Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV without header #88

Open
rabbit72 opened this issue Sep 1, 2023 · 4 comments · May be fixed by #93
Open

CSV without header #88

rabbit72 opened this issue Sep 1, 2023 · 4 comments · May be fixed by #93

Comments

@rabbit72
Copy link

rabbit72 commented Sep 1, 2023

Is it possible to validate files without header?

I expected something like this but it seems it's not possible.

validators = {
        0: [
            UniqueValidator(),
            IntValidator(),
        ],
        1: [
            FloatValidator(),
        ],
        2: [
            SetValidator(["USD", "EUR", "GBP"]),
        ],
    }
@di
Copy link
Owner

di commented Sep 13, 2023

Not currently, the header is required for certain validators to function properly, e.g. to determine row lengths for the RowLengthValidator, as well as output on results.

jonafato added a commit to jonafato/vladiate that referenced this issue Mar 9, 2024
This is a proof of concept to support validating CSVs that don't include
a header row. The implementation allows a user to define a sequence of
field names and passes these through to the underlying DictReader
instances. The default behavior remains unchanged. I did a quick manual
test with a slight modification of the example Vlad object and sample
CSV contents from the readme.

If this is a useful construct, I'll be happy to fill in the
documentation and tests for this option.

Resolves di#88.
@jonafato jonafato linked a pull request Mar 9, 2024 that will close this issue
@jonafato
Copy link
Collaborator

jonafato commented Mar 9, 2024

@rabbit72 I put together a small pull request at #93 as a proof of concept. This would allow a developer to define the header values at the validator level to validate CSV files without included header rows. Would this change address your use case?

jonafato added a commit to jonafato/vladiate that referenced this issue Mar 15, 2024
Add support for validating CSVs that don't include a header row. The
implementation allows a user to define a sequence of field names and
passes these through to the underlying `DictReader` instances. The
`fieldnames` attribute is `None` by default, which retains the existing
behavior of inferring field names from a header row in the source CSV.

Resolves di#88.
jonafato added a commit to jonafato/vladiate that referenced this issue Mar 15, 2024
Add support for validating CSVs that don't include a header row. The
implementation allows a user to define a sequence of field names and
passes these through to the underlying `DictReader` instances. The
`fieldnames` attribute is `None` by default, which retains the existing
behavior of inferring field names from a header row in the source CSV.

Resolves di#88.
@rabbit72
Copy link
Author

@jonafato hi, looks promising!
Does it mean this change allows me to name columns in a CSV file that doesn’t contain the first row with columns?
Do you validate that all rows contain the same number of columns?

@jonafato
Copy link
Collaborator

Does it mean this change allows me to name columns in a CSV file that doesn’t contain the first row with columns?

Yes, if you use this feature, your CSV files should not include header rows. (This feature is not compatible with CSV files that include header rows, and setting fieldnames along with a header row is likely to result in a validation error or other invalid results.)

Do you validate that all rows contain the same number of columns?

This is not enabled by default, but vladiate ships with a RowLengthValidator that you can use with the row_validators attribute to ensure that all rows have the same number of columns as the header row (or, with this new feature PR, the fieldnames attribute). This row validator is available for use as of version 0.0.25.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants