Bugfix for error describing CSV with semi-colon delimeter#129
Merged
phargogh merged 3 commits intonatcap:mainfrom Apr 8, 2026
Merged
Bugfix for error describing CSV with semi-colon delimeter#129phargogh merged 3 commits intonatcap:mainfrom
phargogh merged 3 commits intonatcap:mainfrom
Conversation
…tiating a resource. natcap#128
Text encoding does make sense as an attribute of binary files. Also, frictionless will assign a default encoding when it fails to detect an encoding, so the value it was giving us was spurious.
phargogh
approved these changes
Apr 8, 2026
Member
phargogh
left a comment
There was a problem hiding this comment.
Thanks @davemfish ! I had one question about how it might make sense to approach the other attributes returned by frictionless, but then I second-guessed myself. I suppose we can revisit my question in the future if/when frictionless decides to change their attributes around or adds attributes as the case may be.
Comment on lines
+290
to
+299
| # These are other attributes sometimes returned by frictionless. | ||
| # We don't have a use for them in our metadata and we do not permit | ||
| # arbitrary extra attributes in our models. | ||
| description.pop('mediatype', None) | ||
| description.pop('name', None) | ||
| description.pop('profile', None) | ||
| description.pop('dialect', None) | ||
| description.pop('hash', None) | ||
| description.pop('sources', None) | ||
| description.pop('licenses', None) |
Member
There was a problem hiding this comment.
In case frictionless adds some additional attributes in future releases, maybe it'd be more practical to have a list of attributes we want to keep and then delete the rest?
|
|
||
| """ | ||
| description = describe_file(source_dataset_path, scheme) | ||
| description.pop('encoding', None) # does not make sense for binary data |
Member
There was a problem hiding this comment.
Oh maybe my above comment doesn't make sense if the attributes to keep depends on the datatype.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Such a CSV would be described by frictionless with an extra
dialectattribute. Passing that through to ourTableResourcemodel raised aValidationErrorbecause we do not allow extra attributes on our models.Fixes #128
I also addressed the issue of non-sensical encodings. A longer term solution is probably to stop using frictionless to describe raster and vector datasets, but I'll save that for later.
Fixes #121