Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change Summary
CSV
file format class and related documentation & unit testsGenericOptions.parse
methodIf was previously planned to implement nested classes:
CSV.Reading(delimiter=',', inferSchema=True)
CSV.Writing(delimiter=',', compression='gzip')
But all implementations I've tried were messy or quite tricky to make working as expected.
So instead I've merged all options to just one class. We don't do anything with input options, they are just passed to Spark as-is, and there is no actual reason to make them separated to 2 classes.
Also I've added several default options just to class instead of
known_options
to improve developer experience - IDE should suggest attribute names, default values will be printed to logs, etc.Related issue number
Checklist
docs/changelog/next_release/<pull request or issue id>.<change type>.rst
file added describing change(see CONTRIBUTING.rst for details.)