-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a boolean converter to the CSV IO (#371) #373
Conversation
lib/daru/io/io.rb
Outdated
@@ -275,6 +275,17 @@ def html_table_to_dataframe(table) | |||
order: table[:order], | |||
name: table[:name] | |||
end | |||
|
|||
CSV::Converters[:boolean_converter] = lambda do |f, _| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, despite the fact that docs explicitly say "This Hash is intentionally left unfrozen and users should feel free to add values to it that can be accessed by all CSV objects", I don't believe any "global" solution is good or even necessary.
Imagine large project that uses daru AND some other code to work with CSV. What's the probability they also have defined :boolean_converter
with slightly different behavior (for ex., converting 'yes'
and 'no'
too)?..
Let's just have a constant with lambda in it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously people can still define their own custom boolean converters but I think it's good to have some sort of general convenience method built in. (For example Pandas converts booleans by default and I was surprised there was no built-in method here as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Daru should convert booleans, yes. All I am agains is to modify built-in hash of CSV::Converters
. I think it is a bad practice, generally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion the convenience of modifying the hash beats the better practice. Can change it though if that's the consensus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The converters option of the CSV parser accepts either the name (symbol) of a converter in the CSV::Converters
hash OR a lamba. So, you could probably build a Daru::CSV::Converters
hash for Daru-specific converters, and then have DataFrame.from_csv
accept converter names that first look in CSV::Converters
and then in Daru::CSV::Converters
. That way a user could override the Daru ones by either adding to CSV::Converters
or supplying their own lambda.
@zverok Have tried to implement the suggestions. Please have another look. |
@athityakumar may be of interest |
Resolves #371