Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a boolean converter to the CSV IO (#371) #373

Merged
merged 4 commits into from
Aug 3, 2017

Conversation

baarkerlounger
Copy link
Contributor

Resolves #371

@@ -275,6 +275,17 @@ def html_table_to_dataframe(table)
order: table[:order],
name: table[:name]
end

CSV::Converters[:boolean_converter] = lambda do |f, _|
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, despite the fact that docs explicitly say "This Hash is intentionally left unfrozen and users should feel free to add values to it that can be accessed by all CSV objects", I don't believe any "global" solution is good or even necessary.

Imagine large project that uses daru AND some other code to work with CSV. What's the probability they also have defined :boolean_converter with slightly different behavior (for ex., converting 'yes' and 'no' too)?..

Let's just have a constant with lambda in it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zverok went by the conversation at the bottom of this for why I added it in Daru #337

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously people can still define their own custom boolean converters but I think it's good to have some sort of general convenience method built in. (For example Pandas converts booleans by default and I was surprised there was no built-in method here as well).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Daru should convert booleans, yes. All I am agains is to modify built-in hash of CSV::Converters. I think it is a bad practice, generally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion the convenience of modifying the hash beats the better practice. Can change it though if that's the consensus.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@v0dro @gnilrets thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The converters option of the CSV parser accepts either the name (symbol) of a converter in the CSV::Converters hash OR a lamba. So, you could probably build a Daru::CSV::Converters hash for Daru-specific converters, and then have DataFrame.from_csv accept converter names that first look in CSV::Converters and then in Daru::CSV::Converters. That way a user could override the Daru ones by either adding to CSV::Converters or supplying their own lambda.

@baarkerlounger
Copy link
Contributor Author

@zverok Have tried to implement the suggestions. Please have another look.

@zverok zverok merged commit 19748d6 into SciRuby:master Aug 3, 2017
@baarkerlounger
Copy link
Contributor Author

@athityakumar may be of interest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants