New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wine datasets #26

Merged
merged 5 commits into from Oct 15, 2018

Conversation

Projects
None yet
2 participants
@strviola
Contributor

strviola commented Oct 13, 2018

Issue #15

I tried to write code about wine (http://archive.ics.uci.edu/ml/datasets/wine).
I think we will be able to use wine Dataset.

@kou

This comment has been minimized.

Member

kou commented Oct 13, 2018

Thanks!
Can you try to write tests? Or should we do it as follow-up pull requests?

@strviola

This comment has been minimized.

Contributor

strviola commented Oct 13, 2018

Sorry, I'll try to write tests soon

@kou

This comment has been minimized.

Member

kou commented Oct 13, 2018

Great!

:ash,
:alcalinity_of_ash,
:magnesium,
:total_phenols,

This comment has been minimized.

@kou

kou Oct 13, 2018

Member

Does the field store "the number of phenols"?
If it's true, n_phenols is better. Because we're using n_XXX for "the number of XXX".

This comment has been minimized.

@strviola

strviola Oct 13, 2018

Contributor

fixed

module Datasets
class Wine < Dataset
Record = Struct.new(:wine_class,

This comment has been minimized.

@kou

kou Oct 13, 2018

Member

Did you choose wine_class instead of class to avoid conflict with Kernel#class?
We already use class in Iris. Should we change the class in Iris?

This comment has been minimized.

@strviola

strviola Oct 13, 2018

Contributor

fixed

:proanthocyanins,
:color_intensity,
:hue,
:od280_od315_of_diluted_wines,

This comment has been minimized.

@kou

kou Oct 13, 2018

Member

I don't know much about this area but optical_density may be better.

This comment has been minimized.

@strviola

strviola Oct 13, 2018

Contributor

I searched, use optical_nucleic_acid_concentration. ref: https://www.gelifesciences.co.jp/newsletter/biodirect_mail/chem_story/138.html (in Japanese)

This comment has been minimized.

@kou

kou Oct 13, 2018

Member

OK!

:alcalinity_of_ash,
:magnesium,
:total_phenols,
:flavanoids,

This comment has been minimized.

@kou

kou Oct 13, 2018

Member

Does the field store "the number of flavonoids"?
If so, n_flavonoids is better.

By the way, flavanoides (va -> vo) is typo?

This comment has been minimized.

@strviola

strviola Oct 13, 2018

Contributor

I see.

I copied attribute names from http://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.names, but here is flavanoides ...

This comment has been minimized.

@strviola

strviola Oct 13, 2018

Contributor

I will fix it, but spell "flavanoids" seems not to be typo. https://ejje.weblio.jp/content/flavanoids

This comment has been minimized.

@strviola

strviola Oct 13, 2018

Contributor

fixed

This comment has been minimized.

@kou

kou Oct 13, 2018

Member

I see.

:magnesium,
:total_phenols,
:flavanoids,
:nonflavanoid_phenols,

This comment has been minimized.

@kou

kou Oct 13, 2018

Member

ditto.

This comment has been minimized.

@strviola

strviola Oct 13, 2018

Contributor

fixed

:total_phenols,
:flavanoids,
:nonflavanoid_phenols,
:proanthocyanins,

This comment has been minimized.

@kou

kou Oct 13, 2018

Member

ditto.

This comment has been minimized.

@strviola

strviola Oct 13, 2018

Contributor

fixed

@strviola

This comment has been minimized.

Contributor

strviola commented Oct 13, 2018

Test added.

:malic_acid,
:ash,
:alcalinity_of_ash,
:magnesium,

This comment has been minimized.

@strviola

strviola Oct 13, 2018

Contributor

I think here would be better n_magnesium.

:color_intensity,
:hue,
:od280_od315_of_diluted_wines,
:proline)

This comment has been minimized.

@strviola

strviola Oct 13, 2018

Contributor

Here means "number of" too

@strviola

This comment has been minimized.

Contributor

strviola commented Oct 13, 2018

Fix attribute names. Please check them again.

@kou

Thanks!
I want to hear your ideas for attribute names.

:malic_acid,
:ash,
:alcalinity_of_ash,
:n_magnesium,

This comment has been minimized.

@kou

kou Oct 13, 2018

Member

The last s is missing: n_magnesiums. (n_magnesium_atoms may be better.)

This comment has been minimized.

@strviola

strviola Oct 14, 2018

Contributor

Sure.

This comment has been minimized.

@strviola

strviola Oct 14, 2018

Contributor

Fixed

:ash,
:alcalinity_of_ash,
:n_magnesium,
:n_phenols,

This comment has been minimized.

@kou

kou Oct 13, 2018

Member

I'm sorry but the field's value isn't "the number of XXX" (個数). I saw sample values in your tests code. They are floating point values (2.8 and 2.05). It seems that it's "the total amount of XXX" (総量). So we should not use n_XXX naming convention.

We don't have naming convention for "the total amount of XXX". Do you have any naming convention ideas for this case? total_XXX? XXX_amount?

This comment has been minimized.

@strviola

strviola Oct 14, 2018

Contributor

Sorry, total_phenols seems to be good.

This comment has been minimized.

@strviola

strviola Oct 14, 2018

Contributor

Fixed

@kou kou merged commit 6124ada into red-data-tools:master Oct 15, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@kou

This comment has been minimized.

Member

kou commented Oct 15, 2018

Thanks!

@strviola

This comment has been minimized.

Contributor

strviola commented Oct 15, 2018

Thanks! 😂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment