-
-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong "adult" dataset in OpenML100 / CC18 #813
Comments
It doesn't look like |
Regarding scikit-learn, there was some flawed if logic in a for loop. I created a PR to improve the code-quality and fix the problem: |
Cool, I merged this one. But what about the wrong dataset being in the collections? cc @joaquinvanschoren @berndbischl (and this dataset not ignoring one of the columns it should be ignoring). |
We should deactivate the current dataset(s) and upload a new version. Based on what information do you think that the |
Based on the description of the dataset. Original paper maybe?? It's a
reweighting of the individual so that the overall population is represented
better.
Sent from phone. Please excuse spelling and brevity.
…On Tue, Oct 9, 2018, 11:43 janvanrijn ***@***.***> wrote:
We should deactivate the current dataset(s) and upload a new version.
Based on what information do you think that the fnlwgt column should be
ignored?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#813 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAbcFmszsxxoqXSLD3QAtroFJYKTd0wBks5ujMQ2gaJpZM4XNYqY>
.
|
Ok looks like the original paper included the
I don't entirely understand this feature but I guess it should be included? |
also shouldn't |
Moved this to openml/benchmark-suites#37 |
This one is tagged:
https://www.openml.org/d/1590
It should be this:
https://www.openml.org/d/1119
It should exclude the "fnlwgt" column (not sure it's marked, can't see that in the web interface).
Also the sklearn fetcher fails on this dataset.
CC @janvanrijn
The text was updated successfully, but these errors were encountered: