-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new openml scenario #18
base: master
Are you sure you want to change the base?
Conversation
Could we please use the |
|
There are different runs of the same algorithm with different parameter settings. How do we want to handle that? |
I've made a different algorithm for each parameterisation now. Not sure if this is what you had in mind when proposing the format change though? |
Yes, I did it in a similar way in the
I have not given each algorithm an id, but only each configuration. |
Hmm, that seems a bit weird. So this metainfo thing is only to get shorter names? It seems like it should enable to group algorithms with different configurations. |
why?
This was my initial motivation.
I don't really understand what you want to say. |
It seems weird to allow only one configuration per algorithm. If I see a "configuration" field I would expect to be allowed to have more than one. I guess "call" would be less ambiguous. But if it's ok with everybody else let's leave it this way. |
Apart from that I think that this is ready to be merged. |
@larskotthoff Could I convince you to drop the In general, I miss a Furthermore, the status of all algorithm runs is ok, but some have an acc of 0.0 sometimes. |
It would be also great if the readme could explain why we have missing feature values. |
Please note that I fixed two further issues in the |
@joaquinvanschoren |
Yes, all datasets share the same meta-features.
I believe there are some cases where you have NaN when a division by zero
happens, or a '-1' where something cannot be computed (e.g. mean number of
nominal categories when the data is purely numeric). I think that, in both
cases, this only happens for classification datasets with only numeric
features.
Cheers,
Joaquin
On Thu, Jan 12, 2017 at 10:43 AM Marius Lindauer ***@***.***> wrote:
@joaquinvanschoren <https://github.com/joaquinvanschoren>
Is it correct that two datasets have exactly the same meta-feature vector?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#18 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABpQV9WfoA_TrqvtKq-8AttD219pnnqzks5rRfXSgaJpZM4Lf5Nf>
.
--
Thank you,
Joaquin
|
I meant that Another question: The features seem to have native feature groups, |
Oh, that is indeed a duplicate. Sorry about that. Are there more?
Cheers,
Joaquin
On Thu, Jan 12, 2017 at 11:11 AM Marius Lindauer ***@***.***> wrote:
Yes, all datasets share the same meta-features.
I meant that X24_mushroom and X809_mushroom have exactly the same vector.
So, we cannot discriminate these two.
Another question: The features seem to have native feature groups,
e.g., CfsSubsetEval, DecisionStump, Hoeffding, J48, NaiveBayes, REPTree,
RandomTree, kNN1.
I think it would improve the quality of the scenario, if we would model
these feature groups properly.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#18 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABpQV6D6qp-IG1ezy4fRp5QXpPH4ZJNaks5rRfxQgaJpZM4Lf5Nf>
.
--
Thank you,
Joaquin
|
Not as far as I know. |
I've shortened the names and added a readme. I don't see your point about the 0 accuracy values -- this is a valid number for accuracy and doesn't necessarily indicate an error. Regarding feature groups: As we don't have feature costs (and don't care about feature costs) I don't think that grouping them differently will make any difference. |
Thanks!
I would say to be worse than random is already problematic, but to be always wrong is weird.
Not for your tools, but for mine. ;-) |
Ok, feel free to change the feature groups. |
Accuracy 0 is indeed weird. Does it happen often?
…On Thu, 12 Jan 2017 at 18:39, Lars Kotthoff ***@***.***> wrote:
Ok, feel free to change the feature groups.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#18 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABpQV7XiscIBMkd9K0mLb6ysY1f9apUxks5rRmU7gaJpZM4Lf5Nf>
.
|
1187 times |
I reduced the number of feature groups. |
Using a grep on the original ASLib: Runs_OpenML.csv: |
ah, this was a different dataset? |
maybe I missed a concrete question: |
No description provided.