-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a script to run the TableVectorizer on all openml datasets #665
Conversation
Also, it might be useful to implement a hot-load functionality (which is already part the benchmark framework), in case, for example, OpenML shuts off during the run. Adding a parameter Edit: nevermind, the hot load functionality is not yet merged, as it's part of #593 |
Ah, the diff broke for some reason. I could fix it on one of my PRs by doing this:
|
Co-authored-by: Lilian <lilian@boulard.fr>
138 tasks raised errors. Some are not linked to skrub (Nans in y, mixed types in y...). The only error linked to skrub is #679 (127 times). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Merging. Thank you!
…rub-data#665) * create script * cache * Use loguru for logging, various code improvements, slightly better doc and messages * Fix condition * fix import bug * fix bug for empty evals * fix 0 featues * improvements * Update benchmarks/run_on_openml_datasets.py Co-authored-by: Lilian <lilian@boulard.fr> * import Counter * test commit * remove test commit * fix bug --------- Co-authored-by: Lilian <lilian@boulard.fr>
No description provided.