-
Notifications
You must be signed in to change notification settings - Fork 386
Add all tasks from datasets #95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
js/src/lib/interfaces/Types.ts
Outdated
| audio: "Audio", | ||
| cv: "Computer Vision", | ||
| rl: "Reinforcement Learning", | ||
| time_series: "Time Series", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| time_series: "Time Series", | |
| time_series: "Time Series", | |
| structured: "Structured Data", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and actually I would advocate for time series to be inside structured
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the number of Time Series datasets/models is so small that it's maybe overkill to have a dedicated modality yet.
However in the long run I expect it to become separated to the other structured datasets we'll have. We may want to have a separate modality just for time series at one point anyway for classification, forecasting, anomaly detection, etc. As vision, audio and text modalities, time series require very specific preprocessing and model architectures.
Though I agree such datasets often come with structured metadata to help predictions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
individual tasks will still time-series related so I like using structured as an umbrella for a few different things, personally. Feels like a better level of generality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about table tasks? Technically it's also structured no? But we've kept them under NLP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since modality is not heavily used anywhere, I feel it's safe to change afterward if needed once we see more adoption. Let's go with structured and have everything together, so we don't have sections just pointing to a couple of models/datasets. Later on we can always split pragmatically.
* This is not a Python repo anymore * Update README.md * rabbit hole: revert to package-lock.json lockfileVersion=1 (we use npm 6 stable for now) * rabbit hole: let's try this? * CI: Actually we should also build widgets in that case (they're broken currently) cc @mishig25 * Fix for new `tabular-classification` * `export-data.ts` endpoint * ci: trigger JS Interfaces CI run * Revert "ci: trigger JS Interfaces CI run" This reverts commit 34ac3e9. * move export-tasks to a simple script and run using `tsm`
Third part of 4 (or 3?) for #83
This PR adds all tasks from
tasks.jsonindatasetswith latest update from @lhoestq in huggingface/datasets#4066Internal change needed: hide the new types based on
hideInModelsSome changes along with the PR
hideInModelsTASKS_DATAandTASKS_MODEL_LIBRARIESto have same order asPIPELINE_TAGS_DISPLAY_ORDERsince that really helps with diffs.cc @lhoestq