-
Notifications
You must be signed in to change notification settings - Fork 7
Closed
Labels
Description
Context
We're starting to generate more sources, but we don't update our training data.
Requirements
- We need a script which we can automate to update our training data—manually for now, just before we retrain (which is infrequent).
- check for newly labeled stuff from our source collector app
- We want to grab not only items labeled as "relevant" and in our db, but also items labeled not relevant.
- Collect HTML responses upstream of annotation #324
- optional: apply keyword extraction to labeled data
- Submit approved URL Task #152
- update the hugging face training-urls dataset, with batch ID
- Tweak Data Collection #141
- place it there as raw data; we can transform it into more specific datasets as needed
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Done