use `classy-classification` for active learning #13

davidberenstein1957 · 2022-12-20T16:16:21Z

Ideally we would be able to easily host active learners in a more abstract and intuitive process.

MVP

from argilla_plugins.active_learning import classy_classification_learner

classy_classification_learner(name="dataset", model="bert", validation_threshold: int, min_n_samples: int, max_n_samples: int)
classy_classification_learner.start()

Stretch
filtering variables like query could be added to limit the sync. Things like threshold could be added to pre-annotate and validate certain data.

The text was updated successfully, but these errors were encountered:

davidberenstein1957 · 2023-01-19T12:49:56Z

only update predictions as predicted_by classy-classification

dvsrepo · 2023-01-19T17:53:46Z

Really excited to see this happening!

Regarding the max-number-examples I've been thinking about some scenarios related to continuous training and monitoring:

When we reach this limit, I understand we stop training, but we keep updating new records with the predictions of the model right? This is the scenario where user can send more data to the dataset and we use the model in the loop to label new data.

In the above scenario, if I already reached the limit and the users annotate more data, we will retrain the model with the newest annotations? I think you mention this to act as LIFO queue? In my mind it makes total sense. We shift the fewshot training set towards more recent examples.

dvsrepo · 2023-01-19T17:57:37Z

Not to over complicate things of course, just some quick thoughts about how powerful this could get!

davidberenstein1957 · 2023-01-31T14:29:52Z

@dvsrepo
The plugin currently works by getting all annotated records, getting the fifo/lifo annotations and creating a training dataset for classy classification. This dataset with index i, is then applied every interval t to a batch of x records without annotation and which are queried where metadata.idx!=i. These records are updated if the prediction score has enough certainty and if the previous prediction is allowed to be over-written.

This approach ensures the plugin will keep updating predictions in the background whenever new data is annotated but that it doesn't take too long to infer the new knowledge.

dvsrepo · 2023-01-31T16:03:49Z

Looks awesome, looking forward to trying it out

davidberenstein1957 · 2023-01-31T16:06:35Z

Yes, me too. I need to write tests for edge-cases but I want to do these formal structural things after reviewing the entire concepts based on the PyData Bordeaux input.

davidberenstein1957 added the active-learning label Dec 20, 2022

davidberenstein1957 mentioned this issue Jan 3, 2023

feat: basic active learning plugin for each NLP task argilla-io/argilla#2073

Closed

davidberenstein1957 added the help wanted Extra attention is needed label Jan 19, 2023

davidberenstein1957 self-assigned this Jan 19, 2023

davidberenstein1957 added a commit that referenced this issue Jan 31, 2023

added new version of classy-classification #13

627c8fd

davidberenstein1957 added a commit that referenced this issue Jan 31, 2023

added support for logging to classy active learner #21 #13

7aed621

davidberenstein1957 added a commit that referenced this issue Jan 31, 2023

added updated example #13 active learning

ff6e7ed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use `classy-classification` for active learning #13

use `classy-classification` for active learning #13

davidberenstein1957 commented Dec 20, 2022 •

edited

davidberenstein1957 commented Jan 19, 2023

dvsrepo commented Jan 19, 2023

dvsrepo commented Jan 19, 2023

davidberenstein1957 commented Jan 31, 2023 •

edited

dvsrepo commented Jan 31, 2023

davidberenstein1957 commented Jan 31, 2023

use classy-classification for active learning #13

use classy-classification for active learning #13

Comments

davidberenstein1957 commented Dec 20, 2022 • edited

davidberenstein1957 commented Jan 19, 2023

dvsrepo commented Jan 19, 2023

dvsrepo commented Jan 19, 2023

davidberenstein1957 commented Jan 31, 2023 • edited

dvsrepo commented Jan 31, 2023

davidberenstein1957 commented Jan 31, 2023

use `classy-classification` for active learning #13

use `classy-classification` for active learning #13

davidberenstein1957 commented Dec 20, 2022 •

edited

davidberenstein1957 commented Jan 31, 2023 •

edited