Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterative Labelling & Model Building #3070

Open
phorne-uncharted opened this issue Nov 4, 2021 · 0 comments
Open

Iterative Labelling & Model Building #3070

phorne-uncharted opened this issue Nov 4, 2021 · 0 comments
Assignees

Comments

@phorne-uncharted
Copy link
Contributor

phorne-uncharted commented Nov 4, 2021

Currently, when building a model from user labelled data, a clone of the dataset retaining only labelled rows is used. This means that if the user is not happy with the results, they need to cycle back to the labelling screen, create a new clone, and then update the labels.

Instead of using the above approach, the user should be able to easily add more labelled data after reviewing the model results. The flow would be:

  • User labels an initial set of data
  • User clicks Create Model
  • User views model results
  • User chooses to label more data
  • User clicks Create Model
  • etc.

All of the above steps should be done without creating new clones for each iteration.

To get there, rather than clone the dataset (excluding rows missing labels), the client should instead model the label field and include a filter in the model creation request to remove unlabeled rows in the prefiltering step. Depending on the exact nature of that filter, there may be a slight change required to the API so that the client can specify the filter as used in prefiltering rather than as a prepend in the model creation pipeline.

That should let the user be able to iteratively specify labels to increase model quality.

@Zac-hills Zac-hills self-assigned this Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants