Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Label encoding is bugged in active learning example #2

Closed
phurwicz opened this issue Dec 9, 2020 · 1 comment
Closed

Label encoding is bugged in active learning example #2

phurwicz opened this issue Dec 9, 2020 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@phurwicz
Copy link
Owner

phurwicz commented Dec 9, 2020

This happens when the 'raw' subset of the dataset has a label that the 'dev' subset does not.

SupervisableDataset assumes that labels in the 'raw' subset are irrelevant but those in 'train' are, which is all well and good.

However, it should be made clear how and when annotated data points in 'raw' get committed to 'train'. It should also be clear that there are times when one want to deduplicate 'raw' against 'train' (i.e. to 'lock' those annotated points) and times when one doesn't (i.e. to keep those points open to modification).

  • these commit and deduplicate' actions shall be accessible as app-level(cross-explorer) widgets.
  • also need a button to push dataframe updates to explorer sources. This isn't automatic due to performance considerations.. perhaps add a scheduled pull/push from dataframes to sources?
  • update_population and retrain_model should read the 'train' set rather than the 'raw' set.
@phurwicz phurwicz added the bug Something isn't working label Dec 9, 2020
@phurwicz phurwicz self-assigned this Dec 9, 2020
@phurwicz
Copy link
Owner Author

Resolved with hover 0.3.0 and the catchup commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant