Label encoding is bugged in active learning example #2

phurwicz · 2020-12-09T09:49:05Z

This happens when the 'raw' subset of the dataset has a label that the 'dev' subset does not.

SupervisableDataset assumes that labels in the 'raw' subset are irrelevant but those in 'train' are, which is all well and good.

However, it should be made clear how and when annotated data points in 'raw' get committed to 'train'. It should also be clear that there are times when one want to deduplicate 'raw' against 'train' (i.e. to 'lock' those annotated points) and times when one doesn't (i.e. to keep those points open to modification).

these commit and deduplicate' actions shall be accessible as app-level(cross-explorer) widgets.
also need a button to push dataframe updates to explorer sources. This isn't automatic due to performance considerations.. perhaps add a scheduled pull/push from dataframes to sources?
update_population and retrain_model should read the 'train' set rather than the 'raw' set.

The text was updated successfully, but these errors were encountered:

phurwicz · 2020-12-12T16:13:10Z

Resolved with hover 0.3.0 and the catchup commit.

phurwicz added the bug Something isn't working label Dec 9, 2020

phurwicz self-assigned this Dec 9, 2020

phurwicz closed this as completed Dec 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Label encoding is bugged in active learning example #2

Label encoding is bugged in active learning example #2

phurwicz commented Dec 9, 2020 •

edited

phurwicz commented Dec 12, 2020

Label encoding is bugged in active learning example #2

Label encoding is bugged in active learning example #2

Comments

phurwicz commented Dec 9, 2020 • edited

phurwicz commented Dec 12, 2020

phurwicz commented Dec 9, 2020 •

edited