Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search-tag on labels, resets prior annotation on text classification hand labeling with multi_label=True #1711

Closed
dhruvsakalley opened this issue Sep 10, 2022 · 6 comments · Fixed by #1736
Assignees
Labels
type: bug Indicates an unexpected problem or unintended behavior
Milestone

Comments

@dhruvsakalley
Copy link

dhruvsakalley commented Sep 10, 2022

It appears that the search function on labels in case of hand labeling - text classification with multiple labels clears all prior annotations on close. This creates a major bug, because it is not apparent immediately that the prior annotation labels have been reset since they are out of visible scope. The problems is even more pronounced if you are working with a large number of labels.
Steps to reproduce:
Create a DatasetForTextClassification with an array of records created using

records = []
for idx, row in df.iterrows():
    records.append(make_record(row))
dataset_rb = rb.DatasetForTextClassification(records)

def make_record(row):
  record = rb.TextClassificationRecord(
          text = row["text"],
          multi_label = True
  )
  return row

Assign a large amount of labels to the dataset

  settings = rb.TextClassificationSettings(label_schema=get_lots_of_labels())

  # apply settings to new or already existing dataset
  rb.configure_dataset("my_dataset_name", settings=settings)

  # logging to the newly created dataset triggers the validation checks
  rb.log(dataset_rb, "my_dataset_name")

Switch to the web app and try hand labeling, use the search on the labels (not the record) for toggling select, try a few search string and clear out search string after making selections, only the most recent labels maintain state, all prior label toggles get reset.

Appears to be a state management issue.

@dhruvsakalley dhruvsakalley changed the title Bug Report: Text Classification Manual Annotation with Multi Labels Bug Report: Search resets annotation on text classification hand labeling with multi_label=True Sep 10, 2022
@frascuchon
Copy link
Member

Thanks for reporting @dhruvsakalley

We will take a look at this problem as soon as possible

@frascuchon frascuchon added this to the v0.18.0 milestone Sep 13, 2022
@frascuchon frascuchon added the type: bug Indicates an unexpected problem or unintended behavior label Sep 13, 2022
@dhruvsakalley
Copy link
Author

Thank you for the prompt response @frascuchon, similar behavior can be replicated while bulk annotating, it clears any prior annotations, now I wonder if the other controls respect the multi_label = True.

@dhruvsakalley dhruvsakalley changed the title Bug Report: Search resets annotation on text classification hand labeling with multi_label=True Search-tag on labels, resets annotation on text classification hand labeling with multi_label=True Sep 19, 2022
@dhruvsakalley dhruvsakalley changed the title Search-tag on labels, resets annotation on text classification hand labeling with multi_label=True Search-tag on labels, resets prior annotation on text classification hand labeling with multi_label=True Sep 19, 2022
frascuchon pushed a commit that referenced this issue Sep 28, 2022
Closes #1711

Also, add some none regression unit test with Jest

(cherry picked from commit fcd6c81)
frascuchon pushed a commit that referenced this issue Sep 28, 2022
Closes #1711

Also, add some none regression unit test with Jest

(cherry picked from commit fcd6c81)
frascuchon pushed a commit that referenced this issue Sep 29, 2022
Closes #1711

Also, add some none regression unit test with Jest

(cherry picked from commit fcd6c81)
frascuchon pushed a commit that referenced this issue Sep 30, 2022
Closes #1711

Also, add some none regression unit test with Jest

(cherry picked from commit fcd6c81)
frascuchon pushed a commit that referenced this issue Oct 3, 2022
Closes #1711

Also, add some none regression unit test with Jest

(cherry picked from commit fcd6c81)
frascuchon pushed a commit that referenced this issue Oct 4, 2022
Closes #1711

Also, add some none regression unit test with Jest

(cherry picked from commit fcd6c81)
frascuchon pushed a commit that referenced this issue Oct 5, 2022
Closes #1711

Also, add some none regression unit test with Jest

(cherry picked from commit fcd6c81)
frascuchon pushed a commit that referenced this issue Oct 5, 2022
Closes #1711

Also, add some none regression unit test with Jest

(cherry picked from commit fcd6c81)
frascuchon pushed a commit that referenced this issue Oct 5, 2022
Closes #1711

Also, add some none regression unit test with Jest

(cherry picked from commit fcd6c81)
@dvsrepo
Copy link
Member

dvsrepo commented Oct 11, 2022

Thank you for the prompt response @frascuchon, similar behavior can be replicated while bulk annotating, it clears any prior annotations, now I wonder if the other controls respect the multi_label = True.

Dear @dhruvsakalley , sorry for the late heads up. This should be fixed on 0.18.0 we released last week.

Let us know if you find any issue.

@dhruvsakalley
Copy link
Author

Hi, I can confirm the issue is partially fixed, the search tags, annotate, seem to work as expected, however the bulk annotation with "annotate as" still does overwrite prior labels in a similar fashion as the search and annotate was doing. My apologies if this is intended way of working, but it does seem like a related issue. I can open another issue on the topic if you could confirm this is a bug and not a feature.

@frascuchon
Copy link
Member

Hi @dhruvsakalley

Yes, is the expected behavior. The bulk annotation will set the selected labels as the annotated ones. Effectively, in some cases, working with multi-label text classification, this partial bulk annotation could be useful.

Let us discuss this internally to evaluate the feature @dvsrepo @davidberenstein1957

Again, thanks for your feedback!

@dhruvsakalley
Copy link
Author

dhruvsakalley commented Oct 24, 2022

Thanks for confirming,
I would like to add that if you reset prior annotations without confirmation, it leads to the possibility of lost work. It might be useful to have an undo in case of accidents like these. Some tools like prodigy keep a track of last n actions in the session and commit as a separate step, which I find very useful as a quick way to go back and change a label based on a new observation or undo a mistake that happened, which makes the annotation flow faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants