Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LocalLabel DoesNotExist errors between labelset-change and classifiers-reset #545

Open
StephenChan opened this issue Apr 15, 2024 · 0 comments
Labels

Comments

@StephenChan
Copy link
Member

StephenChan commented Apr 15, 2024

Sequence:

  • Source owner edits the labelset, removing a label L that is used in some machine annotations in that source.

  • Upon submission of that form, CoralNet immediately changes the labelset, then queues a reset-classifiers async Job, which would delete the source's existing classifiers and machine annotations.

  • But before the reset-classifiers async Job actually gets to run, one of the following happens:

    • Someone visits the annotation tool for an image which contains machine annotations of label L
    • Someone visits the source's Backend page which shows classifier stats for a classifier containing label L
    • A classifier containing label L is used to classify some images

    Thus, it tries to access the labelset entry (a LocalLabel) for label L, but that no longer exists, so it gets the error DoesNotExist: LocalLabel matching query does not exist.

This is a temporary situation which only lasts as long as it takes to reset classifiers, so normally several minutes, or maybe a couple hours if the site is getting through a queue of tasks. However, the image-classification case in particular can generate loads of these errors in a short time (e.g. about 100 such errors in 1 minute on 2024/04/14).

Possible strategies:

  1. Delay the actual labelset change by adding that to the async Job.
  2. Queue the async Job as 'real time' instead of 'background'.
  3. Be able to work with annotations for a label that's no longer in the source's labelset. This might just entail using the label's default short code as a fallback when there isn't a labelset-specific short code, but there might be something else involved too.

1 is appealing in the sense that, depending on the order of operations in the Job, it can avoid leaving the source in an "inconsistent" state: there's no interim period where a label was removed from the source's labelset but still is part of the source's data. However, we might see these semantics in a different light later, depending on the direction that issue #537 takes us in. 2 minimizes the interim period instead of eliminating it, but could lead to an arguably better UX where the source owner can wait on the page for the reset Job to actually complete (generally takes under a minute). 3 does nothing about the interim period, but is basically defensive coding to tolerate that period.

Multiple strategies may be worth doing, though just 1 or 3 should suffice for resolving this issue at the moment. Again, there could be a complication about 3 that I'm not thinking of right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant