Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix that failing bucket requests marked buckets as missing on server #7477

Merged
merged 8 commits into from
Dec 5, 2023

Conversation

philippotto
Copy link
Member

@philippotto philippotto commented Dec 5, 2023

This PR fixes an incorrect handling of failed bucket requests. #4740 added a try-catch around the bucket fetching logic for better error reporting. The reporting itself is fine, but the catch-clause used null-values as a return value. However, null values are reserved for buckets that don't exist on the server. Thus, if bucket requests failed for some reason, the front-end marked these buckets as missing and assumed that a re-try is not necessary. This had (at least) two implications:

  • Failing bucket requests were not re-tried at all (only when manually reloading a layer or moving somewhere else and then moving back (usually activates GC)) and just remained black (or rendered with fallback data)
  • When annotating volume data on buckets for which requests failed, the bucket was assumed to be empty/missing on the server. Thus, existing not-zero voxels (e.g., from fallback or from previous annotations) were lost and overwritten with 0.

Additionally, this PR makes the (now working) retry mechanism a bit less aggressive. This is more of a bandaid fix, since the actual architecture could use a slight refactoring. However, I think this is out of scope for this PR.

URL of deployed dev instance (used for testing):

Steps to test:

  • scenario 1:
    • view a dataset
    • make tab offline (via devtools)
    • move somewhere else
    • data should be invisible
    • make tab online again
    • move a tiny bit -> buckets should now be requested again
      • ideally, the moving should not be necessary, but I think, this is not critical for this bug fix
  • scenario 2:
    • create an annotation with fallback data
    • make tab offline
    • reload the segmentation layer --> should be empty now
    • brush something
    • variant 1:
      • click save --> everything should turn red because the tab is offline
      • make tab online again
      • wait --> saving should recover
      • reload annotation --> everything should look as expected
    • variant 2:
      • click undo
      • make tab online again and follow steps from variant 1
    • variant 3:
      • click undo
      • click redo
      • make tab online again and follow steps from variant 1

TODOs:

  • re-enable commented lines

Issues:


(Please delete unneeded items, merge only when none are left open)

…alate the error again so that it will be retried by the pull queue
@philippotto philippotto self-assigned this Dec 5, 2023
Copy link
Member

@daniel-wer daniel-wer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find! Worked very well for me 👍

Additionally, this PR makes the (now working) retry mechanism a bit less aggressive. This is more of a bandaid fix, since the actual architecture could use a slight refactoring. However, I think this is out of scope for this PR.

What are you thinking of there? :)

@philippotto
Copy link
Member Author

Additionally, this PR makes the (now working) retry mechanism a bit less aggressive. This is more of a bandaid fix, since the actual architecture could use a slight refactoring. However, I think this is out of scope for this PR.

What are you thinking of there? :)

The PullQueue is responsible for fetching buckets, but it relies on being triggered (via pull) from other components (but also does it itself in the retry-case). This means that it's hard to predict at what frequency pull is called. I think, it would be nice if the PullQueue would be responsible for calling pull. It could do so in reaction to add() and then take care of throttling etc.

I created #7480 for this :)

@philippotto philippotto merged commit b5744e4 into master Dec 5, 2023
2 checks passed
@philippotto philippotto deleted the fix-incorrect-missing-bucket-state branch December 5, 2023 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Retrying failed bucket requests does not work consistently
2 participants