Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uploading a new file should still set its location metadata to *something*, even if not perfect #554

Closed
brendanheywood opened this issue Mar 22, 2023 · 0 comments · Fixed by #555

Comments

@brendanheywood
Copy link
Contributor

This is a regression caused by #524 (fyi @danmarsden )

Previous state:

We used to upload a file, and then see if that file existed in the remote store, and then based on that set its location to either remote or local.

Current state:

We do nothing at all which means the file is uploaded without any objectfs metadata. This means we rely on the check_objects_location scheduled task to back fill the data. check_objects_location is an extremely expensive task to run on a large site, we basically only really want to run that to catch very weird edge cases that aren't getting caught as they are uploaded, which is no longer happening. On our sites we have throttled this task right back to run once a week or even once a month. Now we get the unintended side effect that because this task hasn't run, objectfs doesn't know about the files which means they site in site data building up and then once a month get shifted and this has only started happening since this code has progressively rolled out.

Proposed state:
When we upload a file, we still don't want to touch the object store which means we can't correctly determine if the file is actually there or not. But we can do some heuristics and make a pretty good guess:

  1. worst case we can do no heuristic and just say the file is local. If the file is also remote then it doesn't matter and the metadata will get corrected when it tries to sync it and it is already there
  2. we can look up for the existence of any other files which have the same hash and copy its metadata. This should be correct but on the off chance it is wrong then it was already wrong, and the various scheduled tasks will eventually fix it up anyway
nhoobin pushed a commit that referenced this issue Mar 28, 2023
brendanheywood added a commit that referenced this issue Apr 12, 2023
Add location metadata on upload (3.3 stable) #554
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant