Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark object store bucket as full and move new files into new bucket #22039

Open
MorrisJobke opened this issue Jul 29, 2020 · 1 comment
Open
Labels
1. to develop Accepted and waiting to be taken care of enhancement feature: object storage

Comments

@MorrisJobke
Copy link
Member

Problem

  • the vendors of object stores give out a recommendation the maximum number of objects per bucket
  • Nextcloud should respect those limits and if a bucket is close to this value it should not put more files into it
  • new files should be put into a newly created bucket
  • multibucket setups should be able to be marked as "extensible" - so there is a default bucket count and as it's capacity of a bucket is reached then a new bucket is created

Things to keep in mind

  • we should look into storing the bucket a file is located in instead of calculate it on the fly:
    • instead of having this implicitly (calculating the hash and have a formula in which bucket this hash goes) we should store the result in the database so we can move different files that would be in the same bucket can also be moved based on the amount of files in the bucket
  • what about a migration? this would allow to rebalance buckets that are already over this limit and misbehave for that reason

@rullzer @kesselb @icewind1991 Feedback and ideas are welcome.

@icewind1991
Copy link
Member

My idea would be to save "ranges" of fileids in a table for each multibucket storage, to split fileids into further buckets

With a table looking something like

storage bucket start end postfix
"bucket_1" 1 100 ""
"bucket_1" 101 null "a"
"bucket_2" 1 null ""

meaning that files storaged within the multibucket storage "bucket_1" with fileids between 1 and 100 will be stored in bucket "bucket_1", with fileids after 101 in "bucket_1a".

Since the fileids are already spread over multibucket this means that there is an overlap between the ranges for different "storage buckets", and the bucket "bucket_1" will have less files in them then their range might suggest.

When the storage needs to call the object storage, it would find the range the fileid belongs in to find the final bucket name. When a new object is written, the number of objects the the current, final range can be queries from the filecache and if the configured limit has been exceeded, a new range is created.

@rullzer rullzer modified the milestones: Nextcloud 21, Nextcloud 21.0.1 Mar 1, 2021
@szaimen szaimen added 1. to develop Accepted and waiting to be taken care of and removed 0. Needs triage Pending check for reproducibility or if it fits our roadmap labels Aug 8, 2021
@blizzz blizzz modified the milestones: Nextcloud 21.0.5, Nextcloud 23 Sep 30, 2021
@blizzz blizzz modified the milestones: Nextcloud 23, Nextcloud 24 Nov 30, 2021
@blizzz blizzz modified the milestones: Nextcloud 24, Nextcloud 25 Apr 21, 2022
@blizzz blizzz modified the milestones: Nextcloud 25, Nextcloud 26 Oct 19, 2022
@blizzz blizzz removed this from the Nextcloud 26 milestone Mar 9, 2023
@blizzz blizzz added this to the Nextcloud 27 milestone Mar 9, 2023
@skjnldsv skjnldsv removed this from the Nextcloud 27.0.2 milestone Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1. to develop Accepted and waiting to be taken care of enhancement feature: object storage
Projects
None yet
Development

No branches or pull requests

6 participants