Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dataset size limits #191

Merged
merged 18 commits into from
Sep 28, 2021
Merged

Conversation

HeDo88TH
Copy link
Collaborator

@HeDo88TH HeDo88TH commented Sep 27, 2021

This PR closes #31 by adding the support for maxStorageMB user metadata.
New configuration key EnableStorageLimiter to enable this limiter, if false it does not check the user's available storage space.
New endpoint GET /users/storage it returns the current user's storage info:

{
  "total": 314572800,
  "used": 307377818
}

When the user goes beyong its quota a new MaxUserStorageException is thrown.

@HeDo88TH HeDo88TH added the enhancement New feature or request label Sep 27, 2021
@HeDo88TH HeDo88TH self-assigned this Sep 27, 2021
@HeDo88TH
Copy link
Collaborator Author

In Hub we need to change the behaviour of Share because It keeps retrying even after server error for quota exceeded, tracked in DroneDB/Hub#47

@HeDo88TH HeDo88TH marked this pull request as ready for review September 27, 2021 16:21
// It currently enumerates all the datasets and asks DDB for size.
// If we notice a slowdown of the upload/share/push process this could be the culprit
// A simple cache level could be the solution but it needs to be kept in sync
// with all the dataset operations. Really a pain if it is not necessary.
Copy link
Contributor

@pierotofy pierotofy Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should at least memoize this. https://github.com/eirikt/Memoizer.NET

It's OK if we allow a user to exceed its quota, for a brief period of time, so long as at a certain point we detect that the quota has exceeded and we can take action.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool library, check #191 (comment)

@HeDo88TH
Copy link
Collaborator Author

I performed some empirical tests on my machine. I tested from 1 to 200 datasets the wait time after the new file upload.
I'm using the cached S3 configuration. The wait times are per file upload.

Datasets Wait time (s)
1 0,18
5 0,20
10 0,28
20 0,46
30 0,52
40 0,69
50 0,76
75 1,02
100 1,40
150 2,01
200 2,40

chart

There is definitely a linear correlation between the wait time and the number of datasets.
We can optimize this but I do not think that it is high on our priority list @pierotofy

@pierotofy
Copy link
Contributor

🐢

@pierotofy
Copy link
Contributor

I would add this right away, but you decide. It will take more work later imo.

@HeDo88TH
Copy link
Collaborator Author

I would add this right away, but you decide. It will take more work later imo.

Tracked in #192

@pierotofy
Copy link
Contributor

pierotofy commented Sep 28, 2021

Works good for share and file upload endpoint.

I found an issue with ddb push though;

If I clone an empty repository, add lots of files (past my storage limit), then ddb push, the first ddb push succeeds. Subsequent attempts to push fail with the correct message, but the first one seems to go unchecked.

To reproduce:

ddb clone <dataset>
cd <dataset>
ddb add [files adding to the storage limit for my user]
ddb push
--> succeeds (incorrect)
ddb add [another file]
ddb push
--> fails (correct)

@HeDo88TH
Copy link
Collaborator Author

Works good for share and file upload endpoint.

I found an issue with ddb push though;

If I clone an empty repository, add lots of files (past my storage limit), then ddb push, the first ddb push succeeds. Subsequent attempts to push fail with the correct message, but the first one seems to go unchecked.

Fixed in 8f3a3d1
Basically we now check in push upload endpoint and not in init, this way the user can push an "all delete" without getting blocked if he has gone beyond its quota BUT he gets blocked if uploads new files.

@HeDo88TH
Copy link
Collaborator Author

Improvements tracked in #192
@pierotofy can we merge this?

@pierotofy
Copy link
Contributor

Looks good. Thanks!

@pierotofy pierotofy merged commit d5bbab3 into DroneDB:master Sep 28, 2021
@HeDo88TH HeDo88TH deleted the add-dataset-size-limits branch October 8, 2021 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement upload limits
2 participants