Improve file storage structure to improve delete/replace performance in large file sets #13559
Replies: 8 comments
-
I assume this may be related to "find and delete" instead of deleting with absolute path. Do you have any suggestions? |
Beta Was this translation helpful? Give feedback.
-
That's exactly what it is. It has to delete all files that start with the same prefix. Doing a "delete where file starts with X" isn't available in all/most platforms, which in turn means that the delete operation has to scan over every file name, and delete the file when the name matches the check. There's no real way around that now. The only way to fix this would be to upgrade the way Directus manages thumbnails in the first place, so we can easily delete a whole folder worth of thumbnails, rather then having to find individual files. So for example something like:
|
Beta Was this translation helpful? Give feedback.
-
Thank you for your prompt response. Your solution looks simple and great. That should fix. For a dirty solution to this, I created 2nd "local" storage with a new STORAGE_LOCAL_ROOT="./uploads-0001". Do you confirm, using "S3_DRIVER" won't fix this? |
Beta Was this translation helpful? Give feedback.
-
It won't fully fix it, but it might run faster. I'm not sure if AWS can process file reads quicker than the node can read the local file system. |
Beta Was this translation helpful? Give feedback.
-
Thank you opening this improvement to discussion @rijkvanzanten. Here is my honest thought: It is like having a sports car with 5 litres of fuel tank. :) Any project can easily hit 20-30k files even way more. When you combine this with user generated presets and system-generated thumbnails it can easily multiply by X4-X10. if you have 200-300k files in a folder and you try to delete 50-100 of them with find&delete it takes many minutes with a regular server meanwhile performance drop in the server is pretty significant. Also, if you are using S3, your server IP can be blocked for a limited time. This happened to me with 30k files stored in digitalocean s3. From my side, as I am the system owner, I can find a few hacks to live with this but I believe for the sake of Directus, this improvement is essential. hack 1: create a new storage adapter with every 25-50k files. |
Beta Was this translation helpful? Give feedback.
-
It like to add a third Idea: As a downside it would need some more database queries (which shouldn’t take as long) mentioned before in #12010 |
Beta Was this translation helpful? Give feedback.
-
This! Is an actual issue I'm dealing with in a production service (400k files + thumbnails not counted). Having thumbnails in fs folders would be a good start I think. On the other hand there is the discussing about reflecting file library's virtual folders on fs. #11148 I would be happy with any approach that reduces the amount of files in root upload folder in order to speed up find and delete. (Trying to rotate more storage adapters meanwhile) |
Beta Was this translation helpful? Give feedback.
-
Heya! Thank you for taking the time to submit this request! It has been over 90 days, and this discussion has not received at least 15 votes from the community. This means that we don't feel like there's enough community interest to warrant further R&D into this topic at this time. 🧊 This request will now be closed to keep our discussions tidy. Please reach out if you have any questions! For more information, see our Feature Request Process. |
Beta Was this translation helpful? Give feedback.
-
I have a "collection A" with "multiple files field" in it. Some times, I am deleting items from "collection A", so the files are no longer needed.
Regularly, I am deleting the files are no longer necessary.
After having around 1.8m files (all stored locally), deleting a file takes up to 10 secs (both API and App).
Mean while creating new items or uploading files works smoothly (I don't know if there are any performance drop as well, I am running directus on a good machine).
While normal running, CPU level is is perfectly fine, however, while I am deleting the files CPU hits 100% and this immediately affects all other operations which takes long.
I am deleting files 1by1, so I don't know how "Delete Multiple Files" would work.
I am running v9.7.0, on a dedicated mac mini m1 machine.
To Reproduce
Create a collection with multiple files in it.
Add many items with around 1-2 million files.
Try to delete a file.
Beta Was this translation helpful? Give feedback.
All reactions