-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce storage manager concept #3577
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Remove `initialize_status` (status already defaults to pending in database) * Remove `has_dimensions?` (always returns true) * Remove `async_conversion?` (dead code) * Remove `validate_file_exists` (unneeded checks) * Simplify `calculate_dimensions` * Merge `file_header_to_content_type` with `content_type_to_file_ext` (content type isn't used elsewhere)
Move animated_gif / animated_png autotagging to take place during uploading, instead of during tag editing. We can't generally assume the file will be present on the local filesystem after uploading.
Refactors the upload process to pass around temp files, rather than passing around file paths and directly writing output to the local filesystem. This way we can pass the storage manager the preview / sample / original temp files, so it can deal with storage itself. * Change Download::File#download! to return a temp file. * Change DanbooruImageResizer and PixivUgoiraConverter to accept/return temp files instead of file paths. * Change Upload#generate_resizes to return temp files for previews and samples. * Change Upload#generate_resizes to generate ugoira .webm samples synchronously instead of asynchronously.
* Perform backups synchronously inside `distribute_files` instead of asynchronously in `queue_backup`. Asynchronous backups assumed that files are stored on the local filesystem, which isn't true in general. * Remove obsolete backup service classes.
* Remove file_path_for, cropped_file_url (dead code) * Remove complete_preview_file_url (preview_file_url now returns absolute links) * Remove `file_name` (only used for Download link in sidebar)
Is this good to merge now? |
Yeah, I'm finished with it, should be good to go. |
This was referenced Apr 2, 2018
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This introduces a concept of storage managers for dealing with the storage of uploaded files. This incidentally fixes #3569, but the main goal is to clean up how archived files are dealt with.
This PR makes it possible to configure how uploads are stored: locally, via SFTP, or via S3. This also makes it possible to fully store all files externally, so that the webservers can be independent of the asset servers.
The basic idea is to introduce a
StorageManager
class withstore
anddelete
methods. Subclasses implement three different storage methods: local filesystem, SFTP, or S3. Each method supports flat or hierarchical directory structures. The SFTP method supports multiple hosts, which allows you to store files to both sonohara and hijiribe at the same time.There is a fourth "hybrid" method, which lets you choose the method to use based on the post's ID and type (preview, sample, or original file). In this way it's possible to support complex configurations like Danbooru's, where files are stored across different servers with different URL structures. Here is how you would replicate Danbooru's current setup.
The basic idea itself is simple enough, but integrating it with the upload process turned out to be somewhat involved. This was mainly because of assumptions throughout the code that files are stored locally on the webserver. In short:
Autotagging of animated GIFs/PNGs was moved from during post editing to during upload. This assumed that files were available on the local webserver, which we don't want to assume. Rechecking the file on every edit isn't very efficient anyway.
Ugoira .webm conversion was made synchronous instead of asynchronous. Doing it asynchronously created multiple problems: it assumed the .zip file was available locally, it could break backups and iqdb submissions (since the conversion may not be finished yet when these things run), and it required hacky file existence checks in various places to detect whether the conversion was finished and the images were ready to use.
Backups were made synchronous instead of asynchronous. Again, this assumed that uploaded files were present on the local webserver. It also meant that if the job worker went down, uploads would be accepted without being backed up.
The signature for
Post.delete_files
had to be changed. Important: this means existing file deletion jobs need to be cancelled ASAP. These jobs wouldn't work right anyway, since in many cases they're trying to delete files that have since been archived./posts.json
now returns absolute URLs in all cases. Right now it returns absolute URLs for some posts and relative URLs for others, which clients have to detect and fix in order to download files. This change may break scrapers. However, it's not hard to fix and makes the API easier to deal with.The upload process was reworked to pass around temp files, instead of writing preview/samples directly to the local filesystem. This simplified some things.
URL building was moved to
StorageManager
, since the URL structure depends on how files are stored.The backup system was changed to use the storage management system. This means you can choose between backing up files to a local directory, to SFTP, or to S3.
Making backups and ugoira conversions synchronous may slow down uploads. Ultimately I think it'd be better to make the whole upload process asynchronous, but that's outside the scope of this.