New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ActiveStorage::Blob.compose #41544
Conversation
You have my vote of support. I would love to combine files for other reasons than just them being large. For example combining various inputs with vips, ffmpeg, etc (see #39283 to support non-image variants). |
In the case of GCS at least, you'll get an "object" response which include the MD5, CRC etc. I know very little of Active Storage, but the |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
@gmcgibbon Do you have time to rebase this and address Jean's comments above? 🙏 |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
f908711
to
d142c37
Compare
1575316
to
cd94551
Compare
Right. I remember now that I tried this but composite objects on GCS don't have MD5 hashes, only CRC32C. Other storage services seem to also not include MD5 hashes. This leaves me no choice but to stream the file locally and calculate the MD5 myself. Though, there may be a more clever way of doing that. |
cd94551
to
1617acb
Compare
5fb4dfb
to
995fc81
Compare
995fc81
to
79a5e0b
Compare
This seems to work well across services without having to calculate checksums on large files. It may take a slightly assertive approach to changing library expectations in terms of checksums, but composed blobs may be able to depend on the integrity of the individual blobs you are composed of. At least, storage services seem to think that way. |
Summary
Retry of #37314
I'd like to be able to use GCS' compose feature within Active Storage. Essentially, it allows you to combine multiple files together. This is particularly useful when handling multiple large files.
One issue I can see is that it is impossible to compute the checksum for a file that's assembled on the storage provider's end. In this implementation, composed blobs cannot be opened as they would fail integrity checks. We could make a special blob type ( eg.
conposed: true
) that skips checksum verification, but I'm not sure if that's a preferred approach for this use-case. What do you think?