Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ActiveStorage::Blob.compose #41544

Merged
merged 1 commit into from Nov 26, 2021
Merged

Conversation

gmcgibbon
Copy link
Member

@gmcgibbon gmcgibbon commented Feb 24, 2021

Summary

Retry of #37314

I'd like to be able to use GCS' compose feature within Active Storage. Essentially, it allows you to combine multiple files together. This is particularly useful when handling multiple large files.

One issue I can see is that it is impossible to compute the checksum for a file that's assembled on the storage provider's end. In this implementation, composed blobs cannot be opened as they would fail integrity checks. We could make a special blob type ( eg. conposed: true) that skips checksum verification, but I'm not sure if that's a preferred approach for this use-case. What do you think?

@sedubois
Copy link
Contributor

You have my vote of support. I would love to combine files for other reasons than just them being large. For example combining various inputs with vips, ffmpeg, etc (see #39283 to support non-image variants).

@byroot
Copy link
Member

byroot commented Mar 28, 2021

particularly useful when handling multiple large files.

One issue I can see is that it is impossible to compute the checksum for a file that's assembled on the storage provider's end.

In the case of GCS at least, you'll get an "object" response which include the MD5, CRC etc.

I know very little of Active Storage, but the compose API could be required to return the composed object checksum. No?

@rails-bot
Copy link

rails-bot bot commented Jun 26, 2021

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Thank you for your contributions.

@rails-bot rails-bot bot added the stale label Jun 26, 2021
@rails-bot rails-bot bot closed this Jul 3, 2021
@zzak zzak reopened this Jul 4, 2021
@rails-bot rails-bot bot removed the stale label Jul 4, 2021
@zzak
Copy link
Member

zzak commented Jul 4, 2021

@gmcgibbon Do you have time to rebase this and address Jean's comments above? 🙏

@rails-bot
Copy link

rails-bot bot commented Oct 2, 2021

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Thank you for your contributions.

@rails-bot rails-bot bot added the stale label Oct 2, 2021
@rails-bot rails-bot bot closed this Oct 9, 2021
@gmcgibbon gmcgibbon reopened this Nov 4, 2021
@rails-bot rails-bot bot removed the stale label Nov 4, 2021
@gmcgibbon gmcgibbon removed the request for review from georgeclaghorn November 4, 2021 23:09
@gmcgibbon gmcgibbon force-pushed the active_storage_compose branch 3 times, most recently from 1575316 to cd94551 Compare November 5, 2021 19:50
@gmcgibbon
Copy link
Member Author

In the case of GCS at least, you'll get an "object" response which include the MD5, CRC etc.

I know very little of Active Storage, but the compose API could be required to return the composed object checksum. No?

Right. I remember now that I tried this but composite objects on GCS don't have MD5 hashes, only CRC32C. Other storage services seem to also not include MD5 hashes. This leaves me no choice but to stream the file locally and calculate the MD5 myself. Though, there may be a more clever way of doing that.

@gmcgibbon gmcgibbon force-pushed the active_storage_compose branch 5 times, most recently from 5fb4dfb to 995fc81 Compare November 26, 2021 00:33
@gmcgibbon
Copy link
Member Author

gmcgibbon commented Nov 26, 2021

This seems to work well across services without having to calculate checksums on large files. It may take a slightly assertive approach to changing library expectations in terms of checksums, but composed blobs may be able to depend on the integrity of the individual blobs you are composed of. At least, storage services seem to think that way.

@gmcgibbon gmcgibbon merged commit 1cb6db4 into rails:main Nov 26, 2021
@gmcgibbon gmcgibbon deleted the active_storage_compose branch November 26, 2021 00:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants