Add option to ActiveStorage::Blob to set extract_content_type_from_io #32833
Conversation
Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @schneems (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review. Please see the contribution instructions for more information. |
def build_after_upload(io:, filename:, content_type: nil, metadata: nil) | ||
# Set the +content_type+ and +extract_content_type_from_io+ to +false+ to manually | ||
# specify the content type rather than using automatic content type inference. | ||
def build_after_upload(io:, filename:, content_type: nil, metadata: nil, extract_content_type_from_io: true) |
georgeclaghorn
May 8, 2018
Member
I’d like to name this argument identify
here and in create_after_upload
for consistency with the identify
method.
I’d like to name this argument identify
here and in create_after_upload
for consistency with the identify
method.
ghost
May 8, 2018
Author
Renamed to identify in this commit 4f9defb42cf7c543906a7de72c7b6f91fa5058d4
Renamed to identify in this commit 4f9defb42cf7c543906a7de72c7b6f91fa5058d4
# | ||
# Normally, you do not have to call this method directly at all. Use the factory class methods of +build_after_upload+ | ||
# and +create_after_upload!+. | ||
def upload(io) | ||
def upload(io, extract_content_type_from_io = true) |
georgeclaghorn
May 8, 2018
Member
Let’s use a keyword argument here. Name it identify
to match build_after_upload
and create_after_upload
.
Let’s use a keyword argument here. Name it identify
to match build_after_upload
and create_after_upload
.
ghost
May 8, 2018
Author
Making it a keyword argument makes it much clearer. Amended here e77f3db21842d20a489853b41eed4233c5394666
Making it a keyword argument makes it much clearer. Amended here e77f3db21842d20a489853b41eed4233c5394666
test "create after upload extracts content_type when extract_content_type_from_io: true" do | ||
blob = create_csv_blob data: "Article,dates,analysis\n1, 2, 3", content_type: "text/csv", extract_content_type_from_io: true | ||
assert_equal "message/news", blob.content_type | ||
end |
georgeclaghorn
May 8, 2018
Member
The new option defaults to true, so this test is superfluous. The preexisting tests above cover this behavior.
The new option defaults to true, so this test is superfluous. The preexisting tests above cover this behavior.
ghost
May 8, 2018
Author
Removed this test in this commit 539e880fe4731ef430a857a35ec9efc167edf263
Removed this test in this commit 539e880fe4731ef430a857a35ec9efc167edf263
@@ -54,6 +54,10 @@ def create_blob(data: "Hello world!", filename: "hello.txt", content_type: "text | |||
ActiveStorage::Blob.create_after_upload! io: StringIO.new(data), filename: filename, content_type: content_type | |||
end | |||
|
|||
def create_csv_blob(data: "article,dates,analysis\n1, 2, 3", filename: "my_csv.csv", content_type: "text/csv", extract_content_type_from_io: true) | |||
ActiveStorage::Blob.create_after_upload! io: StringIO.new(data), filename: filename, content_type: content_type, extract_content_type_from_io: extract_content_type_from_io | |||
end |
georgeclaghorn
May 8, 2018
Member
We don’t need a special helper for creating a CSV blob. Modify create_blob
to take an identify
argument, then call it from the new test:
test "create after upload uses content_type when extract_content_type_from_io: false" do
blob = create_blob data: "Article,dates,analysis\n1,2,3", filename: "table.csv", content_type: "text/csv", identify: false
assert_equal "text/csv", blob.content_type
end
We don’t need a special helper for creating a CSV blob. Modify create_blob
to take an identify
argument, then call it from the new test:
test "create after upload uses content_type when extract_content_type_from_io: false" do
blob = create_blob data: "Article,dates,analysis\n1,2,3", filename: "table.csv", content_type: "text/csv", identify: false
assert_equal "text/csv", blob.content_type
end
ghost
May 8, 2018
Author
Done in this commit 539e880fe4731ef430a857a35ec9efc167edf263
Done in this commit 539e880fe4731ef430a857a35ec9efc167edf263
ghost
May 8, 2018
•
Author
Just remembered to include the filename arg for this updated test d6e1197edd9e626172de6dbf4c0380088c6e28f7
Just remembered to include the filename arg for this updated test d6e1197edd9e626172de6dbf4c0380088c6e28f7
test "create after upload extracts content_type from io when no content_type given" do | ||
blob = create_csv_blob content_type: nil | ||
assert_equal "text/csv", blob.content_type | ||
end |
georgeclaghorn
May 8, 2018
•
Member
This test doesn’t seem terribly valuable. I’d omit it or modify it to cover the case where content_type
is nil
and identify
is false.
This test doesn’t seem terribly valuable. I’d omit it or modify it to cover the case where content_type
is nil
and identify
is false.
ghost
May 8, 2018
Author
Modified for content_type nil and identify: false e34eec2c2c2133edf4512b1370455fe4c669c0ca
Modified for content_type nil and identify: false e34eec2c2c2133edf4512b1370455fe4c669c0ca
@georgeclaghorn This PR is now updated with the requests in your review. I felt like I learnt a bit more about Rails architecture and how to write better Ruby from this activity! |
This address the issue where the user defines a content type as an argument | ||
but the automatic MimeType inference from the io is not what is expected. | ||
It means a user avoids needing force the correct content type by | ||
patching `ActiveStorage::Blob.extract_content_type`. |
georgeclaghorn
May 8, 2018
Member
It’s uncommon to call ActiveStorage::Blob#create_after_upload
or #build_after_upload
directly. Can we demonstrate passing identify: false
to ActiveStorage::Attached::{One,Many}#attach
instead?
It’s uncommon to call ActiveStorage::Blob#create_after_upload
or #build_after_upload
directly. Can we demonstrate passing identify: false
to ActiveStorage::Attached::{One,Many}#attach
instead?
ghost
May 8, 2018
•
Author
I've added the update but not sure if the content is what you had in mind. It has a similar wording to the comments in blob.rb and I've added in an example based on what is in the RailsGuides.
commit 4152b834102b8dbd9f6cae9965d4d00bc1b3f512
I've added the update but not sure if the content is what you had in mind. It has a similar wording to the comments in blob.rb and I've added in an example based on what is in the RailsGuides.
commit 4152b834102b8dbd9f6cae9965d4d00bc1b3f512
self.checksum = compute_checksum_in_chunks(io) | ||
self.content_type = extract_content_type(io) | ||
self.content_type = extract_content_type(io) if self.content_type.nil? || identify |
georgeclaghorn
May 8, 2018
Member
content_type
doesn’t need an explicit receiver in the conditional. Replace self.content_type.nil?
with content_type.nil?
.
content_type
doesn’t need an explicit receiver in the conditional. Replace self.content_type.nil?
with content_type.nil?
.
ghost
May 8, 2018
Author
Removed reference to content_type explicit receiver here 606f42457796258f9f94622678f5d4aac10b6e29
Removed reference to content_type explicit receiver here 606f42457796258f9f94622678f5d4aac10b6e29
@@ -44,21 +44,25 @@ def find_signed(id) | |||
end | |||
|
|||
# Returns a new, unsaved blob instance after the +io+ has been uploaded to the service. | |||
def build_after_upload(io:, filename:, content_type: nil, metadata: nil) | |||
# Set the +content_type+ and +identify+ to +false+ to manually |
georgeclaghorn
May 8, 2018
•
Member
The first part of this sentence makes it sound like both content_type
and identify
should be nil
. Here’s a clarifying edit:
When providing a content type, pass identify: false
to bypass automatic content type inference.
The first part of this sentence makes it sound like both content_type
and identify
should be nil
. Here’s a clarifying edit:
When providing a content type, pass
identify: false
to bypass automatic content type inference.
ghost
May 8, 2018
Author
Clarified this sentence with the edit bad8fa0
Clarified this sentence with the edit bad8fa0
ghost
May 8, 2018
Author
Added in the formatting of identify: false here 231ad55
Added in the formatting of identify: false here 231ad55
@@ -142,13 +146,14 @@ def service_headers_for_direct_upload | |||
# | |||
# Prior to uploading, we compute the checksum, which is sent to the service for transit integrity validation. If the | |||
# checksum does not match what the service receives, an exception will be raised. We also measure the size of the +io+ | |||
# and store that in +byte_size+ on the blob record. | |||
# and store that in +byte_size+ on the blob record. The content type is automatically extracted from the +io+ unless | |||
# the user has specified a +content_type+ and set the +identify+ as false. |
georgeclaghorn
May 8, 2018
Member
Remove the extra “the” between “set” and “identify
.”
Remove the extra “the” between “set” and “identify
.”
ghost
May 8, 2018
Author
Removed extra 'the' here
c79c9e1
Removed extra 'the' here
c79c9e1
@@ -44,21 +44,24 @@ def find_signed(id) | |||
end | |||
|
|||
# Returns a new, unsaved blob instance after the +io+ has been uploaded to the service. | |||
def build_after_upload(io:, filename:, content_type: nil, metadata: nil) | |||
# When providing a content type, pass +identify: false+ to bypass automatic content type inference. |
georgeclaghorn
May 8, 2018
Member
+
only works around individual words. Use <tt></tt>
instead:
pass <tt>identify: false</tt>
+
only works around individual words. Use <tt></tt>
instead:
pass <tt>identify: false</tt>
ghost
May 8, 2018
•
Author
Ah right, updated 209f021c15373832726496c157b86223ceaa291b
Ah right, updated 209f021c15373832726496c157b86223ceaa291b
def create_after_upload!(io:, filename:, content_type: nil, metadata: nil) | ||
build_after_upload(io: io, filename: filename, content_type: content_type, metadata: metadata).tap(&:save!) | ||
# Set the +content_type+ and +identify+ to +false+ to manually | ||
# specify the content type rather than using automatic content type inference. |
georgeclaghorn
May 8, 2018
Member
This needs the same treatment as build_after_upload
.
This needs the same treatment as build_after_upload
.
ghost
May 8, 2018
Author
create_after_upload!
now has the same comment as build_after_upload
e5f7519
create_after_upload!
now has the same comment as build_after_upload
e5f7519
@@ -142,13 +145,14 @@ def service_headers_for_direct_upload | |||
# | |||
# Prior to uploading, we compute the checksum, which is sent to the service for transit integrity validation. If the | |||
# checksum does not match what the service receives, an exception will be raised. We also measure the size of the +io+ | |||
# and store that in +byte_size+ on the blob record. | |||
# and store that in +byte_size+ on the blob record. The content type is automatically extracted from the +io+ unless | |||
# the user has specified a +content_type+ and set +identify+ as false. |
georgeclaghorn
May 8, 2018
Member
Use “you specify” instead of “the user has specified” to match the adjacent paragraphs, which use the second person.
“Set identify
as false” doesn’t read quite right. How about “pass identify: false
”?
Use “you specify” instead of “the user has specified” to match the adjacent paragraphs, which use the second person.
“Set identify
as false” doesn’t read quite right. How about “pass identify: false
”?
ghost
May 8, 2018
Author
Yes, that read more fluently and consistently with the rest of the test
190693c
Yes, that read more fluently and consistently with the rest of the test
190693c
Looks like this change would affect the Rails Guides too: http://edgeguides.rubyonrails.org/active_storage_overview.html#attaching-file-io-objects in the section 3.3 Attaching File/IO Objects I've added this to the commit and squashed it in. |
Can you please squash your commits into one? |
Commits are now squashed and pushed up |
This adds a boolean argument called identify to ActiveStorage::Blob methods #create_after_upload, #build_after_upload and #upload. It allows a user to bypass the automatic content_type inference from the io.
activestorage CHANGELOG.md addition
Use
extract_content_type_from_io: false
as an argument inActiveStorage::Blob.create_after_upload
to manually override theautomatic extraction of content type from the io.
This address the issue where the user defines a content type as an argument
but the automatic MimeType inference from the io is not what is expected.
It means a user avoids needing force the correct content type by
patching
ActiveStorage::Blob.extract_content_type
.Ryan Davidson
Summary
This adds a boolean extract_content_type_from_io to ActiveStorage::Blob
methods #create_after_upload, #build_after_upload and #upload. It
allows a user to manually override the automatic MimeType extraction from
the io. This provides a patch for when Marcel::MimeType.for returns an
incorrect content type when #extract_content_type is called from the #upload method.
This patch aims to address:
It fixes bugs where a content_type is automatically extracted from the input and gives an inaccurate result. This is due to a bug in the Marcel::MimeType gem that is called in the extract_content_type method (which was called by default in the #upload method, which is in turn called by #build_after_upload and #create_after_upload)
Other Information
Added document comments to
activestorage/app/models/active_storage/blob.rb
Finally, if your pull request affects documentation or any non-code
changes, guidelines for those changes are available
here
Thanks for contributing to Rails!