Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content-Disposition fast access in ClientResponse #2455

Merged
merged 2 commits into from Nov 7, 2017

Conversation

redixin
Copy link
Contributor

@redixin redixin commented Oct 31, 2017

Add content_disposition and content_disposition_dict properties

Partially implements #1670

What do these changes do?

Improves ClientResponse ergonomics

Are there changes in behavior for the user?

All changes are documented.

Related issue number

#1670

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bug)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

@redixin
Copy link
Contributor Author

redixin commented Oct 31, 2017

@fafhrd91 should I use issue number or pull request number for CHANGES? I'm asking because this PR implements only first item in issue #1670

@asvetlov
Copy link
Member

asvetlov commented Oct 31, 2017

I dislike two properties.
From architectural perspective better to create a ContentDisposition class (derived from named tuple maybe) and return this instance.

@fafhrd91
Copy link
Member

fafhrd91 commented Oct 31, 2017

  1. Aiohttp has content disposition parsing function. I think cgi module doesn’t handle all cases.
  2. Regarding changes, it is question for asvetlov

@fafhrd91
Copy link
Member

I like idea of ContentDisposition class

@asvetlov
Copy link
Member

PR number is ok for the case -- it doesn't cover the whole issue.

@fafhrd91
Copy link
Member

@codecov-io
Copy link

codecov-io commented Oct 31, 2017

Codecov Report

Merging #2455 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2455      +/-   ##
==========================================
+ Coverage   97.14%   97.14%   +<.01%     
==========================================
  Files          39       39              
  Lines        8125     8136      +11     
  Branches     1419     1420       +1     
==========================================
+ Hits         7893     7904      +11     
  Misses        101      101              
  Partials      131      131
Impacted Files Coverage Δ
aiohttp/client_reqrep.py 97.2% <100%> (+0.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c29b630...7c8f5f3. Read the comment docs.

@@ -737,3 +755,10 @@ def content_length(self, *, _CONTENT_LENGTH=hdrs.CONTENT_LENGTH):

if content_length:
return int(content_length)

@property
def content_disposition(self, *, header=hdrs.CONTENT_DISPOSITION):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the property is in HeadersMixin?
Should web.Request has the property?
I thought it is only for ClientResponse

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed it's response only header.

else:
value, params = multipart.parse_content_disposition(raw)
filename = multipart.content_disposition_filename(params)
self._content_disposition = ContentDisposition(value, params,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert params into types.MappingProxyType. The value should be immutable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do in next patchset.

@property
def content_disposition(self, *, header=hdrs.CONTENT_DISPOSITION):
"""The value of Content-Disposition HTTP header."""
if self._content_disposition == False:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is False

@@ -703,6 +708,7 @@ class HeadersMixin:

_content_type = None
_content_dict = None
_content_disposition = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite strange default. Why False?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False if header are not parsed/cached yet
None if there was an attempt to parse, but no header found
ContentDisposition if header was found and parsed

@@ -714,6 +720,18 @@ def _parse_content_type(self, raw):
else:
self._content_type, self._content_dict = cgi.parse_header(raw)

def _parse_content_disposition(self, _CONTENT_DISPOSITION):
from . import multipart
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why hidden import?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was here due to circular import

@@ -500,6 +505,7 @@ class ClientResponse(HeadersMixin):
raw_headers = None # Response raw headers, a sequence of pairs

_connection = None # current connection
_content_disposition = False # content disposition cache
Copy link
Member

@kxepal kxepal Nov 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that's not good choice. None means no value. We tried to parse it, nothing happened. Yes, this will trigger headers lookup for each property access - but that's not a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helpers.sentinel is an option

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good alternative, but None is fine as well. However, even better would be to use reify here and avoid internal attributes. Headers are immutable in anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for reify

value, params = multipart.parse_content_disposition(raw)
params = MappingProxyType(params)
filename = multipart.content_disposition_filename(params)
return ContentDisposition(value, params, filename)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert params into immutable structure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? params is already converted to MappingProxyType line 565

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry.
Nevermind

@asvetlov
Copy link
Member

asvetlov commented Nov 3, 2017

flake8 tool still disagrees with your code style :)

@@ -527,6 +532,7 @@ def __init__(self, method, url, *,
self._request_info = request_info
self._timer = timer if timer is not None else TimerNoop()
self._auto_decompress = auto_decompress
self._cache = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like no. All classes with @reify methods have this and AttributeError is raising without this line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implicit magic. Worth to add notice about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the attr is required, not a notice :)
But the notice makes no harm

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ok. Last time I saw reify in acttion, no special attrs were required.


assert 'attachment' == response.content_disposition.value
assert 'bar' == response.content_disposition.parameters["foo"]
assert 'archive.tar.gz' == response.content_disposition.filename
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth to add two more cases:

  1. content_disposition.parameters is immutable
  2. second access to content_disposition returns the same result if ClientResponse.headers were modified in-between (I'm surprised to see it's not read-only property).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do


.. attribute:: value

Value of Content-Disposition header itself, e.g. ``attachment``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"attachment" - you want to notice that the value is a string typed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do


.. attribute:: parameters

A :class:`dict` instance contains all parameters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it's not a dict, but MapptingProxy. Difference in mutability.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just replace it with *mapping* for sake of simplicity.
I pretty sure not many python developers know what MappingProxy is :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

@@ -33,6 +34,10 @@
__all__ = ('ClientRequest', 'ClientResponse', 'RequestInfo')


ContentDisposition = collections.namedtuple(
'ContentDisposition', ('value', 'parameters', 'filename'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think value is not the best choice. The RFC defines it as a type which suites more to what the value it holds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

raw = self._headers.get(hdrs.CONTENT_DISPOSITION)
if raw is None:
return None
value, params = multipart.parse_content_disposition(raw)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems only two values here are valid for HTTP headers case: inline and attachment. May be worth to raise a warning about invalid type or turn the value in Enum?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://tools.ietf.org/html/rfc6266#section-4.1 any token is valid

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like Enum but how the change affects multidict and existing users.
I suspect parse_content_disposition is internal API but not 100% sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@redixin Yes, but RFC defines more general Content-Disposition definition. Mozilla explicitly splits it into HTTP headers and Multpart headers. The second one utilize token rule completely, but here we are in HTTP response land.

@redixin redixin force-pushed the content-disposition branch 3 times, most recently from ff12184 to dec87c0 Compare November 6, 2017 15:51
Add ContentDisposition class and content_disposition property

Partially implements aio-libs#1670
@asvetlov asvetlov merged commit 78d08ce into aio-libs:master Nov 7, 2017
@asvetlov
Copy link
Member

asvetlov commented Nov 7, 2017

Great work!
Thank you!

@redixin redixin deleted the content-disposition branch November 7, 2017 23:47
@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that [new issue].
[new issue]: https://github.com/aio-libs/aiohttp/issues/new

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bot:chronographer:provided There is a change note present in this PR outdated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants