Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve MultipartWriter parts headers on write #3475

Merged
merged 2 commits into from
Jan 3, 2019

Conversation

kxepal
Copy link
Member

@kxepal kxepal commented Jan 2, 2019

This fixes #3035

aiohttp/payload.py Outdated Show resolved Hide resolved
@asvetlov
Copy link
Member

asvetlov commented Jan 2, 2019

@kxepal please fix make mypy run

@kxepal kxepal force-pushed the 3035-respect-payload-headers branch 2 times, most recently from 003d36c to 1536f25 Compare January 2, 2019 15:23
@kxepal
Copy link
Member Author

kxepal commented Jan 2, 2019

Should be fixed now. Waiting for CI resolution.

@codecov-io
Copy link

codecov-io commented Jan 2, 2019

Codecov Report

Merging #3475 into master will decrease coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #3475      +/-   ##
=========================================
- Coverage   97.93%   97.9%   -0.03%     
=========================================
  Files          44      44              
  Lines        8559    8564       +5     
  Branches     1383    1383              
=========================================
+ Hits         8382    8385       +3     
- Misses         74      75       +1     
- Partials      103     104       +1
Impacted Files Coverage Δ
aiohttp/multipart.py 95.47% <100%> (-0.36%) ⬇️
aiohttp/payload.py 98.61% <100%> (+0.02%) ⬆️
aiohttp/web_fileresponse.py 96.55% <0%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6205cce...7c3623c. Read the comment docs.

@asvetlov
Copy link
Member

asvetlov commented Jan 2, 2019

Good!
Sorry, type annotations for multipart are quite not strict.
I had problems with understanding actual types and used Any/type: ignore somewhere.

Copy link
Member

@asvetlov asvetlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please merge when green

@kxepal
Copy link
Member Author

kxepal commented Jan 2, 2019

No worries. Actually, that failure was quite right one: I didn't to fix the _Part annonation which continued thinks that detached headers are passed around.

I'm actually wonder why coverage failed so dramatically. Should I worry about and fix that? Probably, it's because removed useless test. May be I should return it with few meaninful assertions...

@kxepal kxepal force-pushed the 3035-respect-payload-headers branch from 1536f25 to 0eeb460 Compare January 2, 2019 15:36
@kxepal
Copy link
Member Author

kxepal commented Jan 2, 2019

Ah, too outdated local master branch probably. Rebased now.

@kxepal kxepal force-pushed the 3035-respect-payload-headers branch from 0eeb460 to 8c5aa01 Compare January 2, 2019 15:38
@asvetlov
Copy link
Member

asvetlov commented Jan 2, 2019

Coverage report shows a couple missed lines: https://codecov.io/gh/aio-libs/aiohttp/pull/3475/diff
If you'll fix it -- would be nice!

This case is actually impossible because all the payload instances will
have headers defined with at least `Content-Type` definition.

While, it's theoretically possible to create `Payload` without headers
definition and Multipart format itself allows such parts, in multipart
module we already have set of assertions which wouldn't make this
possible.

Since `_binary_headers` is private property with unknown fate and
created just to not copy-paste headers serialization logic twice and
used exactly within `multipart` module, we're safe to ignore this
branch.

Proper fix would be refactoring the way how headers and their fragments
will get handled by `Payload` instances, but this quite a work out of
scope of current bugfix and will be addressed in upcoming PR.
@kxepal
Copy link
Member Author

kxepal commented Jan 2, 2019

@asvetlov
Well, it's possible to fix coverage here with some, but that fix will have no links with the real life. Proper one will require quite of refactoring which will touch too many bits and unrelated with bugfix itself. For now I added this commit: 7c3623c to make CI happy and will handle this issue in another PR. Are you ok with it?

@asvetlov
Copy link
Member

asvetlov commented Jan 2, 2019

@kxepal sounds perfect!

@kxepal kxepal merged commit 37722f8 into aio-libs:master Jan 3, 2019
@elsampsa
Copy link

elsampsa commented Jan 3, 2019

Just tried the latest main branch. Unfortunately, the bug still persists.

Expected behaviour

I want to modify the Content-Disposition field in the headers to this:

Content-Disposition: attachments; filename="5c2ca3d448cf7d16922acfdc.jpg"

Actual behaviour

Whatever I try, it always reads:

Content-Disposition: attachment; filename="test.jpg"; filename*=utf-8''test.jpg

Example code

Please try this simple code snippet:

# https://docs.aiohttp.org/en/stable/multipart.html

# nix
import sys
import io

# async
import asyncio
import aiohttp

class debugWriter:
    
    def __init__(self):
        self.buffer=""
        
    async def write(self, stuff):
        self.buffer += stuff.decode("utf-8","ignore")
        
    def done(self):
        print(self.buffer[:200])
 

async def bug1():
    """This doesn't work because:
    
    - payload.py : Payload.set_content_disposition : gets called and it overwrites header with some default values
    
    """
    
    mpwriter = aiohttp.MultipartWriter('form-data')
    f=open("test.jpg","wb"); f.write("123".encode("utf-8")); f.close() # create a dummy file
    
    f=open("test.jpg","rb")
    subwriter = mpwriter.append(f, {
        aiohttp.hdrs.CONTENT_DISPOSITION : 'attachments; filename="5c2ca3d448cf7d16922acfdc.jpg"', # has no effect
        aiohttp.hdrs.CONTENT_TYPE : "image/whatever"
            })

    w = debugWriter()
    await mpwriter.write(w)
    w.done()
    f.close()
    

async def bug2():
    """This doesn't work because:
    
    - multipart.py : MultiparWriter.append : headers get constructed once and only when append is called.
    - .. calling set_content_disposition after that has no effect
    
    """
    mpwriter = aiohttp.MultipartWriter('form-data')
    f=open("test.jpg","wb"); f.write("123".encode("utf-8")); f.close() # create a dummy file
    
    f=open("test.jpg","rb")
    subwriter = mpwriter.append(f, {
        aiohttp.hdrs.CONTENT_TYPE : "image/whatever"
            })
    
    mpwriter.set_content_disposition('attachments', filename='5c2ca3d448cf7d16922acfdc.jpg') # has no effect
    
    w = debugWriter()
    await mpwriter.write(w)
    w.done()
    f.close()
    
    
if __name__ == '__main__':
    asyncio.get_event_loop().run_until_complete(bug1())
    asyncio.get_event_loop().run_until_complete(bug2())

Quickfix

Modify Payload.set_content_disposition

def set_content_disposition(self,
                                disptype: str,
                                quote_fields: bool=True,
                                **params: Any) -> None:
        """Sets ``Content-Disposition`` header."""
        if self._headers is None:
            self._headers = CIMultiDict()

        if (hdrs.CONTENT_DISPOSITION in self._headers):
            print("set_content_disposition : CONTENT_DISPOSITION found : ", self._headers[hdrs.CONTENT_DISPOSITION])

        else:
            print("set_content_disposition : overwriting ")
            self._headers[hdrs.CONTENT_DISPOSITION] = content_disposition_header(
                disptype, quote_fields=quote_fields, **params)
        print("set_content_disposition : now ", self._headers[hdrs.CONTENT_DISPOSITION])

.. this way the case "bug1" in the code snippet above starts working

@asvetlov
Copy link
Member

asvetlov commented Jan 3, 2019

@kxepal ?

@kxepal
Copy link
Member Author

kxepal commented Jan 3, 2019

@elsampsa
Thanks! That's actually another story, but with the similar consequences. Let me see how to fix this once and for all time.

@kxepal
Copy link
Member Author

kxepal commented Jan 3, 2019

@elsampsa
Addressed this in #3479 - that patch should fix the issue now, but I'll add more tests later this evening to make sure we'll not repeat the same mistakes in future.

@elsampsa
Copy link

elsampsa commented Jan 3, 2019

Thanks! I'll try it tomorrow then ..

asvetlov pushed a commit that referenced this pull request Jan 3, 2019
* Preserve MultipartWriter parts headers on write

This fixes #3035 

* Mark case when payload has no headers as unreachable with FIXME

This case is actually impossible because all the payload instances will
have headers defined with at least `Content-Type` definition.

While, it's theoretically possible to create `Payload` without headers
definition and Multipart format itself allows such parts, in multipart
module we already have set of assertions which wouldn't make this
possible.

Since `_binary_headers` is private property with unknown fate and
created just to not copy-paste headers serialization logic twice and
used exactly within `multipart` module, we're safe to ignore this
branch.

Proper fix would be refactoring the way how headers and their fragments
will get handled by `Payload` instances, but this quite a work out of
scope of current bugfix and will be addressed in upcoming PR.
@elsampsa
Copy link

Tested this (version 3.5.4 from pypi). Works as expected.

Thanks for the great library and keep up the good work!

@asvetlov
Copy link
Member

Welcome!

@lock lock bot added the outdated label Jan 22, 2020
@lock lock bot locked as resolved and limited conversation to collaborators Jan 22, 2020
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Jan 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bot:chronographer:provided There is a change note present in this PR outdated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

part.set_content_disposition is ignored
4 participants