Fix Document.get_attachment bug #106

alfinkel · 2016-03-03T21:41:12Z

What

Fix the Document get_attachment method so that it now correctly creates text and binary files as exepected as well as returns text, binary, and json content appropriately.

How

Make the attachment_type method argument optional.
Add logic to figure out what type of content should be returned by the method as well as what kind of file should be created if a write_to argument is provided.

Testing

Add a test to verify getting a text attachment and writing it to a file.
Add a test to verify getting a json attachment and writing it to a file.
Add a test to verify getting a binary attachment and writing it to a file.

Reviewers

reviewer: @emlaver
reviewer: @ricellis

Issues

Document.get_attachment() fails with binary data and when write_to is not None #102

emlaver · 2016-03-04T14:04:09Z

+1

ricellis · 2016-03-04T15:04:44Z

What happens in the case that there is an attachment which is text/plain but uses a charset other than unicode? e.g. Content-type: text/plain; charset=IBM038
If using attachment_type=None it will assume text which will then hit write_to.write(resp.text), but what charset will that use? I'm guessing probably unicode by default?

alfinkel · 2016-03-04T15:23:22Z

re: #106 (comment)

It will (now) use utf-8 to encode. Done in b0b522c. Thanks for catching that.

ricellis · 2016-03-04T15:43:01Z

I think b0b522c is ok for the JSON case because JSON has to be unicode.
However, I still think this might be broken for a text attachment that is in another encoding. Don't we need to read the charset type from the header to either read it in the correct charset or convert it to unicode?

alfinkel · 2016-03-04T15:58:44Z

The Python Requests module handles decoding/encoding. See http://docs.python-requests.org/en/master/user/quickstart/#response-content. Based on this I should probably remove the b0b522c commit.

alfinkel · 2016-03-04T16:00:30Z

Calling resp.text returns the encoded content.

ricellis · 2016-03-04T16:08:41Z

Ah ok, thanks for investigating. Would you mind adding a test that uses a text attachment with some other charset, just to validate that the behaviour is as we expect?

alfinkel · 2016-03-04T18:01:15Z

I don't think that there is much upside in creating a test to verify that the Python Requests module is behaving as it is supposed to. At best I think all that it would prove is that it works for the charset chosen for that test. If the concern is that a charset may not be handled correctly by Requests then there is a route for a user of this method to take, which is to provide the attachment_type='binary' argument to the method call and subsequently take the returned raw response content and encode it however they want to before writing it to a file.

ricellis · 2016-03-07T08:59:48Z

ok +1

- Change method signature and make attachment_type argument optional. - Add logic to figure out whether attachment should returned as text, json, or binary. - Add additional get attachment tests

Merged

toddreed · 2016-03-07T16:09:00Z

I still think that the API for get_attachment() is flawed because of the dependency between the write_to parameter and the attachment_type. Suppose the content type of the attachment is not known to the caller: they pass None for attachment_type so that get_attachment() will choose. The problem is that the caller needs to provide a write_to that is compatible. If the content type is text, then a text stream needs to be opened; if the content type is binary, a binary stream needs to be opened:

with open('attachment', 'w') as f: # will fail if attachment is binary because f expects str not bytes
    doc.get_attachment('attachment', write_to=f)

I think that a get_attachment() API needs to provide the data and the content type. (For example, in my case my content type is image/*, and I want to know is it image/png or image/jpg?)

alfinkel · 2016-03-07T18:17:07Z

Reopening issue #102 as a question and continuing the conversation there.

alfinkel force-pushed the 102-fix-get-attachment-bug branch from 5bde288 to adfe166 Compare March 3, 2016 21:53

alfinkel added 2 commits March 7, 2016 08:13

Fix get_attachment to write files correctly

3922535

- Change method signature and make attachment_type argument optional. - Add logic to figure out whether attachment should returned as text, json, or binary. - Add additional get attachment tests

Add CHANGES.rst entry for get_attachment fix

dcdad79

alfinkel force-pushed the 102-fix-get-attachment-bug branch from 4b4cc98 to dcdad79 Compare March 7, 2016 14:18

alfinkel added a commit that referenced this pull request Mar 7, 2016

Merge pull request #106 from cloudant/102-fix-get-attachment-bug

763e8d3

Merged

alfinkel merged commit 763e8d3 into master Mar 7, 2016

alfinkel mentioned this pull request Mar 7, 2016

Document.get_attachment() fails with binary data and when write_to is not None #102

Closed

ricellis deleted the 102-fix-get-attachment-bug branch August 3, 2016 10:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Document.get_attachment bug #106

Fix Document.get_attachment bug #106

alfinkel commented Mar 3, 2016

emlaver commented Mar 4, 2016

ricellis commented Mar 4, 2016

alfinkel commented Mar 4, 2016

ricellis commented Mar 4, 2016

alfinkel commented Mar 4, 2016

alfinkel commented Mar 4, 2016

ricellis commented Mar 4, 2016

alfinkel commented Mar 4, 2016

ricellis commented Mar 7, 2016

toddreed commented Mar 7, 2016

alfinkel commented Mar 7, 2016

Fix Document.get_attachment bug #106

Fix Document.get_attachment bug #106

Conversation

alfinkel commented Mar 3, 2016

What

How

Testing

Reviewers

Issues

emlaver commented Mar 4, 2016

ricellis commented Mar 4, 2016

alfinkel commented Mar 4, 2016

ricellis commented Mar 4, 2016

alfinkel commented Mar 4, 2016

alfinkel commented Mar 4, 2016

ricellis commented Mar 4, 2016

alfinkel commented Mar 4, 2016

ricellis commented Mar 7, 2016

toddreed commented Mar 7, 2016

alfinkel commented Mar 7, 2016