New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow any buffer type to be written to SFTPFile (1.17) #971

Merged
merged 8 commits into from Jun 9, 2017

Conversation

Projects
None yet
3 participants
@bz2
Contributor

bz2 commented May 26, 2017

Fixes #967 #968

Rollup of earlier branches proposed as #969 and #970 with additional fix inside sftp_client.

Includes new tests for SFTPFile usage.

bz2 added some commits May 25, 2017

Allow any buffer type to written to BufferedFile
Fixes #967

Also adds test coverage for writing various types to
BufferedFile which required some small changes to the test
LoopbackFile subclass.

Change against the 1.17 branch.
Allow any buffer type to be sent to Channel
Fixes #968

Changes the behaviour of the underlying asbytes helper to
pass along unknown types. Most callers already handle this
by passing the bytes along to a file or socket-like object
which will raise TypeError anyway.

Adds test coverage through the Transport implementation.

Change against the 1.17 branch.
Allow any buffer type to be written to SFTPFile
Fixes #967 #968

Rollup of earlier branches proposed as #969 and #970 with
additional fix inside sftp_client.

Includes new tests for SFTPFile usage.

Change against the 1.17 branch.
data = b(data)
if isinstance(data, text_type):
# GZ 2017-05-25: Accepting text on a binary stream unconditionally
# cooercing to utf-8 seems questionable, but compatibility reasons?

This comment has been minimized.

@bitprophet

bitprophet May 31, 2017

Member

Feel like I recall other tickets about this (re: preserving truly binary data instead of passing through Unicode), but unable to find them at a glance. ¯\_(ツ)_/¯ At present, this is effectively what b() was doing, so...not really much worse?

This comment has been minimized.

@ploxiln

ploxiln May 31, 2017

Contributor

Truly binary data would not be text_type (str for py3, unicode for py2), it would be bytes or something like that, so it should be passed through fine. If someone calls write() with a unicode type, this kicks in, turning unicode to utf-8 bytes. This seems like the best thing to do if someone passes unicode here.

This comment has been minimized.

@bitprophet

bitprophet May 31, 2017

Member

Well, on Python 2, binary data well might be str, thus the confusion :D but yea, not really that worried, just making a note of it.

This comment has been minimized.

@ploxiln

ploxiln May 31, 2017

Contributor

It's confusing - you're thinking of string_types. text_type is just unicode on python2

This comment has been minimized.

@bitprophet

bitprophet May 31, 2017

Member

Yup, I sure am.

This comment has been minimized.

@ploxiln

ploxiln May 31, 2017

Contributor

... I also typoed my first comment, was "str for py2, unicode for py2", first py2 should have been py3, I just corrected it ... that couldn't have helped ...

This comment has been minimized.

@bz2

bz2 May 31, 2017

Contributor

There's an important distinction from b() in this change, in that it allows all other types through. This is a tradeoff, callers get duck typing, but internal library interfaces can't rely on having a bytes type. For BufferedFile we get sane type handing when writing to BytesIO later so just passing along is best.

The reason I added this comment is that while handling here makes sense on existing branches for compatibility reasons, I'm not sure it does for master. It might be better to just raise if given unicode, and add an encoding parameter/encode step to higher level interfaces.

elif isinstance(item, SFTPAttributes):
item._pack(msg)
else:
raise Exception('unknown type for %r type %r' % (item, type(item)))

This comment has been minimized.

@bitprophet

bitprophet May 31, 2017

Member

I'd love a code comment here noting explicitly how "it was not any of the other expected types, so we assume it's something that can be asbytes()'d into a packet-level string type, and hand it to add_string() which does so" (assuming I'm interpreting that intent correctly!)

This comment has been minimized.

@bz2

bz2 May 31, 2017

Contributor

That's reasonable, I'll add a comment along those lines.

if isinstance(s, bytes_types):
return s
if isinstance(s, text_type):
# GZ 2017-05-25: Accept text and encode as utf-8 for compatibilty only.

This comment has been minimized.

@ploxiln

ploxiln May 31, 2017

Contributor

compatibilty -> compatibility

what is "GZ"? probably don't need signature/date for this comment

This comment has been minimized.

@bitprophet

bitprophet May 31, 2017

Member

Was curious about that too. Was wondering if some sort of joke around the submitter's Github username (bz2) or vice versa :D

This comment has been minimized.

@bz2

bz2 May 31, 2017

Contributor

'GZ' is indeed a person id for me, which I agree is confusing given the nick I use on this site.

I like id/date as markers with comments for review on tricky things that may need discussion. Am happy to move to docstring, strip or delete the comment for landing.

This comment has been minimized.

@bitprophet

bitprophet May 31, 2017

Member

I'm fine with them being comments instead of docstrings, it's just that we're not used to the "initials: date" prefix formatting. Stripping them out is probably the way to go, for consistency's sake. Git blame is a thing :D

@bitprophet

This comment has been minimized.

Member

bitprophet commented May 31, 2017

Appreciate the rigorous ticket + PR setup! Though in the interests of corralling conversation/changelog entries/etc, I've closed all but this one.

First, thanks for cleaning this up - sounds at a glance that most of the broken API promises re: buffer/socket type objects, happened in the big Python 3 shakeup (release 1.13). (The gift that keeps on giving!)

Second, it's probably a non-issue since it doesn't touch the crypto bits much, but I wonder how this will work out when merged up into the 2.x line - something to be aware of. Given the current state of branches, I may only release this in the 1.18+ or even 2.2+ lines, depending. Super appreciate the port to 1.17+ either way though - options are nice.

Third, I'm currently going through and reviewing / leaving occasional line notes on the patchset. Will report back here with anything major, otherwise once you have a chance to respond to those I'll look at merging. EDIT: all done with that, ball back in your court!

@bitprophet

This comment has been minimized.

Member

bitprophet commented May 31, 2017

Tagging as both bug and feature since I'm torn on whether this qualifies as one of those "bugfix, but potentially widely instability-causing" changes that like to come out in feature releases.

@bz2

This comment has been minimized.

Contributor

bz2 commented May 31, 2017

Thanks for the review! I've responded to the inline comments, and will push up a change with the suggested comment tweaks shortly.

I tried to write the change to minimise conflicts, so it should apply cleanly-ish to all later branches. That said, there is some fixing up that would be nice to do around 2/3 compat and tests after this lands on master, which I'd be happy to propose as well.

Tweak comments as suggested in review
Thanks to bitprophet and ploxiln.

bz2 added a commit to bz2/paramiko that referenced this pull request Jun 1, 2017

@bz2

This comment has been minimized.

Contributor

bz2 commented Jun 1, 2017

Added a possible changelog entry in f0124d9 - feel free to use/rewrite as desired.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Jun 9, 2017

  • Directly merged into 1.17 then up to 1.18
  • Couldn't find an alternate 2.0 version of the patchset so attempted to manually cherry-pick the whole range to the 2.0 branch (included some minor style/flake8 tweaks since the 2.0 lines got flake8'd recently)
  • tests still passing on both interpreter families so hopefully I didn't screw it up too badly
  • merged up to master

bitprophet added a commit that referenced this pull request Jun 9, 2017

@bitprophet bitprophet merged commit e25e1aa into paramiko:1.17 Jun 9, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

bitprophet added a commit that referenced this pull request Jun 9, 2017

bitprophet added a commit that referenced this pull request Jun 9, 2017

@bz2

This comment has been minimized.

Contributor

bz2 commented Jun 9, 2017

Thanks!

The 1.17 merge to 2.0 looked pretty clean so I didn't propose a separate branch for that series, but just double checked your cherrypick and it all looks good to me.

dkhapun pushed a commit to cyberx-labs/paramiko that referenced this pull request Jun 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment