readline() method #13

coreyhuinker · 2015-02-14T18:37:41Z

This would be useful for extracting the header from a csv.

piskvorky · 2015-02-18T18:20:21Z

I agree. Can you implement it?

coreyatmoat · 2015-02-18T18:21:46Z

I'll give it a shot.

piskvorky · 2015-05-15T13:13:35Z

@coreyatmoat what's your progress?

coreyatmoat · 2015-05-15T16:50:52Z

Sadly no. A while back a co-worker wrote https://gist.github.com/coreyatmoat/c961f99dfb0cfabdda54 as a sort of work around, but I haven't had time to figure out how it'd fit into your framework.

Mgutjahr · 2015-10-12T15:50:00Z

+1

mpenkov · 2016-06-10T19:19:59Z

I'm having a look at this now, since one of my projects needs this to work too :)

Is there any reason why we need to maintain independent pointers for read and __iter__? The default implementation of readline shares the file pointer with read (and other I/O operations), to the best of my knowledge.

piskvorky · 2016-06-11T00:09:59Z

Thanks @mpenkov .

The reason for the "independent pointer" was related to difficulties with buffering IIRC, but was only technical (CC @ziky90 ). That behaviour is not a part of the API contract (I don't think it's even documented).

If we can make a single pointer work, that's fine, even preferable.

mpenkov · 2016-06-11T06:52:18Z

@piskvorky OK. I think I've got it working. Have a look here: https://github.com/mpenkov/smart_open/commit/aca3a18358afb5f42b5f7e5c1355e8ff93a9bcea

I made a separate branch (readline) for it, since an open pull request (#75) is currently using the clone's master. This branch also contains the gzip changes.

What do you think is the cleanest way of getting this into your repo? The simplest I can think of:

Close Resolve issue #12 (unable to read gz from S3) #75
Open a new pull request from my new readline branch

Another way is:

You guys merge Resolve issue #12 (unable to read gz from S3) #75 whenever you're ready
I update my clone
I make a new pull request for the readline stuff

Let me know if there's a simpler or more convenient way.

piskvorky · 2016-06-11T10:55:55Z

Great stuff! @tmylk please review.

Re. branches: it's your call. If splitting the changes into two separate PRs is too complicated, doing both in one branch is fine with me. We appreciate your work, this is some much needed functionality.

mpenkov · 2016-06-12T16:22:33Z

@piskvorky @tmylk I've authored a request to pull from my readline to your master (#76). Let me know if you need anything else from me.

- Bundle gzipstream to enable streaming of gzipped content from S3 - Update gzipstream to avoid deep recursion - Implement readline for S3 - Add pip requirements.txt

mpenkov · 2016-07-25T07:13:20Z

@piskvorky @tmylk I think we can close this now. 78c461e resolved this.

piskvorky mentioned this issue Aug 7, 2015

missing readline in S3OpenRead #27

Closed

ziky90 mentioned this issue Aug 7, 2015

Fixed compatibility with S3 for gensim.utils.SaveLoad piskvorky/gensim#422

Merged

piskvorky assigned tmylk Jun 11, 2016

tmylk pushed a commit that referenced this issue Jun 27, 2016

Resolve issues #12 (gzipped S3) and #13 (readline) (#76)

78c461e

- Bundle gzipstream to enable streaming of gzipped content from S3 - Update gzipstream to avoid deep recursion - Implement readline for S3 - Add pip requirements.txt

tmylk closed this as completed Sep 4, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readline() method #13

readline() method #13

coreyhuinker commented Feb 14, 2015

piskvorky commented Feb 18, 2015

coreyatmoat commented Feb 18, 2015

piskvorky commented May 15, 2015

coreyatmoat commented May 15, 2015

Mgutjahr commented Oct 12, 2015

mpenkov commented Jun 10, 2016

piskvorky commented Jun 11, 2016 •

edited

mpenkov commented Jun 11, 2016 •

edited

piskvorky commented Jun 11, 2016

mpenkov commented Jun 12, 2016

mpenkov commented Jul 25, 2016 •

edited

readline() method #13

readline() method #13

Comments

coreyhuinker commented Feb 14, 2015

piskvorky commented Feb 18, 2015

coreyatmoat commented Feb 18, 2015

piskvorky commented May 15, 2015

coreyatmoat commented May 15, 2015

Mgutjahr commented Oct 12, 2015

mpenkov commented Jun 10, 2016

piskvorky commented Jun 11, 2016 • edited

mpenkov commented Jun 11, 2016 • edited

piskvorky commented Jun 11, 2016

mpenkov commented Jun 12, 2016

mpenkov commented Jul 25, 2016 • edited

piskvorky commented Jun 11, 2016 •

edited

mpenkov commented Jun 11, 2016 •

edited

mpenkov commented Jul 25, 2016 •

edited