Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partition #2

Closed
wants to merge 3 commits into from
Closed

Partition #2

wants to merge 3 commits into from

Conversation

mrocklin
Copy link
Member

@mrocklin mrocklin commented Apr 6, 2014

OK, here is a "getting the feet wet" pull request.

At first I just relied on zip and izip_longest, as we do in toolz. This yielded a 30% speedup.

Then I tried making an iterator class and ran into weird issues. I get a fair number of stackoverflows. In particular I get some strange integer wraparound at 256. Does Cython use one byte integers somehow? Anyway, if you have any thoughts on debugging this I'd be interested.

Fail case on my machine

import cytoolz
list(cytoolz.partition(5, range(1000)))

oddly this works

list(cytoolz.partition(5, range(200)))

but this doesn't

timeit list(cytoolz.partition(5, range(200)))

Conflicts:
	cytoolz/itertoolz/core.pyx
	cytoolz/itertoolz/tests/test_core.py
@mrocklin
Copy link
Member Author

mrocklin commented Apr 6, 2014

@eriknw do you get notifications on this repository?

@eriknw
Copy link
Member

eriknw commented Apr 6, 2014

Yeah, I get notifications.

PyTuple_SET_ITEM "steals" a reference (it takes responsibility for garbage collecting the element), so you need to do Py_INCREF(val) before calling PyTuple_SET_ITEM. Unless you handle PyObject* values, you almost never need to manually increment or decrements reference counts. Cython is actually pretty nice in this regard. Manually setting tuples and lists are the only exceptions I know about.

I do this for get. It was one of the lessons I learned and wrote down. I should probably share these in a wiki!

@mrocklin
Copy link
Member Author

mrocklin commented Apr 6, 2014

Thanks for the tip. That worked wonderfully.

Unfortunately my custom code is slower than the zip solution.

In [1]: import cytoolz

In [2]: import toolz

In [3]: data = range(1000)

In [4]: timeit list(toolz.partition(5, data))
100000 loops, best of 3: 12.9 µs per loop

In [5]: timeit list(cytoolz.partition(5, data))
100000 loops, best of 3: 16.2 µs per loop

I'll revert or (if I get adventurous), look into line profiling (I remember being impressed with cython tools for this).

@eriknw
Copy link
Member

eriknw commented Apr 6, 2014

Yeah, I wasn't sure how to best go about partition. Since your (perfectly reasonable) approach didn't beat the original, I bet the original implementation is pretty good! I think there is room for improvement using advanced (read: dirty-ish) techniques, but can you benchmark the following for ideal partitions to see if it's worth pursuing?

    def __next__(self):
        cdef tuple result
        cdef int i 
        cdef object val
        result = PyTuple_New(self.n)
        for i in range(self.n):  # efficient loop if `i` is cdef integer type
            val = next(self.seq)
            Py_INCREF(val)
            PyTuple_SET_ITEM(result, i, val)
        return result

I actually haven't done much with profile yet. I should probably try it out. Annotations--cython -a some_file.pxd, which creates an html file--are also useful.

@mrocklin
Copy link
Member Author

mrocklin commented Apr 6, 2014

No dice. I'll rebase out the recent commits and stick with the solution in mrocklin@4c29e14

In [1]: import toolz
In [2]: import cytoolz

In [3]: data = range(1000)

In [4]: timeit list(toolz.partition(5, data))
100000 loops, best of 3: 12.5 µs per loop

In [5]: timeit list(cytoolz.partition(5, data))
100000 loops, best of 3: 14.5 µs per loop

@eriknw
Copy link
Member

eriknw commented Apr 6, 2014

Awesome (yes, I think it's awesome when toolz wins)!

Does map return an iterator or a list? I actually haven't used map in cytoolz yet, and you may have noticed that cytoolz.compatibility doesn't yet exist.

Another pointer: enumerate also makes efficient loops if i is an integer type. It's not more efficient than if you do it yourself, but it is as efficient and can be more readable.

Nice job getting your feet wet by the way.

I can't wait to flatten the directory structure in cytoolz, which will reduce the number of things to do when adding a new function.

@eriknw
Copy link
Member

eriknw commented Apr 8, 2014

I played around with this some more, and, wow, the toolz implementation is insane! We should definitely stick with it.

We'll need to add cytoolz/compatibility.py for things like map, zip, zip_longest, etc.

I'm going to go ahead and flatten cytoolz like you did for toolz. It will then be easy to add the stock implementation of partition, and--unless you want to--I can go ahead and do it if you don't mind. It's awesome that toolz.partition performs so well; thanks for this discovery!

@mrocklin
Copy link
Member Author

mrocklin commented Apr 8, 2014

Yeah, I was confused the most recent time that I did timings, toolz was beating cytoolz. Perhaps this is always the way that it was and when I reported the 30% improvement I had it flipped around.

@eriknw
Copy link
Member

eriknw commented Apr 8, 2014

Perhaps this is always the way that it was and when I reported the 30% improvement I had it flipped around.

I wouldn't be surprised. I don't see where there is 30% to be gained. Heh, you chose a heck of a function with which to get your feet wet.

eriknw added a commit that referenced this pull request Apr 9, 2014
@eriknw
Copy link
Member

eriknw commented Apr 9, 2014

Closing. I just added the stock version after flattening the directory structure.

@eriknw eriknw closed this Apr 9, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants