Improve skip key #47

CarstVaartjes · 2015-09-22T16:22:22Z

It's a slightly ugly method to handle irrelevant results (because of filtering) during a groupby:

finding it can be slow as it's now (now cythonized)
getmask would be nicer; also have to look how bcolz iterblocks handles filters
how it's removed at the end (manipulating a list of ndarrays) is not performance efficient

waylonflinn · 2015-10-05T15:46:39Z

Would whereblocks be a good way to handle this?
I'm working on a PR that will add pivot table style aggregation, and I'm pursuing a method that would make extensive use of the existing filtering functionality. If there are speed gains to be had, I'd like to pursue them.

CarstVaartjes · 2015-10-05T19:46:07Z

whereblocks aren't cython based; @FrancescElies his work on iterblocks will work (and make it a lot more readable) but we still have to apply the filter ourselves :/
we could do a np.getmask on the chunk array but not sure if that will make it quicker or not than just looping through it. Also depends on the amount that you filter out of course

waylonflinn · 2015-10-05T22:30:41Z

I found this excellent investigation of timings from @FrancescElies: Comment in Chunks Class iterator, PR 153

It looks like he's just using the top level iterblocks though: test_v5 in bench_iter_carray

Isn't that defined in python here? iterblocks in toplevel.py

waylonflinn · 2015-10-05T23:13:53Z

I'm also happy to put together a PR that reproduces his earlier PR, since that will likely be hard to merge. Would that be helpful?

CarstVaartjes · 2016-07-04T11:53:06Z

@FrancescAlted also did some work to improve performance here (also with tuples vs namedtuples), I will check if we can use that to improve this and save ourselves unneeded chunk decompressions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve skip key #47

Improve skip key #47

CarstVaartjes commented Sep 22, 2015

waylonflinn commented Oct 5, 2015

CarstVaartjes commented Oct 5, 2015

waylonflinn commented Oct 5, 2015

waylonflinn commented Oct 5, 2015

CarstVaartjes commented Jul 4, 2016

Improve skip key #47

Improve skip key #47

Comments

CarstVaartjes commented Sep 22, 2015

waylonflinn commented Oct 5, 2015

CarstVaartjes commented Oct 5, 2015

waylonflinn commented Oct 5, 2015

waylonflinn commented Oct 5, 2015

CarstVaartjes commented Jul 4, 2016