Topn fill #57

tgruben · 2016-02-29T16:30:28Z

The topn needs a bit of adjustment. The process should be:

scatter out topn to each slice
gather results
-->fetch missing counts from non reporting slices<-- the missing part
sort and return top n

I've added a test, TestExecutor_Execute_TopN_fill, to express the problem

you think you can fix that @benbjohnson ?

benbjohnson · 2016-02-29T19:11:50Z

@tgruben Yep, I can take a look.

benbjohnson · 2016-02-29T23:29:59Z

@tgruben Did the previous implementation do a refetch? From what I see in 6822dd9 it does a TopN() to fetch N*2 results from the nodes trims that down to N results.

This is the only change required for N*2: master...benbjohnson:top-n-2

I could see doing a min N of 10 or so just to make sure that small Ns always fetch a large enough resultset.

Here are the lines from the previous implementation I was referencing:

https://github.com/umbel/pilosa/blob/6822dd99b5245195dbe45fbe9cb1718fdfb51a2f/core/topn.go#L174
https://github.com/umbel/pilosa/blob/6822dd99b5245195dbe45fbe9cb1718fdfb51a2f/core/topn.go#L190

tgruben · 2016-03-01T00:57:45Z

Yes,but most of the code that does the fill in was located in
core/query.go if you look at TopNPackage and check_pair around line 252

it is basically the results of the first scatter
That TopN package gets the first batch and then the check pair sends out
the requests for the fill in, so it was a little more complicated that just
doubling the request

Does that code make sense or do I need to expand..that one function is
basically the "combine results from all queries part.
(CatQueryStepHandler)

-Todd

On Mon, Feb 29, 2016 at 5:30 PM, Ben Johnson notifications@github.com
wrote:

@tgruben https://github.com/tgruben Did the previous implementation do
a refetch? From what I see in 6822dd9
6822dd9
it does a TopN() to fetch N*2 results from the nodes trims that down to N
results.

This is the only change required for N*2: master...benbjohnson:top-n-2
master...benbjohnson:top-n-2

I could see doing a min N of 10 or so just to make sure that small Ns
always fetch a large enough resultset.

Here are the lines from the previous implementation I was referencing:

https://github.com/umbel/pilosa/blob/6822dd99b5245195dbe45fbe9cb1718fdfb51a2f/core/topn.go#L174

https://github.com/umbel/pilosa/blob/6822dd99b5245195dbe45fbe9cb1718fdfb51a2f/core/topn.go#L190

—
Reply to this email directly or view it on GitHub
#57 (comment).

benbjohnson · 2016-03-01T02:51:27Z

@tgruben Yep, that make sense. I missed that refetching part the first time around. I'll get that added in.

BLOCKED: Pilosa update

return total match counts for either min or max

benbjohnson self-assigned this Feb 29, 2016

benbjohnson mentioned this issue Mar 3, 2016

Refetch full counts for TopN() #58

Merged

tgruben closed this as completed in #58 Mar 3, 2016

tgruben pushed a commit to tgruben/pilosa that referenced this issue Sep 4, 2019

Merge pull request FeatureBaseDB#57 from yuce/pilosa-update

9958cd5

BLOCKED: Pilosa update

jaffee pushed a commit that referenced this issue Jan 12, 2020

Merge pull request #57 from tgruben/fix-minmax-count

c5aeed0

return total match counts for either min or max

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Topn fill #57

Topn fill #57

tgruben commented Feb 29, 2016

benbjohnson commented Feb 29, 2016

benbjohnson commented Feb 29, 2016

tgruben commented Mar 1, 2016

benbjohnson commented Mar 1, 2016

Topn fill #57

Topn fill #57

Comments

tgruben commented Feb 29, 2016

benbjohnson commented Feb 29, 2016

benbjohnson commented Feb 29, 2016

tgruben commented Mar 1, 2016

benbjohnson commented Mar 1, 2016