Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding filter to vector scanner #391

Closed
wants to merge 3 commits into from

Conversation

harsh-ps-2003
Copy link

@harsh-ps-2003 harsh-ps-2003 commented Jan 7, 2024

Fixes #338

Changes
Adding a filtering function to the vector scanner which would take in a step vector and produce a new step vector with filtered samples to optimize the query execution process.

Verification
Added a test case sum(kube_pod_info == 1)

Tests are passing locally using make test

Signed-off-by: Harsh Pratap Singh <harshpratapsingh8210@gmail.com>
Signed-off-by: Harsh Pratap Singh <harshpratapsingh8210@gmail.com>
@harsh-ps-2003
Copy link
Author

harsh-ps-2003 commented Jan 7, 2024

@fpetkovski the make tests are passing locally :

❯ make test
>> running unit tests (without cache)
?       github.com/thanos-io/promql-engine/api  [no test files]
ok      github.com/thanos-io/promql-engine/engine       589.694s
?       github.com/thanos-io/promql-engine/execution    [no test files]
?       github.com/thanos-io/promql-engine/execution/aggregate  [no test files]
?       github.com/thanos-io/promql-engine/execution/binary     [no test files]
?       github.com/thanos-io/promql-engine/execution/exchange   [no test files]
?       github.com/thanos-io/promql-engine/execution/function   [no test files]
?       github.com/thanos-io/promql-engine/execution/model      [no test files]
?       github.com/thanos-io/promql-engine/execution/noop       [no test files]
?       github.com/thanos-io/promql-engine/execution/remote     [no test files]
?       github.com/thanos-io/promql-engine/execution/parse      [no test files]
?       github.com/thanos-io/promql-engine/execution/step_invariant     [no test files]
?       github.com/thanos-io/promql-engine/execution/scan       [no test files]
?       github.com/thanos-io/promql-engine/execution/unary      [no test files]
?       github.com/thanos-io/promql-engine/execution/warnings   [no test files]
?       github.com/thanos-io/promql-engine/extlabels    [no test files]
ok      github.com/thanos-io/promql-engine/execution/storage    (cached)
?       github.com/thanos-io/promql-engine/query        [no test files]
?       github.com/thanos-io/promql-engine/worker       [no test files]
ok      github.com/thanos-io/promql-engine/logicalplan  (cached)
ok      github.com/thanos-io/promql-engine/ringbuffer   (cached)

but make lint is failing!
Please point out if I have messed up something!

Signed-off-by: Harsh Pratap Singh <harshpratapsingh8210@gmail.com>
@harsh-ps-2003 harsh-ps-2003 marked this pull request as ready for review January 7, 2024 10:48
labels labels.Labels
signature uint64
samples *storage.MemoizedSeriesIterator
filterFunc func(model.StepVector) model.StepVector
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think this is ever populated anywhere right?

Copy link
Author

@harsh-ps-2003 harsh-ps-2003 Jan 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be some gaps in my understanding (sorry) but I thought, the filterFunc field within the vectorScanner struct is being populated within the Next method of the vectorSelector struct. The Next method creates a new vectorScanner and assigns the filterFunc to it before returning the result. After the filterFunc is assigned within the Next method of the vectorSelector struct, the actual filtering process occurs when iterating through the vectors and applying the filterFunc to each StepVector. When the Next method is called, it returns a slice of model.StepVector instances. Each model.StepVector in the returned slice is processed by the filterFunc that was assigned to the corresponding vectorScanner during its creation. The filterFunc is applied to each model.StepVector to perform the desired filtration based on the defined logic within the filterFunc. Hence I dont understand how is it not getting populated!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean it needs to be instantiated somewhere, right now it is always nil as it is never assigned any filter function. You would need to do that. Thats also what the linter is complaining about!

@MichaHoffmann
Copy link
Contributor

MichaHoffmann commented Jan 7, 2024

I think we need to add a "FilterFunc" to logicalplan.VectorSelector and then pass it down from execute.go into the constructor of the scan.VectorScan operator. The logical filter should be populated by an optimizer that also removes the binary expression (which is the ultimate benefit ~ no binary operator on top of the vector scan anymore). I think that would be the way to implement the feature.

Edit: that is probably a lot of information! If you have any questions or i can help somehow, let me know!

@harsh-ps-2003
Copy link
Author

harsh-ps-2003 commented Jan 7, 2024

@MichaHoffmann thanks for the guidance! I am kinda noob to PromQL engine, I am trying to solve this just for getting some exposure!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Push boolean operations into scanners
2 participants