-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add support for first and offset directive in has function. #3970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: பாலாஜி ஜின்னா <balaji@dgraph.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ A review job has been created and sent to the PullRequest network.
@balajijinnah you can click here to see the review status or cancel the code review job.
|
Super excited about this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x/list.go
Outdated
| "github.com/dgraph-io/dgraph/protos/pb" | ||
| ) | ||
|
|
||
| // IsListFull check wheather the posting list is full or not based on the limit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// IsListFull check whether the posting List is full based on the limit
protos/pb.proto
Outdated
| @@ -70,6 +70,7 @@ message Query { | |||
|
|
|||
| uint64 read_ts = 13; | |||
| int32 cache = 14; | |||
| int32 count = 15; // used to limit the number of result. Typically, the count is value of first | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First what? Perhaps rewrite the comment to say
// limit result count. Typically equal to value of first
Signed-off-by: பாலாஜி ஜின்னா <balaji@dgraph.io>
Signed-off-by: பாலாஜி ஜின்னா <balaji@dgraph.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got some comments. Can you please make sure that we already have test cases in query package covering the common use cases for has function?
- First
- Offset
- Offset + first
- First + offset + filter
- First + offset + order
- First + offset + filter + order
Reviewed 4 of 5 files at r1, 1 of 1 files at r2.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @balajijinnah, @manishrjain, and @martinmr)
protos/pb.proto, line 73 at r2 (raw file):
uint64 read_ts = 13; int32 cache = 14; int32 count = 15; // used to limit the number of result. Typically, the count is value of first
Please also say its only being used for has query right now.
query/query.go, line 885 at r2 (raw file):
// { // q(func: has(name), first:1) { // count(name)
So for count(name) SubGraph DoCount would be true but that is different from the has SubGraph. In this case DoCount won't be true for the root subgraph, infact is never is true at root. It is only true when you are requesting for count of a predicate.
Look into this case as well.
{
q(func: has(name), first: 1) {
count(uid)
}
}
In this case uidCount would be true for subgraph at root so you should exclude that case.
query/query.go, line 896 at r2 (raw file):
// - No facet ordering // { // q(func: has(mobile), first:1) {
We can't have facet ordering at root and all the work that we are doing here is for the root has function, so is this case relevant?
query/query.go, line 900 at r2 (raw file):
// } // } // - subgraph should not be a filter, which has connective operations like (and, or , not)
Again this case would be discarded when you check for SrcFunc.Name to be `has.
query/query.go, line 912 at r2 (raw file):
// } // here one case is missing how to differentiate that this not the subgraph of // allofterms(name@en, "jurassic park") filter. This will fail the connective filter.
If its SubGraph of allofterms filter, then SrcFunc.Name won't be has. Which case are you talking about?
query/query.go, line 918 at r2 (raw file):
return sg.SrcFunc != nil && sg.SrcFunc.Name == "has" } if len(sg.Filters) == 0 && len(sg.Params.Order) == 0 && !sg.Params.DoCount &&
Please check that this should work with after argument as well.
query/query.go, line 919 at r2 (raw file):
} if len(sg.Filters) == 0 && len(sg.Params.Order) == 0 && !sg.Params.DoCount && isSupportedFunction() && len(sg.Params.FacetOrder) == 0 && len(sg.FilterOp) == 0 {
Do we need to check FilterOp here?
query/query.go, line 920 at r2 (raw file):
if len(sg.Filters) == 0 && len(sg.Params.Order) == 0 && !sg.Params.DoCount && isSupportedFunction() && len(sg.Params.FacetOrder) == 0 && len(sg.FilterOp) == 0 { // Offset also added because, we need n results to trim the offset.
Would be nice to do some benchmarks with large values of offsets. At some point, we might want to pass this in as well to the worker and only return the relevant ids.
Signed-off-by: பாலாஜி ஜின்னா <balaji@dgraph.io>
Signed-off-by: பாலாஜி ஜின்னா <balaji@dgraph.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 6 files reviewed, 10 unresolved discussions (waiting on @balajijinnah, @manishrjain, @martinmr, @pawanrawal, and @pullrequest[bot])
query/query.go, line 896 at r2 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
We can't have facet ordering at root and all the work that we are doing here is for the root has function, so is this case relevant?
Removed it
query/query.go, line 912 at r2 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
If its SubGraph of
alloftermsfilter, thenSrcFunc.Namewon't behas. Which case are you talking about?
Didn't know that we don't support first and offset in filter. I wrote this case by having has in the mind :(
query/query.go, line 918 at r2 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Please check that this should work with
afterargument as well.
AfterUID is passed as part for taskQuery. and we starting iteration from the uid onward. So, it should be fine. It won't affect the limit of the result.
query/query.go, line 919 at r2 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Do we need to check
FilterOphere?
Didn't know that we don't support first in the filter
x/list.go, line 23 at r1 (raw file):
Previously, pullrequest[bot] wrote…
// IsListFull check whether the posting List is full based on the limit
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one comment about the result of one of the tests. Let me know what do you find out about it. One more thing I'd recommend is to run all the tests by commenting out the new code and verify that you get some results as you do now. This would make sure results are identical using the new and old way.
Reviewed 1 of 4 files at r3.
Reviewable status: 3 of 6 files reviewed, 7 unresolved discussions (waiting on @balajijinnah, @manishrjain, @martinmr, @pawanrawal, and @pullrequest[bot])
query/query4_test.go, line 358 at r3 (raw file):
"q": [ { "name": ""
This is a bit strange that empty name is the first one that we get here. Can you check why this happens since "" < "Badger"
…aji/task_offset
|
@pawanrawal results are taken from master branch and cross verified. Regarding the sorting, It coming in the wrong order in master as well. I'll raise a separate issue for that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 6 files reviewed, 13 unresolved discussions (waiting on @balajijinnah, @martinmr, @pawanrawal, and @pullrequest[bot])
query/query.go, line 886 at r4 (raw file):
FacetsFilter: sg.facetsFilter, ExpandAll: sg.Params.ExpandAll, Count: int32(count),
Call it First to avoid confusion with docount. You are getting the first N results.
query/query.go, line 896 at r4 (raw file):
// calculateCountResult returns the count of result we need to proceed query further down. func calculateCountResult(sg *SubGraph) int {
CalculateFirstN
query/query.go, line 897 at r4 (raw file):
// calculateCountResult returns the count of result we need to proceed query further down. func calculateCountResult(sg *SubGraph) int { // by default count is zero. (zero will retrive all the results)
Should be maxint. Not zero by default.
query/query.go, line 919 at r4 (raw file):
// } // } isSupportedFunction := func() bool {
Why is this a function ? Can be a bool.
worker/task.go, line 2033 at r4 (raw file):
result.Uids = append(result.Uids, pk.Uid) // We'll stop fetching if we fetch the required count. if x.IsListFull(result, q.Count) {
no Need for a func. Len >= q.Count should be sufficient.
x/list.go, line 25 at r4 (raw file):
// IsListFull check whether the posting List is full based on the limit // by default it return false, if the limit is set to zero. func IsListFull(l *pb.List, limit int32) bool {
Don’t need this function . Remove.
Signed-off-by: பாலாஜி ஜின்னா <balaji@dgraph.io>
Signed-off-by: பாலாஜி ஜின்னா <balaji@dgraph.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 5 files reviewed, 13 unresolved discussions (waiting on @balajijinnah, @manishrjain, @martinmr, @pawanrawal, and @pullrequest[bot])
protos/pb.proto, line 73 at r1 (raw file):
Previously, pullrequest[bot] wrote…
First what? Perhaps rewrite the comment to say
// limit result count. Typically equal to value of first
Done.
protos/pb.proto, line 73 at r2 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Please also say its only being used for
hasquery right now.
Done.
query/query.go, line 885 at r2 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
So for
count(name)SubGraphDoCountwould be true but that is different from thehasSubGraph. In this caseDoCountwon't be true for the root subgraph, infact is never is true at root. It is only true when you are requesting for count of a predicate.Look into this case as well.
{ q(func: has(name), first: 1) { count(uid) } }In this case
uidCountwould be true for subgraph at root so you should exclude that case.
Done.
query/query.go, line 900 at r2 (raw file):
Previously, pawanrawal (Pawan Rawal) wrote…
Again this case would be discarded when you check for
SrcFunc.Nameto be `has.
Done.
query/query.go, line 886 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Call it First to avoid confusion with docount. You are getting the first N results.
Done.
query/query.go, line 896 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
CalculateFirstN
Done.
query/query.go, line 897 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Should be maxint. Not zero by default.
Done.
query/query.go, line 919 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Why is this a function ? Can be a bool.
Done.
worker/task.go, line 2033 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
no Need for a func. Len >= q.Count should be sufficient.
Done.
x/list.go, line 25 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Don’t need this function . Remove.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 5 files reviewed, 13 unresolved discussions (waiting on @balajijinnah, @manishrjain, @martinmr, @pawanrawal, and @pullrequest[bot])
query/query.go, line 896 at r5 (raw file):
// calculateFirstN returns the count of result we need to proceed query further down. func calculateFirstN(sg *SubGraph) int {
This can just return int32.
query/query.go, line 898 at r5 (raw file):
func calculateFirstN(sg *SubGraph) int { // by default count is zero. (zero will retrive all the results) count := math.MaxInt8
MaxInt8 would be 128. Surely, that's not what you want.
Signed-off-by: பாலாஜி ஜின்னா <balaji@dgraph.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 5 files reviewed, 13 unresolved discussions (waiting on @balajijinnah, @manishrjain, @martinmr, @pawanrawal, and @pullrequest[bot])
query/query.go, line 896 at r5 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
This can just return int32.
Done.
query/query.go, line 898 at r5 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
MaxInt8 would be 128. Surely, that's not what you want.
Done.
…aji/task_offset Signed-off-by: பாலாஜி ஜின்னா <balaji@dgraph.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 5 files at r5, 3 of 3 files at r6.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @balajijinnah, @manishrjain, @martinmr, @pawanrawal, and @pullrequest[bot])
Signed-off-by: பாலாஜி ஜின்னா <balaji@dgraph.io>
close #3964
Implementation:
first is passed as a parameter in worker query and the iteration is stopped after we get the required
result.
TODO:
Signed-off-by: பாலாஜி ஜின்னா balaji@dgraph.io
This change is