-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Return mutated uids which are part of an upsert operation in a map. #4145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return mutated uids which are part of an upsert operation in a map. #4145
Conversation
…we can add assertions to our upsert tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ A review job has been created and sent to the PullRequest network.
@pawanrawal you can click here to see the review status or cancel the code review job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- In
"uids"map, it has keys asuid(v)whereas in"mutated", it has keys as variable name. That is not consistent. - I think
mutatedmay not be a good choice for the key name, creation of new UIDs is also technically called mutation. - Not sure whether there was a test for bulk upsert too, just check once please.
- Need to update the docs too
- Ensure that
dgoPR is merged and vendor-ed after the merge
Reviewable status: 0 of 10 files reviewed, 1 unresolved discussion (waiting on @mangalaman93, @manishrjain, and @pawanrawal)
x/x.go, line 60 at r1 (raw file):
// ContextKey is used to set options in the context object. type ContextKey int
Adding this type doesn't seem very useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the case when a variable is defined in the query and not used in mutation (may be used in the query itself). It will show up in the response of mutated, right?
Additionally, I was thinking how it would work with value variables. What about the case when uid(v) is used in the object, i.e. modifying edges.
For the current situation, we could call it vars instead of mutated because we are just returning values for the defined variables. Not sure whether that is what we want though.
Reviewable status: 0 of 10 files reviewed, 1 unresolved discussion (waiting on @mangalaman93, @manishrjain, and @pawanrawal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I am wondering whether it can be extended to multiple mutations where I can use same variable in multiple mutations. Just thinking out loud.
Reviewable status: 0 of 10 files reviewed, 1 unresolved discussion (waiting on @mangalaman93, @manishrjain, and @pawanrawal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edgraph/server.go
Outdated
| } | ||
| } | ||
|
|
||
| updateUIDInMutations(gmu, varToUID) | ||
| updateValInMutations(gmu, qr) | ||
| return l, nil | ||
| // varToUID is returned to the client, lets delete the dummy var that we put in there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's describe here why varToUID is there, and why we should delete it.
edgraph/server.go
Outdated
| } | ||
| } | ||
|
|
||
| updateUIDInMutations(gmu, varToUID) | ||
| updateValInMutations(gmu, qr) | ||
| return l, nil | ||
| // varToUID is returned to the client, lets delete the dummy var that we put in there. | ||
| delete(varToUID, varName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just wondering, why is varToUID there to begin with?
x/x.go
Outdated
| @@ -108,6 +112,9 @@ const ( | |||
| {"predicate":"dgraph.user.group","list":true, "reverse": true, "type": "uid"}, | |||
| {"predicate":"dgraph.group.acl","type":"string"} | |||
| ` | |||
|
|
|||
| // DebugKey is the key used to toggle debug mode. | |||
| DebugKey ContextKey = iota | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DebugKey = 0, could easily overlap with another key in a context. It would be better if had DebugKey be a string.
dgraph/cmd/alpha/http.go
Outdated
| @@ -361,7 +366,8 @@ func mutationHandler(w http.ResponseWriter, r *http.Request) { | |||
| req.StartTs = startTs | |||
| req.CommitNow = commitNow | |||
|
|
|||
| ctx := attachAccessJwt(context.Background(), r) | |||
| ctx := context.WithValue(context.Background(), x.DebugKey, isDebugMode) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. I think it would be better if DebugKey is a string.
edgraph/server.go
Outdated
| if err != nil { | ||
| return resp, err | ||
| } | ||
| parsingTime += l.Parsing | ||
| if x.IsDebugRequest(ctx) && len(varToUID) > 0 { | ||
| // There could be a lot of these uids which could blow up the response size, specially for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: especially instead of specially
this comment should go above the if statement.
dgraph/cmd/alpha/http.go
Outdated
| @@ -207,7 +207,7 @@ func queryHandler(w http.ResponseWriter, r *http.Request) { | |||
| return | |||
| } | |||
|
|
|||
| ctx := context.WithValue(context.Background(), query.DebugKey, isDebugMode) | |||
| ctx := context.WithValue(context.Background(), x.DebugKey, isDebugMode) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted below, DebugKey is an int, and can get confusing. We should have x.DebugKey be a string, or have DebugKey be the value, with the field be something like mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed it to MutatedVars and we are only returning values of variables that were part of some mutation.
What about the case when a variable is defined in the query and not used in mutation (may be used in the query itself). It will show up in the response of mutated, right?
Good point, I now remove variables which were not part of a mutation.
Additionally, I was thinking how it would work with value variables. What about the case when uid(v) is used in the object, i.e. modifying edges.
Same thing, we only return uids corresponding to v in this case and not the value variables.
Also, I am wondering whether it can be extended to multiple mutations where I can use same variable in multiple mutations.
Should work with multiple mutations, we just take the union of the variables that were used in any mutation.
I have addressed the other questions @mangalaman93. Let me know if you think anything else needs to be changed.
Reviewed 2 of 10 files at r1, 8 of 9 files at r2.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @danielmai, @mangalaman93, @manishrjain, @MichaelJCompton, and @pullrequest[bot])
edgraph/server.go, line 681 at r1 (raw file):
Previously, pullrequest[bot] wrote…
Let's describe here why varToUID is there, and why we should delete it.
Done.
edgraph/server.go, line 682 at r1 (raw file):
Previously, pullrequest[bot] wrote…
just wondering, why is
varToUIDthere to begin with?
added a comment.
x/x.go, line 60 at r1 (raw file):
Previously, mangalaman93 (Aman Mangal) wrote…
Adding this type doesn't seem very useful.
reverted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I have added support for Go Modules now, can remove the changes in vendor folder now.
-
I am thinking just
varsis fine instead ofmutation_vars.mutation_varsjust looks too long.varslooks more likeuidsthat we already have in the response. In a normal mutation, we only return new UIDs and not the modified ones. Similarly, here, we return the UIDs stored in variables which are modified, not the rest. Calling itvarsseems fine to me. -
Just realized that,
uid(u)is returned as key when its a variable whereas if it is a blank node we'd just returnu. For example, look at the following response. Could you add this as a comment somewhere in the code.
{
"data": {
"code": "Success",
"message": "Done",
"mutation_vars": {
"uid(u)": [
"6",
"9"
]
},
"uids": {
"u": "0xf"
}
},
"extensions": {
"server_latency": {
"parsing_ns": 24593,
"processing_ns": 33807452
},
"txn": {
"start_ts": 81,
"commit_ts": 82,
"preds": [
"1-name"
]
}
}
}
for the query:
upsert {
query {
users(func: eq(name, "user1")) {
u as uid
}
}
mutation {
set {
uid(u) <name> "user1" .
_:u <name> "user2" .
}
}
}
- I ran the following query and got a response that has
6instead of0x6in ratel. We should add a test for this too.
Steps to reproduce:
Schema:
name: string @index(exact) .
branch: string .
amount: float .
Mutation:
{
set {
_:user1 <name> "user1" .
_:user1 <amount> "10" .
_:user2 <name> "user2" .
_:user2 <amount> "100" .
_:user3 <name> "user3" .
_:user3 <amount> "1000" .
}
}
Upsert:
upsert {
query {
u as var(func: has(amount)) {
amt as amount
}
me () {
updated_amt as math(amt+1)
}
}
mutation {
set {
uid(u) <amount> val(updated_amt) .
}
}
}
Result:
{
"data": {
"code": "Success",
"message": "Done",
"mutation_vars": {
"uid(u)": [
"6",
"7",
"8"
]
},
"uids": {}
},
"extensions": {
"server_latency": {
"parsing_ns": 17432,
"processing_ns": 29050676
},
"txn": {
"start_ts": 40,
"commit_ts": 41,
"preds": [
"1-amount"
]
}
}
}
I also did the same thing using curl:
curl 'http://localhost:8180/mutate?commitNow=true' -H 'Content-Type: application/rdf' --data $'upsert {\n query {\n u as var(func: has(amount)) {\n amt as amount\n }\n me () {\n updated_amt as math(amt+1)\n }\n }\n\n mutation {\n set {\n uid(u) <amount> val(updated_amt) .\n }\n }\n}'
Same result:
{"data":{"code":"Success","message":"Done","mutation_vars":{"uid(u)":["6","7","8"]},"uids":{}},"extensions":{"server_latency":{"parsing_ns":54287,"processing_ns":36651897},"txn":{"start_ts":46,"commit_ts":47,"preds":["1-amount"]}}}
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @danielmai, @manishrjain, @MichaelJCompton, @pawanrawal, and @pullrequest[bot])
edgraph/server.go, line 521 at r3 (raw file):
// There could be a lot of these uids which could blow up the response size, especially // for bulk mutations, hence only return the first million. if numUids+len(uids) > 1e6 {
This addition is done twice, could be just done once.
edgraph/server.go, line 595 at r3 (raw file):
// doQueryInUpsert processes the query in upsert block. // It return the latency and a map of variables => [ uids ...] used in the mutation.
returns*
in the upsert
edgraph/server.go, line 690 at r3 (raw file):
updateUIDInMutations(gmu, varToUID) updateValInMutations(gmu, qr) // varToUID is returned to the client, lets delete the dummy var that we put in there for
Let's
wiki/content/mutations/index.md, line 875 at r3 (raw file):
* If the variable is empty i.e. no node matched the query, the `uid` function returns a new UID in case of a `set` operation and is thus treated similar to a blank node. On the other hand, for `delete/del` operation, it returns no UID, and thus the operation becomes a no-op and is silently ignored. * If the variable stores one or more than one UIDs, the `uid` function returns all the UIDs stored in the variable. In this case, the operation is performed on all the UIDs returned, one at a time. * A mapping of the uid variable to the list of uids that were used to execute the mutation are also returned as part of the response (as part of the `mutation_vars` key). Currently only upto a million uids are returned as part of the result.
Not sure why it was added here. This doesn't seem like the right place for this doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not sure about Multiple Mutations. Think about a use case like this:
upsert {
query {
v as ...
}
mutation1 @if(false condition) { ... }
mutation2 @if(true condition) { .... }
}
I still can't figure out from the response which mutation was successfully executed.
Reviewable status: 2 of 8 files reviewed, 6 unresolved discussions (waiting on @danielmai, @manishrjain, @MichaelJCompton, @pawanrawal, and @pullrequest[bot])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to merge once the existing comments are addressed. Also, get the docs reviewed too.
Reviewed 1 of 10 files at r1, 1 of 9 files at r2, 6 of 6 files at r4.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @danielmai, @manishrjain, @MichaelJCompton, @pawanrawal, and @pullrequest[bot])
edgraph/server.go, line 521 at r3 (raw file):
Previously, mangalaman93 (Aman Mangal) wrote…
This addition is done twice, could be just done once.
You could directly assign to numUids. No need for totalUids
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But have a bunch of comments, so address those.
Reviewable status: 1 of 8 files reviewed, 7 unresolved discussions (waiting on @danielmai, @mangalaman93, @manishrjain, @MichaelJCompton, @pawanrawal, and @pullrequest[bot])
dgraph/cmd/alpha/upsert_test.go, line 36 at r5 (raw file):
for _, uid := range uids { require.True(t, strings.HasPrefix(uid, "0x"), "uid: [%v] should be a hex encoded string", uid) // ParseUint throws an error if the string has 0x prefix, so we need to strip the prefix here.
100 chars.
edgraph/server.go, line 521 at r3 (raw file):
Previously, mangalaman93 (Aman Mangal) wrote…
You could directly assign to
numUids. No need fortotalUids
I wouldn't even sum them up. Make a decision per variable. Only include the ones which have less than a mil results.
And do the string conv here. A fast way to convert to hex string.
edgraph/server.go, line 600 at r5 (raw file):
// It returns the latency and a map of variables => [ uids ...] used in the upsert mutation. func doQueryInUpsert(ctx context.Context, req *api.Request, gmu *gql.Mutation) ( *query.Latency, map[string][]string, error) {
map[string]*api.Uids
edgraph/server.go, line 675 at r5 (raw file):
uids := make([]string, len(v.Uids.Uids)) for i, u := range v.Uids.Uids { uids[i] = fmt.Sprintf("%#x", u)
fmt.Sprintf is slow. It acquires some sort of a global lock, don't quote me.
Also, if we have over a Mil results, don't do this conversion and don't return anything.
edgraph/server.go, line 695 at r5 (raw file):
// varToUID is returned to the client, let's delete the dummy var that we put in there for // evaluating the conditional upsert. delete(varToUID, varName)
Rename varName to something else.
| varName := fmt.Sprintf("__dgraph%d__", rand.Int()) | ||
| // conditionalVar is a dummy var that we use to evaluate the result of | ||
| // conditional upsert. | ||
| conditionalVar := fmt.Sprintf("__dgraph%d__", rand.Int()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
G404: Use of weak random number generator (math/rand instead of crypto/rand) (from gosec)
We now return a map of the variables that were part of an upsert mutation along with the list of uids that they map to when debug parameter is set to true.
The PR also refactors
http_test.goto return a struct frommutationWithTs.This change is