Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(countindex): Optimize count index #4666

Merged
merged 15 commits into from
Jul 10, 2020
Merged

Conversation

ashish-goswami
Copy link
Contributor

@ashish-goswami ashish-goswami commented Jan 24, 2020

Fixes: DGRAPH-890

Currently count index is calculated in the following way:

  • Calculate length of posting list for which mutation is getting performed.
  • Perform the actual mutation.
  • Calculate the length of posting list after mutation is performed.
  • Add count index for new count and delete count index for old count.

Finding length of posting list:
Length of posting list is calculated by merging mutable and immutable layer. Immutable layer is already sorted. We need to sort mutable layer for merging. This sorting turns out to be a costly function and shows up in CPU profile. Since we are calculating length twice, sorting is happening twice.

Optimization:
This PR avoids calculating length twice. It uses length calculated before and edge(for mutation) information, to calculate length after mutation.

Live loader performance comparison:
Master:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 14m13.252187311s
N-Quads processed per second : 24900

This PR:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 12m52.835002981s
N-Quads processed per second : 27512


This change is Reviewable

Docs Preview: Dgraph Preview

Copy link
Contributor

@pawanrawal pawanrawal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r1.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @animesh2049, @ashish-goswami, @manishrjain, and @martinmr)


posting/index.go, line 162 at r1 (raw file):

	}
	if hasCountIndex {
		if (found && edge.Op == pb.DirectedEdge_SET) || (!found && edge.Op == pb.DirectedEdge_DEL) {

Have a function for this as this is being reused across two functions.

updatedCount(countBefore, found, op) int

posting/index.go, line 168 at r1 (raw file):

		} else if found && edge.Op == pb.DirectedEdge_DEL {
			countAfter = countBefore - 1
		} else {

this else can be removed as we have covered the 4 cases possible.


posting/index.go, line 346 at r1 (raw file):

	countBefore, countAfter := 0, 0
	var currPost *pb.Posting
	var currVal types.Val

Let it remain as val.


posting/index.go, line 383 at r1 (raw file):

	}

	if found {
if doUpdateIndex && found {
}

posting/list.go, line 668 at r1 (raw file):

}

func (l *List) getPostingAndLength(readTs, afterUid, uid uint64) (int, bool, *pb.Posting) {

This can only return posting and length. Doesn't need to return found.

Copy link
Contributor Author

@ashish-goswami ashish-goswami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 5 unresolved discussions (waiting on @animesh2049, @manishrjain, @martinmr, and @pawanrawal)


posting/index.go, line 162 at r1 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…

Have a function for this as this is being reused across two functions.

updatedCount(countBefore, found, op) int

Done.


posting/index.go, line 168 at r1 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…

this else can be removed as we have covered the 4 cases possible.

Done.


posting/index.go, line 346 at r1 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…

Let it remain as val.

Done.


posting/index.go, line 383 at r1 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…
if doUpdateIndex && found {
}

Done.


posting/list.go, line 668 at r1 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…

This can only return posting and length. Doesn't need to return found.

I have made it like this in sync with findPosting and findValue.

Copy link
Contributor

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r2.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @animesh2049, @ashish-goswami, @manishrjain, and @pawanrawal)


posting/index.go, line 316 at r2 (raw file):

	}

	// (found && op == pb.DirectedEdge_SET) || (!found && op == pb.DirectedEdge_DEL)

this is commented out.


posting/index.go, line 333 at r2 (raw file):

	}

	getUID := func(t *pb.DirectedEdge) uint64 {

I think there's already a function doing this. Please check if you can re-use it here.


posting/index.go, line 340 at r2 (raw file):

	}

	delNonListPredicateCond := !schema.State().IsList(t.Attr) &&

something like delNonListPred should be enough.
Also move it closer to the usage (right above the switch block).


posting/index.go, line 343 at r2 (raw file):

// Here we want to call function, which is super set of all. 

Can you rephrase this? I am not sure what it means. In particular which function are we calling?

Copy link
Contributor Author

@ashish-goswami ashish-goswami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 9 unresolved discussions (waiting on @animesh2049, @manishrjain, @martinmr, and @pawanrawal)


posting/index.go, line 316 at r2 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

this is commented out.

Its a comment. Made is more clear now.


posting/index.go, line 333 at r2 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

I think there's already a function doing this. Please check if you can re-use it here.

Hey @martinmr, could not find it. Can you point it to me?


posting/index.go, line 340 at r2 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

something like delNonListPred should be enough.
Also move it closer to the usage (right above the switch block).

Done.


posting/index.go, line 343 at r2 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…
// Here we want to call function, which is super set of all. 

Can you rephrase this? I am not sure what it means. In particular which function are we calling?

Done.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Def run Jepsen bank test 10 times, 10 minutes each test, bank, no nemesis.

Reviewed 2 of 3 files at r1, 1 of 1 files at r3.
Reviewable status: all files reviewed, 14 unresolved discussions (waiting on @animesh2049, @ashish-goswami, @martinmr, and @pawanrawal)


posting/index.go, line 333 at r2 (raw file):

Previously, ashish-goswami (Ashish Goswami) wrote…

Hey @martinmr, could not find it. Can you point it to me?

Yeah, we should have this one.


posting/index.go, line 309 at r3 (raw file):

}

func countAfterMutation(countBefore int, found bool, op pb.DirectedEdge_Op) int {

count += countAfterMutation(op, found)

this func returns 1 or -1 or 0.


posting/index.go, line 355 at r3 (raw file):

	switch {
	case hasCountIndex:
		countBefore, found, currPost = l.getPostingAndLength(txn.StartTs, 0, getUID(t))

post, count :=

l.findPostingAndLength


posting/index.go, line 360 at r3 (raw file):

		}
	case doUpdateIndex || delNonListPredicate:
		found, currPost, err = l.findPosting(txn.StartTs, fingerprintEdge(t))

Make it return the post and error. If post is nil, then not found.


posting/list.go, line 668 at r3 (raw file):

}

func (l *List) getPostingAndLength(readTs, afterUid, uid uint64) (int, bool, *pb.Posting) {

lengthAndPosting

findPostingAndLength


posting/list.go, line 676 at r3 (raw file):

		if p.Uid == uid {
			post = p
			found = true

don't need found.

Copy link
Contributor Author

@ashish-goswami ashish-goswami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 3 files reviewed, 14 unresolved discussions (waiting on @animesh2049, @manishrjain, @martinmr, and @pawanrawal)


posting/index.go, line 309 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

count += countAfterMutation(op, found)

this func returns 1 or -1 or 0.

Done.


posting/index.go, line 355 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

post, count :=

l.findPostingAndLength

Done.


posting/index.go, line 360 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Make it return the post and error. If post is nil, then not found.

Done.


posting/list.go, line 668 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

lengthAndPosting

findPostingAndLength

Done.


posting/list.go, line 676 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

don't need found.

Done.

@ashish-goswami ashish-goswami changed the title [OPTIMIZATION] Optimize count index optimization(count index) Optimize count index Jul 10, 2020
@ashish-goswami ashish-goswami changed the title optimization(count index) Optimize count index perf(count index) Optimize count index Jul 10, 2020
@ashish-goswami ashish-goswami changed the title perf(count index) Optimize count index perf(countindex) Optimize count index Jul 10, 2020
@ashish-goswami ashish-goswami changed the title perf(countindex) Optimize count index perf(countindex): Optimize count index Jul 10, 2020
@ashish-goswami ashish-goswami merged commit 1167250 into master Jul 10, 2020
@ashish-goswami ashish-goswami deleted the ashish/ci-optimization branch July 10, 2020 17:47
ashish-goswami added a commit that referenced this pull request Jul 10, 2020
Fixes: DGRAPH-890

Currently count index is calculated in the following way:

Calculate length of posting list for which mutation is getting performed.
Perform the actual mutation.
Calculate the length of posting list after mutation is performed.
Add count index for new count and delete count index for old count.
Finding length of posting list:
Length of posting list is calculated by merging mutable and immutable layer. Immutable layer is already sorted. We need to sort mutable layer for merging. This sorting turns out to be a costly function and shows up in CPU profile. Since we are calculating length twice, sorting is happening twice.

Optimization:
This PR avoids calculating length twice. It uses length calculated before and edge(for mutation) information, to calculate length after mutation.

Live loader performance comparison:
Master:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 14m13.252187311s
N-Quads processed per second : 24900
This PR:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 12m52.835002981s
N-Quads processed per second : 27512

(cherry picked from commit 1167250)
ashish-goswami added a commit that referenced this pull request Jul 10, 2020
Fixes: DGRAPH-890

Currently count index is calculated in the following way:

Calculate length of posting list for which mutation is getting performed.
Perform the actual mutation.
Calculate the length of posting list after mutation is performed.
Add count index for new count and delete count index for old count.
Finding length of posting list:
Length of posting list is calculated by merging mutable and immutable layer. Immutable layer is already sorted. We need to sort mutable layer for merging. This sorting turns out to be a costly function and shows up in CPU profile. Since we are calculating length twice, sorting is happening twice.

Optimization:
This PR avoids calculating length twice. It uses length calculated before and edge(for mutation) information, to calculate length after mutation.

Live loader performance comparison:
Master:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 14m13.252187311s
N-Quads processed per second : 24900
This PR:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 12m52.835002981s
N-Quads processed per second : 27512

(cherry picked from commit 1167250)
ashish-goswami added a commit that referenced this pull request Jul 13, 2020
Fixes: DGRAPH-890

Currently count index is calculated in the following way:

Calculate length of posting list for which mutation is getting performed.
Perform the actual mutation.
Calculate the length of posting list after mutation is performed.
Add count index for new count and delete count index for old count.
Finding length of posting list:
Length of posting list is calculated by merging mutable and immutable layer. Immutable layer is already sorted. We need to sort mutable layer for merging. This sorting turns out to be a costly function and shows up in CPU profile. Since we are calculating length twice, sorting is happening twice.

Optimization:
This PR avoids calculating length twice. It uses length calculated before and edge(for mutation) information, to calculate length after mutation.

Live loader performance comparison:
Master:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 14m13.252187311s
N-Quads processed per second : 24900
This PR:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 12m52.835002981s
N-Quads processed per second : 27512

(cherry picked from commit 1167250)
parasssh pushed a commit that referenced this pull request Jul 13, 2020
Fixes: DGRAPH-890

Currently count index is calculated in the following way:

Calculate length of posting list for which mutation is getting performed.
Perform the actual mutation.
Calculate the length of posting list after mutation is performed.
Add count index for new count and delete count index for old count.
Finding length of posting list:
Length of posting list is calculated by merging mutable and immutable layer. Immutable layer is already sorted. We need to sort mutable layer for merging. This sorting turns out to be a costly function and shows up in CPU profile. Since we are calculating length twice, sorting is happening twice.

Optimization:
This PR avoids calculating length twice. It uses length calculated before and edge(for mutation) information, to calculate length after mutation.

Live loader performance comparison:
Master:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 14m13.252187311s
N-Quads processed per second : 24900
This PR:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 12m52.835002981s
N-Quads processed per second : 27512

(cherry picked from commit 1167250)
arijitAD pushed a commit that referenced this pull request Jul 14, 2020
Fixes: DGRAPH-890

Currently count index is calculated in the following way:

Calculate length of posting list for which mutation is getting performed.
Perform the actual mutation.
Calculate the length of posting list after mutation is performed.
Add count index for new count and delete count index for old count.
Finding length of posting list:
Length of posting list is calculated by merging mutable and immutable layer. Immutable layer is already sorted. We need to sort mutable layer for merging. This sorting turns out to be a costly function and shows up in CPU profile. Since we are calculating length twice, sorting is happening twice.

Optimization:
This PR avoids calculating length twice. It uses length calculated before and edge(for mutation) information, to calculate length after mutation.

Live loader performance comparison:
Master:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 14m13.252187311s
N-Quads processed per second : 24900
This PR:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 12m52.835002981s
N-Quads processed per second : 27512
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 18, 2020
Fixes: DGRAPH-890

Currently count index is calculated in the following way:

Calculate length of posting list for which mutation is getting performed.
Perform the actual mutation.
Calculate the length of posting list after mutation is performed.
Add count index for new count and delete count index for old count.
Finding length of posting list:
Length of posting list is calculated by merging mutable and immutable layer. Immutable layer is already sorted. We need to sort mutable layer for merging. This sorting turns out to be a costly function and shows up in CPU profile. Since we are calculating length twice, sorting is happening twice.

Optimization:
This PR avoids calculating length twice. It uses length calculated before and edge(for mutation) information, to calculate length after mutation.

Live loader performance comparison:
Master:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 14m13.252187311s
N-Quads processed per second : 24900
This PR:

Number of TXs run            : 21240                                                                
Number of N-Quads processed  : 21239870
Time spent                   : 12m52.835002981s
N-Quads processed per second : 27512
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants