Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OPTIMIZATION] Optimize snapshot creation #4901

Open
wants to merge 27 commits into
base: master
from
Open

Conversation

@ashish-goswami
Copy link
Member

ashish-goswami commented Mar 9, 2020

Fixes: #4250, DGRAPH-676

Currently snapshot creation is taking time, which shoots up mutations latency.
This PR optimizes snapshot creation in following ways:

  • We delete all raft entries until snapshot index, while saving new snapshot. This requires performing a seek on badger. We used to seek from entry 0, which is not required. We can seek from FirstIndex at that point of time. This avoids going through lot of previously deleted entries in
    badger iterator. This improves seek time significantly.
  • Saving of new snapshot and deletion of all entries used to be synchronous. This PR saves new snapshot synchronously but performs deletion of entries in background.

This PR reduces snapshot creation time from seconds(in worst case) to milliseconds.

Logs from master for snapshot creation time:

574 I0318 10:30:55.174392      18 storage.go:595] CreateSnapshot i=20739611, cs=nodes:1 nodes:2 nodes:3
575 I0318 10:30:59.398864      18 draft.go:506] Creating snapshot at index: 20739611. ReadTs: 39763033. time taken: 4.22447658s
576 I0318 10:31:47.558142      18 log.go:34] Got compaction priority: {level:0 score:1 dropPrefix:[]}
577 I0318 10:31:47.558183      18 log.go:34] Running for level: 0
578 I0318 10:31:48.202713      18 log.go:34] LOG Compact 0->1, del 6 tables, add 1 tables, took 644.511199ms
579 I0318 10:31:48.202740      18 log.go:34] Compaction for level: 0 DONE
580 I0318 10:31:54.985492      18 draft.go:1404] Num pending txns: 54
581 I0318 10:31:55.131087      18 draft.go:804] [0x1] Set Raft progress to index: 20768031.
582 I0318 10:31:55.131145      18 draft.go:1404] Num pending txns: 60
583 I0318 10:31:55.272493      18 storage.go:595] CreateSnapshot i=20768104, cs=nodes:1 nodes:2 nodes:3
584 I0318 10:31:58.809421      18 draft.go:506] Creating snapshot at index: 20768104. ReadTs: 39817697. time taken: 3.536931581s
585 I0318 10:32:54.972385      18 draft.go:1404] Num pending txns: 68
586 I0318 10:32:55.124161      18 draft.go:804] [0x1] Set Raft progress to index: 20796681.
587 I0318 10:32:55.124219      18 draft.go:1404] Num pending txns: 56
588 I0318 10:32:55.322028      18 storage.go:595] CreateSnapshot i=20796770, cs=nodes:1 nodes:2 nodes:3
589 I0318 10:32:59.558248      18 draft.go:506] Creating snapshot at index: 20796770. ReadTs: 39872679. time taken: 4.236221663s
590 I0318 10:33:28.558129      18 log.go:34] Got compaction priority: {level:0 score:1 dropPrefix:[]}
591 I0318 10:33:28.558176      18 log.go:34] Running for level: 0
592 I0318 10:33:29.230710      18 log.go:34] LOG Compact 0->1, del 6 tables, add 1 tables, took 672.502982ms
593 I0318 10:33:29.230742      18 log.go:34] Compaction for level: 0 DONE
594 I0318 10:33:54.986669      18 draft.go:1404] Num pending txns: 64
595 I0318 10:33:55.125660      18 draft.go:804] [0x1] Set Raft progress to index: 20825162.
596 I0318 10:33:55.125710      18 draft.go:1404] Num pending txns: 60
597 I0318 10:33:55.264070      18 storage.go:595] CreateSnapshot i=20825236, cs=nodes:1 nodes:2 nodes:3
598 I0318 10:33:58.716431      18 draft.go:506] Creating snapshot at index: 20825236. ReadTs: 39927283. time taken: 3.452365171s
599 I0318 10:34:54.969608      18 draft.go:115] Operation started with id: 1

Logs from this branch for snapshot creation time:

178 I0318 11:02:42.103363       1 draft.go:425] Creating snapshot at index: 21496840. ReadTs: 41226003. time taken: 297.39µs
179 I0318 11:03:21.943711       1 log.go:34] Got compaction priority: {level:0 score:1 dropPrefix:[]}
180 I0318 11:03:21.943763       1 log.go:34] Running for level: 0
181 I0318 11:03:24.510385       1 log.go:34] LOG Compact 0->1, del 6 tables, add 1 tables, took 2.566600068s
182 I0318 11:03:24.510417       1 log.go:34] Compaction for level: 0 DONE
183 I0318 11:03:40.164856       1 draft.go:1283] Num pending txns: 56
184 I0318 11:03:40.330751       1 draft.go:709] [0x1] Set Raft progress to index: 21526400.
185 I0318 11:03:40.330806       1 draft.go:1283] Num pending txns: 55
186 I0318 11:03:40.503145       1 storage.go:645] CreateSnapshot i=21526477, cs=nodes:1 nodes:2 nodes:3
187 I0318 11:03:40.503317       1 draft.go:425] Creating snapshot at index: 21526477. ReadTs: 41282847. time taken: 174.56µs
188 W0318 11:04:20.508999       1 draft.go:1025] Raft.Ready took too long to process: Timer Total: 221ms. Breakdown: [{disk 220ms} {proposal    s 0s} {advance 0s}] Num entries: 0. MustSync: false
189 I0318 11:04:40.164813       1 draft.go:1283] Num pending txns: 66
190 I0318 11:04:40.429105       1 draft.go:709] [0x1] Set Raft progress to index: 21556865.
191 I0318 11:04:40.429157       1 draft.go:1283] Num pending txns: 23
192 I0318 11:04:40.639351       1 storage.go:645] CreateSnapshot i=21557009, cs=nodes:1 nodes:2 nodes:3
193 I0318 11:04:40.639667       1 draft.go:425] Creating snapshot at index: 21557009. ReadTs: 41341402. time taken: 317.73µs
194 W0318 11:04:40.952292       1 draft.go:1025] Raft.Ready took too long to process: Timer Total: 248ms. Breakdown: [{disk 243ms} {sync 5ms    } {proposals 0s} {advance 0s}] Num entries: 30. MustSync: true
195 I0318 11:04:50.943669       1 log.go:34] Got compaction priority: {level:0 score:1 dropPrefix:[]}
196 I0318 11:04:50.943708       1 log.go:34] Running for level: 0
197 W0318 11:04:53.236307       1 draft.go:1025] Raft.Ready took too long to process: Timer Total: 443ms. Breakdown: [{sync 443ms} {disk 0s}     {proposals 0s} {advance 0s}] Num entries: 35. MustSync: true
198 I0318 11:04:53.299058       1 log.go:34] LOG Compact 0->1, del 6 tables, add 1 tables, took 2.355332869s
199 I0318 11:04:53.299089       1 log.go:34] Compaction for level: 0 DONE
200 I0318 11:05:40.164832       1 draft.go:1283] Num pending txns: 39
201 I0318 11:05:40.340678       1 draft.go:709] [0x1] Set Raft progress to index: 21587019.
202 I0318 11:05:40.340732       1 draft.go:1283] Num pending txns: 34
203 I0318 11:05:40.511474       1 storage.go:645] CreateSnapshot i=21587137, cs=nodes:1 nodes:2 nodes:3
204 I0318 11:05:40.511781       1 draft.go:425] Creating snapshot at index: 21587137. ReadTs: 41399174. time taken: 306.48µs

This change is Reviewable

Docs Preview: Dgraph Preview

Also pass from index in deleteUntil function,
which starts deleting from 0 to until index. Now
it will start deleting from from to until index.
worker/draft.go Outdated Show resolved Hide resolved
raftwal/storage.go Outdated Show resolved Hide resolved
@ashish-goswami ashish-goswami marked this pull request as ready for review Mar 10, 2020
@ashish-goswami ashish-goswami requested review from manishrjain and dgraph-io/team as code owners Mar 10, 2020
@ashish-goswami ashish-goswami requested review from martinmr and pawanrawal Mar 10, 2020
Copy link
Member

martinmr left a comment

Reviewed 1 of 3 files at r1, 2 of 2 files at r3.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @ashish-goswami, @golangcibot, @manishrjain, and @pawanrawal)


raftwal/storage.go, line 90 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

printf: Errorf format %s has arg r.until of wrong type uint64 (from govet)

Same thing regarding the use of Wrapf instead of Errorf.


raftwal/storage.go, line 80 at r3 (raw file):

func (w *DiskStorage) processDeleteRange() {
	// TODO: Figure out a way to close this.

for future reference, can you be more specific about what needs to be closed?

I am assuming you mean closing the channel but I am not sure.


raftwal/storage.go, line 85 at r3 (raw file):

"deleteuntil

deleteUntil


raftwal/storage.go, line 85 at r3 (raw file):

glog.Errorf(

don't print the error message here. Used Wrapf instead to include the previous error.


raftwal/storage.go, line 273 at r3 (raw file):

if snap, err := w.Snapshot(); err == nil && !raft.IsEmptySnap(snap) {

why is the cache not being used anymore. If this is intentional, make sure to mention it in the PR description.


raftwal/storage.go, line 659 at r3 (raw file):

	}

	// TODO: should we block here?

Also for future reference, maybe add a short description of why are you concerned the code might have to block here.


raftwal/storage.go, line 660 at r3 (raw file):

	// TODO: should we block here?
	w.deleteChan <- deleteRange{first - 1, snap.Metadata.Index}

why first - 1 instead of just first. Maybe add a comment explaining this because it's not obvious why.

Copy link
Member

manishrjain left a comment

Let's clean it up, so we can merge this.

Reviewed 1 of 3 files at r1, 1 of 2 files at r3, 1 of 1 files at r4.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @ashish-goswami, @golangcibot, and @pawanrawal)


raftwal/storage.go, line 81 at r4 (raw file):

	batch := w.db.NewWriteBatch()
	defer batch.Cancel()
	x.Check(w.deleteFrom(batch, first-2, true))

use delete until.

And if you're modifying the meaning of delete until, then call it deleteRange.


raftwal/storage.go, line 89 at r4 (raw file):

func (w *DiskStorage) processDeleteRange() {
	// TODO: Figure out a way to close this.
	go func() {

always call goroutine from outside. Don't envelope entire function in this from within.

Also, you need a way to shut this down.

Copy link
Member Author

ashish-goswami left a comment

Reviewable status: 1 of 2 files reviewed, 10 unresolved discussions (waiting on @ashish-goswami, @golangcibot, @manishrjain, @martinmr, and @pawanrawal)


raftwal/storage.go, line 80 at r3 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

for future reference, can you be more specific about what needs to be closed?

I am assuming you mean closing the channel but I am not sure.

Done.


raftwal/storage.go, line 85 at r3 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…
"deleteuntil

deleteUntil

Done.


raftwal/storage.go, line 273 at r3 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…
if snap, err := w.Snapshot(); err == nil && !raft.IsEmptySnap(snap) {

why is the cache not being used anymore. If this is intentional, make sure to mention it in the PR description.

Done.


raftwal/storage.go, line 660 at r3 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

why first - 1 instead of just first. Maybe add a comment explaining this because it's not obvious why.

Done.


raftwal/storage.go, line 81 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

use delete until.

And if you're modifying the meaning of delete until, then call it deleteRange.

Done.


raftwal/storage.go, line 89 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

always call goroutine from outside. Don't envelope entire function in this from within.

Also, you need a way to shut this down.

Done.


worker/draft.go, line 427 at r1 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

line is 114 characters (from lll)

Fixed.

@@ -52,6 +52,7 @@ type groupi struct {
triggerCh chan struct{} // Used to trigger membership sync
blockDeletes *sync.Mutex // Ensure that deletion won't happen when move is going on.
closer *y.Closer
storeCloser *y.Closer // Closer for raftwal storage.

This comment has been minimized.

Copy link
@golangcibot

golangcibot Mar 18, 2020

field storeCloser is unused (from unused)

Copy link
Member

pawanrawal left a comment

:lgtm:

Reviewable status: 0 of 7 files reviewed, 13 unresolved discussions (waiting on @ashish-goswami, @golangcibot, @manishrjain, @martinmr, and @pawanrawal)


raftwal/storage.go, line 50 at r6 (raw file):

type indexRange struct {
	from, until uint64

Just add comment that this is RAFT index.


raftwal/storage.go, line 290 at r6 (raw file):

// into the latest Snapshot).
func (w *DiskStorage) FirstIndex() (uint64, error) {
	// Previously we used to get snapshot key from cache. If it is not found in cache, we used to

Just add a comment explaining the current behavior that cache is only set on calling CreateSnapshot and also how does doing snapshot in background affects this.

Copy link
Member

manishrjain left a comment

:lgtm: Nice work.

Reviewed 6 of 8 files at r5, 1 of 1 files at r6, 1 of 1 files at r7.
Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @ashish-goswami, @golangcibot, @manishrjain, and @martinmr)


dgraph/cmd/debug/wal.go, line 271 at r7 (raw file):

			case len(opt.wsetSnapshot) > 0:
				err := overwriteSnapshot(db, store)
				store.Closer.SignalAndWait()

Is there another goroutine here? I'm not sure. So, do run the tool and see if it gets stuck.


raftwal/storage.go, line 94 at r7 (raw file):

	defer w.Closer.Done()

	slurp := func(r indexRange) {

deleteRange


raftwal/storage.go, line 116 at r7 (raw file):

	// As we have already shutdown the node, it is safe to close indexRangeChan.
	// node.processApplyChan() calls CreteSnapshot, which internally sends values on this chan.

CreateSnapshot

Copy link
Member Author

ashish-goswami left a comment

Reviewable status: 6 of 7 files reviewed, 16 unresolved discussions (waiting on @ashish-goswami, @golangcibot, @manishrjain, @martinmr, and @pawanrawal)


raftwal/storage.go, line 659 at r3 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

Also for future reference, maybe add a short description of why are you concerned the code might have to block here.

Done.


raftwal/storage.go, line 50 at r6 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…

Just add comment that this is RAFT index.

Done.


raftwal/storage.go, line 290 at r6 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…

Just add a comment explaining the current behavior that cache is only set on calling CreateSnapshot and also how does doing snapshot in background affects this.

Done.


raftwal/storage.go, line 94 at r7 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

deleteRange

Renamed it processSingleRange


raftwal/storage.go, line 116 at r7 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

CreateSnapshot

Done.


worker/groups.go, line 55 at r5 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

field storeCloser is unused (from unused)

Fixed.

Copy link
Member Author

ashish-goswami left a comment

Reviewable status: 6 of 7 files reviewed, 16 unresolved discussions (waiting on @ashish-goswami, @golangcibot, @manishrjain, @martinmr, and @pawanrawal)


dgraph/cmd/debug/wal.go, line 271 at r7 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Is there another goroutine here? I'm not sure. So, do run the tool and see if it gets stuck.

Ran debug tool on w/zw directory of alpha/zero nodes after insertion of 1 million data using live loader. Didn't get stuck.

Copy link
Contributor

mangalaman93 left a comment

Reviewable status: 5 of 7 files reviewed, 17 unresolved discussions (waiting on @ashish-goswami, @golangcibot, @manishrjain, @martinmr, and @pawanrawal)


raftwal/storage.go, line 97 at r8 (raw file):

		batch := w.db.NewWriteBatch()
		if err := w.deleteRange(batch, r.from, r.until); err != nil {
			glog.Errorf("deleteRange failed with error: %s, from: %d, until: %d\n",

Why are we using %s for printing error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

8 participants
You can’t perform that action at this time.