Reformat journal index #7780

max-hoffman · 2024-04-25T02:54:57Z

Change the way we write journal index lookups. Each write appends a lookup to a bufio.Writer that lazily writes to disk. And after some increment we flush a CRC/root value record for consistency checking the index during bootstrap. This avoids big stalls for flushing a batch of index records. We also only write an addr16 now, because that's what we load into the default chunk address map.

Databases with the older format will pay a one-time startup penalty to rewrite the journal index. In testing this appears to be 5-10% of the import time for the database.

coffeegoddd · 2024-04-25T03:25:52Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`a1b2b7c`	ok	5937457

version	total_tests
`a1b2b7c`	5937457

correctness_percentage
100.0

…te.sh

coffeegoddd · 2024-04-25T20:15:30Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 99.510960

version	result	total
`7a0804b`	did not run	7423
`7a0804b`	not ok	21613
`7a0804b`	ok	5908511
`7a0804b`	timeout	1

version	total_tests
`7a0804b`	5937548

correctness_percentage
99.51096

…51096

…t into max/streamline-journal-index

coffeegoddd · 2024-04-26T18:46:36Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 99.378724

version	result	total
`8e519a7`	did not run	996
`8e519a7`	not ok	35891
`8e519a7`	ok	5900570
`8e519a7`	timeout	1

version	total_tests
`8e519a7`	5937458

correctness_percentage
99.378724

…378724

…t into max/streamline-journal-index

coffeegoddd · 2024-04-27T00:08:25Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 99.387347

version	result	total
`7bf0ce8`	did not run	17061
`7bf0ce8`	not ok	19314
`7bf0ce8`	ok	5901082
`7bf0ce8`	timeout	1

version	total_tests
`7bf0ce8`	5937458

correctness_percentage
99.387347

…387347

…t into max/streamline-journal-index

max-hoffman · 2024-04-29T16:44:02Z

#benchmark

github-actions · 2024-04-29T16:44:24Z

@max-hoffman workflow run: https://github.com/dolthub/dolt/actions/runs/8882267970

coffeegoddd · 2024-04-29T17:19:50Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 99.686701

version	result	total
`a008969`	did not run	5107
`a008969`	not ok	13493
`a008969`	ok	5918856
`a008969`	timeout	2

version	total_tests
`a008969`	5937458

correctness_percentage
99.686701

…686701

…t into max/streamline-journal-index

coffeegoddd · 2024-05-03T19:41:33Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`ee36b06`	ok	5937457

version	total_tests
`ee36b06`	5937457

correctness_percentage
100.0

…t into max/streamline-journal-index

max-hoffman · 2024-05-03T20:25:18Z

#benchmark

coffeegoddd · 2024-05-03T20:32:55Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`290471d`	ok	5937457

version	total_tests
`290471d`	5937457

correctness_percentage
100.0

github-actions · 2024-05-03T20:33:00Z

@max-hoffman workflow run: https://github.com/dolthub/dolt/actions/runs/8944507673

coffeegoddd · 2024-05-03T21:08:33Z

@max-hoffman DOLT

test_name	from_latency_median	to_latency_median	is_faster
tpcc-scale-factor-1	97.55	90.78	0

test_name	server_name	server_version	tps	test_name	server_name	server_version	tps	is_faster
tpcc-scale-factor-1	dolt	`34c3613`	22.27	tpcc-scale-factor-1	dolt	`290471d`	24.97	0

coffeegoddd · 2024-05-03T21:54:23Z

@max-hoffman DOLT

read_tests	from_latency_median	to_latency_median	is_faster
covering_index_scan	3.07	3.02	0
groupby_scan	17.63	17.63	0
index_join	5.18	5.18	0
index_join_scan	2.22	2.26	0
index_scan	52.89	53.85	0
oltp_point_select	0.51	0.51	0
oltp_read_only	8.43	8.58	0
select_random_points	0.8	0.81	0
select_random_ranges	0.97	0.99	0
table_scan	53.85	54.83	0
types_table_scan	134.9	161.51	-1

write_tests	from_latency_median	to_latency_median
oltp_delete_insert	6.79	6.79
oltp_insert	3.36	3.36
oltp_read_write	16.12	16.12
oltp_update_index	3.49	3.49
oltp_update_non_index	3.43	3.43
oltp_write_only	7.84	7.84
types_delete_insert	7.56	7.56

reltuk

Generally seems fine. A few comments about recovery and the bootstrap process.

As discussed offline, not sure if this will impact existing database with large journals – the loss of their index will cause a high one time startup cost on upgrade.

reltuk · 2024-05-06T21:19:25Z

go/store/nbs/journal_index_record.go

+		recTag, err := rd.ReadByte()
+		if err != nil {
+			if errors.Is(err, io.EOF) {
+				return nil


We need to get the number of bytes we read after the last successfuly indexRecMeta callback back to the caller so that they can file.Seek() and file.Truncate() the output file to start at that point. Then we can start writing new records without causing a CRC failure in a later bootstrap.

Same for the points where we return nil from ErrUnexpectedEOF down below.

A bit of context that I was missing, the way this used to work was if anything went wrong we delete the index and exit process. The next startup loads all chunks into novel, and the next batch flush writes all chunks into a fresh index. Startup/clear/exit is the repeatable retry loop if anything goes wrong. Shitty for the next index flush, but it works.

The rewrite has different semantics that you pointed out. Now we are concerned with hanging index lookups after the last metadata record. I added a seek/truncate, which clears the handing lookups. We have to do an extra step where we add the hanging lookups back to the index for consistency. So I basically ignore most errors here, missing/malformed/io.EOF, we just truncate and rebuild. I think that should be equally repeatable, as long as there isn't a pathological loop where the index can't get a foothold for some reason.

Like one thing that is maybe a bit annoying in both versions is that we could rewrite the entire index, the server quits before writing a root value, and then next startup has to do it all over again. Reread all of the lookups, delete the index b/c no batch metadata, and then rebuilds the same index again. A 45 minute startup becoming like 3 hours would be annoying.

reltuk · 2024-05-06T21:21:07Z

go/store/nbs/journal_index_record.go

+			batch = nil
+			batchCrc = 0
+		default:
+			return fmt.Errorf("expected record to start with a chunk or metadata type tag")


In some ways different, in some ways...maybe not so different? Why an error here but ErrUnexpectedEOF is not one? If we there could be garbage beyond where we read...

reltuk · 2024-05-06T21:24:22Z

go/store/nbs/journal_writer.go

-	if _, err = wr.index.Write(buf); err != nil {
-		return err
-	}
+	writeJournalIndexMeta(wr.indexWriter, root, wr.indexed, end, wr.batchCrc)


Update the comment here to be more accurate.

coffeegoddd · 2024-05-08T01:00:35Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`0cc415b`	ok	5937457

version	total_tests
`0cc415b`	5937457

correctness_percentage
100.0

coffeegoddd · 2024-05-08T15:48:29Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`9042954`	ok	5937457

version	total_tests
`9042954`	5937457

correctness_percentage
100.0

coffeegoddd · 2024-05-08T17:56:30Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`f51e89a`	ok	5937457

version	total_tests
`f51e89a`	5937457

correctness_percentage
100.0

…te.sh

coffeegoddd · 2024-05-08T19:12:22Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`cf2cf03`	ok	5937457

version	total_tests
`cf2cf03`	5937457

correctness_percentage
100.0

coffeegoddd · 2024-05-08T19:21:07Z

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`9db5679`	ok	5937457

version	total_tests
`9db5679`	5937457

correctness_percentage
100.0

Reformat journal index

a1b2b7c

coffeegoddd added the correctness_approved label Apr 25, 2024

max-hoffman and others added 2 commits April 25, 2024 12:36

edits

7a0804b

[ga-format-pr] Run go/utils/repofmt/format_repo.sh and go/Godeps/upda…

dbeecee

…te.sh

coffeegoddd removed the correctness_approved label Apr 25, 2024

coffeegoddd and others added 4 commits April 25, 2024 20:16

[skip actions] [ga-update-correctness] SQL Correctness updated to 99.…

afb8c1d

…51096

fix race

39a42fc

Merge branch 'max/streamline-journal-index' of github.com:dolthub/dol…

e4d4824

…t into max/streamline-journal-index

write 16 bytes, not 20

8e519a7

coffeegoddd and others added 3 commits April 26, 2024 18:47

[skip actions] [ga-update-correctness] SQL Correctness updated to 99.…

b87ca4c

…378724

fix test

9b01f6a

Merge branch 'max/streamline-journal-index' of github.com:dolthub/dol…

7bf0ce8

…t into max/streamline-journal-index

coffeegoddd and others added 3 commits April 27, 2024 00:08

[skip actions] [ga-update-correctness] SQL Correctness updated to 99.…

54b6ae4

…387347

partial index updates are safe

929312b

Merge branch 'max/streamline-journal-index' of github.com:dolthub/dol…

a008969

…t into max/streamline-journal-index

coffeegoddd and others added 3 commits April 29, 2024 17:20

[skip actions] [ga-update-correctness] SQL Correctness updated to 99.…

9bf443c

…686701

edit

2e742d7

Merge branch 'max/streamline-journal-index' of github.com:dolthub/dol…

ee36b06

…t into max/streamline-journal-index

coffeegoddd added the correctness_approved label May 3, 2024

coffeegoddd and others added 3 commits May 3, 2024 19:42

[skip actions] [ga-update-correctness] SQL Correctness updated to 100

51181f1

cleanup

447f3c4

Merge branch 'max/streamline-journal-index' of github.com:dolthub/dol…

290471d

…t into max/streamline-journal-index

max-hoffman requested a review from reltuk May 3, 2024 23:03

reltuk approved these changes May 6, 2024

View reviewed changes

comments

0cc415b

save index rewrite if novel exceeds max

9042954

bugs

f51e89a

max-hoffman and others added 3 commits May 8, 2024 11:25

fix tests, better docs

a693d63

Merge branch 'main' into max/streamline-journal-index

cf2cf03

[ga-format-pr] Run go/utils/repofmt/format_repo.sh and go/Godeps/upda…

9db5679

…te.sh

max-hoffman merged commit 084b835 into main May 8, 2024
19 of 20 checks passed

max-hoffman deleted the max/streamline-journal-index branch May 8, 2024 19:58

BrewTestBot mentioned this pull request May 8, 2024

dolt 1.36.1 Homebrew/homebrew-core#171208

Merged

max-hoffman mentioned this pull request May 8, 2024

Journal index offset 8bytes #7836

Merged

BrewTestBot mentioned this pull request May 9, 2024

dolt 1.37.0 Homebrew/homebrew-core#171215

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reformat journal index #7780

Reformat journal index #7780

max-hoffman commented Apr 25, 2024 •

edited

coffeegoddd commented Apr 25, 2024

coffeegoddd commented Apr 25, 2024

coffeegoddd commented Apr 26, 2024

coffeegoddd commented Apr 27, 2024

max-hoffman commented Apr 29, 2024

github-actions bot commented Apr 29, 2024

coffeegoddd commented Apr 29, 2024

coffeegoddd commented May 3, 2024

max-hoffman commented May 3, 2024

coffeegoddd commented May 3, 2024

github-actions bot commented May 3, 2024

coffeegoddd commented May 3, 2024

coffeegoddd commented May 3, 2024

reltuk left a comment

reltuk May 6, 2024

reltuk May 6, 2024

max-hoffman May 8, 2024 •

edited

reltuk May 6, 2024

reltuk May 6, 2024

coffeegoddd commented May 8, 2024

coffeegoddd commented May 8, 2024

coffeegoddd commented May 8, 2024

coffeegoddd commented May 8, 2024

coffeegoddd commented May 8, 2024

Reformat journal index #7780

Reformat journal index #7780

Conversation

max-hoffman commented Apr 25, 2024 • edited

coffeegoddd commented Apr 25, 2024

coffeegoddd commented Apr 25, 2024

coffeegoddd commented Apr 26, 2024

coffeegoddd commented Apr 27, 2024

max-hoffman commented Apr 29, 2024

github-actions bot commented Apr 29, 2024

coffeegoddd commented Apr 29, 2024

coffeegoddd commented May 3, 2024

max-hoffman commented May 3, 2024

coffeegoddd commented May 3, 2024

github-actions bot commented May 3, 2024

coffeegoddd commented May 3, 2024

coffeegoddd commented May 3, 2024

reltuk left a comment

Choose a reason for hiding this comment

reltuk May 6, 2024

Choose a reason for hiding this comment

reltuk May 6, 2024

Choose a reason for hiding this comment

max-hoffman May 8, 2024 • edited

Choose a reason for hiding this comment

reltuk May 6, 2024

Choose a reason for hiding this comment

reltuk May 6, 2024

Choose a reason for hiding this comment

coffeegoddd commented May 8, 2024

coffeegoddd commented May 8, 2024

coffeegoddd commented May 8, 2024

coffeegoddd commented May 8, 2024

coffeegoddd commented May 8, 2024

max-hoffman commented Apr 25, 2024 •

edited

max-hoffman May 8, 2024 •

edited