New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Index rebuilds with external key sorting #7754

Merged

max-hoffman merged 26 commits into main from max/external-sort-index-build

May 2, 2024

Contributor

max-hoffman commented Apr 16, 2024 •

edited

Index builds now write keys to intermediate files and merge sort before materializing the prolly tree for the secondary index. This contrasts the default approach, which rebuilds the prolly tree each time we flush keys from memory. The old approach reads most of the tree with random reads and writes when memory flushes are unsorted keys. The new approach structures work for sequential IO by flushing sorted runs that become incrementally merge sorted. The sequential IO is dramatically faster for disk-based systems.


          starter for external sorting index rebuilds

c3065ec

Contributor

coffeegoddd commented Apr 16, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`c3065ec`	ok	5937457

version	total_tests
`c3065ec`	5937457

correctness_percentage
100.0

coffeegoddd added the correctness_approved label


          fixes

b9f36b6

Contributor

coffeegoddd commented Apr 16, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`b9f36b6`	ok	5937457

version	total_tests
`b9f36b6`	5937457

correctness_percentage
100.0

max-hoffman and others added 2 commits

April 16, 2024 15:35


          don't flush in progress

15d6d6a


          [ga-format-pr] Run go/utils/repofmt/format_repo.sh and go/Godeps/upda…

76893a7

…te.sh

Contributor

coffeegoddd commented Apr 16, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`15d6d6a`	ok	5937457

version	total_tests
`15d6d6a`	5937457

correctness_percentage
100.0

Contributor

coffeegoddd commented Apr 16, 2024

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`76893a7`	ok	5937457

version	total_tests
`76893a7`	5937457

correctness_percentage
100.0

max-hoffman and others added 3 commits

April 17, 2024 10:29


          formatting


          merge

8cd6aa2


          [ga-format-pr] Run go/utils/repofmt/format_repo.sh and go/Godeps/upda…

e59b5a5

…te.sh

Contributor

coffeegoddd commented Apr 17, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`8cd6aa2`	ok	5937457

version	total_tests
`8cd6aa2`	5937457

correctness_percentage
100.0

Contributor

coffeegoddd commented Apr 17, 2024

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`e59b5a5`	ok	5937457

version	total_tests
`e59b5a5`	5937457

correctness_percentage
100.0

max-hoffman and others added 4 commits

April 17, 2024 11:55


          testing edits

8ee5d83


          add tests

99cf0ee


          Merge branch 'max/external-sort-index-build' of github.com:dolthub/do…

74d9123

…lt into max/external-sort-index-build


          [ga-format-pr] Run go/utils/repofmt/format_repo.sh and go/Godeps/upda…

82025f1

…te.sh

max-hoffman changed the title ~~starter for external sorting index rebuilds~~ Index rebuilds with external key sorting

Contributor Author

max-hoffman commented Apr 17, 2024

I tested the preliminary speedup by comparing 1 million -> 3 million -> 10 million row build time on an HDD.

old: 19 seconds -> 15 minutes -> 280 minutes
new: 30 seconds -> 2 minutes -> 6 minutes

3 million rows on an SSD for comparison goes from 10 seconds -> 30 seconds..

Tuning has some effect on the scaling constants, but the difference is disk seeks on HDD. We only build the tree once with sorted edits now by pre-sorting keys with merge sort. The previous version would build the whole tree log(n) number of times, with each rebuild random IO reading every chunk. The merge sort shifts the work upfront to sequential IO.

max-hoffman marked this pull request as ready for review

April 17, 2024 20:58

max-hoffman added 3 commits

April 17, 2024 14:05


          edits


          Merge branch 'max/external-sort-index-build' of github.com:dolthub/do…

ec19984

…lt into max/external-sort-index-build


          Merge branch 'main' into max/external-sort-index-build

d984695

Contributor

coffeegoddd commented Apr 17, 2024

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`82025f1`	ok	5937457

version	total_tests
`82025f1`	5937457

correctness_percentage
100.0

Contributor

coffeegoddd commented Apr 17, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`d984695`	ok	5937457

version	total_tests
`d984695`	5937457

correctness_percentage
100.0


          fix string compare bug

0d14698

Contributor

coffeegoddd commented Apr 17, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`0d14698`	ok	5937457

version	total_tests
`0d14698`	5937457

correctness_percentage
100.0

reltuk reviewed

View reviewed changes

Contributor

reltuk left a comment

This looks like a great start. Had a number of initial comments. Happy to follow up on any of these. It's quite close, just needs a bit of cleanup, and it's going to be great :)

go/store/prolly/sort/external.go Outdated Show resolved Hide resolved

go/store/prolly/sort/external.go Outdated Show resolved Hide resolved

go/store/prolly/sort/external.go Outdated Show resolved Hide resolved

go/store/prolly/sort/external.go Outdated Show resolved Hide resolved

go/store/prolly/sort/external.go Outdated Show resolved Hide resolved

go/store/prolly/sort/external.go Outdated Show resolved Hide resolved

go/store/prolly/sort/external.go Outdated Show resolved Hide resolved

go/store/prolly/sort/external.go Outdated Show resolved Hide resolved

go/store/prolly/sort/external.go Outdated Show resolved Hide resolved

go/store/prolly/sort/external.go Outdated

+              			}
+              		}
+              	})
+              	for i := 0; i < len(a.files); i += 2 {

Contributor

reltuk Apr 19, 2024

If I understand the whole logic here, the asymptotics on this strategy as implemented here seem off to me, at least compared to optimality. AFAIU:

we're scanning the entire data set on every compact, and
every compact is coming at fixed size intervals, for example, every time we have 64 new files of 32MBs, we scan every existing file...

Don't we get N^2 asymptotics?

If you take a look at the file sizes of the files in each slot, you'll also see we're just building one big file at index 0 and everything else stays quite a bit smaller than it...so it's also probably not great for parallelism, because we're going to block on the linear scan of that file each compact...

I think an approach where we only merge the n/2 smallest files each compact (we only get by n/4 slots in that case) gets us back to better asymptotics, but there are probably much better approaches which take advantage of the k-way merge (instead of 2-way...)


          aaron coments

83860d5


          standardize error passing

4f2619b

Contributor Author

max-hoffman commented Apr 19, 2024 •

edited

@reltuk With your changes index building looks like it runs about 2x as quickly as the first version. Compaction strategies seem edge-casey, so I picked what seems like a simple one. Leveled tiers that get parallelism=1 compacted to the next level after a threshold.

1 milion rows (18s) -> 3 million rows (1 min) -> 10 million rows (4 min)

Contributor

coffeegoddd commented Apr 19, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`4f2619b`	ok	5937457

version	total_tests
`4f2619b`	5937457

correctness_percentage
100.0


          bit more cleanup

9b09fe7

Contributor

coffeegoddd commented Apr 24, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`9b09fe7`	ok	5937457

version	total_tests
`9b09fe7`	5937457

correctness_percentage
100.0


          use io.ReadFull rather than retrying buf.Read

1b19a44

Contributor

coffeegoddd commented Apr 29, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`1b19a44`	ok	5937457

version	total_tests
`1b19a44`	5937457

correctness_percentage
100.0

reltuk approved these changes

View reviewed changes

Contributor

reltuk left a comment

LGTM. I like the level change and the heap based multi-merger.

I think it's worth taking a short pass to get timely resource finalization on the file handles and cleanup from disk of the temp files. Somewhat your call.

go/libraries/doltcore/table/editor/creation/external_build_index.go Outdated

Comment on lines 106 to 109

+              	defer it.Close()
+              	if err != nil {
+              		return nil, err
+              	}

Contributor

reltuk May 1, 2024

Suggested change

      
            	defer it.Close()
          
            	if err != nil {
          
            		return nil, err
          
            	}
          
            	if err != nil {
          
            		return nil, err
          
            	}
          
            	defer it.Close()

Contributor

reltuk May 1, 2024

More Close()es, but I think only a defer sorter.Close() up above, if we add timely resource finalization.

go/store/prolly/sort/external.go

+              }
+              func (a *tupleSorter) newFile() (*os.File, error) {
+              	f, err := a.tmpProv.NewFile("", "key_file_")

Contributor

reltuk May 1, 2024

Something seems off about the lifetime of these files. We put them in keyIterables, and those return the Iter, but the iter's Close() method doesn't close the file. And there's no way provided in the API to close and clean them up in the case of an error. And it seems like we would want to os.Remove() these files after we're done with them, since we had a contract was to use a certain number of files?

Near at hand is something like:

Make it explicit that keyFile.IterAll() can only ever be called once.
On keyFile.IterAll(), set keyFile's f *os.File to nil.
func (k *keyFile) Close() { if k.f != nil { k.f.Close(); os.Remove(k.f.Name()) }
keyFileReader gets the *os.File as well as its buffered reader.
func (r keyFileReader) Close() { r.f.Close(); os.Remove(r.f.Name()) }
func (a *tupleSorter) Close() { for _, fs := range a.files { for _, f := range fs { f.Close() } } }

Or something like that?

go/store/prolly/sort/external.go

+              	outF := newKeyFile(newF, a.batchSize)
+              	m, err := newFileMerger(ctx, a.keyCmp, outF, fileLevel[:a.fileMax]...)
+              	if err != nil {
+              		return err

Contributor

reltuk May 1, 2024

If we add resource finalization, careful not to leak newF or any fileLevel[:a.fileMax] which didn't successfully IterAll'd here.

go/store/prolly/sort/external.go

+              		return err
+              	}
+              	if err := m.run(ctx); err != nil {
+              		return err

Contributor

reltuk May 1, 2024

If we add resource finalization, careful not to leak file newF here. m.run() should probably guarantee all iterators it made are closed.

go/store/prolly/sort/external.go

+              		} else {
+              			reader.iter.Close()
+              			if err != nil {
+              				return err

Contributor

reltuk May 1, 2024

If the contract is these are all closed when we return, need to add some code to close all the ones remaining in m.mq here.

go/store/prolly/sort/external.go Outdated Show resolved Hide resolved

go/store/prolly/sort/external.go Outdated Show resolved Hide resolved

go/store/prolly/sort/external.go Outdated Show resolved Hide resolved

go/store/prolly/sort/external.go Show resolved Hide resolved

go/store/prolly/sort/external.go Outdated Show resolved Hide resolved


          Update go/store/prolly/sort/external.go

3194cd4

Co-authored-by: Aaron Son <aaron@dolthub.com>

Contributor

coffeegoddd commented May 2, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`3194cd4`	ok	5937457

version	total_tests
`3194cd4`	5937457

correctness_percentage
100.0

max-hoffman added 3 commits

May 1, 2024 20:03


          aaron comments

e882310


          merge

77827df


          test window tmpdir edit

034fc3b

Contributor

coffeegoddd commented May 2, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`77827df`	ok	5937457

version	total_tests
`77827df`	5937457

correctness_percentage
100.0

Contributor

coffeegoddd commented May 2, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`034fc3b`	ok	5937457

version	total_tests
`034fc3b`	5937457

correctness_percentage
100.0


          test windows tmpdir edit2

d7bd28b

Contributor

coffeegoddd commented May 2, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`d7bd28b`	ok	5937457

version	total_tests
`d7bd28b`	5937457

correctness_percentage
100.0

max-hoffman and others added 2 commits

May 2, 2024 08:28


          Merge branch 'main' into max/external-sort-index-build

e190f41


          [ga-format-pr] Run go/utils/repofmt/format_repo.sh and go/Godeps/upda…

ed8bdca

…te.sh

Contributor

coffeegoddd commented May 2, 2024

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`e190f41`	ok	5937457

version	total_tests
`e190f41`	5937457

correctness_percentage
100.0

Contributor

coffeegoddd commented May 2, 2024

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`ed8bdca`	ok	5937457

version	total_tests
`ed8bdca`	5937457

correctness_percentage
100.0

max-hoffman merged commit 9ec3ce2 into main

20 checks passed

max-hoffman deleted the max/external-sort-index-build branch

May 2, 2024 16:36

github-actions bot commented May 2, 2024

@max-hoffman DOLT

test_name	detail	row_cnt	sorted	mysql_time	sql_mult	cli_mult
batching	LOAD DATA	10000	1	0.05	1.4
batching	batch sql	10000	1	0.07	2
batching	by line sql	10000	1	0.07	1.86
blob	1 blob	200000	1	0.9	3.84	4.4
blob	2 blobs	200000	1	0.91	4.68	4.99
blob	no blob	200000	1	0.9	1.99	2.08
col type	datetime	200000	1	0.81	2.6	2.83
col type	varchar	200000	1	0.69	2.78	2.8
config width	2 cols	200000	1	0.82	1.99	2.13
config width	32 cols	200000	1	1.87	1.72	2.49
config width	8 cols	200000	1	0.96	2.03	2.3
pk type	float	200000	1	1.01	1.65	1.74
pk type	int	200000	1	0.84	1.98	2.06
pk type	varchar	200000	1	1.61	1.44	1.39
row count	1.6mm	1600000	1	5.63	2.42	2.58
row count	400k	400000	1	1.49	2.23	2.39
row count	800k	800000	1	2.79	2.44	2.59
secondary index	four index	200000	1	3.53	1.29	1.09
secondary index	no secondary	200000	1	0.9	2.04	2.1
secondary index	one index	200000	1	1.14	2.05	2.11
secondary index	two index	200000	1	1.95	1.6	1.51
sorting	shuffled 1mm	1000000	0	4.85	2.54	2.71
sorting	sorted 1mm	1000000	1	4.88	2.54	2.68

github-actions bot commented May 2, 2024

@max-hoffman DOLT

name	detail	mean_mult
dolt_blame_basic	system table	1.23
dolt_blame_commit_filter	system table	3.3
dolt_commit_ancestors_commit_filter	system table	0.84
dolt_commits_commit_filter	system table	0.91
dolt_diff_log_join_from_commit	system table	2.08
dolt_diff_log_join_to_commit	system table	2.14
dolt_diff_table_from_commit_filter	system table	1.12
dolt_diff_table_to_commit_filter	system table	1.12
dolt_diffs_commit_filter	system table	1
dolt_history_commit_filter	system table	1.21
dolt_log_commit_filter	system table	0.89

BrewTestBot mentioned this pull request

dolt 1.35.13 Homebrew/homebrew-core#170760

Merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment