Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out why Kilic is so slow #37

Closed
gballet opened this issue Apr 28, 2021 · 6 comments
Closed

Figure out why Kilic is so slow #37

gballet opened this issue Apr 28, 2021 · 6 comments
Labels
question Further information is requested

Comments

@gballet
Copy link
Member

gballet commented Apr 28, 2021

A mainnet tree conversion with kilic is ~2x as slow as with herumi. I doubt it's entirely due to the library being slower. This is a tracking issue to figure out if the problem is indeed on the kilic side or if it's based on our usage of it.

@gballet gballet added the question Further information is requested label Apr 28, 2021
@gballet
Copy link
Member Author

gballet commented Apr 28, 2021

A naive benchmark shows no significant difference, kilic even being faster with 256:

kilic:

goos: linux
goarch: amd64
pkg: github.com/gballet/go-verkle
BenchmarkCommitLeaves/insert/leaves/1000/width/10-8         	       1	1474368639 ns/op	292095272 B/op	 3192268 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/1000/width/10-8  	       7	 155914651 ns/op	22354277 B/op	   65587 allocs/op
BenchmarkCommitLeaves/insert/leaves/10000/width/10-8        	       1	1376285817 ns/op	146847120 B/op	  636855 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/10000/width/10-8 	       1	1661157609 ns/op	156908680 B/op	  709441 allocs/op
BenchmarkCommitLeaves/insert/leaves/1000/width/8-8          	       7	 152014956 ns/op	12296058 B/op	   65042 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/1000/width/8-8   	       6	 179306638 ns/op	14418436 B/op	   79603 allocs/op
BenchmarkCommitLeaves/insert/leaves/10000/width/8-8         	       1	1419358018 ns/op	111304536 B/op	  667735 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/10000/width/8-8  	       1	1755614599 ns/op	120034984 B/op	  731625 allocs/op
BenchmarkCommitFullNode/width/10-8                          	      34	  33119308 ns/op	  799179 B/op	   10952 allocs/op
BenchmarkCommitFullNode/width/8-8                           	     100	  11298582 ns/op	  225296 B/op	    2840 allocs/op
PASS
ok  	github.com/gballet/go-verkle	15.269s

herumi:

goos: linux
goarch: amd64
pkg: github.com/gballet/go-verkle
BenchmarkCommitLeaves/insert/leaves/1000/width/10-8         	       1	1259425911 ns/op	15187432 B/op	   16579 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/1000/width/10-8  	       6	 171992046 ns/op	14097186 B/op	   15976 allocs/op
BenchmarkCommitLeaves/insert/leaves/10000/width/10-8        	       1	1432095986 ns/op	60271552 B/op	  118016 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/10000/width/10-8 	       1	1500091174 ns/op	61187144 B/op	  136318 allocs/op
BenchmarkCommitLeaves/insert/leaves/1000/width/8-8          	       8	 139774493 ns/op	 3760832 B/op	   13300 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/1000/width/8-8   	       6	 167537888 ns/op	 3897029 B/op	   16084 allocs/op
BenchmarkCommitLeaves/insert/leaves/10000/width/8-8         	       1	1317603382 ns/op	18962752 B/op	  117465 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/10000/width/8-8  	       1	1418402187 ns/op	19791336 B/op	  134347 allocs/op
BenchmarkCommitFullNode/width/10-8                          	      19	  59106862 ns/op	  463930 B/op	    8781 allocs/op
BenchmarkCommitFullNode/width/8-8                           	      73	  14779280 ns/op	  116086 B/op	    2188 allocs/op
PASS
ok  	github.com/gballet/go-verkle	13.406s

It looks like kilic is much faster than herumi when the nodes are really full, and seems to be even more pronounced when nodes are wider. So far comparisons only occured on a width of 256, I'll run benchmarks with width 1024.

@gballet
Copy link
Member Author

gballet commented Apr 28, 2021

One explanation could be that we set the "lincomb" threshold based on herumi's numbers, and kilic could work with an ever lower/higher threshold. @s1na do you remember which bignum library you used to find a threshold of 110?

@gballet
Copy link
Member Author

gballet commented Apr 29, 2021

As expected, when comparing the time taken to translate a mainnet trie into a verkle tree, it is noticeably faster but still 40% slower. ~10h(herumi) vs ~14h (kilic).

@gballet
Copy link
Member Author

gballet commented Apr 30, 2021

Running the benchmarks on an AWS instance x5.large

herumi:

goos: linux
goarch: amd64
pkg: github.com/gballet/go-verkle
cpu: Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
BenchmarkProofCalculation-4            6         177339859 ns/op         1005608 B/op      21452 allocs/op
BenchmarkProofVerification-4         487           2486951 ns/op           20368 B/op         73 allocs/op
BenchmarkCommitLeaves/insert/leaves/1000/width/10-4                   10         108025027 ns/op        13653744 B/op      11873 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/1000/width/10-4             8         127878044 ns/op        13799688 B/op      14880 allocs/op
BenchmarkCommitLeaves/insert/leaves/10000/width/10-4                   2         981093258 ns/op        58890640 B/op     108000 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/10000/width/10-4            1        1075515054 ns/op        59779912 B/op     126228 allocs/op
BenchmarkCommitLeaves/insert/leaves/1000/width/8-4                    12          99175598 ns/op         3594284 B/op      12271 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/1000/width/8-4              9         117806597 ns/op         3727641 B/op      15046 allocs/op
BenchmarkCommitLeaves/insert/leaves/10000/width/8-4                    2         923288118 ns/op        17821912 B/op     107731 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/10000/width/8-4             1        1004784412 ns/op        18645288 B/op     124762 allocs/op
BenchmarkCommitFullNode/width/10-4                                    28          40760101 ns/op          324521 B/op       7757 allocs/op
BenchmarkCommitFullNode/width/8-4                                    100          10180493 ns/op           81221 B/op       1932 allocs/op
BenchmarkModifyLeaves-4                                                1        15078576557 ns/op       130852168 B/op   1251054 allocs/op

kilic:

goos: linux
goarch: amd64
pkg: github.com/gballet/go-verkle
cpu: Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
BenchmarkProofCalculation-4            9         122313086 ns/op         2421853 B/op      34401 allocs/op
BenchmarkProofVerification-4         663           1834717 ns/op           65265 B/op        421 allocs/op
BenchmarkCommitLeaves/insert/leaves/1000/width/10-4                   10         100830027 ns/op        19457588 B/op      47432 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/1000/width/10-4             8         129216877 ns/op        21877086 B/op      63958 allocs/op
BenchmarkCommitLeaves/insert/leaves/10000/width/10-4                   1        1227360081 ns/op        145180104 B/op    626447 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/10000/width/10-4            1        1347071574 ns/op        155180896 B/op    698801 allocs/op
BenchmarkCommitLeaves/insert/leaves/1000/width/8-4                     8         126863070 ns/op        12174806 B/op      64138 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/1000/width/8-4              7         153912399 ns/op        14287331 B/op      78662 allocs/op
BenchmarkCommitLeaves/insert/leaves/10000/width/8-4                    1        1259118860 ns/op        110126448 B/op    658312 allocs/op
BenchmarkCommitLeaves/insertOrdered/leaves/10000/width/8-4             1        1370844519 ns/op        118980800 B/op    723114 allocs/op
BenchmarkCommitFullNode/width/10-4                                    45          26299748 ns/op          667911 B/op       9927 allocs/op
BenchmarkCommitFullNode/width/8-4                                    135           8771252 ns/op          192461 B/op       2584 allocs/op
BenchmarkModifyLeaves-4                                                1        8808043081 ns/op        455726400 B/op   3727681 allocs/op

Building the trie with kilic is noticeably slower, however calculating proofs / root commitments is noticeably faster 🤔

@gballet
Copy link
Member Author

gballet commented May 18, 2021

After changing multiExpThreshold to 26, the conversion comes down to ~11 hours, which indicates that knowing when to call LinComb has a significant impact on performance.

CPU profiling output of a conversion:
profile001

@gballet
Copy link
Member Author

gballet commented Nov 11, 2021

We are no longer using KZG or BLS, closing this tracker issue.

@gballet gballet closed this as completed Nov 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant