Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String Bank Performance Gains #1367

Merged
merged 1 commit into from
Aug 26, 2022
Merged

Conversation

mbolt35
Copy link
Contributor

@mbolt35 mbolt35 commented Aug 25, 2022

Summary

This change updates the stringutil.Bank operation to use a map[string]string and sync.Mutex instead of a sync.Map. The choice of sync.Map is based on the golang documentation of the "proper" use-cases for sync.Map:

The Map type is optimized for two common use cases: (1) when the entry for a given key is only ever written once but read many times, as in caches that only grow, or (2) when multiple goroutines read, write, and overwrite entries for disjoint sets of keys. In these two cases, use of a Map may significantly reduce lock contention compared to a Go map paired with a separate Mutex or RWMutex.

The results from the benchmarks are as follows:

oos: linux
goarch: amd64
pkg: github.com/opencost/opencost/pkg/util/stringutil
cpu: Intel(R) Core(TM) i7-7800X CPU @ 3.50GHz

stringutil.Bank

Legacy stringutil.Bank implementation using sync.Map

BenchmarkLegacyStringBank90PercentDuplicate-12    	       2	 541466030 ns/op	50062496 B/op	 2205611 allocs/op
BenchmarkLegacyStringBank75PercentDuplicate-12    	       2	 712892249 ns/op	77102440 B/op	 2513852 allocs/op
BenchmarkLegacyStringBank50PercentDuplicate-12    	       2	 559443800 ns/op	50060480 B/op	 2205600 allocs/op
BenchmarkLegacyStringBank25PercentDuplicate-12    	       2	 747127020 ns/op	113600264 B/op	 3527178 allocs/op
BenchmarkLegacyStringBankNoDuplicate-12           	       1	1024655135 ns/op	179816424 B/op	 4038152 allocs/op

New stringutil.Bank implementation using map[string]string and sync.Mutex

BenchmarkStringBank90PercentDuplicate-12          	       5	 209994527 ns/op	10631940 B/op	    3919 allocs/op
BenchmarkStringBank75PercentDuplicate-12          	       4	 321212347 ns/op	40644560 B/op	    9486 allocs/op
BenchmarkStringBank50PercentDuplicate-12          	       5	 223152649 ns/op	10648472 B/op	    3977 allocs/op
BenchmarkStringBank25PercentDuplicate-12          	       3	 457292383 ns/op	83616184 B/op	   27167 allocs/op
BenchmarkStringBankNoDuplicate-12                 	       2	 654713730 ns/op	162575576 B/op	   38279 allocs/op

For 90% collision rate (90% of the total 1,000,000 strings were duplicates), the new Bank used 2201692 less allocations and allocated 20% the total bytes compared to the legacy Bank. It ran 2.5x faster than the legacy Bank as well.

stringutil.BankFunc

Using this type of map also opened the door for some further optimization with util.Buffer in the bytesToString(b []byte) string implementation. We can now use the existing []byte to check the cache versus allocating a new string just to throw it away.

The BankFunc benchmarks are very comparable to Bank:

BenchmarkStringBankFunc90PercentDuplicate-12      	       5	 211575388 ns/op	10645822 B/op	    3967 allocs/op
BenchmarkStringBankFunc75PercentDuplicate-12      	       4	 313956547 ns/op	40626992 B/op	    9425 allocs/op
BenchmarkStringBankFunc50PercentDuplicate-12      	       5	 212961263 ns/op	10649969 B/op	    3982 allocs/op
BenchmarkStringBankFunc25PercentDuplicate-12      	       3	 457920294 ns/op	83649976 B/op	   27285 allocs/op
BenchmarkStringBankFuncNoDuplicate-12             	       2	 672757022 ns/op	162559736 B/op	   38224 allocs/op

Buffer.bytesToString

Using the BankFunc option with the Buffer and bingen-file-loader, these bench tests represent a 40-day sequential load of scale Allocation Data:

Using Old Bank() which allocates the new string no matter what, throws away the string if dupe exists, and uses sync.Map for string cache:

BenchmarkOpenAllocationsBingen-12    	        1	88042658214 ns/op	19644383856 B/op	139868514 allocs/op

Using New Bank() which allocates the new string no matter what, throws away the string if dupe exists

BenchmarkOpenAllocationsBingen-12				1	87603554924 ns/op	19028404576 B/op	101537636 allocs/op

Using BankFunc() for pinned byte -> string Load And Store. Strings aren't allocated on the "key check", only when the allocation should be stored:

BenchmarkOpenAllocationsBingen-12    	        1	62323902163 ns/op	17093897432 B/op	82898037 allocs/op

Load times for this change were 7-8s faster for the full 40 days.

Running Concurrently Loaded Bingen went from 26s serially to Total Time: 13078ms. This dropped 4-5s on concurrently loaded 40d scale data.

Signed-off-by: Matt Bolt mbolt35@gmail.com

What does this PR change?

  • Very notable performance gains with string caching performance and bingen decoding

How will this PR impact users?

  • Faster bingen decoding and string caching

How was this PR tested?

  • Benchmarks and bingen-file-loader.

@AjayTripathy
Copy link
Contributor

Sweet!

@dwbrown2
Copy link
Collaborator

Wow, this looks insane!

Copy link
Contributor

@nikovacevic nikovacevic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡ ⚡ ⚡

@mbolt35 mbolt35 self-assigned this Aug 25, 2022
Copy link
Contributor

@michaelmdresser michaelmdresser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing documentation, benchmarks, and super helpful PR notes. The speedup is nice too 😁

Comment on lines +14 to +22
// This is the old implementation of the string bank to use for comparison benchmarks
func BankLegacy(s string) string {
ss, _ := oldBank.LoadOrStore(s, s)
return ss.(string)
}

func ClearBankLegacy() {
oldBank = sync.Map{}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be exported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exported is fine in the test context, since _test files aren't compiled outside of testing, but not necessary. Probably worth a change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know that about compilation, always good to expand my knowledge of Go esoteria a bit.

Comment on lines +51 to +55
b.StopTimer()
randStrings := generateBenchData(totalStrings, totalUnique)

for i := 0; i < b.N; i++ {
b.StartTimer()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a future PR, I think this is more idiomatic (unless I'm missing something!):

Suggested change
b.StopTimer()
randStrings := generateBenchData(totalStrings, totalUnique)
for i := 0; i < b.N; i++ {
b.StartTimer()
randStrings := generateBenchData(totalStrings, totalUnique)
b.ResetTimer()
for i := 0; i < b.N; i++ {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Reset() call Start() on an already Stopped() bench?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My assumption is that a new call to a BenchmarkFoo func (if being handled by the bench runner for a higher-b.N bench) should have a totally fresh timer. But... can't say for sure.

Comment on lines +60 to +62
ClearBankLegacy()
runtime.GC()
debug.FreeOSMemory()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! Does Go not guarantee a GC in between benchmarks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea, belt and suspenders :)

…ngen decoding

Signed-off-by: Matt Bolt <mbolt35@gmail.com>
@mbolt35 mbolt35 force-pushed the bolt/string-bank-improvements branch from 40fded4 to 5eca029 Compare August 26, 2022 18:04
@mbolt35 mbolt35 merged commit c3fe308 into develop Aug 26, 2022
@michaelmdresser michaelmdresser deleted the bolt/string-bank-improvements branch June 23, 2023 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants