Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve lz4 compression #2614

Merged
merged 4 commits into from
Sep 30, 2020
Merged

Improve lz4 compression #2614

merged 4 commits into from
Sep 30, 2020

Conversation

cyriltovena
Copy link
Contributor

  • Move to v4.
  • Remove not required checksumming. (we already checksum in the chunk)
  • Default will now write 4M blocks, it's backward compatible for reads.

Performance wise I can see a 20% improvement compare to snappy (for reads, write are slower) but also a better compression ratio. At the expense of more cpu and memory used.

see benchmark and compression test result below.

bench:

go test -benchmem -run=BenchmarkRead -bench BenchmarkRead -v ./pkg/chunkenc
goos: darwin
goarch: amd64
pkg: github.com/grafana/loki/pkg/chunkenc
BenchmarkRead
BenchmarkRead/none
    memchunk_test.go:605: bytes per second  691 MB
    memchunk_test.go:606: n= 1
    memchunk_test.go:605: bytes per second  716 MB
    memchunk_test.go:606: n= 100
BenchmarkRead/none-4         	     100	  10566322 ns/op	 7761962 B/op	   28476 allocs/op
BenchmarkRead/gzip
    memchunk_test.go:605: bytes per second  370 MB
    memchunk_test.go:606: n= 1
    memchunk_test.go:605: bytes per second  322 MB
    memchunk_test.go:606: n= 3
BenchmarkRead/gzip-4         	       3	 341573597 ns/op	112990232 B/op	  413237 allocs/op
BenchmarkRead/lz4-64k
    memchunk_test.go:605: bytes per second  1.1 GB
    memchunk_test.go:606: n= 1
    memchunk_test.go:605: bytes per second  792 MB
    memchunk_test.go:606: n= 15
BenchmarkRead/lz4-64k-4      	      15	  99341841 ns/op	80775355 B/op	  296551 allocs/op
BenchmarkRead/lz4-256k
    memchunk_test.go:605: bytes per second  593 MB
    memchunk_test.go:606: n= 1
    memchunk_test.go:605: bytes per second  821 MB
    memchunk_test.go:606: n= 8
    memchunk_test.go:605: bytes per second  722 MB
    memchunk_test.go:606: n= 10
BenchmarkRead/lz4-256k-4     	      10	 116166896 ns/op	86147066 B/op	  316208 allocs/op
BenchmarkRead/lz4-1M
    memchunk_test.go:605: bytes per second  856 MB
    memchunk_test.go:606: n= 1
    memchunk_test.go:605: bytes per second  682 MB
    memchunk_test.go:606: n= 12
BenchmarkRead/lz4-1M-4       	      12	 125007983 ns/op	87994374 B/op	  321232 allocs/op
BenchmarkRead/lz4
    memchunk_test.go:605: bytes per second  758 MB
    memchunk_test.go:606: n= 1
    memchunk_test.go:605: bytes per second  799 MB
    memchunk_test.go:606: n= 9
    memchunk_test.go:605: bytes per second  951 MB
    memchunk_test.go:606: n= 10
    memchunk_test.go:605: bytes per second  825 MB
    memchunk_test.go:606: n= 13
BenchmarkRead/lz4-4          	      13	 103320290 ns/op	89072964 B/op	  321186 allocs/op
BenchmarkRead/snappy
    memchunk_test.go:605: bytes per second  785 MB
    memchunk_test.go:606: n= 1
    memchunk_test.go:605: bytes per second  621 MB
    memchunk_test.go:606: n= 15
BenchmarkRead/snappy-4       	      15	  97019479 ns/op	61859299 B/op	  226125 allocs/op
PASS
ok  	github.com/grafana/loki/pkg/chunkenc	17.788s

compression test:

=== RUN   TestChunkSize
=== RUN   TestChunkSize/none
    memchunk_test.go:435: Chunk size 1.5 MB
    memchunk_test.go:436: characters  1516641
=== RUN   TestChunkSize/gzip
    memchunk_test.go:435: Chunk size 1.3 MB
    memchunk_test.go:436: characters  22029822
=== RUN   TestChunkSize/lz4-64k
    memchunk_test.go:435: Chunk size 1.3 MB
    memchunk_test.go:436: characters  15734244
=== RUN   TestChunkSize/lz4-256k
    memchunk_test.go:435: Chunk size 1.3 MB
    memchunk_test.go:436: characters  16774337
=== RUN   TestChunkSize/lz4-1M
    memchunk_test.go:435: Chunk size 1.3 MB
    memchunk_test.go:436: characters  17039875
=== RUN   TestChunkSize/lz4
    memchunk_test.go:435: Chunk size 1.3 MB
    memchunk_test.go:436: characters  17039875
=== RUN   TestChunkSize/snappy
    memchunk_test.go:435: Chunk size 1.3 MB
    memchunk_test.go:436: characters  12059524
--- PASS: TestChunkSize (0.48s)
    --- PASS: TestChunkSize/none (0.00s)
    --- PASS: TestChunkSize/gzip (0.19s)
    --- PASS: TestChunkSize/lz4-64k (0.03s)
    --- PASS: TestChunkSize/lz4-256k (0.04s)
    --- PASS: TestChunkSize/lz4-1M (0.17s)
    --- PASS: TestChunkSize/lz4 (0.03s)
    --- PASS: TestChunkSize/snappy (0.01s)
PASS
ok  	github.com/grafana/loki/pkg/chunkenc	1.064s

- Move to v4.
- Remove not required checksuming.
- Default will now write 4M blocks, it's backward compatible for reads.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
@codecov-commenter
Copy link

Codecov Report

Merging #2614 into master will decrease coverage by 0.00%.
The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2614      +/-   ##
==========================================
- Coverage   62.87%   62.86%   -0.01%     
==========================================
  Files         170      170              
  Lines       15049    15045       -4     
==========================================
- Hits         9462     9458       -4     
  Misses       4826     4826              
  Partials      761      761              
Impacted Files Coverage Δ
pkg/chunkenc/pool.go 87.27% <81.81%> (-2.21%) ⬇️
pkg/chunkenc/interface.go 87.50% <100.00%> (ø)
pkg/logql/evaluator.go 92.88% <0.00%> (+0.40%) ⬆️

Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, sorry for the delay. Nice work.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
@cyriltovena cyriltovena merged commit 6500f82 into grafana:master Sep 30, 2020
cyriltovena added a commit to cyriltovena/loki that referenced this pull request Oct 21, 2020
* Improve lz4 compression.

- Move to v4.
- Remove not required checksuming.
- Default will now write 4M blocks, it's backward compatible for reads.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* vendor update

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants