Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress boltdb files to gzip while uploading from shipper #2507

Merged

Conversation

sandeepsukhani
Copy link
Contributor

@sandeepsukhani sandeepsukhani commented Aug 14, 2020

What this PR does / why we need it:
Adds supports for compressing boltdb files before uploading which is expected to reduce file sizes by 1/3rd.
The files which are compressed would have .gz extension and until we build a compactor we would have both compressed and uncompressed files in the storage which is taken care of in the read path.

While it is not necessary but it would be good to merge PR #2487 before merging this to avoid having some files in both compressed and uncompressed form.

It uses gzip pool that we already have built for reducing allocations while compressing/decompressing chunks.

Checklist

  • Tests updated

@codecov-commenter
Copy link

codecov-commenter commented Aug 14, 2020

Codecov Report

Merging #2507 into master will increase coverage by 0.14%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2507      +/-   ##
==========================================
+ Coverage   63.45%   63.60%   +0.14%     
==========================================
  Files         163      163              
  Lines       14263    14273      +10     
==========================================
+ Hits         9051     9078      +27     
+ Misses       4499     4481      -18     
- Partials      713      714       +1     
Impacted Files Coverage Δ
pkg/storage/stores/shipper/downloads/table.go 67.88% <100.00%> (+0.66%) ⬆️
pkg/storage/stores/shipper/uploads/table.go 69.62% <100.00%> (+0.99%) ⬆️
pkg/logql/evaluator.go 92.88% <0.00%> (+0.40%) ⬆️
pkg/promtail/positions/positions.go 59.64% <0.00%> (+13.15%) ⬆️

Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see the randomness removed from the test then LGTM.

@@ -142,7 +147,28 @@ func SetupDBTablesAtPath(t *testing.T, tableName, path string, dbs map[string]DB

for name, dbRecords := range dbs {
AddRecordsToDB(t, filepath.Join(tablePath, name), boltIndexClient, dbRecords.Start, dbRecords.NumRecords)
if compressRandomFiles && rand.Intn(2) == 0 {
Copy link
Member

@owen-d owen-d Aug 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I think adding randomness like this to tests is bad practice because it creates flakey test cases. Instead, we can run the test case twice, once against compressed files and once against uncompressed files, or you can make this "randomness" deterministic by basing it off of the index like

var i int
for name, dbRecords := range dbs {
  if compress && i % 2 == 0 {
    // compress it
  }
  i++
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed it. Thanks for the review!

@sandeepsukhani sandeepsukhani merged commit 97dfb29 into grafana:master Aug 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants