download boltdb files parallelly during reads #2483

sandeepsukhani · 2020-08-10T10:16:35Z

What this PR does / why we need it:
When downloading files initially for reads we download them 1 at a time which is quite slow since we are now creating 96 per ingester per day. This PR changes the code to download up to 50 files at a time.

Checklist

Tests updated

codecov-commenter · 2020-08-10T10:28:10Z

Codecov Report

Merging #2483 into master will increase coverage by 0.06%.
The diff coverage is 72.50%.

@@            Coverage Diff             @@
##           master    #2483      +/-   ##
==========================================
+ Coverage   62.91%   62.98%   +0.06%     
==========================================
  Files         162      162              
  Lines       13998    14035      +37     
==========================================
+ Hits         8807     8840      +33     
- Misses       4502     4505       +3     
- Partials      689      690       +1

Impacted Files	Coverage Δ
pkg/storage/stores/shipper/downloads/table.go	`66.66% <72.50%> (+1.66%)`	⬆️
pkg/logql/evaluator.go	`92.47% <0.00%> (-0.41%)`	⬇️
pkg/promtail/targets/file/filetarget.go	`69.64% <0.00%> (+1.78%)`	⬆️
pkg/promtail/targets/file/tailer.go	`78.40% <0.00%> (+4.54%)`	⬆️

slim-bean · 2020-08-10T11:11:53Z

pkg/storage/stores/shipper/downloads/table.go

+	if err != nil {
+		return err
+	}
+
 	folderPath, err := t.folderPathForTable(true)


I think this could be moved before the t.doParallelDownload and you could pass the folder path into that function to avoid having to call it here and also inside that function, WDYT?

slim-bean · 2020-08-10T11:17:54Z

pkg/storage/stores/shipper/downloads/table.go

+
+	queue := make(chan chunk.StorageObject)
+	n := util.Min(len(objects), downloadParallelism)
+	incomingErrors := make(chan error, n)


I'm wondering if this should not be a buffered channel, honestly I'm not sure it will really make much difference because of how fast the loop should be that reads from it.

By having it buffered one of the worker threads could receive an error and then immediately start downloading the next item from the queue.

If we make it unbuffered then the check for success becomes synchronous over all the worker threads and any errors encountered would result in immediate cancel before workers started another download.

I think it makes more sense to me to remove the buffering on this channel.

slim-bean

LGTM!

download boltdb files parallelly during reads

951a88e

sandeepsukhani requested a review from slim-bean August 10, 2020 10:16

pull-request-size bot added the size/L label Aug 10, 2020

slim-bean reviewed Aug 10, 2020

View reviewed changes

changes suggested from PR review

926215d

slim-bean approved these changes Aug 10, 2020

View reviewed changes

slim-bean merged commit c5bc416 into grafana:master Aug 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

download boltdb files parallelly during reads #2483

download boltdb files parallelly during reads #2483

sandeepsukhani commented Aug 10, 2020

codecov-commenter commented Aug 10, 2020 •

edited

Loading

slim-bean Aug 10, 2020

slim-bean Aug 10, 2020

slim-bean left a comment

download boltdb files parallelly during reads #2483

download boltdb files parallelly during reads #2483

Conversation

sandeepsukhani commented Aug 10, 2020

codecov-commenter commented Aug 10, 2020 • edited Loading

Codecov Report

slim-bean Aug 10, 2020

Choose a reason for hiding this comment

slim-bean Aug 10, 2020

Choose a reason for hiding this comment

slim-bean left a comment

Choose a reason for hiding this comment

codecov-commenter commented Aug 10, 2020 •

edited

Loading