Skip to content
This repository has been archived by the owner on Dec 8, 2021. It is now read-only.

backend/local: optimize local writer concurrency and memory usage #560

Closed
wants to merge 23 commits into from

Conversation

glorv
Copy link
Contributor

@glorv glorv commented Jan 28, 2021

What problem does this PR solve?

Optimize local writer performance and memory usage.

What is changed and how it works?

Optimize local writer by:

  • Open an index LocalWriter for each chunk to avoid the bottle neck at write index kvs
  • Replace the async ApplendRows with sync operation, to optimize the memory usage and simplify the logic
  • decrease the encode/deliverLoop chan to decrease memory usage
  • use a memory buffer to restore the temp kv pairs before write them to SST file
  • Batch ingest SST files into pebble to avoid the sync overhead in pebble ingest operation

Bench result:
The benchmark tests were run on a 40core machine with an NVME disk. and based on the following three data and table schema:

  • DataSet1. 14k warehouse tpcc data
  • DataSet2. 1k warehouse order_line table with 3 indexes. Thus each row generates 4 kvs.
    PRIMARY KEY (`ol_w_id`,`ol_d_id`,`ol_o_id`,`ol_number`),
    KEY `idx_d_i` (`ol_d_id`, `ol_i_id`),
    KEY `idx_d_w_supply` (`ol_d_id`, `ol_w_id`, `ol_supply_w_id`)
    
  • DataSet3. 1k warehouse order_line table with 3 indexes. Thus each row generates 8 kvs.
    PRIMARY KEY (`ol_w_id`,`ol_d_id`,`ol_o_id`,`ol_number`),
    KEY `idx_d_i` (`ol_d_id`,`ol_i_id`),
    KEY `idx_d_w_supply` (`ol_d_id`,`ol_w_id`,`ol_supply_w_id`),
    KEY `idx_o_amount` (`ol_o_id`,`ol_amount`),
    KEY `idx_d_supply` (`ol_d_id`,`ol_supply_w_id`),
    KEY `idx_o_d_i` (`ol_o_id`,`ol_d_id`,`ol_i_id`),
    KEY `idx_i_id` (`ol_i_id`)
    

Bench Result:

DataSet Branch Peak Memory Cost Time
DataSet1 master 34GB 2h25m
DataSet1 opt-local-writer 29GB 1h46m
DataSet2 master 30GB 24m57s
DataSet2 opt-local-writer 30GB 9m32s
DataSet3 master 64GB 1h22m
DataSet3 opt-local-writer 33GB 33m

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

Related changes

@glorv glorv force-pushed the opt-local-writer branch 4 times, most recently from 1708476 to 79a411d Compare February 3, 2021 06:03
@glorv glorv added the status/PTAL This PR is ready for review. Add this label back after committing new changes label Feb 3, 2021
@sleepymole sleepymole mentioned this pull request Feb 5, 2021
@glorv
Copy link
Contributor Author

glorv commented Feb 22, 2021

/run-all-tests

Comment on lines 1268 to 1272
engineCtx := context.WithValue(ctx, kv.LocalEngineConfigKey, kv.LocalEngineConfig{
Compact: true,
CompactConcurrency: 4,
CompactThreshold: 4 << 30, // 4GB
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is very strange to use context.WithValue to pass in these settings. since we control the interface of OpenEngine we could add an argument specifying the type of engine (index engine, data engine sorted, data engine unsorted).

@glorv
Copy link
Contributor Author

glorv commented Feb 25, 2021

/run-all-tests

@glorv
Copy link
Contributor Author

glorv commented Feb 25, 2021

closed in favor of pingcap/br#753

@glorv glorv closed this Feb 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status/PTAL This PR is ready for review. Add this label back after committing new changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants