Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reserve disk space before create db engine #6321

Merged
merged 15 commits into from Jan 10, 2020
Merged

Conversation

@Little-Wallace
Copy link
Contributor

Little-Wallace commented Dec 23, 2019

Signed-off-by: Little-Wallace bupt2013211450@gmail.com

What have you changed?

I create a big file in data_dir before TiKV start. When TiKV write so much data into RocksDB that RocksDB, TiKV may crash because the file system can not be store more data.

  • PD could transfer region to another TiKV server, but tikv still need space to write the command of 'transfer region' on disk. If there is no space left, tikv could not execute any command which could change membership of regions on it.
  • If you want to delete some keys of RocksDB to reclaim space, it would not work. Because RocksDB deletes a key by appending a new key with delete tag just like other LSM-Tree engine, which need more disk space.
  • If you want to run a compaction job to reclaim space occupied by deleted keys, RocksDB must created a new file to store result of compaction before it could remove input files.
  • I set 2GB as a default value of reserve_space, because max total size of the files which are involved in a one compaction job is 2GB by default. So RocksDB could reclaim space after the reserve_file is removed.

How to recover with this feature.

If you found TiKV crash or failed to write data into RocksDB, you could update config filetikv.yml and set reserve_space 0MB. Then you could delete the file named 'space_placeholder_file' in data_dir and restart TiKV-process, and use tikv-ctl to compact some range in which many update or delete command were written into.

What is the type of the changes?

  • New feature (a change which adds functionality)

How is the PR tested?

  • Unit test

Does this PR affect documentation (docs) or should it be mentioned in the release notes?

No.

Does this PR affect tidb-ansible?

Yes. I would submit a PR later.

Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
fix fmt
Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
file::delete_file_if_exist(path.clone()).unwrap();
}
let f = File::create(path).unwrap();
f.allocate(file_size).unwrap();

This comment has been minimized.

Copy link
@lonng

lonng Dec 23, 2019

Member

I think we should not panic if cannot reserve enough space configured.

This comment has been minimized.

Copy link
@Little-Wallace

Little-Wallace Dec 23, 2019

Author Contributor

If the capacity of disk is so small that TiKV can not reserve enough space, you can adjust config to a suitable value.

fix test
Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
Copy link
Contributor

brson left a comment

This seems like a plausible solution for solving what I think the problem is.

It needs more comments about the nature of the problem it's solving and how. What kind of recovery triggers the problem? What is happening that causes the disk-space exhaustion? Why does this help? Why is 2GB a reasonable default?

From reading the patch I don't understand how it works. It looks like TiKV allocates a chunk of disk space on startup, but never truncates the file to free the diskspace. How does this patch work to free diskspace during recovery?

A static size of 2GB seems suspicious. Does recovery take a fixed amount of disk space so that a single number should be enough for most (or every) database? Does a human need to tune this number for their specific database, and change it as the size grows?

Having a single function that is overloaded for two purposes (one to allocate the space, one to free the space) makes for a confusing read. One of the purposes (delete) does the opposite of the name of the function (reserve). Two functions would be clearer. If it makes sense to do so, please split it into two functions.

components/tikv_util/src/lib.rs Show resolved Hide resolved
Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
src/storage/config.rs Outdated Show resolved Hide resolved
@@ -55,6 +57,7 @@ pub mod timer;
pub mod worker;

static PANIC_WHEN_UNEXPECTED_KEY_OR_DATA: AtomicBool = AtomicBool::new(false);
const SPACE_PLACEHOLDER_FILE: &str = "space_placeholder_file";

This comment has been minimized.

Copy link
@breeswish

breeswish Dec 27, 2019

Member

maybe call this file reserved_space is more clear

This comment has been minimized.

Copy link
@Little-Wallace

Little-Wallace Dec 31, 2019

Author Contributor

RESERVER_SPACE_FILE ?

This comment has been minimized.

Copy link
@breeswish

breeswish Dec 31, 2019

Member

I'm fine with it.

This comment has been minimized.

Copy link
@Little-Wallace

Little-Wallace Dec 31, 2019

Author Contributor

Emmmm. Its previous name was RESERVER_SPACE_FILE and Jinpeng suggested me to replace with space_placeholder

zhangjinpeng1987 and others added 3 commits Dec 30, 2019
Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
@Little-Wallace

This comment has been minimized.

Copy link
Contributor Author

Little-Wallace commented Dec 31, 2019

/bench

workload:
  threads: 1024
tikv.yml:
  rocksdb:
     lockcf:
         write-buffer-size: "32MB"
@sre-bot

This comment has been minimized.

Copy link
Collaborator

sre-bot commented Dec 31, 2019

Benchmark Report

Run Sysbench Performance Test on VMs

@@                               Benchmark Diff                               @@
================================================================================
tidb: 81a6eb04a585ba38bc8e6e7c0591411a39a5a827
--- tikv: 8376bb6b8458d3ba0cc179f9158fb0e931abbb26
+++ tikv: cb5897f6e75fd6b3077cfe02fa718cfea6edef62
pd: b3612833fde586a77bb520dbe1eeb7caad7d64ea
================================================================================
oltp_update_index:
    * QPS: 8833.66 ± 0.31% (std=18.13) delta: 0.79% (p=0.049)
    * Latency p50: 114.59 ± 0.27% (std=0.20) delta: -1.80%
    * Latency p99: 225.37 ± 0.90% (std=2.03) delta: -0.89%
            
oltp_insert:
    * QPS: 12102.43 ± 0.82% (std=65.57) delta: 0.50% (p=0.215)
    * Latency p50: 84.59 ± 0.83% (std=0.46) delta: -0.49%
    * Latency p99: 141.12 ± 0.90% (std=1.27) delta: -0.89%
            
oltp_read_write:
    * QPS: 15782.61 ± 0.25% (std=26.22) delta: 0.35% (p=0.127)
    * Latency p50: 1317.28 ± 0.30% (std=2.47) delta: -0.43%
    * Latency p99: 2493.86 ± 0.00% (std=0.00) delta: -1.78%
            
oltp_point_select:
    * QPS: 52041.12 ± 0.07% (std=27.79) delta: 0.67% (p=0.025)
    * Latency p50: 19.69 ± 0.37% (std=0.04) delta: -0.55%
    * Latency p99: 52.89 ± 0.00% (std=0.00) delta: 0.00%
            
oltp_update_non_index:
    * QPS: 14284.31 ± 0.68% (std=66.37) delta: 0.54% (p=0.136)
    * Latency p50: 71.66 ± 0.68% (std=0.33) delta: -0.53%
    * Latency p99: 155.80 ± 0.00% (std=0.00) delta: -1.78%
            
@Little-Wallace Little-Wallace force-pushed the Little-Wallace:reserve_space branch from 56e5131 to cb5897f Jan 3, 2020
@Little-Wallace

This comment has been minimized.

Copy link
Contributor Author

Little-Wallace commented Jan 3, 2020

/test

@Little-Wallace

This comment has been minimized.

Copy link
Contributor Author

Little-Wallace commented Jan 6, 2020

@breeswish Any comment?

@Little-Wallace Little-Wallace added this to the v4.0.0-beta milestone Jan 10, 2020
Copy link
Contributor

5kbpers left a comment

LGTM

@sre-bot

This comment has been minimized.

Copy link
Collaborator

sre-bot commented Jan 10, 2020

/run-all-tests

@brson

This comment has been minimized.

Copy link
Contributor

brson commented Jan 10, 2020

Thanks for the updates @Little-Wallace. I don't think my review has been sufficiently addressed. The use of this features is now described in this PR, but the code is still inscrutable. The code is what should be documented.

I now see that the function doesn't serve two purposes as I guessed. It is not meant to automatically free space at all - the entire feature requires user intervention, which is not so user-friendly. I would expect this to "just work" - run out of space -> delete file -> do operation -> create file again. But this might be reasonable too, or at least a reasonable first step. I don't know.

Since this is a new feature, the OP should seemingly indicate that docs should be updated, and docs should be updated.

I'm removing my request for changes, but I think the code should at least be documented. As-is I expect future readers to be confused.

@brson brson dismissed their stale review Jan 10, 2020

per comments

@Little-Wallace

This comment has been minimized.

Copy link
Contributor Author

Little-Wallace commented Jan 10, 2020

/merge

@sre-bot

This comment has been minimized.

Copy link
Collaborator

sre-bot commented Jan 10, 2020

Your auto merge job has been accepted, waiting for 6419

@sre-bot

This comment has been minimized.

Copy link
Collaborator

sre-bot commented Jan 10, 2020

/run-all-tests

@sre-bot

This comment has been minimized.

Copy link
Collaborator

sre-bot commented Jan 10, 2020

@Little-Wallace merge failed.

@Little-Wallace

This comment has been minimized.

Copy link
Contributor Author

Little-Wallace commented Jan 10, 2020

/test

@Little-Wallace

This comment has been minimized.

Copy link
Contributor Author

Little-Wallace commented Jan 10, 2020

/merge

@sre-bot

This comment has been minimized.

Copy link
Collaborator

sre-bot commented Jan 10, 2020

/run-all-tests

@sre-bot sre-bot merged commit 0f37255 into tikv:master Jan 10, 2020
6 checks passed
6 checks passed
DCO All commits are signed off!
Details
idc-jenkins-ci-tikv/integration-common-test Jenkins job succeeded.
Details
idc-jenkins-ci-tikv/integration-compatibility-test Jenkins job succeeded.
Details
idc-jenkins-ci-tikv/integration-copr-test Jenkins job succeeded.
Details
idc-jenkins-ci-tikv/integration-ddl-test Jenkins job succeeded.
Details
idc-jenkins-ci/test Jenkins job succeeded.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.