Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve accuracy of I/O stats collection of external SST ingestion. #3713

Closed
wants to merge 7 commits into from

Conversation

riversand963
Copy link
Contributor

@riversand963 riversand963 commented Apr 12, 2018

RocksDB supports ingestion of external ssts. If ingestion_options.move_files is true, when performing ingestion, RocksDB first tries to link external ssts. If external SST file resides on a different FS, or the underlying FS does not support hard link, then RocksDB performs actual file copy. However, no matter which choice is made, current code increase bytes-written when updating compaction stats, which is inaccurate when RocksDB does NOT copy file.

Rename a sync point.

RocksDB supports ingestion of external ssts. If ingestion_options.move_files is
true, when performing ingestion, RocksDB
first tries to link external ssts. If
external sst file resides on a different FS, or the underlying FS does not
support hard link, then RocksDB performs actual file copy.
However, no matter which choice is made, current code increase bytes-written
when updating compaction stats, which is inaccurate when RocksDB does NOT copy
file.

Also refactor a struct to conform to C++ style.

Rename a sync point.
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@riversand963
Copy link
Contributor Author

@ajkr When you have a chance, could you PTAL? Thanks!

@facebook-github-bot
Copy link
Contributor

@riversand963 has updated the pull request. View: changes, changes since last import

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@@ -103,12 +103,14 @@ Status ExternalSstFileIngestionJob::Prepare(
// Original file is on a different FS, use copy instead of hard linking
status = CopyFile(env_, path_outside_db, path_inside_db, 0,
db_options_.use_fsync);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will the compiler complain if we put f.copy_file=true after this CopyFile and the lower one, and give it no default value? It's slightly clearer and less error-prone to me to initialize it only in the places where we know for certain what its value should be (and that prevents us from accidentally using the default value when code paths change in the future). But that's just my personal preference, will leave it up to you whether to address.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (f.copy_file) {
stats.bytes_written = f.fd.GetFileSize();
} else {
stats.bytes_moved = f.fd.GetFileSize();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh that's nice we already had bytes_moved for trivial move compactions, good reuse

@facebook-github-bot
Copy link
Contributor

@riversand963 has updated the pull request. View: changes, changes since last import

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@riversand963 has updated the pull request. View: changes, changes since last import

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@riversand963 has updated the pull request. View: changes, changes since last import

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@riversand963 has updated the pull request. View: changes, changes since last import

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@riversand963 is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@riversand963 riversand963 deleted the t28017432 branch April 13, 2018 18:03
// Whether to copy or link the external sst file. copy_file will be set to
// false if ingestion_options.move_files is true and underlying FS
// supports link operation.
bool copy_file;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ubsan_check is complaining about the missing default value here.

db/external_sst_file_ingestion_job.h:23:8: runtime error: load of value 252, which is not a valid value for type 'bool'

facebook-github-bot pushed a commit that referenced this pull request Apr 16, 2018
Summary:
The reason for this initialization is that LLVM UBSAN check will fail due to
uninitialized bool. [StackOverflow post](https://stackoverflow.com/questions/31420154/runtime-error-load-of-value-127-which-is-not-a-valid-value-for-type-bool).

UBSAN log:
> ===== Running external_sst_file_basic_test
[==========] Running 7 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 7 tests from ExternalSSTFileBasicTest
[ RUN      ] ExternalSSTFileBasicTest.Basic
[       OK ] ExternalSSTFileBasicTest.Basic (6 ms)
[ RUN      ] ExternalSSTFileBasicTest.NoCopy
db/external_sst_file_ingestion_job.h:23:8: runtime error: load of value 253, which is not a valid value for type 'bool'

miasantreble  I've tested this locally using the following command.
```
TEST_TMPDIR=/dev/shm/rocksdb COMPILE_WITH_UBSAN=1 OPT=-g make J=1 -j8 ubsan_check
```

ajkr This PR is related to your review comment in [PR](#3713). It turns out that, with UBSAN enabled, we must provide a default value for boolean member variables.
Closes #3728

Differential Revision: D7642476

Pulled By: riversand963

fbshipit-source-id: 4c09a4b8d271151cb99ae7393db9e4ad9f29762e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants