Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protect existing files in FaultInjectionTest{Env,FS}::ReopenWritableFile() #8995

Closed
wants to merge 12 commits into from

Conversation

ajkr
Copy link
Contributor

@ajkr ajkr commented Oct 8, 2021

FaultInjectionTest{Env,FS}::ReopenWritableFile() functions were accidentally deleting WALs from previous db_stress runs causing verification to fail. They were operating under the assumption that ReopenWritableFile() would delete any existing file. It was a reasonable assumption considering the {Env,FileSystem}::ReopenWritableFile() documentation stated that would happen. The only problem was neither the implementations we offer nor the "real" clients in RocksDB code followed that contract. So, this PR updates the contract as well as fixing the fault injection client usage.

The fault injection change exposed that ExternalSSTFileBasicTest.SyncFailure was relying on a fault injection Env dropping unsynced data written by a regular Env. I changed that test to make its SstFileWriter use fault injection Env, and also implemented LinkFile() in fault injection so the unsynced data is tracked under the new name.

Test Plan:

  • Verified it fixes the following failure:
$ ./db_stress --clear_column_family_one_in=0 --column_families=1 --db=/dev/shm/rocksdb_crashtest_whitebox --delpercent=5 --expected_values_dir=/dev/shm/rocksdb_crashtest_expected --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=100000 --max_key_len=3 --nooverwritepercent=1 --ops_per_thread=1000 --prefixpercent=0 --readpercent=60 --reopen=0 --target_file_size_base=1048576 --test_batches_snapshots=0 --write_buffer_size=1048576 --writepercent=35 --value_size_mult=33 -threads=1
...
$ ./db_stress --avoid_flush_during_recovery=1 --clear_column_family_one_in=0 --column_families=1 --db=/dev/shm/rocksdb_crashtest_whitebox --delpercent=5 --destroy_db_initially=0 --expected_values_dir=/dev/shm/rocksdb_crashtest_expected --iterpercent=10 --key_len_percent_dist=1,30,69 --max_bytes_for_level_base=4194304 --max_key=100000 --max_key_len=3 --nooverwritepercent=1 --open_files=-1 --open_metadata_write_fault_one_in=8 --open_write_fault_one_in=16 --ops_per_thread=1000 --prefix_size=-1 --prefixpercent=0 --readpercent=50 --sync=1 --target_file_size_base=1048576 --test_batches_snapshots=0 --write_buffer_size=1048576 --writepercent=35 --value_size_mult=33 -threads=1
...
Verification failed for column family 0 key 000000000000001300000000000000857878787878 (1143): Value not found: NotFound:
Crash-recovery verification failed :(
...
  • make check -j48

@facebook-github-bot
Copy link
Contributor

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ajkr has updated the pull request. You must reimport the pull request before landing.

@ajkr ajkr changed the title Protect existing WALs in FaultInjectionTest{Env,FS}::ReopenWritableFile() Protect existing files in FaultInjectionTest{Env,FS}::ReopenWritableFile() Oct 10, 2021
@facebook-github-bot
Copy link
Contributor

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ajkr ajkr requested a review from ltamasi October 11, 2021 17:44
Copy link
Contributor

@ltamasi ltamasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the fixes @ajkr ! LGTM, I just have a couple of questions

Comment on lines +81 to +83
# Setting `nooverwritepercent > 0` is only possible because we do not vary
# the random seed, so the same keys are chosen by every run for disallowing
# overwrites.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to mention this in the db_stress help as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, not noticing this took me on a long detour.

} else if (!exists) {
// It was created by this `Env` just now.
should_track = true;
open_files_.insert(fname);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we would want to add the file to open_files_ even if should_track is false? (The same question applies to FaultInjectionTestFS.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, when we don't use TestWritableFile wrapper, open_files_ won't be maintained:

open_files_.erase(state.filename_);

From the perspective of this Env it doesn't need to know an unmanaged file is open, I guess? Maybe we should rename it open_managed_files_.

Comment on lines 712 to 723
// We preserve contents of overwritten files up to a size threshold.
// We could keep previous file in another name, but we need to worry about
// garbage collect the those files. We do it if it is needed later.
// We ignore I/O errors here for simplicity.
std::string previous_contents = kNewFileNoOverwrite;
if (target()->FileExists(t, IOOptions(), nullptr).ok()) {
uint64_t file_size;
if (target()->GetFileSize(t, IOOptions(), &file_size, nullptr).ok() &&
file_size < 1024) {
ReadFileToString(target(), t, &previous_contents).PermitUncheckedError();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, do we force-create the link when there is already a file with the new name? (If my understanding is correct, that's the scenario when we would potentially need the previous contents.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, seems our LinkFile() doesn't actually overwrite. I will just use kNewFileNoOverwrite instead. We could also carry over the value of dir_to_new_files_since_last_sync_[sdn.first][sdn.second] but I find that hard to think about.

Copy link
Contributor Author

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

Comment on lines 712 to 723
// We preserve contents of overwritten files up to a size threshold.
// We could keep previous file in another name, but we need to worry about
// garbage collect the those files. We do it if it is needed later.
// We ignore I/O errors here for simplicity.
std::string previous_contents = kNewFileNoOverwrite;
if (target()->FileExists(t, IOOptions(), nullptr).ok()) {
uint64_t file_size;
if (target()->GetFileSize(t, IOOptions(), &file_size, nullptr).ok() &&
file_size < 1024) {
ReadFileToString(target(), t, &previous_contents).PermitUncheckedError();
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, seems our LinkFile() doesn't actually overwrite. I will just use kNewFileNoOverwrite instead. We could also carry over the value of dir_to_new_files_since_last_sync_[sdn.first][sdn.second] but I find that hard to think about.

Comment on lines +81 to +83
# Setting `nooverwritepercent > 0` is only possible because we do not vary
# the random seed, so the same keys are chosen by every run for disallowing
# overwrites.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, not noticing this took me on a long detour.

} else if (!exists) {
// It was created by this `Env` just now.
should_track = true;
open_files_.insert(fname);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, when we don't use TestWritableFile wrapper, open_files_ won't be maintained:

open_files_.erase(state.filename_);

From the perspective of this Env it doesn't need to know an unmanaged file is open, I guess? Maybe we should rename it open_managed_files_.

@facebook-github-bot
Copy link
Contributor

@ajkr has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ajkr added a commit to ajkr/rocksdb that referenced this pull request Oct 11, 2021
…File()` (facebook#8995)

Summary:
`FaultInjectionTest{Env,FS}::ReopenWritableFile()` functions were accidentally deleting WALs from previous `db_stress` runs causing verification to fail. They were operating under the assumption that `ReopenWritableFile()` would delete any existing file. It was a reasonable assumption considering the `{Env,FileSystem}::ReopenWritableFile()` documentation stated that would happen. The only problem was neither the implementations we offer nor the "real" clients in RocksDB code followed that contract. So, this PR updates the contract as well as fixing the fault injection client usage.

The fault injection change exposed that `ExternalSSTFileBasicTest.SyncFailure` was relying on a fault injection `Env` dropping unsynced data written by a regular `Env`. I changed that test to make its `SstFileWriter` use fault injection `Env`, and also implemented `LinkFile()` in fault injection so the unsynced data is tracked under the new name.

Pull Request resolved: facebook#8995

Test Plan:
- Verified it fixes the following failure:

```
$ ./db_stress --clear_column_family_one_in=0 --column_families=1 --db=/dev/shm/rocksdb_crashtest_whitebox --delpercent=5 --expected_values_dir=/dev/shm/rocksdb_crashtest_expected --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=100000 --max_key_len=3 --nooverwritepercent=1 --ops_per_thread=1000 --prefixpercent=0 --readpercent=60 --reopen=0 --target_file_size_base=1048576 --test_batches_snapshots=0 --write_buffer_size=1048576 --writepercent=35 --value_size_mult=33 -threads=1
...
$ ./db_stress --avoid_flush_during_recovery=1 --clear_column_family_one_in=0 --column_families=1 --db=/dev/shm/rocksdb_crashtest_whitebox --delpercent=5 --destroy_db_initially=0 --expected_values_dir=/dev/shm/rocksdb_crashtest_expected --iterpercent=10 --key_len_percent_dist=1,30,69 --max_bytes_for_level_base=4194304 --max_key=100000 --max_key_len=3 --nooverwritepercent=1 --open_files=-1 --open_metadata_write_fault_one_in=8 --open_write_fault_one_in=16 --ops_per_thread=1000 --prefix_size=-1 --prefixpercent=0 --readpercent=50 --sync=1 --target_file_size_base=1048576 --test_batches_snapshots=0 --write_buffer_size=1048576 --writepercent=35 --value_size_mult=33 -threads=1
...
Verification failed for column family 0 key 000000000000001300000000000000857878787878 (1143): Value not found: NotFound:
Crash-recovery verification failed :(
...
```

- `make check -j48`

Reviewed By: ltamasi

Differential Revision: D31495388

Pulled By: ajkr

fbshipit-source-id: 7886ccb6a07cb8b78ad7b6c1c341ccf40bb68385
yoori pushed a commit to yoori/rocksdb that referenced this pull request Nov 26, 2023
…File()` (#8995)

Summary:
`FaultInjectionTest{Env,FS}::ReopenWritableFile()` functions were accidentally deleting WALs from previous `db_stress` runs causing verification to fail. They were operating under the assumption that `ReopenWritableFile()` would delete any existing file. It was a reasonable assumption considering the `{Env,FileSystem}::ReopenWritableFile()` documentation stated that would happen. The only problem was neither the implementations we offer nor the "real" clients in RocksDB code followed that contract. So, this PR updates the contract as well as fixing the fault injection client usage.

The fault injection change exposed that `ExternalSSTFileBasicTest.SyncFailure` was relying on a fault injection `Env` dropping unsynced data written by a regular `Env`. I changed that test to make its `SstFileWriter` use fault injection `Env`, and also implemented `LinkFile()` in fault injection so the unsynced data is tracked under the new name.

Pull Request resolved: facebook/rocksdb#8995

Test Plan:
- Verified it fixes the following failure:

```
$ ./db_stress --clear_column_family_one_in=0 --column_families=1 --db=/dev/shm/rocksdb_crashtest_whitebox --delpercent=5 --expected_values_dir=/dev/shm/rocksdb_crashtest_expected --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=100000 --max_key_len=3 --nooverwritepercent=1 --ops_per_thread=1000 --prefixpercent=0 --readpercent=60 --reopen=0 --target_file_size_base=1048576 --test_batches_snapshots=0 --write_buffer_size=1048576 --writepercent=35 --value_size_mult=33 -threads=1
...
$ ./db_stress --avoid_flush_during_recovery=1 --clear_column_family_one_in=0 --column_families=1 --db=/dev/shm/rocksdb_crashtest_whitebox --delpercent=5 --destroy_db_initially=0 --expected_values_dir=/dev/shm/rocksdb_crashtest_expected --iterpercent=10 --key_len_percent_dist=1,30,69 --max_bytes_for_level_base=4194304 --max_key=100000 --max_key_len=3 --nooverwritepercent=1 --open_files=-1 --open_metadata_write_fault_one_in=8 --open_write_fault_one_in=16 --ops_per_thread=1000 --prefix_size=-1 --prefixpercent=0 --readpercent=50 --sync=1 --target_file_size_base=1048576 --test_batches_snapshots=0 --write_buffer_size=1048576 --writepercent=35 --value_size_mult=33 -threads=1
...
Verification failed for column family 0 key 000000000000001300000000000000857878787878 (1143): Value not found: NotFound:
Crash-recovery verification failed :(
...
```

- `make check -j48`

Reviewed By: ltamasi

Differential Revision: D31495388

Pulled By: ajkr

fbshipit-source-id: 7886ccb6a07cb8b78ad7b6c1c341ccf40bb68385
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants