Skip to content

Conversation

@zhichao-cao
Copy link
Contributor

Add new AppendWithVerify and PositionedAppendWithVerify APIs to Env and FileSystem to bring the data verification information (data checksum information) from upper layer (e.g., WritableFileWriter) to the storage layer. In this way, the customized Env or FileSystem is able to verify the correctness of data being written to the storage on time. Add checksum_handoff_file_types to DBOptions. User can use this option to control which file types (e.g., SST, WAL, Manifest) should use the new AppendWithVerify and PositionedAppendWithVerfiy APIs to handoff the verification information.

test plan: make asan_check/ make check

Copy link
Contributor

@mrambacher mrambacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of the implementations of the new APIs default to using the old API, which means 1) the requested new functionality is not there and is silently ignored and 2) the new API might not really be needed.

I do not understand exactly what the change is hoping to accomplish and do not see any tests that validate it is doing the right thing, or that it is handling any of the error cases.

I am unsure where the resulting checksum is used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add the new API rather than expand the IOOptions to have a DataVerificationInfo element

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOOptions suppose to bring the hints down to the FS, might not be a appropriate place to bring the verification information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo "storage"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you adding an option to the DBOptions that is not settable as an option? You should be able to set this using the options code.

db/builder.cc Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a method somewhere? There is almost the same code block in many locations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is bad as it silently drops the fact that the user requested data validation. Why can't this be done at a higher level during options validation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the checksum method name part of the Env? Shouldn't it be coming from the Options?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use FileType? Wouldn't this end up there anyway (but with better-but-different names)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option doesn't make sense to me as written. Like Mark pointed out elsewhere, this makes it easy for the user not to get handoff checksums as expected because there are a minimum of two things to turn on to get it. Why?

At first I thought this didn't make sense at all as a DB option, but it seems plausible that DBs would have different integrity requirements for different file types. For example, some applications might only rely on WAL for fast recovery, but have an alternate method of recovery in case of corrupt WAL.

But shouldn't we have a sensible set of defaults and let Env decide whether to enable checksum handoff? If a DB option is always needed to enable it, that seems like a bad (distant, unnecessary) and dangerous (not getting what you think you're getting) abstraction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdillinger My consideration is, FileSystem gives the capability to support checksum handoff. So I plan to introduce a API to FileSystem to return the ChecksumHandoffFuncName (or similar). If FileSystem does not support it, or does not want to enable checksum handoff in its layer, ChecksumHandoffFuncName will return an empty string. Also, if the checksum function is different from the one in the RocksDB, we do not need to even calculate the checksum and send it. So, that part is like the mechanism.

On the other side, I think we should give user/applications the option to decide, even FileSystem support it, but user/application do not want to use it, or partial use it. So I introduce the checksum_handoff_file_types to achieve the control. Otherwise, how can we deliver such flexibility?

Copy link
Contributor

@pdillinger pdillinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, since nothing but wrappers call *AppendWithVerify, what would be accomplished by picking this into 6.13?

HISTORY.md Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this way, the customized Env or FileSystem is able to verify the correctness of data being written to the storage on time.

Except this doesn't actually work yet, right? So why is this stated in HISTORY.md?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need to define these new interfaces in WritableFile? Since we want to deprecate the storage part of Env eventually, why not just define them in file_system.h and redirect in LegacyWritableFileWrapper. Users like Zippy can work with the FileSystem directly or call env->GetFileSystem()->AppendWithVerify().

@zhichao-cao
Copy link
Contributor Author

Many of the implementations of the new APIs default to using the old API, which means 1) the requested new functionality is not there and is silently ignored and 2) the new API might not really be needed.

I do not understand exactly what the change is hoping to accomplish and do not see any tests that validate it is doing the right thing, or that it is handling any of the error cases.

I am unsure where the resulting checksum is used.

Hi Mark, for the review. This PR is only the API change of the checksum handoff. The motivation of this project is to pass the data verification information (e.g., checksum for now. Maybe offset in the file and others) together with the data from RocksDB to the storage layer. In this way, a customized env/fs which has the capability to verify the data can use these information to check the correctness of the data and catch the potential corruptions.

This is only the first PR and the code actually does not execute. So I did not include the testing case.

The checksum will be generated in WritableFileWriter for the Slice data and be verified in the Env/FS WritableFile implementation.

@zhichao-cao zhichao-cao changed the title Introduce Checksum handoff APIs and Control Options [Draft] Introduce Checksum handoff APIs and Control Options Sep 21, 2020
IOSTATS_CPU_TIMER_GUARD(cpu_write_nanos, env_);
s = writable_file_->Append(Slice(src, allowed), IOOptions(), nullptr);
if (perform_data_verification_) {
std::string checksum;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reuse it for the whole file, rather than recreating a string for every write?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

void WritableFileWriter::DataChecksumCalculation(const char* data, size_t size,
std::string* checksum) {
uint32_t v_crc32c = crc32c::Extend(0, data, size);
PutFixed32(checksum, EndianSwapValue(v_crc32c));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why endian swap?

s = writable_file_->PositionedAppend(Slice(src, size), write_offset,
IOOptions(), nullptr);
if (perform_data_verification_) {
std::string checksum;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Can we reuse this string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants