Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of Crc32c combine function #8305

Closed
wants to merge 3 commits into from

Conversation

zhichao-cao
Copy link
Contributor

Implement a function to generate the crc32c of two combined strings. Suppose we have the string 1 (s1) with crc32c checksum crc32c_1 and string 2 (s2) with crc32c checksum crc32c_2, the new string is s1+s2 and its checksum is crc32c_new=Crc32cCombine(crc32c_1, crc32c_2, s2.size).

test plan: make check, added new testing case

Copy link
Contributor

@pdillinger pdillinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also going to send over a commented/simplified version of Crc32cCombine (which I helped to adapt to our crc32c implementation)

@@ -137,6 +137,22 @@ TEST(CRC, Mask) {
ASSERT_EQ(crc, Unmask(Unmask(Mask(Mask(crc)))));
}

TEST(CRC, Crc32cCombineTest1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add more tests. We should probably cover all sizes of crc2 from 0 to 1000. Perhaps also one big size like 2^24 - 1 (for good coverage of the modular exponentiation).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will add the testing

ASSERT_EQ(crc3, crc1_2_combine);
}

TEST(CRC, Crc32cCombineTest2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps call it Crc32cCombineOrderMatters

@facebook-github-bot
Copy link
Contributor

@zhichao-cao has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

util/crc32c_test.cc Outdated Show resolved Hide resolved
util/crc32c_test.cc Outdated Show resolved Hide resolved
util/crc32c_test.cc Outdated Show resolved Hide resolved
util/crc32c_test.cc Outdated Show resolved Hide resolved
uint32_t crc1 = Value(s1.c_str(), size_1);
for (int i = 0; i < scale; i++) {
int size_2 = i;
std::string s2 = rnd.RandomString(size_2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RandomBinaryString should have better coverage here.

@facebook-github-bot
Copy link
Contributor

@zhichao-cao has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@zhichao-cao has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@pdillinger pdillinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once msvc is happy

util/crc32c.cc Outdated
// clang-format off
return i == 32 ? p : gf_multiply_sw_1(
/* i = */ i + 1,
/* p = */ static_cast<uint32_t>(p ^ (static_cast<int64_t>(-((b >> 31) & 1) & a))),
Copy link
Contributor

@pdillinger pdillinger Jun 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we have to polish this line from folly to make Microsoft happy. I'm not sure what the casts are supposed to be for. It looks like it computes /* p = */ p ^ ((0U - (b >> 31)) & a) which is clearly optimizing to avoid conditional branch

@zhichao-cao
Copy link
Contributor Author

LGTM once msvc is happy

Thanks for the help and review. Will address the windows build failu.

@facebook-github-bot
Copy link
Contributor

@zhichao-cao has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@zhichao-cao has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@zhichao-cao has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@zhichao-cao merged this pull request in ecccc63.

facebook-github-bot pushed a commit that referenced this pull request Jun 25, 2021
…AL (#8412)

Summary:
In PR #7523 , checksum handoff is introduced in RocksDB for WAL, Manifest, and SST files. When user enable checksum handoff for a certain type of file, before the data is written to the lower layer storage system, we calculate the checksum (crc32c) of each piece of data and pass the checksum down with the data, such that data verification can be down by the lower layer storage system if it has the capability. However, it cannot cover the whole lifetime of the data in the memory and also it potentially introduces extra checksum calculation overhead.

In this PR, we introduce a new interface in WritableFileWriter::Append, which allows the caller be able to pass the data and the checksum (crc32c) together. In this way, WritableFileWriter can directly use the pass-in checksum (crc32c) to generate the checksum of data being passed down to the storage system. It saves the calculation overhead and achieves higher protection coverage. When a new checksum is added with the data, we use Crc32cCombine #8305 to combine the existing checksum and the new checksum. To avoid the segmenting of data by rate-limiter before it is stored, rate-limiter is called enough times to accumulate enough credits for a certain write. This design only support Manifest and WAL which use log_writer in the current stage.

Pull Request resolved: #8412

Test Plan: make check, add new testing cases.

Reviewed By: anand1976

Differential Revision: D29151545

Pulled By: zhichao-cao

fbshipit-source-id: 75e2278c5126cfd58393c67b1efd18dcc7a30772
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants