Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-2380. Use the Table.isExist() API instead of get() API while checking for presence of key. #102

Closed
wants to merge 1 commit into from

Conversation

avijayanhwx
Copy link
Contributor

@avijayanhwx avijayanhwx commented Oct 29, 2019

What changes were proposed in this pull request?

Currently, when OM creates a file/directory, it checks the absence of all prefix paths of the key in its RocksDB. Since we don't care about the deserialization of the actual value from the DB, we should use the isExist API added in org.apache.hadoop.hdds.utils.db.Table which internally uses the more performant keyMayExist API of RocksDB. More documentation on the keyMayExist API can be found here This API was already used in the old OM create file flow. This patch adds it to the new OM code flow that uses Double buffering.
There is a similar check in Recon container DB. Refactored that as well.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-2380

How was this patch tested?

Manually tested.

…r pre-existing files in the directory path.
Copy link
Contributor

@bharatviswa504 bharatviswa504 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. One minor comment.
And also can you confirm if UT failure is related, as I don't see UT's failing in other CI runs.

@avijayanhwx avijayanhwx changed the title HDDS-2380. OMFileRequest should use the isExist API while checking fo… HDDS-2380. Use the Table.isExist API instead of get() call while checking for presence of key. Oct 29, 2019
@avijayanhwx avijayanhwx changed the title HDDS-2380. Use the Table.isExist API instead of get() call while checking for presence of key. HDDS-2380. Use the Table.isExist() API instead of get() API while checking for presence of key. Oct 29, 2019
@avijayanhwx
Copy link
Contributor Author

Overall LGTM. One minor comment.
And also can you confirm if UT failure is related, as I don't see UT's failing in other CI runs.

Changed the title of the JIRA instead :)

@bharatviswa504
Copy link
Contributor

Ready to go, if UT failures are not related.

@bharatviswa504
Copy link
Contributor

/retest

@@ -60,15 +60,15 @@ public static OMDirectoryResult verifyFilesInPath(
String dbDirKeyName = omMetadataManager.getOzoneDirKey(volumeName,
bucketName, pathName);

if (omMetadataManager.getKeyTable().get(dbKeyName) != null) {
if (omMetadataManager.getKeyTable().isExist(dbKeyName)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch looks correct, but I have a question about the earlier patch that went in to support this.

The documentation of keyMayExist -- says this.

public boolean keyMayExist(ColumnFamilyHandle columnFamilyHandle,
byte[] key,
java.lang.StringBuilder value)
If the key definitely does not exist in the database, then this method returns false, else true. This check is potentially lighter-weight than invoking DB::Get(). One way to make this lighter weight is to avoid doing any IOs.
Parameters:
columnFamilyHandle - ColumnFamilyHandle instance
key - byte array of a key to search for
value - StringBuilder instance which is a out parameter if a value is found in block-cache.

There is a value parameter in the call, which is documented as -- out parameter if the value is found in the block cache.

The objective of calling into this function, I suppose to leverage that -- Mind you the keyMayExist is not bloom filter lookup, it says If the value is in memory already you get access to it.

However, if you read the code -- it looks like we ignore the possiblity of gettting the value back from the block cache.

Here is the code that was committed via
"HDDS-1691 : RDBTable#isExist should use Rocksdb#keyMayExist"
// RocksDB#keyMayExist // If the key definitely does not exist in the database, then this // method returns false, else true. return db.keyMayExist(handle, key, new StringBuilder()) && db.get(handle, key) != null;

So how does this call eliminate the IO cost, since the new StringBuilder is completely ignored ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @anuengineer. This is a valid observation. I will fix the working of the isExist method in #101 where we are modifying the implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thanks @anuengineer for flagging it. Yes in true case if we get value from block cache we can use that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect, let me go ahead and commit this patch, since you will fix that issue in the #101 pull request.

@anuengineer
Copy link
Contributor

I have committed this patch to the master branch. @bharatviswa504 Thanks for the review. @avijayanhwx Thank you for the contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants