-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-23017 Verify the file integrity in persistent IOEngine #626
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
5ca09dc
to
b797c16
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@@ -1053,6 +1065,12 @@ private void persistToFile() throws IOException { | |||
} | |||
try (FileOutputStream fos = new FileOutputStream(persistencePath, false)) { | |||
fos.write(ProtobufMagic.PB_MAGIC); | |||
byte[] checksum = ((PersistentIOEngine) ioEngine).calculateChecksum(algorithm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct.
In master we already have the PB_MAGIC and we are writing the meta info into the persisting file in protobuf way. We can change the PB proto message to include optional checksum also. If the checksum is there, write those bytes. Marking it as optional so retrieving it back as bytes and check whether it was there or not, all these work PB will help us. We dont have to have any extra logic.
@@ -1081,8 +1101,47 @@ private void retrieveFromFile(int[] bucketSizes) throws IOException { | |||
throw new IOException("Persistence file does not start with protobuf magic number. " + | |||
persistencePath); | |||
} | |||
byte[] pbuf2 = new byte[pblen]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these logic will go away. From the PB message we can get.. So getting back the checksum should be included as part of parsePB()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has been modified to use PB persistent checksum. But I have not designed how to maintain backward compatibility.
💔 -1 overall
This message was automatically generated. |
required string map_class = 3; | ||
map<int32, string> deserializers = 4; | ||
required BackingMap backing_map = 5; | ||
optional bytes checksum = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put checksum bytes as the last entry. Do not change the existing order.
@@ -1067,7 +1081,9 @@ private void retrieveFromFile(int[] bucketSizes) throws IOException { | |||
} | |||
assert !cacheEnabled; | |||
|
|||
try (FileInputStream in = deleteFileOnClose(persistenceFile)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
* persistent file integrity, default algorithm is MD5 | ||
* */ | ||
private String algorithm; | ||
private byte[] checksum; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why to keep this as state in BC?
@@ -1053,6 +1066,7 @@ private void persistToFile() throws IOException { | |||
} | |||
try (FileOutputStream fos = new FileOutputStream(persistencePath, false)) { | |||
fos.write(ProtobufMagic.PB_MAGIC); | |||
checksum = ((PersistentIOEngine) ioEngine).calculateChecksum(algorithm); | |||
BucketProtoUtils.toPB(this).writeDelimitedTo(fos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The checksum can be passed as an argument. Keeping checksum as a state var in BC does not look correct.
@@ -1131,6 +1155,8 @@ private void verifyCapacityAndClasses(long capacitySize, String ioclass, String | |||
} | |||
|
|||
private void parsePB(BucketCacheProtos.BucketCacheEntry proto) throws IOException { | |||
((PersistentIOEngine) ioEngine).verifyFileIntegrity(proto.getChecksum().toByteArray(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For compatibility, u need to check whether the checksum is there or not. Use proto.hasChecksum() before calling the getter.
@@ -1525,6 +1551,10 @@ float getMemoryFactor() { | |||
return memoryFactor; | |||
} | |||
|
|||
public byte[] getChecksum() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this public getter needed then?
private static final DuFileCommand DU = new DuFileCommand(new String[] {"du", ""}); | ||
protected final String[] filePaths; | ||
|
||
public PersistentIOEngine(String[] filePaths) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the arg not using String... ? U can just call super(filePath); from FileMmapIOEngine ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because FileIOEngine can use multiple files as cache file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont mean used String type. Strings only but as in FileIOE.. String... filePaths
If so in FileMmapIOEngine we can just call super(filePath)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see.
* Verify cache files's integrity | ||
* @param algorithm the backingMap persistence path | ||
*/ | ||
protected void verifyFileIntegrity(byte[] persistentChecksum, String algorithm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether this in line with the branch-1 patch now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In branch-1 the method verifyFileIntegrity() return boolean type to indicate whether the verification is successful. Here we throw IOE if verification failed. Because the method parsePB() use this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine. The IOE is handled in BC.
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
algorithm); | ||
} else { | ||
// if has not checksum, it means the persistence file is old format | ||
LOG.warn("Persistence file is old format, it does not support verifying file integrity!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be a Warn? It can be just Info right?
@@ -1235,6 +1253,10 @@ public long getCurrentSize() { | |||
return this.bucketAllocator.getUsedSize(); | |||
} | |||
|
|||
public String getAlgorithm() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed now?
.putAllDeserializers(CacheableDeserializerIdManager.save()) | ||
.setBackingMap(BucketProtoUtils.toPB(cache.backingMap)) | ||
.setChecksum(ByteString.copyFrom(((PersistentIOEngine) cache.ioEngine). | ||
calculateChecksum(cache.getAlgorithm()))).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok used here. Fine. It could be have been just package protected only not public?
private static final DuFileCommand DU = new DuFileCommand(new String[] {"du", ""}); | ||
protected final String[] filePaths; | ||
|
||
public PersistentIOEngine(String[] filePaths) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont mean used String type. Strings only but as in FileIOE.. String... filePaths
If so in FileMmapIOEngine we can just call super(filePath)?
* Verify cache files's integrity | ||
* @param algorithm the backingMap persistence path | ||
*/ | ||
protected void verifyFileIntegrity(byte[] persistentChecksum, String algorithm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine. The IOE is handled in BC.
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@Reidddddd Can u also review once and if looks good, commit?
algorithm); | ||
} else { | ||
// if has not checksum, it means the persistence file is old format | ||
LOG.info("Persistence file is old format, it does not support verifying file integrity!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Persistent? Adjective instead of noun?
.setMapClass(cache.backingMap.getClass().getName()) | ||
.putAllDeserializers(CacheableDeserializerIdManager.save()) | ||
.setBackingMap(BucketProtoUtils.toPB(cache.backingMap)) | ||
.setChecksum(ByteString.copyFrom(((PersistentIOEngine) cache.ioEngine). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure here, since i don't read the code. What if the cache is not PersistentIOEngine? Will here throw cast exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling this method has shown that ioEngine is persistent, and all the persistent ioEngine extend PersistentIOEngine. So it won't throw cast exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, and a question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
💔 -1 overall
This message was automatically generated. |
HBASE-23017