New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure that content IDs are unique in a Nessie repository #7757
Conversation
0ab4be2
to
2c94b17
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM although there are multiple conflicts because of #7771.
2c94b17
to
79c3592
Compare
Nessie content IDs are random IDs, but we do not guarantee that those are actually really unique. This change adds a new object type to ensure that a generated ID is unique by leveraging existing functionality of the `Persist` framework that already provides "`INSERT IF NOT EXIST`" guarantees. New content IDs from this change on are now verified. This change does not include functionality to automatically add already existing content-IDs. IMHO it is probably okay for now given the practically non-existing probability of content-ID conflicts.
79c3592
to
c61ed2f
Compare
@@ -122,4 +123,10 @@ static ObjId stringDataHash( | |||
hasher.putBytes(text.asReadOnlyByteBuffer()); | |||
return hashAsObjId(hasher); | |||
} | |||
|
|||
static ObjId uniqueIdHash(String space, String value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One general remark: wouldn't it be simpler to model the value as an opaque byte array? It seems we could save some conversions to and from string in the common case where the value is an UUID.
import org.projectnessie.versioned.storage.common.persist.Persist; | ||
|
||
/** | ||
* Describes the <em>internal</em> state of a reference when it has been created, managed by {@link |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The javadoc seems to refer to RefObj
.
Nessie content IDs are random IDs, but we do not guarantee that those are actually really unique.
This change adds a new object type to ensure that a generated ID is unique by leveraging existing functionality of the
Persist
framework that already provides "INSERT IF NOT EXIST
" guarantees.New content IDs from this change on are now verified. This change does not include functionality to automatically add already existing content-IDs. IMHO it is probably okay for now given the practically non-existing probability of content-ID conflicts.