-
Notifications
You must be signed in to change notification settings - Fork 3.8k
CASSANDRA-17792: Fix race condition on updating cdc size and advancing to next segment #1770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to compare to DBD.getCDCTotalSpace - the size of a CL Segment. A hypothetical example of an edge case w/this logic here:
- CommitLogSegment size 128mb
- getCDCTotalSpace = 3.95G out of 4.0G
- we see that getCDCTotalSpace is < max, so we flip the bit
- Then we overflow by
CommitLogSegment size - (4.0 - 3.95)
OR - we just leave it as is because this is likely not a big problem and maybe document it. 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CDCSizeTracker sums up the size of all commit logs, including CDC and non-CDC files, since it creates hard links for all newly created files. (A side question is.. maybe we want to created the link unless the state is FORBIDDEN?). So the size is often an over-estimate when the workload contains both CDC and non-CDC traffic.
Back to the scenario in allocate... if the state was previously FORBIDDEN, it means the total size has exceeded the limit at the time of creation. Now, the total size drops below the limit. The CDCSizeTracker must have run at least once. In this case, the size of the file (allocatingFrom) is already counted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CDCSizeTracker must have run at least once. In this case, the size of the file (allocatingFrom) is already counted. <- I didn't connect that. You're right.
maybe we want to created the link unless the state is FORBIDDEN?
🤔 We shouldn't hard link the file if the cdcState in there is FORBIDDEN should we?
// Hard link file in cdc folder for realtime tracking
FileUtils.createHardLink(segment.logFile, segment.getCDCFile());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. If the the state is FORBIDDEN, we do not create a link. The link is created only when the state is NOT FORBIDDEN (i.e. PERMITTED/CONTAINS).
The current implementation is that a hardlink is created for all new segments, regardless of the state. Since each FORBIDDEN segment has a change to revert back to PERMITTED, this handling is easy but at the cost of slightly over-counting (the size of one file).
I think, and as you mentioned, it likely not a big problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused. Here's what I'm seeing in the code:
public CommitLogSegment createSegment()
{
CommitLogSegment segment = CommitLogSegment.createSegment(commitLog, this);
// Hard link file in cdc folder for realtime tracking
FileUtils.createHardLink(segment.logFile, segment.getCDCFile());
cdcSizeTracker.processNewSegment(segment);
return segment;
}
I think the code would need to read something like this to prevent hard-linking files with cdc disabled:
public CommitLogSegment createSegment()
{
CommitLogSegment segment = CommitLogSegment.createSegment(commitLog, this);
if (segment.getCDCState() != CDCState.FORBIDDEN)
{
// Hard link file in cdc folder for realtime tracking
FileUtils.createHardLink(segment.logFile, segment.getCDCFile());
}
cdcSizeTracker.processNewSegment(segment);
return segment;
}
Further, we'd need to add logic to hard link the file on transition from FORBIDDEN to PERMITTED if that happened later in its lifecycle.
That make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense. The if statement is what I proposed in the earlier comment. The ease of handling in the current implementation is what I guessed by reading the code.
src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentManagerCDC.java
Outdated
Show resolved
Hide resolved
src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentManagerCDC.java
Outdated
Show resolved
Hide resolved
src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentManagerCDC.java
Outdated
Show resolved
Hide resolved
test/unit/org/apache/cassandra/db/commitlog/CommitLogSegmentManagerCDCTest.java
Outdated
Show resolved
Hide resolved
50dcc26 to
afcc58c
Compare
jmckenzie-dev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 after the changes.
f3028bf to
bbec627
Compare
…#1770) If the index fails to build, mark the index non-queryable, but don't fail the C* flush. This way the node can continue to run the other queries.
…#1770) (apache#1970) If the index fails to build, mark the index non-queryable, but don't fail the C* flush. This way the node can continue to run the other queries. --------- Co-authored-by: Piotr Kołaczkowski <pkolaczk@datastax.com>
No description provided.