Skip to content

Conversation

@yifan-c
Copy link
Contributor

@yifan-c yifan-c commented Aug 5, 2022

No description provided.

@jmckenzie-dev jmckenzie-dev self-requested a review August 5, 2022 17:53
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to compare to DBD.getCDCTotalSpace - the size of a CL Segment. A hypothetical example of an edge case w/this logic here:

  1. CommitLogSegment size 128mb
  2. getCDCTotalSpace = 3.95G out of 4.0G
  3. we see that getCDCTotalSpace is < max, so we flip the bit
  4. Then we overflow by CommitLogSegment size - (4.0 - 3.95)

OR - we just leave it as is because this is likely not a big problem and maybe document it. 🤷

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CDCSizeTracker sums up the size of all commit logs, including CDC and non-CDC files, since it creates hard links for all newly created files. (A side question is.. maybe we want to created the link unless the state is FORBIDDEN?). So the size is often an over-estimate when the workload contains both CDC and non-CDC traffic.
Back to the scenario in allocate... if the state was previously FORBIDDEN, it means the total size has exceeded the limit at the time of creation. Now, the total size drops below the limit. The CDCSizeTracker must have run at least once. In this case, the size of the file (allocatingFrom) is already counted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CDCSizeTracker must have run at least once. In this case, the size of the file (allocatingFrom) is already counted. <- I didn't connect that. You're right.

maybe we want to created the link unless the state is FORBIDDEN?
🤔 We shouldn't hard link the file if the cdcState in there is FORBIDDEN should we?

// Hard link file in cdc folder for realtime tracking
FileUtils.createHardLink(segment.logFile, segment.getCDCFile());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. If the the state is FORBIDDEN, we do not create a link. The link is created only when the state is NOT FORBIDDEN (i.e. PERMITTED/CONTAINS).
The current implementation is that a hardlink is created for all new segments, regardless of the state. Since each FORBIDDEN segment has a change to revert back to PERMITTED, this handling is easy but at the cost of slightly over-counting (the size of one file).
I think, and as you mentioned, it likely not a big problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Here's what I'm seeing in the code:

    public CommitLogSegment createSegment()
    {
        CommitLogSegment segment = CommitLogSegment.createSegment(commitLog, this);

        // Hard link file in cdc folder for realtime tracking
        FileUtils.createHardLink(segment.logFile, segment.getCDCFile());

        cdcSizeTracker.processNewSegment(segment);
        return segment;
    }

I think the code would need to read something like this to prevent hard-linking files with cdc disabled:

    public CommitLogSegment createSegment()
    {
        CommitLogSegment segment = CommitLogSegment.createSegment(commitLog, this);

        if (segment.getCDCState() != CDCState.FORBIDDEN)
        {
            // Hard link file in cdc folder for realtime tracking
            FileUtils.createHardLink(segment.logFile, segment.getCDCFile());
        }

        cdcSizeTracker.processNewSegment(segment);
        return segment;
    }

Further, we'd need to add logic to hard link the file on transition from FORBIDDEN to PERMITTED if that happened later in its lifecycle.

That make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense. The if statement is what I proposed in the earlier comment. The ease of handling in the current implementation is what I guessed by reading the code.

@yifan-c yifan-c force-pushed the CASSANDRA-17792/trunk branch from 50dcc26 to afcc58c Compare August 5, 2022 18:25
Copy link
Contributor

@jmckenzie-dev jmckenzie-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 after the changes.

@yifan-c yifan-c force-pushed the CASSANDRA-17792/trunk branch from f3028bf to bbec627 Compare August 11, 2022 19:29
@yifan-c yifan-c closed this Aug 11, 2022
michaeljmarshall pushed a commit to michaeljmarshall/cassandra that referenced this pull request Sep 17, 2025
…#1770)

If the index fails to build, mark the index non-queryable,
but don't fail the C* flush. This way the node can continue
to run the other queries.
michaelsembwever pushed a commit to thelastpickle/cassandra that referenced this pull request Jan 7, 2026
…#1770) (apache#1970)

If the index fails to build, mark the index non-queryable, but don't
fail the C* flush. This way the node can continue to run the other
queries.

---------

Co-authored-by: Piotr Kołaczkowski <pkolaczk@datastax.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants