Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASSANDRA-18714 CQLSSTableWriter writes SAI index #3029

Merged
merged 1 commit into from
Jan 22, 2024

Conversation

smiklosovic
Copy link
Contributor

Thanks for sending a pull request! Here are some tips if you're new here:

  • Ensure you have added or run the appropriate tests for your PR.
  • Be sure to keep the PR description updated to reflect all changes.
  • Write your PR title to summarize what this PR proposes.
  • If possible, provide a concise example to reproduce the issue for a faster review.
  • Read our contributor guidelines
  • If you're making a documentation change, see our guide to documentation contribution

Commit messages should follow the following format:

<One sentence description, usually Jira title or CHANGES.txt summary>

<Optional lengthier description (context on patch)>

patch by <Authors>; reviewed by <Reviewers> for CASSANDRA-#####

Co-authored-by: Name1 <email1>
Co-authored-by: Name2 <email2>

The Cassandra Jira

@smiklosovic smiklosovic force-pushed the CASSANDRA-18714-5.0 branch 4 times, most recently from a183c86 to 18658f7 Compare January 7, 2024 19:50
@smiklosovic smiklosovic marked this pull request as ready for review January 8, 2024 17:01
@smiklosovic smiklosovic force-pushed the CASSANDRA-18714-5.0 branch 2 times, most recently from 9d56c44 to 0575e71 Compare January 9, 2024 09:39
public List<String> importNewSSTables(String ksName, String cfName, Set<String> srcPaths, boolean resetLevel, boolean clearRepaired, boolean verifySSTables, boolean verifyTokens, boolean invalidateCaches, boolean extendedVerify, boolean copyData)
{
return getCfsProxy(ksName, cfName).importNewSSTables(srcPaths, resetLevel, clearRepaired, verifySSTables, verifyTokens, invalidateCaches, extendedVerify, copyData);
public List<String> importNewSSTables(String ksName, String cfName, Set<String> srcPaths, boolean resetLevel,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created new method instead of modifying the old one to be backward compatible / not break stuff for people using the older version already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So your comment says you added a new one, but it looks like you modified the old one... I don't know if there's anything other than the Import command that actually uses it, but wanted to check if it was intentional that this comment doesn't match the code - or did you mean you made a new one in ColumnFamilyStore/ColumnFamilyStoreMBean which would also make sense, and which you did... so maybe I just misunderstood this PR comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeetKunDoug correct, I created new one on ColumnFamilyStore/MBean but I changed the one on NodeProbe. That is just fine, because NodeProbe one is "internal" to Cassandra. Nobody depends on NodeProbe across versions ... Does that make sense?

@@ -80,6 +80,16 @@ public class Import extends NodeToolCmd
description = "Copy data from source directories instead of moving them")
private boolean copyData = false;

@Option(title = "fail_on_missing_index",
name = {"-f", "--fail-on-missing-index"},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am open to different flags here, -f and -i just makes sense to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about -ri/--require-index-component and -siv/--skip-index-validation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to -siv/--skip-index-validation

Still thinking about the other...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine w/ --require-index-component too, although I think it should be --require-index-components. (An SAI index is really a bag of separate components that enable related functionality, but perhaps I'm too in the weeds...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for --skip-index-validation, I do not like this one because if you notice what other flags are, there is prefix --no - --no-verify, --no-tokens, --no-invalidate-caches - there is no --skip.... We might do --no-index-validation, that would work, but not --skip-index-validation.

When it comes to --require-index-component, I do not like the fact that this does not say what happens when they are required and they are not present. To prepend it with --fail is pretty obvious to me. Anyway, one can conclude that if something is required and it is not there, it will probably complain and fail eventually ... yeah, I think this could fly (and the fact it will fail is also mentioned in the description field).

flags are now -ri and -niv as in --require-index-components and --no-index-validation.

Copy link
Contributor

@JeetKunDoug JeetKunDoug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for picking up this JIRA ticket, and already addressing the review comments I had posted earlier. Generally looks great, but I think there's one area where there's a bug introduced (in ColumnFamilyStore#importNewSSTables, where we lost the value of the copyData parameter in one of the overloads).

@@ -80,6 +80,16 @@ public class Import extends NodeToolCmd
description = "Copy data from source directories instead of moving them")
private boolean copyData = false;

@Option(title = "fail_on_missing_index",
name = {"-f", "--fail-on-missing-index"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about -ri/--require-index-component and -siv/--skip-index-validation

src/java/org/apache/cassandra/db/ColumnFamilyStore.java Outdated Show resolved Hide resolved
src/java/org/apache/cassandra/db/ColumnFamilyStore.java Outdated Show resolved Hide resolved
src/java/org/apache/cassandra/db/SSTableImporter.java Outdated Show resolved Hide resolved
public List<String> importNewSSTables(String ksName, String cfName, Set<String> srcPaths, boolean resetLevel, boolean clearRepaired, boolean verifySSTables, boolean verifyTokens, boolean invalidateCaches, boolean extendedVerify, boolean copyData)
{
return getCfsProxy(ksName, cfName).importNewSSTables(srcPaths, resetLevel, clearRepaired, verifySSTables, verifyTokens, invalidateCaches, extendedVerify, copyData);
public List<String> importNewSSTables(String ksName, String cfName, Set<String> srcPaths, boolean resetLevel,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So your comment says you added a new one, but it looks like you modified the old one... I don't know if there's anything other than the Import command that actually uses it, but wanted to check if it was intentional that this comment doesn't match the code - or did you mean you made a new one in ColumnFamilyStore/ColumnFamilyStoreMBean which would also make sense, and which you did... so maybe I just misunderstood this PR comment?

Copy link
Contributor

@maedhroz maedhroz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall structure looks pretty good. A few things to resolve around per-column component validation, some naming to finalize, and maybe a new test case for empty index validation, and we should be set...

@smiklosovic smiklosovic force-pushed the CASSANDRA-18714-5.0 branch 2 times, most recently from 60f7692 to bed6fc5 Compare January 11, 2024 10:15
Copy link
Contributor

@maedhroz maedhroz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple minor notes, but it looks like the important stuff is all taken care of.

LGTM

Thanks for the patch!

@smiklosovic smiklosovic force-pushed the CASSANDRA-18714-5.0 branch 3 times, most recently from e397f88 to b9ab129 Compare January 12, 2024 12:43
@maedhroz
Copy link
Contributor

maedhroz commented Jan 12, 2024

Pushed up an alternate approach to fixing the client mode issues here: fecac72

Tried to minimize the changes to existing code and the need for making up mock values by bypassing certain paths earlier. Also took some inspiration from what StressCQLSSTableWriter does in this case.

@smiklosovic Let me know what you think...

@maedhroz
Copy link
Contributor

This PR is replaced by #3046, yes?

@smiklosovic
Copy link
Contributor Author

@maedhroz yes, in 3046 there are refactored tests, your commit and one extra commit fixing the issue we talked about

@smiklosovic smiklosovic force-pushed the CASSANDRA-18714-5.0 branch 4 times, most recently from 874636c to 90bdd75 Compare January 20, 2024 17:41
patch by Stefan Miklosovic; reviewed by Caleb Rackliffe, Doug Rohrer for CASSANDRA-18714
@smiklosovic smiklosovic merged commit 016dd6c into apache:cassandra-5.0 Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants