Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASSANDRA-18670 Importer should build SSTable indexes successfully before making new … #2492

Closed
wants to merge 2 commits into from

Conversation

maedhroz
Copy link
Contributor

…SSTables readable

patch by Caleb Rackliffe; reviewed by ? for CASSANDRA-18670

@maedhroz
Copy link
Contributor Author

// Make any indexes involved in this transaction non-queryable, as they will likely not match the backing table.
if (fromIndex)
indexes.forEach(StorageAttachedIndex::makeIndexNonQueryable);

Copy link
Contributor Author

@maedhroz maedhroz Jul 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in abort() here are probably what require some real discussion. The position I'm going to try to defend here is that making the index non-queryable is sub-optimal in all cases if our goal is to keep indexes and backing tables consistent w/ one another. (I've fixed SegmentFlushingFailureTester to reflect this, but more tests are probably necessary around Memtable flush...not hard to do.)

(I don't see any problem w/ continuing to not throw when the abort is upstream is generated from outside the index itself.)

For this to work, the companion changes in SSTableWriter are here: https://github.com/apache/cassandra/pull/2492/files#diff-27996cac323bd3bafcfd28e84caf649d44f3b4d83477874ae4d08c7513303209R234

@adelapena @mike-tr-adamson

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't hit Memtable flush yet, but just added IndexStreamingFailureTest, which documents how SAI handles index build failures during streaming in the non-entire file case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favour of this change. I never felt that it should be down to the index to say whether or not it is queryable. Particularly at this point in the writer. Short of validating the index here we have no idea whether it is valid or not.

}

return txnProxy.commit(accumulate);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change here in some way mirrors what already happens for streaming and what now happens for import in this patch. We make sure index components are complete before proceeding w/ making the related SSTables visible for reads, etc.

observers.forEach(SSTableFlushObserver::complete);
txnProxy.finish();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get a failure in complete(), and we've already called finish(), we'll get an illegal state...i.e. that we'll try to abort after committing due to the error from complete().

onSSTableChanged(Collections.emptySet(), notice.added, indexes, validate);
// Avoid validation for index files just written following Memtable flush. Otherwise, the new SSTables have
// come either from import, streaming, or a standalone tool, where they have also already been validated.
onSSTableChanged(Collections.emptySet(), notice.added, indexes, IndexValidation.NONE);
Copy link
Contributor Author

@maedhroz maedhroz Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, I think we can no longer get into the trouble we might find ourselves in w/ #2460 (comment), and the comment explains why we're not validating at all.

Once we get here, index components should have been built and such, so the only thing we're doing is updating the views and potentially marking the index non-queryable, which I think is okay. (i.e. Marking the index non-queryable is the only thing that we can really do, but what I need to confirm is whether we're being strict enough around that.)

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is close to the change that I added in my original patch for CASSANDRA-18653. At this point, we should already know that the indexes are built and valid. If they aren't then we aren't in a position to build them anyway.

I'm wondering whether we always need to do this here or whether we only need to do it if a memtable is present. The SIM handles the notification as well and will trigger a build if the memtable is not present.

Copy link
Contributor Author

@maedhroz maedhroz Jul 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SIM handles the notification, but it will no longer trigger builds for SAI, given SAI will now return true from Index#isSSTableAttached(). I think we need to update the view no matter what.

@maedhroz
Copy link
Contributor Author

maedhroz commented Jul 19, 2023

There's one failure here I think I've caused: https://app.circleci.com/pipelines/github/maedhroz/cassandra/746/workflows/ac4f54f6-4cd6-4884-8204-366a8be43525/jobs/10985/tests#failed-test-0

Investigating...

EDIT: Fixes this and cleaned up the checked exceptions situation in the process :)

…SSTables readable

- Avoid validation in response to SSTableAddedNotification, as it should already have been done somewhere else
- Change SSTableWriter to prevent commit when a failure is thrown out of an index build

patch by Caleb Rackliffe; reviewed by Mike Adamson and Andres de la Peña for CASSANDRA-18670
@maedhroz
Copy link
Contributor Author

Committed as 83523dd

@maedhroz maedhroz closed this Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants