Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASSANDRA-17044: Refactor schema management #1270

Conversation

jacek-lewandowski
Copy link
Contributor

No description provided.

…tifier

- Converted SchemaChangeListener to an interface with default empty methods - refactored overrides and added @OverRide annotations.
- Extracted functionality of registering and unregistering listeners as well as issuing individual notifications for each updated element in the schema to SchemaChangeNotifier. The only public methods are for whole keyspace creation, dropping and alteration. This simplifies SchemaManager class by removing a lot of code.
Table metadata reference cache has been extracted out from SchemaManager as a distinct entity.
Renamed values denoting timeouts so that it is clear which time unit they expect.
…onResult

Static subclass TransformationResult was removed from SchemaManager and added to SchemaTransformation class.

Also removed fields like success or exception as it made no sense for them to be in that class.
LocalKeyspaces holds metadata of the system local keyspaces, namely system and system_schema.
Metadata of those keyspaces are immutable, they are not synced with other nodes and not stored. As such, we want to separate them from metadata of the keyspaces which are modifiable and synced between nodes, hence called sharedKeyspaces.

There a couple of minor refactorings following that change - some query methods in SchemaManager were renamed and refactored to return immutable Keyspaces object rather than ordinary collections of KeyspaceMetadata or names.
Make most of the methods private, refactor tests to properly use mocking external services
Moved some static methods from MigrationManager to MigrationCoordinator and made them instance methods, as well as to the test code
It turned out that many static methods in MigrationManager are actually used only in the test code. I created SchemaTestUtil class in test sources and moved those methods there so that MigrationManager is not polluted.
There are two purposes of this change:
1. Move the code around to the more appropriate places (for example, CFS specific code for dropping table was moved from SM class to CFS class)
2. Relax the synchronization when adding/removing keyspace instances in SM - instead of synchronizing the whole collection of keyspace instances, we only synchronize the related item (the original idea authored by Branimir Lambov)
Added a couple of useful methods and made some small self-explanatory fixes.
Sometimes it happens we replace the keyspace. In this case the keyspace with the same name is in dropped and created sets. If we first try to create it, we would just replace the existing one and the remove it, which is not what we want.
Once joined the ring, StorageService artificially executes onChange (endpoint event subscriber method) with all the updated values. However, we want that all registered endpoint event subscribers get updated.
The code which is responsible for dealing with SchemaKeyspace (storing changes in system tables) has been extracted out to DefaultSchemaUpdateHandler.
The other changes are the consequences of that change.

Delegate some functionality related to MigrationCoordinator through DefaultSchemaUpdateHandler
DefaultSchemaUpdateHandler also registers itself as a endpoint event subscriber, so we may avoid explicit calls on MigrationCoordinator when certain events happen.

Squash SchemaManager.loadFromDisk methods

Make some methods private in SchemaManager

Remove direct usages of MigrationCoordinator
After previous changes, it became possible to remove some methods and reduce visibility of others. The main change is that we no longer need to get direct MigrationCoordinator instance outside DefaultSchemaUpdateHandler.
Factory is used to create the right instance depending on the context - whether we are online or not (server mode / tool or client mode). Provider is for creating the factory so that we can switch the implementation by providing a system property. This can be also useful for testing.
This seemed to be screwed a bit. In just two schema alteration statements we collected client warning which were captured during the transformation into a local collection. I guess it was done because the transformation was being done in a different stage (migration) and client warnings collected in that stage would not be present in the stage where the query was executed. Then, the client warnings are retrieved using clientWarnings method and added to the captured client warnings in the stage which is executing the query. Once we are not executing explicit transformations in migration stage (the methods are synchronized anyways and this was a blocking operation) the client warnings got duplicated - we collected them and then added again. This weird mechanism was implemented only in two schema alteration statements - I wonder if for other ones, the client warnings could simply get lost before this change.
@jacek-lewandowski jacek-lewandowski deleted the CASSANDRA-17044-4.0 branch September 13, 2022 06:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants