New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce citus local tables #4143
Introduce citus local tables #4143
Conversation
@@ -96,13 +96,12 @@ CallFuncExprRemotely(CallStmt *callStmt, DistObjectCacheEntry *procedure, | |||
CitusTableCacheEntry *distTable = GetCitusTableCacheEntry(colocatedRelationId); | |||
Var *partitionColumn = distTable->partitionColumn; | |||
bool colocatedWithReferenceTable = false; | |||
if (IsCitusTableTypeCacheEntry(distTable, REFERENCE_TABLE)) | |||
if (IsCitusTableTypeCacheEntry(distTable, CITUS_TABLE_WITH_NO_DIST_KEY)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that change make sense ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should punt on this change for now, because we have a lot of reference table usage: https://github.com/citusdata/citus/compare/single-placement-table/master-cache-entry-rebased...add_funct_call_delegation?expand=1
I'd suggest we make this back as REFERENCE_TABLE and add Support function call delegation
to the todo list. Maybe share this super basic change that potentially allows it.
src/include/distributed/commands.h
Outdated
* Flags that can be passed to GetForeignKeyOids to indicate | ||
* which foreign key constraint OIDs are to be extracted | ||
*/ | ||
typedef enum ExtractForeignKeyConstrainstMode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constrainst -> Constraints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets not forget this
Just a heads up, in 5012c6b, the commit disabling |
@@ -96,13 +96,12 @@ CallFuncExprRemotely(CallStmt *callStmt, DistObjectCacheEntry *procedure, | |||
CitusTableCacheEntry *distTable = GetCitusTableCacheEntry(colocatedRelationId); | |||
Var *partitionColumn = distTable->partitionColumn; | |||
bool colocatedWithReferenceTable = false; | |||
if (IsCitusTableTypeCacheEntry(distTable, REFERENCE_TABLE)) | |||
if (IsCitusTableTypeCacheEntry(distTable, CITUS_TABLE_WITH_NO_DIST_KEY)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should punt on this change for now, because we have a lot of reference table usage: https://github.com/citusdata/citus/compare/single-placement-table/master-cache-entry-rebased...add_funct_call_delegation?expand=1
I'd suggest we make this back as REFERENCE_TABLE and add Support function call delegation
to the todo list. Maybe share this super basic change that potentially allows it.
/* assert that we created the shell table properly in the same schema */ | ||
Assert(OidIsValid(shellRelationId)); | ||
|
||
DropAndMoveDefaultSequenceOwnerships(shardRelationId, shellRelationId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be nice to have a comment on why and how we handle sequences on citus local tables
src/include/distributed/commands.h
Outdated
* Flags that can be passed to GetForeignKeyOids to indicate | ||
* which foreign key constraint OIDs are to be extracted | ||
*/ | ||
typedef enum ExtractForeignKeyConstrainstMode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets not forget this
@@ -30,6 +30,7 @@ typedef enum | |||
} PropSetCmdBehavior; | |||
extern PropSetCmdBehavior PropagateSetCommands; | |||
extern bool EnableDDLPropagation; | |||
extern bool AllCitusLocalTable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this new variable? We seem to not use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was part of vanilla testing, removing it :)
|
||
|
||
/* | ||
* create_citus_local_table creates a citus table from the table with relationId. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: a citus table -> a citus local table
|
||
PG_RETURN_VOID(); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor:
It could be nice to order the functions in a way that they are called, for example you could place CreateCitusLocalTable
below this function for better readability
static void | ||
CreateCitusLocalTable(Oid relationId) | ||
{ | ||
/* these checks should be done before acquiring the lock on the table */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: it could be nicer to explain "why" these checks should be done "before". That way when adding something new, it could help people place it to the right place
|
||
/* | ||
* ErrorIfUnsupportedCreateCitusLocalTable errors out if we cannot create the | ||
* citus local table from the relation with relationId. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from the relation with relationId -> from the relation
we don't have relationId
as a parameter
|
||
|
||
/* | ||
* FinalizeCitusLocalTableCreation performs completes creation of the citus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either remove performs
or completes
@@ -1401,6 +1666,87 @@ InterShardDDLTaskList(Oid leftRelationId, Oid rightRelationId, | |||
} | |||
|
|||
|
|||
/* | |||
* CreateRightShardListForInterShardDDLTask is an helper function that creates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an helper -> a helper
|
||
|
||
/* | ||
* DeferErrorIfUnsupportedModifyQueryWithCitusLocalTable is an helper function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an helper -> a helper
|
||
|
||
/* | ||
* DeferErrorIfUnsupportedModifyQueryWithPostgresLocalTable is an helper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/an helper/ a helper/g
{ | ||
bool groupContainsNodes = false; | ||
|
||
LockRelationOid(DistNodeRelationId(), ShareLock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the difference between not releasing a lock and releasing it with NO_LOCK
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't unlock a relation with NO_LOCK
* returns coordinator node. | ||
*/ | ||
WorkerNode * | ||
CoordinatorNode() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CoordinatorNode -> CoordinatorNodeIfAddedAsWorker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about CoordinatorNodeIfAddedAsWorkerOrError ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah even better :)
FROM (SELECT shardid FROM pg_dist_shard WHERE logicalrelid='citus_local_table_1'::regclass) as shardid; | ||
-- undistribute_table is supported | ||
BEGIN; | ||
SELECT undistribute_table('citus_local_table_1'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also make sure that master_copy_shard_placement()
has an explicit error check for citus local tables. The correct place could be ErrorIfTableCannotBeReplicated()
If you create a citus local table with statement replication, it gives a weird error now
SELECT master_copy_shard_placement(102082, 'localhost', 5432, 'localhost', 9700);
ERROR: could not find placement matching "localhost:9700"
HINT: Confirm the placement still exists and try again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be outdated, I added necessary changes for that in
citus/src/backend/distributed/operations/repair_shards.c
Lines 186 to 213 in e36b26e
/* | |
* ErrorIfTableCannotBeReplicated function errors out if the given table is not suitable | |
* for its shard being replicated. Shard replications is not allowed only for MX tables, | |
* since RF=1 is a must MX tables. | |
*/ | |
static void | |
ErrorIfTableCannotBeReplicated(Oid relationId) | |
{ | |
/* | |
* Note that ShouldSyncTableMetadata() returns true for both MX tables | |
* and reference tables. | |
*/ | |
bool shouldSyncMetadata = ShouldSyncTableMetadata(relationId); | |
if (!shouldSyncMetadata) | |
{ | |
return; | |
} | |
CitusTableCacheEntry *tableEntry = GetCitusTableCacheEntry(relationId); | |
char *relationName = get_rel_name(relationId); | |
if (IsCitusTableTypeCacheEntry(tableEntry, CITUS_LOCAL_TABLE)) | |
{ | |
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), | |
(errmsg("Table %s is a citus local table. Replicating " | |
"shard of a citus local table currently is not " | |
"supported", quote_literal_cstr(relationName))))); | |
} |
SELECT * FROM citus_local_table_3; | ||
|
||
-- finally show that we do not allow defining foreign key in mx nodes | ||
ALTER TABLE citus_local_table_3 ADD CONSTRAINT fkey_local_to_local_2 FOREIGN KEY(l1) REFERENCES citus_local_table_4(l1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add some savepoint regression tests at this moment. It shouldn't take a lot of time, you can check the existing savepoint tests.
To simple tests like,
Do a modification to citus local table;
rollback to savepoint;
see that the modification is rolled back
===
Do a modification to citus local table;
Do a modification to distributed table;
rollback to savepoint;
see that the modificatios are rolled back
===
Do a modification to citus local table;
Do a modification to reference table;
rollback to savepoint;
see that the modifications are rolled back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are good to merge this now. It has been a long journey, we started with the idea of enabling foreign keys between local tables and reference tables, and we ended-up with a first class table type in Citus. This work will be essential for hybrid local/distributed table clusters.
We'll need later to iterate to implement all the missing pieces (#4145).
This commit brings following features: Foreign key support from citus local tables to reference tables * Foreign key support from reference tables to citus local tables (only with RESTRICT & NO ACTION behavior) * ALTER TABLE ENABLE/DISABLE trigger command support * CREATE/DROP/ALTER trigger command support and disallows: * ALTER TABLE ATTACH/DETACH PARTITION commands * CREATE TABLE <postgres table> ATTACH PARTITION <citus local table> commands * Foreign keys from postgres tables to citus local tables (the other way was already disallowed) for citus local tables.
ab643b5
to
3a73fba
Compare
DESCRIPTION: Introduce citus local tables
This is the final pr to merge Citus Local Tables into master 🎵 🕺
The commits in this pr are merged from other sub-pr's:
create_citus_local_table
udffor citus local tables
places to integrate citus local tables into our distributed execution logic
We are introducing citus local tables, which a new table type to citus.
To be able to create a citus local table, first we need to add coordinator as a worker
node.
Then, we can create a citus local table via
SELECT create_citus_local_table(<tableName>)
.Calling this udf from coordinator will actually create a single-shard table whose shard
is on the coordinator.
Also, from the citus metadata perspective, for citus local tables:
partitionMethod
is set toDISTRIBUTE_BY_NONE
(like reference tables) andreplicationModel
is set to the current value ofcitus.replication_model
, whichalready can't be equal to
REPLICATION_MODEL_2PC
, which is only used for referencetables internally.
Note that currently we support creating citus local tables only from postgres tables
living in the coordinator.
That means, it is not allowed to execute this udf from worker nodes or it is not allowed
to move shard of a citus local table to any other nodes.
Also, run-time complexity of calling
create_citus_local_table
udf does not dependon the size of the relation, that means, creating citus local tables is actually a
non-blocking operation.
This is because, instead of copying the data to a new shard, this udf just does the
following:
the
shardId
to it's name, constraints, indexes and triggers etc.,metadata sycn is enabled.
Here, we should also note we can execute queries/dml's from mx worker nodes
as citus local tables are already first class citus tables.
Even more, we brought trigger support for citus local tables.
That means, we can define triggers on citus local tables so that users can define trigger
objects to perform execution of custom functions that might even modify other citus tables
and other postgres tables.
Other than trigger support, citus local tables can also be involved in foreign key relationships
with reference tables.
Here the only restriction is, foreign keys from reference tables to citus local tables cannot
have behaviors other than
RESTRICT
&NO ACTION
behavior.Other than that, foreign keys between citus local tables and reference tables just work fine.
All in all, citus local tables are actually just local tables living in the coordinator, but natively
accessible from other nodes like other first class citus tables and this enables us to set foreign
keys constraints between very big coordinator tables and reference tables without having to
do any data replication to worker nodes for local tables.
Parent issue that describes current limitations & TO-DO's can be found here.