Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate foreign server ops #5468

Merged
merged 1 commit into from Dec 23, 2021
Merged

Conversation

agedemenli
Copy link
Contributor

@agedemenli agedemenli commented Nov 16, 2021

DESCRIPTION: Adds propagation for foreign server commands

@@ -837,6 +865,16 @@ GetDistributeObjectOps(Node *node)
return &Any_CreateFunction;
}

case T_CreateFdwStmt:
{
return &Any_CreateFdw;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since foreign data wrappers are almost always created by an extension and we explicitly do not want to propagate objects created by an extension, we could skip CREATE FOREIGN DATA WRAPPER.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it's not created by an extension? Like the one we have in the regression tests:

CREATE FUNCTION fake_fdw_handler()
RETURNS fdw_handler
AS 'citus'
LANGUAGE C STRICT;

CREATE FOREIGN DATA WRAPPER fake_fdw HANDLER fake_fdw_handler;

I know that it doesn't seem like something a user would do :)
So you're suggesting to skip those CREATE FDW statements because (almost) no one will try to create fdw's manually, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The utility seems low and there is some cost (e.g. making sure we don't accidentally break anything, making sure deparsing logic does not have SQL injection bugs). I'm ok supporting it, but I'm also ok not supporting it.

PreprocessCreateFdwStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
EnsureCoordinator();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EnsureCoordinator should be below ShouldPropagate, otherwise we block this command on workers altogether

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, makes sense

@codecov
Copy link

codecov bot commented Nov 23, 2021

Codecov Report

Merging #5468 (7958d44) into master (77d948a) will increase coverage by 0.02%.
The diff coverage is 97.20%.

❗ Current head 7958d44 differs from pull request most recent head 945f174. Consider uploading reports for the commit 945f174 to get more accurate results

@@            Coverage Diff             @@
##           master    #5468      +/-   ##
==========================================
+ Coverage   92.75%   92.77%   +0.02%     
==========================================
  Files         215      217       +2     
  Lines       45294    45509     +215     
==========================================
+ Hits        42012    42222     +210     
- Misses       3282     3287       +5     

@agedemenli agedemenli marked this pull request as ready for review November 24, 2021 14:01
src/backend/distributed/commands/foreign_server.c Outdated Show resolved Hide resolved
PreprocessAlterForeignServerStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
if (!ShouldPropagate())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use ShouldPropagateObject instead of calling ShouldPropagate and IsObjectDistributed separately, as it looks more clear to me for objects. It is same for ones below as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, done

src/test/regress/sql/multi_create_fdw.sql Outdated Show resolved Hide resolved
/*
* CreateForeignServerStmtObjectAddress finds the ObjectAddress for the server
* that is created by given CreateForeignServerStmt. If missingOk is false and if
* the statistics does not exist, then it errors out.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale statistics comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thanks

@@ -12,5 +12,44 @@ RETURNS fdw_handler
AS 'citus'
LANGUAGE C STRICT;

set citus.enable_ddl_propagation to off;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we disable ddl propagation here?

Copy link
Contributor Author

@agedemenli agedemenli Dec 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to propagate the commands between set citus.enable_ddl_propagation to off; and set citus.enable_ddl_propagation to on;, because these objects are already created on the workers, in pg_regress_multi.pl. Because of that, we get an "already exists" error on the worker, if we don't disable ddl prop.

We also don't want to remove them since some other tests depend on these.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably remove from pg_regress_multi.pl and let this create

static char * GetDefElemActionString(DefElemAction action);

char *
DeparseCreateForeignServerStmt(Node *node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: needs comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it seems we don't have much comments on these deparse functions, so I'm fine with not having here as well

-- verify that the server is created on the worker
SELECT COUNT(*)=1 FROM pg_foreign_server WHERE srvname = 'foreign_server';
\c - - - :master_port
ALTER SERVER foreign_server OPTIONS (ADD passfile 'to_be_dropped');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have any quoted/escaped options and/or values?

foreach_ptr(def, stmt->options)
{
const char *value = defGetString(def);
appendStringInfo(buf, "%s \'%s\'", def->defname, value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly: should we call quote_literal_cstr for name/value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defnames don't have quotes according to pg syntax.
Values have single quotes. I've added quote_identifier for that

{
bool missingOk = false;
ObjectAddress typeAddress = GetObjectAddressFromParseTree(node, missingOk);
EnsureDependenciesExistOnAllNodes(&typeAddress);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of dependency a foreign server might have? Do we have a test for that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think a server can depend to an extension, and an extension can depend on a schema? So, such a test might be nice:

cREATE SCHEMa sc1;
CREATE EXTENSION postgres_fdw WITH SCHEMA sc1;
CREATE SERVER .... 

-- make sure that schema is also distributed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thanks!

@@ -9,3 +9,71 @@ AS 'citus'
LANGUAGE C STRICT;
CREATE FOREIGN DATA WRAPPER fake_fdw HANDLER fake_fdw_handler;
CREATE SERVER fake_fdw_server FOREIGN DATA WRAPPER fake_fdw;
-- test propagating foreign server creation
CREATE EXTENSION postgres_fdw;
CREATE SERVER foreign_server TYPE 'test_type' VERSION 'v1'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a very basic test with citus_add_node. Probably here is not the best place to add such a test, but can you find a place where we ensure that adding node sends the foreign server to the new node as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went over the test schedule but this file seemed like the most relevant place for this test. So I've added it to multi_create_fdw.sql.

If you have a suggestion I could consider moving it

{
const char *value = defGetString(def);

appendStringInfo(buf, " \'%s\'", value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any time you see quotes in an appendStringInfo, assume that there's a SQL injection bug.

Quoting should always be done separately to take into account escaping.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added that for CREATE SERVER OPTIONS: https://github.com/citusdata/citus/pull/5468/files#diff-c2f60db4cb9c4046fb711bd8ded1841ad0cede72d4ed8239aaf91e6151e9f90eR150-R151

But apparently missed that for ALTER SERVER.

Could you please check the one that is for CREATE SERVER? I'm considering applying the same here

@agedemenli agedemenli force-pushed the propagate-foreign-server-ops branch 4 times, most recently from df10a6a to 2c1ed64 Compare December 7, 2021 15:47
Copy link
Member

@onderkalaci onderkalaci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll continue reviewing & testing

UnmarkObjectDistributed(address);
}

/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels dangerous. Can't we create a new DropStmt with distributedServerNames?

In fact, I'm in favor of throwing an error if not all of them are distributed servers.

As we discussed earlier, these kinds of operations are dangerous for MX anyway.

CreateForeignServerStmtObjectAddress(Node *node, bool missing_ok)
{
CreateForeignServerStmt *stmt = castNode(CreateForeignServerStmt, node);
ForeignServer *server = GetForeignServerByName(stmt->servername, missing_ok);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think passing servername to this function can help to remove lots of repetition in the code, say we can use it in PreprocessRenameForeignServerStmt and others

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we expect all object address functions to have this signature?

https://github.com/citusdata/citus/blob/master/src/backend/distributed/deparser/objectaddress.c#L36

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is move the following -- by passing servername as parameter:

	ForeignServer *server = GetForeignServerByName(servername, missing_ok);
	Oid serverOid = server->serverid;
	ObjectAddress address = { 0 };
	ObjectAddressSet(address, ForeignServerRelationId, serverOid);

to a function, and use it in functions such as PreprocessRenameForeignServerStmt. But you need to palloc and return address*

CreateForeignServerStmt *createStmt = makeNode(CreateForeignServerStmt);

/* set server name and if_not_exists fields */
createStmt->servername = server->servername;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we need to createStmt->servername in this memory context? createStmt->servername = pstrdup(server->servername);

We should do same for all the pointers in the new CreateForeignServerStmt. As an example, see RecreateExtensionStmt

SELECT srvowner FROM pg_foreign_server WHERE srvname = 'foreign''server_1!';
srvowner
---------------------------------------------------------------------
3373
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we join with pg_roles to print the rolname instead of oid? This applies to all role ids in the test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like:

SELECT rolname FROM pg_roles JOIN pg_foreign_server ON (pg_roles.oid=pg_foreign_server.srvowner) WHERE srvname = 'foreign_server';
   rolname   
-------------
 onderkalaci
(1 row)

so that the tests do not rely on OIDs

}

EnsureCoordinator();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need EnsureDependenciesExistOnAllNodes here? I think with AlterServer you cannot alter the server to depend on a new object, right? Such as ALTER SERVER .. SET SCHEMA new_schema or such? If so, we should be fine.

As far as I can read from the docs, ALTER SERVER can only change Server's options?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's allowed to alter a server's owner, name and options. I thought like we might need to ensure the new owner role exists on the workers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's why we do it on PostProcess.. ?

PreprocessAlterForeignServerOwnerStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that is tricky in this PR is alter owner. What if the new owner role is not on the workers?

alter server myserver2 OWNER TO my_role;
ERROR:  role "my_role" does not exist
CONTEXT:  while executing command on localhost:9702
Time: 52.760 ms

I think that might be fine on the community, but on the enterprise we probably need EnsureDependencies call to make sure that the necessary objects(e.g., roles) are on the workers. So, we should still add EnsureDependenciesExistOnAllNodes here.

And, on the enteprise specific tests, we should add something like

-- prevent creating the roles just yet, let ALTER SERVER do that
SET citus.enable_object_propagation TO off;
create role ... ;
SET citus.enable_object_propagation TO on;

ALTER SERVER .. SET OWNER ...;
.. check owner is set properly on the workers ..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have the code and the same test for schemas, so probably fine not to add a specific test on this.

After getting this to enterprise, can you manually verify that it works expected (e.g., we send the roles)?

return;
}

appendStringInfoString(buf, "OPTIONS (");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might be able to use AppendOptionListToString here

Copy link
Member

@onderkalaci onderkalaci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is almost ready to merge, I had few minor questions

SELECT srvowner FROM pg_foreign_server WHERE srvname = 'foreign''server_1!';
srvowner
---------------------------------------------------------------------
3373
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like:

SELECT rolname FROM pg_roles JOIN pg_foreign_server ON (pg_roles.oid=pg_foreign_server.srvowner) WHERE srvname = 'foreign_server';
   rolname   
-------------
 onderkalaci
(1 row)

so that the tests do not rely on OIDs

PreprocessAlterForeignServerOwnerStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have the code and the same test for schemas, so probably fine not to add a specific test on this.

After getting this to enterprise, can you manually verify that it works expected (e.g., we send the roles)?

foreach_ptr(serverValue, serverNames)
{
char *serverNameString = strVal(serverValue);
ForeignServer *server = GetForeignServerByName(serverNameString, false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use GetObjectAddressByServerName?

src/backend/distributed/commands/foreign_server.c Outdated Show resolved Hide resolved
src/backend/distributed/commands/foreign_server.c Outdated Show resolved Hide resolved
Assert(stmt->renameType == OBJECT_FOREIGN_SERVER);

char *serverName = strVal(stmt->object);
ForeignServer *server = GetForeignServerByName(serverName, false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use GetObjectAddressByServerName?

GetForeignServerCreateDDLCommand(Oid serverId)
{
/* generate a statement for creation of the server in "if not exists" construct */
Node *stmt = RecreateForeignServerStmt(serverId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really couldn't follow why we only have RecreateForeignServerStmt but say we don't have ReCreateDropForeignServerStmt or ReCreateAlterForeignServerStmt.

What makes RecreateForeignServerStmt special?

If you copy & pasted the logic from somewhere else, can you ask the original author why that is the case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is getting used to create the server on a worker when, say, we are doing a citus_add_node.
Here we generate a CreateForeignServerStmt with if_not_exists = true, then deparse it and send the command to workers.

@agedemenli agedemenli force-pushed the propagate-foreign-server-ops branch 2 times, most recently from fc2819b to dc36a4c Compare December 23, 2021 12:09
static char * GetDefElemActionString(DefElemAction action);

char *
DeparseCreateForeignServerStmt(Node *node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it seems we don't have much comments on these deparse functions, so I'm fine with not having here as well

FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'testhost', port '5432', dbname 'testdb');

\c - - - :master_port
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not leave any objects behind. So, maybe recreate test_dependent_schema and make sure once we drop, no objects left behind.

In general, left over objects can become painful once you do large refactors etc.

@agedemenli agedemenli merged commit 63626d4 into master Dec 23, 2021
@agedemenli agedemenli deleted the propagate-foreign-server-ops branch December 23, 2021 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants