Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix validation of available extensions on data node #2617

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
49 changes: 38 additions & 11 deletions tsl/src/data_node.c
Expand Up @@ -32,6 +32,7 @@
#include <utils/inval.h>
#include <utils/syscache.h>

#include "config.h"
#include "fdw/fdw.h"
#include "remote/async.h"
#include "remote/connection.h"
Expand Down Expand Up @@ -563,21 +564,26 @@ connect_for_bootstrapping(const char *node_name, const char *const host, int32 p
}

/**
* Validate that extension is available and with the correct version.
* Validate that compatible extension is available on the data node.
*
* If the extension is not available on the data node, we will get strange
* errors when we try to use functions, so we check that the extension is
* available before attempting anything else.
* We check all available extension versions. Since we are connected to
* template DB when performing this check, it means we can't
* really tell if a compatible extension is installed in the database we
* are trying to add to the cluster. However we can make sure that a user
* will be able to manually upgrade the extension on the data node if needed.
*
* Will abort with error if there is an issue, otherwise do nothing.
* Will abort with error if there is no compatible version available, otherwise do nothing.
*/
static void
data_node_validate_extension_availability(TSConnection *conn)
{
PGresult *res = remote_connection_execf(conn,
"SELECT default_version, installed_version FROM "
"pg_available_extensions WHERE name = %s",
quote_literal_cstr(EXTENSION_NAME));
bool compatible;

PGresult *res =
remote_connection_execf(conn,
"SELECT version FROM pg_available_extension_versions WHERE name = "
"%s AND version ~ '\\d+.\\d+.\\d+.*' ORDER BY version DESC",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the version regexp needed here? Do we expect versions with a different format and why should those not be included in that case?

It is also unclear to me which use case this change addresses. I guess it is trying to check that there is at least one compatible version available for installation?

But there seems to be many different cases to handle here:

  1. The database does not exist on the data node. Requirement: there must be a compatible extension version available
  2. The database exists but does not have the extension installed. Same requirement as 1.
  3. The database exists and extension is installed. Requirement: the installed extension must be a compatible version.

Which of these cases do we handle? It seems this change mostly targets case 1 and 2 above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regexp is there to avoid tests breaking - since some of our so mocks don't follow our semantic versioning. If we would change the way we do versioning then that will require changes in few other places in our code (eg. logic that does version compatibility checking).

Right, your observations are pretty much correct. Let me explain a bit more. We have two types of checks: 1. check available versions 2. check installed version. Check 1 is what I was focused on here and that check if performed only
when trying to bootstrap DN. Check 2 is performed later on after we bootstrap DN and also each time we establish a connection.
This means that with this change in place the bootstrap process will not fail if it finds some available extension but can still fail later on if installed version is incompatible with version on AN. This would mean that user will need to do ALTER
EXTENSION UPDATE on problematic DN and then try adding the DN again. Hope this makes it more clear now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this same function used for case 2? Do we support case 3 (and which function is used to validate the extension in that case)?

Would be good to make this clear in comments or the function description. For instance, it would be good to describe exactly how the function is used (i.e., which cases above).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this function is only used when bootstrapping parameter is true (that's by default). So it will run for all three case you've mentioned. No matter if DB/extension exists or not.

Copy link
Contributor

@pmwkaa pmwkaa Nov 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how Postgres stores versions here, for example is 2.0 will be always 2.0.0? Curious if this could mess up the sorting in some odd way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell Postgres only stores string so we control the versioning. The order is not particularly important (code would work without ordering as well). I've only added it since most users would run the latest version so this would speed up the loop below.

quote_literal_cstr(EXTENSION_NAME));

if (PQresultStatus(res) != PGRES_TUPLES_OK)
ereport(ERROR,
Expand All @@ -589,8 +595,29 @@ data_node_validate_extension_availability(TSConnection *conn)
errmsg("TimescaleDB extension not available on remote PostgreSQL instance"),
errhint("Install the TimescaleDB extension on the remote PostgresSQL instance.")));

/* Here we validate the available version, not the installed version */
remote_validate_extension_version(conn, PQgetvalue(res, 0, 0));
Assert(PQnfields(res) == 1);

int i;
StringInfo concat_versions = makeStringInfo();
bool old_version;
for (i = 0; i < PQntuples(res); i++)
{
appendStringInfo(concat_versions, "%s, ", PQgetvalue(res, i, 0));
compatible = dist_util_is_compatible_version(PQgetvalue(res, i, 0),
TIMESCALEDB_VERSION,
&old_version);
if (compatible)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this version should be installed in such case during the CREATE EXTENSION operation on data node? I don't think we specify any version there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we install the default version and that default might not always be compatible with the one on AN (eg. default is 1.7.4 and we have 2.0.0 as well). But in that case a user can ALTER EXTENSION UPDATE on DN and the problem will be solved.

break;
}

if (!compatible)
ereport(ERROR,
(errcode(ERRCODE_TS_DATA_NODE_INVALID_CONFIG),
errmsg("remote PostgreSQL instance has an incompatible timescaledb extension "
"version"),
errdetail_internal("Access node version: %s, available remote versions: %s.",
TIMESCALEDB_VERSION_MOD,
concat_versions->data)));
}

/**
Expand Down
1 change: 1 addition & 0 deletions tsl/src/dist_util.c
Expand Up @@ -340,6 +340,7 @@ dist_util_is_compatible_version(const char *data_node_version, const char *acces
unsigned int access_node_major, access_node_minor, access_node_patch;

Assert(is_old_version);
Assert(data_node_version);

if (sscanf(data_node_version,
"%u.%u.%u",
Expand Down