New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature issue#16 #73
Feature issue#16 #73
Conversation
This commit aims to check the type of distribution column. After this commit, pg_shard errors out if it cannot work on the distribution column on table distribution call.
BTGreaterEqualStrategyNumber); | ||
|
||
/* if partition column does not support <= and >= operators, error out*/ | ||
if (lessEqualOperatorId == InvalidOid || greaterEqualOperatorId == InvalidOid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not find any type, which has non-InvalidOid partitionColumnOpClassId, with invalid equality operators.
But, to be on the safe side, I added this extra control.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this check only makes sense for tables that use range partitioning: for hash partitioning, the shards are always defined using 32-bit integers, and we can always find the inequality operator for that.
If using range partitioning, shard boundaries will be defined using the partition column type, so it makes sense in that case to check for inequality operators.
Can you change this section to do something like:
if using hash partitioning
check for presence of hash operator for partition column type
else if using range partitioning
check for existence of inequality operators for partition column type
else
error out that partition type is unexpected
Hey @onderkalaci — I know I said I'd email @sumedhpathak and @ozgune about breaking the error checking out into its own function, but what I hadn't realized is that So there is no argument to support breaking error handling into its own function here: it makes sense that Which just leaves my point about the inequality operator checks. I'll see whether there are any other stylistic things that need addressing and add a checklist at the top of this review. After those changes I think we're good to merge. |
This commit add checks to find whether the data type of the partition column has the required support routines for future select/insert/update commands.
This commit aims to improve style of tests.
Hey @jasonmp85, @sumedhpathak Below I summerize what is included in the updated pull request:
|
@@ -50,6 +56,7 @@ static List * ParseWorkerNodeFile(char *workerNodeFilename); | |||
static int CompareWorkerNodes(const void *leftElement, const void *rightElement); | |||
static bool ExecuteRemoteCommand(PGconn *connection, const char *sqlCommand); | |||
static text * IntegerToText(int32 value); | |||
static Oid GetSupportRoutine(Oid columnOid, Oid accessMethodId, int16 supportFunctionNumber); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Routine is an unfamiliar word in this codebase. I don't see any functions in CitusDB or PostgreSQL that use this term except those related to foreign data wrappers. Most comments refer to this as a support function, and I think we should follow that pattern.
Ok, so after staring at this for a while, I think I know what I'd like to see happen… In some code you haven't modified, there's a section like: partitionColumnId = get_attnum(distributedTableId, partitionColumnName);
if (partitionColumnId == InvalidAttrNumber)
{
ereport(ERROR, (errmsg("could not find column: %s", partitionColumnName)));
} This error checking is duplicated by Then, If the type does not have a default operator class, I think this could help code flow quite a bit while cutting down on verbosity and duplication. |
Var* partitionColumnVar = NULL; | ||
Oid partitionColumnTypeId = InvalidOid; | ||
Oid partitionColumnOpClassId = InvalidOid; | ||
Oid btreeSupportRoutine = InvalidOid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the declaration of btreeSupportRoutine
and hashSupportRoutine
into the respective if
/else if
blocks to reduce their scope… no reason to declare them up here. And when you move them, you can combine the declaration and assignment into a one-line definition for each.
This commit aims to rename GetSupportRoutine function to GetSupportFunction. Also, updated related variable names and comments.
*/ | ||
if (partitionColumnOpClassId == InvalidOid) | ||
{ | ||
ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ERRCODE_WRONG_OBJECT_TYPE
isn't appropriate here. Let's stick with what other functions do when they get an InvalidOid
here. I think the following should work:
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("data type %s has no default operator class for specified partition method",
format_type_be(attrType)),
errdetail("Partition column types must have a default operator class defined.")));
Ok @onderkalaci — I've left my checklist at the top of this PR. I think the reorganization is enough that I'll want to check this out once more before a merge, but assuming it's mostly in shape by morning (MDT) I'll merge it myself if it looks good. |
This commit aims to apply feedback from @jasonmp85. Basically, change code flow so that code gets shorter and easier to read. Also, apply style changes.
This commits includes only some whitespace changes.
@jasonmp85 Thanks for the review. |
@onderkalaci By default, Coveralls will complain if coverage decreases. We discussed the coverage of your patch (and how it couldn't be complete until we support range partitioning) and I expected coverage to decrease slightly. If you notice, the merge button is still present, and nothing can ever stop any of us from merging from the command line. So it's just a slight inconvenience designed to make sure you are aware you lowered coverage. In this case, it's expected, so no biggie. |
HASH_AM_OID, HASHPROC); | ||
if (hashSupportFunction == InvalidOid) | ||
{ | ||
Oid partitionColumnTypeId = partitionColumnVar->vartype; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the point in this variable existing, so I'm going to remove it before merging (it's only referenced once).
Oid columnOid = partitionColumn->vartype; | ||
Oid operatorClassId = GetDefaultOpClass(columnOid, accessMethodId); | ||
|
||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is actually incorrect, I think… This documentation section shows that an index can be defined using a customer operator class rather than the default. So it's possible we could support types without default operator classes, so long as the user specified which custom operator class to use at the time of distribution. I'll update this comment to clarify that we just don't support anything other than using a default operator class at the moment.
Fixed some alignment issues, clarified some comment language, renamed a variable, and moved error call for range partitioning.
I've tested this on the latest CitusDB, PostgreSQL 9.3, and PostgreSQL 9.4 on OS X. Travis'll take care of Linux. After it passes I'm gonna merge. |
Detect unsupported partition key types during dist. table creation. cr: @jasonmp85
This pull request aims to prevent tables to be distributed with columns that cannot be used as distribution column.
fixes #16
Code Review Tasks
get_attnum
section and replace with call toColumnNameToColumn
GetSupportRoutine
asSupportFunctionForColumn
and make it take aVar *
SupportFunctionForColumn