Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent gp_tablespace_segment_location() from executing on entrydb QE #13287

Merged

Conversation

interma
Copy link
Member

@interma interma commented Mar 21, 2022

gp_tablespace_segment_location() has its own dispatching logic, so they do not work on entrydb QE since we do not support dispatching on it. These are other functions like it, we fix them before by giving a warning to user, example: #11178.

gp_tablespace_segment_location() is defined in src/backend/catalog/system_views.sql, it has been fixed in master (with more details and repro scenario): #13075: changed it definition in system_views.sql.

But it cannot port to 6X directly: modify system_views.sql breaks the catalog interface, so we need handle it in c code.
This PR does it and refactors the previous similar logic.

Here are some reminders before you submit the pull request

  • Add tests for the change
  • Document changes
  • Communicate in the mailing list if needed
  • Pass make installcheck
  • Review a PR in return to support the community

;
}

if (!strcmp(proname, "gp_tablespace_segment_location"))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compare the function name one by one is not effective here, but I think it's acceptable:

  • there are not many such functions
  • only exec_location == PROEXECLOCATION_ALL_SEGMENTS and run on entrydb does this check

@interma interma changed the title [WIP] Prevent gp_tablespace_segment_location() from executing on entrydb QE Prevent gp_tablespace_segment_location() from executing on entrydb QE Mar 23, 2022
@interma interma requested a review from kainwen March 23, 2022 01:54
Copy link
Contributor

@kainwen kainwen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the banned list logic might be added in init_fcache. Please spike it.

static bool
function_not_run_entrydb(const char *proname, Oid funcid)
{
// check by oid (if it has a fixed oid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/* */ style comments.

}

// check by func name
if (!strcmp(proname, "gp_tablespace_segment_location"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot compare via text name. In theory, user can define a UDF has the same name but in different schema.

So please use funcid, this is the identifier.

@@ -2793,7 +2814,45 @@ create_functionscan_path(PlannerInfo *root, RelOptInfo *rel,
}
}
else
{
// Gp_role == GP_ROLE_EXECUTE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment style and extra blank line.

const char *proname = NameStr(procform->proname);;
char this_exec_location = func_exec_location(funcid);

if (GpIdentity.segindex == MASTER_CONTENT_ID &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_function_scan is called during planning stage. Most of the case, GpIdentity.segindex == MASTER_CONTENT_ID will be true. Here we are creating plan not running a function. Thsi is not correct place to add the logic.

@interma
Copy link
Member Author

interma commented Mar 23, 2022

thanks Zhenghua, I will fix them later.

@interma
Copy link
Member Author

interma commented Mar 30, 2022

thanks zhenghua for review!

@interma
Copy link
Member Author

interma commented Mar 30, 2022

refactor the code: integrated Junfeng's #11178 logic in my PR.

if (retvalue)
return retvalue;

char exec_location = func_exec_location(foid);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two lines 1487 & 1488 seem redundant to me.

Copy link
Member Author

@interma interma Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts:
so far, these functions (need to check by name) are not too many, and all of them arePROEXECLOCATION_ALL_SEGMENTS, using the if condition first can boost a little performance (avoid every function to go to strcmp later).

I will add some comments in code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts: so far, these functions (need to check by name) are not too many, and all of them arePROEXECLOCATION_ALL_SEGMENTS, using the if condition first can boost a little performance (avoid every function to go to strcmp later).

I will add some comments in code.

But the logic is a bit confusing. We want to disallow a function's call by name, but here we'd try to fetch another field to filter out first.

Also func_exec_location will search the syscache and I do not think the perf is changed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no problem, I will fix it.

@@ -2793,7 +2793,9 @@ create_functionscan_path(PlannerInfo *root, RelOptInfo *rel,
}
}
else
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems not related to the issue, it is better not to change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* So they do not work on entrydb since we do not support dispatching
* from entry-db currently.
*/
static bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one word on why some are checked via OID and some are by text.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@kainwen kainwen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@interma
I have added some more minor comments. Please address them before merging. Thanks.

@interma
Copy link
Member Author

interma commented Apr 12, 2022

got it, will fix them later, thanks zhenghua!

@interma interma merged commit 4bc90ea into greenplum-db:6X_STABLE Apr 14, 2022
@interma
Copy link
Member Author

interma commented Apr 14, 2022

merged, thanks zhenghua for review it!

24nishant pushed a commit to 24nishant/gpdb that referenced this pull request Oct 17, 2022
…greenplum-db#13287)

gp_tablespace_segment_location() has its own dispatching logic, so they do not work on entrydb QE since we do not support dispatching on it. These are other functions like it, we fix them before by giving a warning to user, example: greenplum-db#11178.

gp_tablespace_segment_location() is defined in src/backend/catalog/system_views.sql, it has been fixed in master (with more details and repro scenario): greenplum-db#13075: changed it definition in system_views.sql.

But it cannot port to 6X directly: modify system_views.sql breaks the catalog interface, so we need handle it in c code.
This PR does it and refactors the previous similar logic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants