Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SELECT FOR JSON AUTO support in Babelfish #2243

Conversation

Jakeowen1
Copy link
Contributor

@Jakeowen1 Jakeowen1 commented Jan 10, 2024

Description

This change adds SELECT FOR JSON AUTO support to Babelfish which nests JSON objects based on the structure of the Select statement.

Issues Resolved

BABEL-3668

Test Scenarios Covered

  • Use case based -
select U.Id AS "users.userid", O.productId AS "order.productId", O.Id AS "product.oid", P.price AS "product.price" FROM forjson_nesting_vu_t_users U JOIN forjson_nesting_vu_t_orders O ON (U.id = O.userid) JOIN forjson_nesting_vu_t_products P ON (P.id = O.productid) FOR JSON AUTO

[{"users.userid": 1, "o": [{"order.productId": 1, "product.oid": 2, "p": [{"product.price": "20"}, {"product.price": "20"}, {"product.price": "30"}, {"product.price": "30"}]}, {"order.productId": 1, "product.oid": 1, "p": [{"product.price": "20"}, {"product.price": "20"}, {"product.price": "30"}, {"product.price": "30"}]}]}]

  • Boundary conditions -
select U.Id AS "users.userid", O.productId AS "order.productId", O.Id AS "product.oid", P.price AS "product.price", S.totalSales AS "totalsales" FROM forjson_nesting_vu_t_users U JOIN forjson_nesting_vu_t_orders O ON (U.id = O.userid) JOIN forjson_nesting_vu_t_products P ON (P.id = O.productid) JOIN forjson_nesting_vu_t_sales S ON (P.price = S.price) FOR JSON AUTO

[{"users.userid": 1, "o": [{"order.productId": 1, "product.oid": 2, "p": [{"product.price": "20", "s": [{"totalsales": 50}, {"totalsales": 50}]}]}, {"order.productId": 1, "product.oid": 1, "p": [{"product.price": "30", "s": [{"totalsales": 100}, {"totalsales": 100}]}]}, {"order.productId": 1, "product.oid": 2, "p": [{"product.price": "30", "s": [{"totalsales": 100}, {"totalsales": 100}]}]}, {"order.productId": 1, "product.oid": 1, "p": [{"product.price": "20", "s": [{"totalsales": 50}, {"totalsales": 50}]}]}]}]

  • Arbitrary inputs -
INSERT INTO forjson_nesting_vu_t_sales VALUES (1, NULL, NULL), (2, NULL, NULL);
GO

select U.Id AS "users.userid", O.productId AS "order.productId", O.Id AS "product.oid", P.price AS "product.price", S.totalSales AS "totalsales" FROM forjson_nesting_vu_t_users U JOIN forjson_nesting_vu_t_orders O ON (U.id = O.userid) JOIN forjson_nesting_vu_t_products P ON (P.id = O.productid) JOIN forjson_nesting_vu_t_sales S ON (P.price = S.price) FOR JSON AUTO

[{"users.userid": 1, "o": [{"order.productId": 1, "product.oid": 2, "p": [{"product.price": "20", "s": [{"totalsales": 50}, {"totalsales": 50}]}]}, {"order.productId": 1, "product.oid": 1, "p": [{"product.price": "30", "s": [{"totalsales": 100}, {"totalsales": 100}]}]}, {"order.productId": 1, "product.oid": 2, "p": [{"product.price": "30", "s": [{"totalsales": 100}, {"totalsales": 100}]}]}, {"order.productId": 1, "product.oid": 1, "p": [{"product.price": "20", "s": [{"totalsales": 50}, {"totalsales": 50}]}]}]}]
  • Negative test cases -
1> select cast(null as datetime) for JSON AUTO
2> go
json                                                                                                                                                                                                                                                            
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[{}]                                                                                                                                                                                                                                                            

(1 rows affected)
1> 
  • Minor version upgrade tests -
    NA

  • Major version upgrade tests -
    Added to upgrade scripts

  • Performance tests -
    NA

  • Tooling impact -
    NA

  • Client tests -
    NA

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Jake Owen added 4 commits January 10, 2024 21:05
Signed-off-by: Jake Owen <owjco@amazon.com>
Signed-off-by: Jake Owen <owjco@amazon.com>
Signed-off-by: Jake Owen <owjco@amazon.com>
Signed-off-by: Jake Owen <owjco@amazon.com>
@coveralls
Copy link
Collaborator

coveralls commented Jan 11, 2024

Pull Request Test Coverage Report for Build 7561344918

Warning: This coverage report may be inaccurate.

We've detected an issue with your CI configuration that might affect the accuracy of this pull request's coverage report.
To ensure accuracy in future PRs, please see these guidelines.
A quick fix for this PR: rebase it; your next report should be accurate.

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.1%) to 71.821%

Totals Coverage Status
Change from base Build 7472197412: 0.1%
Covered Lines: 39695
Relevant Lines: 55269

💛 - Coveralls

Jake Owen added 4 commits January 11, 2024 00:20
Signed-off-by: Jake Owen <owjco@amazon.com>
Signed-off-by: Jake Owen <owjco@amazon.com>
Signed-off-by: Jake Owen <owjco@amazon.com>
Signed-off-by: Jake Owen <owjco@amazon.com>
TargetEntry* te = (TargetEntry*) lc->ptr_value;
if(te && strncmp(te->resname, "json", 4) == 0 && te->expr != NULL && ((Expr*) te->expr)->type == T_FuncExpr) {
List* args = ((FuncExpr*) te->expr)->args;
if(args != NULL && ((Node*) ((ListCell*) args->elements)->ptr_value)->type == T_Aggref) {
Copy link
Contributor

@forestkeeper forestkeeper Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean need to get the first elements of List arg here ?
I think better to write linitial ? or proper function from pg_list.h

// Handle Views with an alias
SubLink* sl = (SubLink*) te->expr;
if(((Node*) sl->subselect)->type == T_Query)
checkForJsonAuto((Query*) sl->subselect);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any else in here ?

Jake Owen added 3 commits January 12, 2024 01:46
Signed-off-by: Jake Owen <owjco@amazon.com>
Signed-off-by: Jake Owen <owjco@amazon.com>
Signed-off-by: Jake Owen <owjco@amazon.com>
@@ -865,6 +875,9 @@ pltsql_pre_parse_analyze(ParseState *pstate, RawStmt *parseTree)
static void
pltsql_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
{
if(!checkForJsonAuto(query))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The place to put query tree walker should be after tsql dialect check

@@ -865,6 +875,9 @@ pltsql_pre_parse_analyze(ParseState *pstate, RawStmt *parseTree)
static void
pltsql_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
{
if(!checkForJsonAuto(query))
(void) query_tree_walker(query, check_json_auto_walker, (void *) pstate, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In here we should call check_json_auto_walker instead of query_tree_walker to avoid one more stack level.

forjson_table **tableInfoArr;
if(target) {
ListCell* lc = list_nth_cell(target, 0);
if(lc != NULL && ((Node*) lfirst(lc))->type == T_TargetEntry) {
Copy link
Contributor

@forestkeeper forestkeeper Jan 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use nodeTag(node) == T_TargetEntry to make the code more readable


// Modify query to be of the form "JSONAUTOALIAS.[nest_level].[table_alias]"
rtable = (List*) query->rtable;
if(rtable != NULL) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it's == NULL, in such case we should throw exception ? If so , pls use assert

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise we'll have to write some else clause to cover those

JsonbPair *rowPairs;
if(currDepth == maxDepth) {
jsonbArray->val.array.nElems++;
jsonbArray->val.array.elems = (JsonbValue *) repalloc(jsonbArray->val.array.elems, sizeof(JsonbValue) * (jsonbArray->val.array.nElems));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we realloc some space in here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to allocate space for the new array element being added to jsonbArray

if(args != NULL && ((Node*) linitial(args))->type == T_Aggref) {
Aggref* agg = linitial_node(Aggref, args);
List* aggargs = agg->args;
if(aggargs != NULL && list_nth_cell(aggargs, 1) != NULL && ((Node*) lsecond(aggargs))->type == T_TargetEntry) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list_nth_cell(aggargs, 1) != NULL

--> should be

list_length(aggargs) > 1 ?

}

for(int i = 0; i < subq->targetList->length; i++) {
TargetEntry* te = castNode(TargetEntry, lfirst(list_nth_cell(subq->targetList, i)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're for loop into targetList, why not using foreach( lc , targetlist ) ?

Jake Owen added 4 commits January 17, 2024 17:15
Signed-off-by: Jake Owen <owjco@amazon.com>
Signed-off-by: Jake Owen <owjco@amazon.com>
Signed-off-by: Jake Owen <owjco@amazon.com>
Signed-off-by: Jake Owen <owjco@amazon.com>
ctequery = (Query*) cte->ctequery;
foreach(lc2, ctequery->rtable) {
subqRte = castNode(RangeTblEntry, lfirst(lc2));
if(subqRte->rtekind == RTE_RELATION)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls add a

assert(subqRte->rtekind == RTE_RELATION)

in here to make sure we don't miss anything in future

CREATE TRIGGER forjson_vu_trigger_1 on forjson_auto_vu_t_users for insert as
begin
select U.Id AS "users.userid",
U.firstname as "firstname",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add one more test case that inside the trigger like this :

with cte as (
select * from inserted
)
select * from cte as json auto

@@ -215,6 +215,7 @@ forjson
forjson-datatypes
forjson-subquery
forjson-nesting
forjsonauto
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we already support for json auto syntax, then we should add this test case into other Mvu schedule files as well.

Jake Owen added 2 commits January 17, 2024 19:40
Signed-off-by: Jake Owen <owjco@amazon.com>
Signed-off-by: Jake Owen <owjco@amazon.com>
@forestkeeper forestkeeper merged commit e5bb2ec into babelfish-for-postgresql:BABEL_4_X_DEV Jan 17, 2024
31 checks passed
Jakeowen1 added a commit to amazon-aurora/babelfish_extensions that referenced this pull request Jan 17, 2024
…ql#2243)

This change adds SELECT FOR JSON AUTO support to Babelfish which nests JSON objects based on the structure of the Select statement.

Task: BABEL-3668
Signed-off-by: Jake Owen <owjco@amazon.com>
Jakeowen1 added a commit to amazon-aurora/babelfish_extensions that referenced this pull request Jan 19, 2024
…ql#2243)

This change adds SELECT FOR JSON AUTO support to Babelfish which nests JSON objects based on the structure of the Select statement.

Task: BABEL-3668
Signed-off-by: Jake Owen <owjco@amazon.com>
forestkeeper pushed a commit that referenced this pull request Jan 19, 2024
This change adds SELECT FOR JSON AUTO support to Babelfish which nests JSON objects based on the structure of the Select statement.

Task: BABEL-3668
Signed-off-by: Jake Owen <owjco@amazon.com>
staticlibs pushed a commit to wiltondb/babelfish_extensions that referenced this pull request Apr 22, 2024
…ql#2243) (babelfish-for-postgresql#2270)

This change adds SELECT FOR JSON AUTO support to Babelfish which nests JSON objects based on the structure of the Select statement.

Task: BABEL-3668
Signed-off-by: Jake Owen <owjco@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants