New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport JSON support to Greenplum #530

Merged
merged 13 commits into from Aug 1, 2016

Conversation

Projects
None yet
@ahachete

ahachete commented Mar 15, 2016

No description provided.

@gpdbdreddbot

This comment has been minimized.

gpdbdreddbot commented Mar 15, 2016

Hello ahachete!

Thanks for submitting this pull request!

All pull request authors must have a Contributor License Agreement (CLA) on-file with us. Please sign the appropriate CLA (individual or corporate).

When sending signed CLA please provide your github username in case of individual CLA or the list of github usernames that can make pull requests on behalf of your organization.

If you are confident that you're covered under a Corporate CLA, please make sure you've publicized your membership in the appropriate Github Org, per these instructions.

extern Datum json_recv(PG_FUNCTION_ARGS);
extern Datum json_send(PG_FUNCTION_ARGS);
#endif /* XML_H */

This comment has been minimized.

@danielgustafsson

danielgustafsson Mar 15, 2016

Member

s/XML_H/JSON_H/

This comment has been minimized.

@ahachete

ahachete Mar 15, 2016

On 15/03/16 03:57, Daniel Gustafsson wrote:

In src/include/utils/json.h
#530 (comment):

  • * src/include/utils/json.h
  • *-------------------------------------------------------------------------
  • /
    +
    +#ifndef JSON_H
    +#define JSON_H
    +
    +#include "fmgr.h"
    +
    +extern Datum json_in(PG_FUNCTION_ARGS);
    +extern Datum json_out(PG_FUNCTION_ARGS);
    +extern Datum json_recv(PG_FUNCTION_ARGS);
    +extern Datum json_send(PG_FUNCTION_ARGS);
    +
    +#endif /
    XML_H */

s/XML_H/JSON_H/

 While you're right :) this comes from the original patch:

$ git diff
5384a73~1..5384a73
src/include/utils/json.h |grep endif
+#endif /* XML_H */

 This is fixed by future patch 58e9f974dcfae7c4c445631afad47d80deb83160:

$ git diff
58e9f97~1..58e9f97
src/include/utils/json.h |grep endif
-#endif /* XML_H /
+#endif /
JSON_H */

so I'll wait until this patch is applied (as I mentioned before, I plan
to try to apply all patches related to JSON until REL_9_2_0). Let me
know if I should proceed otherwise.

 Álvaro

Álvaro Hernández Tortosa


8Kdata

This comment has been minimized.

@danielgustafsson

danielgustafsson Mar 15, 2016

Member

That's fine, let's await the future patch.

@schubert schubert closed this Mar 22, 2016

@schubert schubert reopened this Mar 22, 2016

@gpdbdreddbot

This comment has been minimized.

gpdbdreddbot commented Mar 22, 2016

Hello ahachete!

Thanks for submitting this pull request! I'm here to inform the recipients of the pull request that you've already signed the CLA.

@kuien

This comment has been minimized.

Contributor

kuien commented Apr 9, 2016

I like the comment "FIXME commit xxx".

And any additional regression tests to cover this feature, for example, sql/json.sql?

@ahachete

This comment has been minimized.

ahachete commented Apr 11, 2016

On 09/04/16 03:32, Kuien Liu wrote:

I like the comment "FIXME commit xxx".

And any additional regression tests to cover this feature, for
example, sql/json.sql?

 Yes.

 Please understand that this PR is a WIP. There are more patches to 

be backported, until all JSON code released with 9.2 is backported.

 In particular, the next commit to be backported 

(postgres/postgres@39909d1)
introduces regression testing.

 Hopefully some more patches will be backported soon :)

 Cheers,

 Álvaro

@ahachete ahachete force-pushed the ahachete:json branch from 3204450 to b9bbe97 Apr 24, 2016

@ahachete

This comment has been minimized.

ahachete commented Apr 24, 2016

Commit refactored to take into account the changes to pg_type.h introduced by commit 8954a53

Also rebased to current standard_conforming_strings_on which is currently even with master.

More commits will be added to this PR to continue with the json backporting process.

@ascherbaum-pivotal

This comment has been minimized.

Contributor

ascherbaum-pivotal commented May 6, 2016

I went ahead and committed #527. I understand that is a dependency for this PR, right?

@ahachete

This comment has been minimized.

ahachete commented May 7, 2016

Yes, that's correct, it's a dependency for this PR. Thanks for merging #527

@ascherbaum-pivotal

This comment has been minimized.

Contributor

ascherbaum-pivotal commented May 16, 2016

@ahachete Where are you with this PR? Do you plan to add more code?

@ahachete

This comment has been minimized.

ahachete commented May 16, 2016

 Indeed. I hope to have some more patch(es) this week. I will let 

you know.

 Thanks!

On 16/05/16 15:45, Andreas Scherbaum wrote:

@ahachete https://github.com/ahachete Where are you with this PR? Do
you plan to add more code?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#530 (comment)

Álvaro Hernández Tortosa


8Kdata

@ascherbaum-pivotal

This comment has been minimized.

Contributor

ascherbaum-pivotal commented Jun 10, 2016

@ahachete Any update on this PR?

@ahachete ahachete force-pushed the ahachete:json branch 2 times, most recently from 1fbcc91 to 1f58d9c Jun 13, 2016

@ahachete

This comment has been minimized.

ahachete commented Jun 14, 2016

Hi @ascherbaum-pivotal

So far, most of my planned work to bring JSON to the level at which it was by 9.2 is completed. There are still some patches to backport on top of these ones, but should be easy work that I will have completed before the end of this week's.

Regression tests seem to work for me, and Travis too. In any case, I believe a good patch review would be more than welcome, specially "Add array_to_json and row_to_json functions" which required debugging and writing some GPDB specific code.

Patches are rebased against current master.

@ahachete ahachete force-pushed the ahachete:json branch from 3501600 to 3991594 Jun 14, 2016

@ahachete

This comment has been minimized.

ahachete commented Jun 14, 2016

Hi all.

Finished backporting all the patches that bring JSON functionality on par with that of Postgres' 9.2. I will be shortly posting a more detailed explanation to the mailing list.

Please let me know if there's anything else that would require fixing.

Enjoy JSON.

@ahachete ahachete force-pushed the ahachete:json branch 2 times, most recently from 111de02 to ae0f671 Jun 15, 2016

@@ -181,6 +182,22 @@ ExplainResultDesc(ExplainStmt *stmt)
{
TupleDesc tupdesc;
/* FIXME

This comment has been minimized.

@ascherbaum-pivotal

ascherbaum-pivotal Jun 16, 2016

Contributor

Does not look nice if we leave FIXME items around in the code.

@hlinnaka Will it be enough to just merge the changes in explain.c from postgres/postgres@5384a73#diff-9ce3cf19c55c48d8714787cf9e3b533f or is there more work required?

This comment has been minimized.

@danielgustafsson

danielgustafsson Jun 17, 2016

Member

This refers to the functionality in PostgreSQL to format EXPLAIN output as XML or JSON, functionality we have yet to merge (it originates in 9.0). Personally I think we can remove this comment altogether as it's not related to JSON datatype support per se but using the JSON code for EXPLAIN.

@@ -410,6 +410,12 @@ DESCR("XML content");
#define XMLOID 142
DATA(insert OID = 143 ( _xml PGNSP PGUID -1 f b t \054 0 142 0 array_in array_out array_recv array_send - - - i x f 0 -1 0 _null_ _null_ ));
/* FIXME: PostgreSQL's original JSON OIDs were 114 and 199 (see https://github.com/postgres/postgres/commit/5384a73f98d9829725186a7b65bf4f8adb3cfaf1#diff-8cd627dd576152b8b60cd08967931fff) */

This comment has been minimized.

@ascherbaum-pivotal

ascherbaum-pivotal Jun 16, 2016

Contributor

I'm ok leaving a comment in the code about the changed OID, but there is nothing we can fix here, right?

@@ -683,6 +689,12 @@ DATA(insert OID = 3311 ( _gpxlogloc PGNSP PGUID -1 f b t \054 0 3310 0 array_
*/
DATA(insert OID = 2249 ( record PGNSP PGUID -1 f p t \054 0 0 0 record_in record_out record_recv record_send - - - d x f 0 -1 0 _null_ _null_ ));
#define RECORDOID 2249
/* FIXME: RECORDARRAYOID's original oid was 2287, replaced by 3287

This comment has been minimized.

@ascherbaum-pivotal

ascherbaum-pivotal Jun 16, 2016

Contributor

I'm ok leaving a comment in the code about the changed OID, but there is nothing we can fix here, right?

@ascherbaum-pivotal

This comment has been minimized.

Contributor

ascherbaum-pivotal commented Jun 16, 2016

The FIXME need to be removed, but mostly there is nothing we can do here (like OIDs already allocated in GPDB). Everything else looks good to me.

@ahachete

This comment has been minimized.

ahachete commented Jun 16, 2016

Well, I left those FIXMEs there because they cannot be fixed by now. They require merging other patches that may be quite ahead in the future (compared to 8.3) and pulling all those in a chain may end up in an endless process. So I thought it would be better to leave that comment to signal that JSON patch backporting was not done completely. But of course it wouldn't be a problem removing them.

Also, as you say, the used OIDs cannot be changed. But it won't hurt to signal that so that future merges don't get very difficult when these OIDs show up again.

The idea is that when GPDB reaches 9.2, all those FIXMEs would be removed.

@ascherbaum-pivotal

This comment has been minimized.

Contributor

ascherbaum-pivotal commented Jun 16, 2016

The "FIXME" for the OIDs ca be removed, but you can leave the comments there (just without the FIXME - because there is nothing to do).

@danielgustafsson

This comment has been minimized.

Member

danielgustafsson commented Jun 17, 2016

I think this looks good and playing around with it it certainly does what it says on the tin. I will do some more testing and unless I find anything or anyone else chimes in against I will push this with a catversion bump etc.

@danielgustafsson

This comment has been minimized.

Member

danielgustafsson commented Jun 23, 2016

Having played around with it a bit more I'm 👍 on getting this in the tree with the EXPLAIN comment removed (as noted earlier in the discussion). @ascherbaum-pivotal will be the committer of this patch. Thanks for the hard work put into this, greatly appreciated!

Robert Haas and others added some commits Jan 31, 2012

Built-in JSON data type.
Like the XML data type, we simply store JSON data as text, after checking
that it is valid.  More complex operations such as canonicalization and
comparison may come later, but this is enough for not.

There are a few open issues here, such as whether we should attempt to
detect UTF-8 surrogate pairs represented as \uXXXX\uYYYY, but this gets
the basic framework in place.

Author:             Álvaro Hernández Tortosa <aht@8kdata.com>
Date:               Mon Mar 14 17:35:20 UTC 2016

doc/src/sgml/datatype.sgml: changed ignored

src/backend/utils/adt/json.c: FIXME: s/errdetail_internal/errdetail/
as errdetail_internal is not supported (yet?) in GPDP.

src/backend/commands/explain.c: code from commit
5384a73 was not merged here as
ExplainStmt->options is not supported yet in GPDB. Code to be merged:
postgres/postgres@5384a73#diff-9ce3cf19c55c48d8714787cf9e3b533f

src/include/catalog/pg_type.h: original PostgreSQL's json and
_json OIDs where 114 and 199. Since 114 is reserved in GPDB, current
assigned OIDs are 3114 and 3199.
Add array_to_json and row_to_json functions.
Also move the escape_json function from explain.c to json.c where it
seems to belong.

Andrew Dunstan, Reviewd by Abhijit Menon-Sen.

(cherry-picked from 39909d1)

    Author:             Álvaro Hernández Tortosa <aht@8kdata.com>
    Date:               Mon Jun 13 11:30:32 UTC 2016

doc/src/sgml/func.sgml:	not merged

src/include/catalog/pg_type.h: backported RECORDARRAYOID.
RECORDARRAYOID's original oid was 2287, replaced by 3287 due to conflict
in GPDB. Also enabled the TYPCATEGORY_* #defines, required by the JSON
code.

src/include/parser/parse_coerce.h: backported TYPCATEGORY typedef

src/backend/utils/adt/json.c: added a temporary function
TypeCategoryJsonGPDB() that emulates parse_coerce's TypeCategory() in
the way it works in PG 9.2. Should be deleted when
bac3e83 is backported.

src/test/regress/sql/json.sql: some tests deleted as there's no current
support for array_agg(RECORD).
Fix a couple of cases of JSON output.
First, as noted by Itagaki Takahiro, a datum of type JSON doesn't
need to be escaped. Second, ensure that numeric output not in
the form of a legal JSON number is quoted and escaped.

(cherry-picked from 83fcaff)

Author:             Álvaro Hernández Tortosa <aht@8kdata.com>
Date:               Tue Jun 14 00:24:42 UTC 2016
Fix typo, noticed by Will Crawford.
(cherry-picked from 6b044cb)

Author:             Álvaro Hernández Tortosa <aht@8kdata.com>
Date:               Tue Jun 14 00:36:00 UTC 2016
Correctly handle NULLs in JSON output.
Error reported by David Wheeler.

(cherry-picked from 0c9e5d5)

Author:             Álvaro Hernández Tortosa <aht@8kdata.com>
Date:               Tue Jun 14 00:41:52 UTC 2016
Fix typo in comment
Haifeng Liu

(cherry-picked from 58e9f97)

Author:             Álvaro Hernández Tortosa <aht@8kdata.com>
Date:               Tue Jun 14 00:43:48 UTC 2016
Fix incorrect logic in JSON number lexer
Detectable by gcc -Wlogical-op.

Add two regression test cases that would previously allow incorrect
values to pass.

(cherry-picked from f1f6737)

Author:             Álvaro Hernández Tortosa <aht@8kdata.com>
Date:               Tue Jun 14 00:43:48 UTC 2016
Fix bogus handling of control characters in json_lex_string().
The original coding misbehaved if "char" is signed, and also made the
extremely poor decision to print control characters literally when trying
to complain about them.  Report and patch by Shigeru Hanada.

In passing, also fix core dump risk in report_parse_error() should the
parse state be something other than what it expects.

(cherry-picked from 3dd8e59)

Author:             Álvaro Hernández Tortosa <aht@8kdata.com>
Date:               Tue Jun 14 00:59:18 UTC 2016
Make include files work without having to include other ones first
(cherry-picked from 8570114)

Author:             Álvaro Hernández Tortosa <aht@8kdata.com>
Date:               Tue Jun 14 01:01:49 UTC 2016
Mark JSON error detail messages for translation.
Per gripe from Tom Lane.

(cherry-picked from 36b7e3d)

Author:             Álvaro Hernández Tortosa <aht@8kdata.com>
Date:               Tue Jun 14 21:43:25 UTC 2016

src/backend/utils/adt/json.c: remove FIXME: comments that where
introduced backporting 5384a73,
which was using errdetail_internal vs errdetail. This patch now uses
errdetail, which is present and working in current GPDB.
Minor code review for json.c.
Improve commenting, conform to project style for use of ++ etc.
No functional changes.

(cherry-picked from f871ef7)

Author:             Álvaro Hernández Tortosa <aht@8kdata.com>
Date:               Tue Jun 14 22:02:55 UTC 2016
Revisit error message details for JSON input parsing.
Instead of identifying error locations only by line number (which could
be entirely unhelpful with long input lines), provide a fragment of the
input text too, placing this info in a new CONTEXT entry.  Make the
error detail messages conform more closely to style guidelines, fix
failure to expose some of them for translation, ensure compiler can
check formats against supplied parameters.

(cherry-picked from 80edfd7)

Author:             Álvaro Hernández Tortosa <aht@8kdata.com>
Date:               Tue Jun 14 22:14:05 UTC 2016
Remove inappropriate semicolons after function definitions.
Solaris Studio warns about this, and some compilers might think it's an
outright syntax error.

(cherry-picked from 4f1e0e4)

Author:             Álvaro Hernández Tortosa <aht@8kdata.com>
Date:               Tue Jun 14 22:38:28 UTC 2016

@ahachete ahachete force-pushed the ahachete:json branch from ae0f671 to 91b4e0c Jun 28, 2016

@ahachete

This comment has been minimized.

ahachete commented Jun 28, 2016

OK. Done. Pointed FIXME comments removed, also from commit message. Rebased with current master. Thank you @ascherbaum-pivotal and @danielgustafsson for the review. Let me know if there's anything else to be done.

@ahachete

This comment has been minimized.

ahachete commented Jul 28, 2016

Hey! How is it going? Is it finally going to be merged? ^_^

Thanks,

Álvaro

@hlinnaka

This comment has been minimized.

Member

hlinnaka commented Jul 28, 2016

On 07/28/2016 01:12 PM, Álvaro Hernández Tortosa wrote:

Hey! How is it going? Is it finally going to be merged? ^_^

Hi! Daniel's on vacation. I guess I should pick the ball on this one.

So, my biggest question is how exactly we want the commits to look like,
when they hit the tree? I'm kind of feeling that the trivial "fix typo"
commits could well be squashed into the preceding commits, it seems like
unnecessary noise, if you're browsing the git history. We do want to
keep some record that they were included, though, so perhaps include the
details in the commit message of the squashed commit. Then again, if you
want to later compare a commit with the corresponding upstream commit,
it might be nicer to have them separate. Also, how do we want the git
Author and Committer fields to look like in the git history?

For a complicated patch series like this, a Merge commit might actually
be best. I have argued strongly against merge commits before, they just
add noise when cherry-picking a single commit, or merging a messy
development where you won't care about all the "oops, forgot to git add
this file" style commits once it's pushed. But in this case, we do want
to keep the individual commits, but it'd also be nice to have a commit
message for the patch series as whole, that describes the overall effort.

So, right now, I'm leaning towards doing a "git merge" of this PR. Thoughts?

  • Heikki
@ahachete

This comment has been minimized.

ahachete commented Aug 1, 2016

I guess this is mostly a Pivotal decision :)

My very own opinion would be to leave the commits as they happened on the PG tree (that's what I tried to follow) since it will probably help future merge processes to know which commits were cherry-picked and which ones didn't.

Regarding the merge, makes sense. However, if you would prefer a ff, let me know if you need me to rebase it.

Thanks!

Álvaro

@hlinnaka hlinnaka merged commit 91b4e0c into greenplum-db:master Aug 1, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@hlinnaka

This comment has been minimized.

Member

hlinnaka commented Aug 1, 2016

Pushed, finally!

Many thanks for the effort, and especially for keeping a clear history of the commits that were cherry-picked and what the commit messages on the changes. I created the merge commit to bring those in, unchanged, so that it's easy to see that they belong together in the history.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment