Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHOENIX-628 Support native JSON data type #1780

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ranganathg
Copy link

@ranganathg ranganathg commented Jan 3, 2024

@stoty
Copy link
Contributor

stoty commented Jan 3, 2024

Can you clean up the commit message ?

Use of colons, dashes, etc is inconsitent.

@ranganathg ranganathg changed the title PHOENIX-628 JSON support in Phoenix PHOENIX-628 Support native JSON data type Jan 3, 2024
@stoty
Copy link
Contributor

stoty commented Jan 3, 2024

The commit message doesn't match the JIRA ticket description

@ranganathg
Copy link
Author

ranganathg commented Jan 3, 2024

The commit message doesn't match the JIRA ticket description

Can it be different from the JIRA description? I think the description has a recommendation - can I add the message as "Adding JSON native datatype and JSON functions" - That would be more relevant to the commit instead of the JIRA description?

@stoty
Copy link
Contributor

stoty commented Jan 3, 2024

The convention is to use the JIRA description for the first line of the commit message.

If you think that the JIRA description is not sufficient, you may want to change the JIRA ticket description, just keep it in sync with the commit message.

@virajjasani
Copy link
Contributor

It's good to continue the review, however just wanted to bring up this discuss thread: Discuss thread: https://lists.apache.org/thread/xhwqfwytlklylt6kwqm1botsxr7dd4zb

I am hopeful that in a couple of weeks, we should be well positioned to get this PR move forward, but review can definitely continue in parallel.
Thank you @ranganathg!

@ranganathg ranganathg force-pushed the PHOENIX-628-feature_2 branch 4 times, most recently from cfca694 to 99be560 Compare January 9, 2024 02:50
@virajjasani
Copy link
Contributor

Re-triggered after fixing build failure on master branch: https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1780/16/

Copy link
Contributor

@virajjasani virajjasani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using Upsert Select at server side plan, if we can set the JSON_MODIFY expressions (serialized) with Mutation and let the server evaluate the expression by reading the row, that would be great. Otherwise, with UPSERT SELECT plan, we loose the capability to return atomic upsert status code as a result of statement#executeUpdate.

If auto-commit is on, we would like client to send the mutations to server side. If the auto-commit is off, we can accummulate mutations and let server perform them in batches?

Copy link
Contributor

@virajjasani virajjasani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RawBsonDocument extends BsonDocument, I am not sure if we need to use BsonBinaryReader just to convert RawBsonDocument to BsonDocument.

public ByteBuffer updateValue(Object top, String jsonPathExprStr, String newVal) {
Configuration conf = Configuration.builder().jsonProvider(new BsonJsonProvider()).build();
BsonValue newValue = JsonPath.using(conf).parse(newVal).json();
BsonDocument root = fromRaw((RawBsonDocument) top);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use fromRaw() here which uses BsonBinaryReader and then decode RawBsonDocument to BsonDocument?
Can we use this directly? It seems safe type cast.

BsonDocument root = ((RawBsonDocument) top).asDocument()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason we use RawBSONDocument is because we want to do lazy parsing of bson bytes. Once we have the byte buffer representing json if we call asDocument it will parse the entire byte buffer and construct a BsonDocument. But then you have lost the efficiency gains. We only want to parse the byte[] which is queried in the jsonpath expression leaving the rest of the bytes uninterpreted. By uninterpreted I mean we don't parse those bytes but just read them into memory into the buffer. This is where the performance gains come from. If you parse everything you lose that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@ranganathg could you document this as comments for fromRaw()?
fromRaw() has only one comment // Transform to an in memory BsonDocument instance as of now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw the definition of asDocument()

  /**
     * Gets this value as a BsonDocument if it is one, otherwise throws exception
     *
     * @return a BsonDocument
     * @throws org.bson.BsonInvalidOperationException if this value is not of the expected type
     */
    public BsonDocument asDocument() {
        throwIfInvalidType(BsonType.DOCUMENT);
        return (BsonDocument) this;
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that, asDocument() just does the typecast, so even if we use it, the underlying object is still of Raw type, meaning we will not be able to update the document. Hence, we need to create updatable BsonDocument object using BsonBinaryReader.

public boolean isPathValid(Object top, String path) {
try{
Configuration conf = Configuration.builder().jsonProvider(new BsonJsonProvider()).build();
BsonDocument root = fromRaw((RawBsonDocument) top);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, can we just type cast BsonDocument root = ((RawBsonDocument) top).asDocument() ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants