Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce GetJsonObjectOptions in getJSONObject Java API #14956

Merged

Conversation

SurajAralihalli
Copy link
Contributor

@SurajAralihalli SurajAralihalli commented Feb 2, 2024

Description

Addresses 10219

This PR introduces a new class named GetJsonObjectOptions that holds the configurations to control the behavior of the underlying cudf::get_json_object function. It incorporates this new class into the getJSONObject JAVA API as an additional argument but also keeps the previous API to maintain backwards compatibility. It also includes a test case, testGetJSONObjectWithSingleQuotes, validating the behavior of getJSONObject when single quotes are enabled.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added the Java Affects Java cuDF API. label Feb 2, 2024
@SurajAralihalli SurajAralihalli added the 2 - In Progress Currently a work in progress label Feb 2, 2024
@SurajAralihalli SurajAralihalli changed the title [WIP] Introduce GetJsonObjectOptions to address Single Quotes issue [WIP] Introduce GetJsonObjectOptions in getJSONObject Java API Feb 2, 2024
@SurajAralihalli SurajAralihalli changed the title [WIP] Introduce GetJsonObjectOptions in getJSONObject Java API Introduce GetJsonObjectOptions in getJSONObject Java API Feb 2, 2024
@SurajAralihalli SurajAralihalli marked this pull request as ready for review February 2, 2024 18:49
@SurajAralihalli SurajAralihalli requested a review from a team as a code owner February 2, 2024 18:49
@SurajAralihalli SurajAralihalli added Spark Functionality that helps Spark RAPIDS and removed 2 - In Progress Currently a work in progress labels Feb 2, 2024
Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great

@karthikeyann karthikeyann added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 7, 2024
Copy link
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Just curious about the design choices used here.
The Java public interface has aGetJsonObjectOptions.
then JNI.cpp interface uses 3 booleans.
libcudf uses get_json_object_options (constructed from these 3 booleans)

The data flows as (Java object) -> 3 booleans in function arguments -> (C++ object)

Is this a limitation of JNI?
Can we directly use (Java object) or (C++ object) in the function argument?

@revans2
Copy link
Contributor

revans2 commented Feb 7, 2024

@karthikeyann it really is about JNI limitations. We have a few choices for what we could do.

We could have a java class that creates an instance of a C++ object and holds a reference to it. Then anything that interacts with the object needs to call into JNI to change things. It also has the possibility of being leaked if the java code does not close it, or if we don't put in a finalizer to make sure that the data is released when GC happens. Neither is very user friendly from a java standpoint.

We could have a pure java class that holds the values, and we could pass an instance of that class into JNI. But it is very slow to call back into java from JNI. So if we know that all of the values are going to be used, we typically just pull it apart in java and send them down as separate parameters so the performance is not too bad.

Signed-off-by: Suraj Aralihalli <suraj.ara16@gmail.com>
Signed-off-by: Suraj Aralihalli <suraj.ara16@gmail.com>
@revans2
Copy link
Contributor

revans2 commented Feb 12, 2024

/merge

@rapids-bot rapids-bot bot merged commit 49c2995 into rapidsai:branch-24.04 Feb 12, 2024
69 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function Java Affects Java cuDF API. non-breaking Non-breaking change Spark Functionality that helps Spark RAPIDS
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

4 participants