Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-41444][CONNECT] Support read.json() #38975

Closed
wants to merge 1 commit into from

Conversation

amaliujia
Copy link
Contributor

What changes were proposed in this pull request?

This PR supports the json() API in DataFrameReader. This API is built on top of the core API of the reader (schema, load, option, etc.)

Loading data is the first and important step for users to try Spark Connect while JSON files are one of the most popular file formats.

Why are the changes needed?

API coverage

Does this PR introduce any user-facing change?

NO

How was this patch tested?

UT

@amaliujia
Copy link
Contributor Author

@zhengruifeng

@@ -121,6 +121,23 @@ def test_simple_read(self):
# Check that the limit is applied
self.assertEqual(len(data.index), 10)

def test_json(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should better add a JIRA number here but let me get this in first.

@HyukjinKwon
Copy link
Member

Merged to master.

@HyukjinKwon HyukjinKwon changed the title [SPARK-41284][CONNECT] Support read.json() [SPARK-41444][CONNECT] Support read.json() Dec 8, 2022
@zhengruifeng
Copy link
Contributor

LGTM

beliefer pushed a commit to beliefer/spark that referenced this pull request Dec 18, 2022
### What changes were proposed in this pull request?

This PR supports the `json()` API in DataFrameReader. This API is built on top of the core API of the reader (schema, load, option, etc.)

Loading data is the first and important step for users to try Spark Connect while JSON files are one of the most popular file formats.

### Why are the changes needed?

API coverage

### Does this PR introduce _any_ user-facing change?

NO
### How was this patch tested?

UT

Closes apache#38975 from amaliujia/have_json.

Authored-by: Rui Wang <rui.wang@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants