Skip to content

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

This PR proposes to restructure User Guide in PySpark documentation for pandas APIs on Spark.

Before

Screen Shot 2021-06-08 at 8 47 41 PM

After

Screen Shot 2021-06-08 at 8 46 58 PM

Note that I mostly just moved the contents around except minor changes:

  • Removing some questions in FAQ that don't make sense in Apache Spark
  • Rename a subtitle "Working with pandas and PySpark" to "From/to pandas and PySpark DataFrames"

For renaming Koalas to either pandas-on-Spark or pandas APIs on Spark, it will be done at SPARK-35591

Why are the changes needed?

For better readability.

Does this PR introduce any user-facing change?

Yes, it restructures the documentation as shown above.

How was this patch tested?

I manually built the docs and tested.

@HyukjinKwon
Copy link
Member Author

cc @itholic @ueshin @xinrong-databricks FYI

@SparkQA
Copy link

SparkQA commented Jun 8, 2021

Test build #139479 has finished for PR 32820 at commit 3fd1721.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44002/

@SparkQA
Copy link

SparkQA commented Jun 8, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44002/

Working with pandas and PySpark
===============================
=====================================
From/to pandas and PySpark DataFrames
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From/to pandas and PySpark? From/to pandas/PySpark? Conversion from/to pandas and PySpark? DataFrames conversion from/to pandas and PySpark?
I just wanted to provide more options, the existing one looks good as well. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol. I think it's fine either way. I intentionally wrote PySpark and pandas DataFrames becuase we're already in the same PySpark.

@xinrong-meng
Copy link
Member

LGTM! Thanks.

@HyukjinKwon
Copy link
Member Author

Thanks @xinrong-databricks.

Merged to master.

@HyukjinKwon HyukjinKwon deleted the SPARK-35647 branch January 4, 2022 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants