Skip to content

[SPARK-35755][PYTHON][INFRA] Use higher PyArrow versions in GitHub Actions build#32906

Closed
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-35755
Closed

[SPARK-35755][PYTHON][INFRA] Use higher PyArrow versions in GitHub Actions build#32906
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-35755

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

This PR proposes to use higher versions of PyArrow which more users use in general.

Without this PR, the testing matrix as follows:

  • (Python 3.8) Use PyArrow 2.x in pandas UDF tests in SQL side
  • (Python 3.6) Use PyArrow 2.x in PySpark tests
  • (Python 3.9) Use PyArrow 4.x in PySpark tests (no change)
  • (Python 3.6) Use PyArrow 2.x in PySpark documentation generation (it runs Spark jobs to generate images to use in PySpark API docs)

After this PR, the testing matrix as follows:

  • (Python 3.8) Use PyArrow 4.x in pandas UDF tests in SQL side
  • (Python 3.6) Use PyArrow 3.x in PySpark tests
  • (Python 3.9) Use PyArrow 4.x in PySpark tests (no change)
  • (Python 3.6) Use PyArrow 4.x in PySpark documentation generation (it runs Spark jobs to generate images to use in PySpark API docs)

Why are the changes needed?

Test matrix which more people use.

Does this PR introduce any user-facing change?

No, dev and testing only.

How was this patch tested?

GitHub Actions in this PR should test it out.

@github-actions github-actions bot added the INFRA label Jun 14, 2021
@HyukjinKwon
Copy link
Member Author

FWIW, for branch-3.1, I think we're using:

  • (Python 3.8) Use PyArrow 2.x in pandas UDF tests in SQL side
  • (Python 3.6) Use PyArrow 2.x in PySpark tests
  • (Python 3.8) Use PyArrow 2.x in PySpark tests

Maybe I will change it to something like:

  • (Python 3.8) Use PyArrow 4.x in pandas UDF tests in SQL side
  • (Python 3.6) Use PyArrow 3.x in PySpark tests
  • (Python 3.8) Use PyArrow 2.x in PySpark tests

? I will create another PR later.

@SparkQA
Copy link

SparkQA commented Jun 14, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44300/

@SparkQA
Copy link

SparkQA commented Jun 14, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44300/

@SparkQA
Copy link

SparkQA commented Jun 14, 2021

Test build #139774 has finished for PR 32906 at commit e357f98.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable to me

@HyukjinKwon
Copy link
Member Author

Merged to master.

@HyukjinKwon HyukjinKwon deleted the SPARK-35755 branch January 4, 2022 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants