Skip to content

Commit

Permalink
[SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker…
Browse files Browse the repository at this point in the history
… image dependencies

### What changes were proposed in this pull request?

This PR proposes to add plotly, pyarrow and pandas dependencies for generating the API documentation for pandas API on Spark.

The versions of `pandas==1.1.5 pyarrow==3.0.0 plotly==5.4.0` are matched with the current versions being used in branch-3.2 at Python 3.6.

### Why are the changes needed?

Currently, the function references for pandas API on Spark are all missing: https://spark.apache.org/docs/latest/api/python/reference/pyspark.pandas/series.html due to missing dependencies when building the docs.

### Does this PR introduce _any_ user-facing change?

Yes, the broken links of documentation at https://spark.apache.org/docs/latest/api/python/reference/pyspark.pandas/series.html will all be recovered.

### How was this patch tested?

To be honest, it has not been tested. I don't have the nerve to run Docker releasing script for the sake of testing so I defer to the next release manager.

The combinations of the dependency versions are being tested in GitHub Actions at `branch-3.2`.

Closes #34813 from HyukjinKwon/SPARK-37554.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 03750c0)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
  • Loading branch information
HyukjinKwon authored and dongjoon-hyun committed Jul 11, 2022
1 parent 9dd4c07 commit 001d8b0
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion dev/create-release/spark-rm/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ ARG APT_INSTALL="apt-get install --no-install-recommends -y"
# We should use the latest Sphinx version once this is fixed.
# TODO(SPARK-35375): Jinja2 3.0.0+ causes error when building with Sphinx.
# See also https://issues.apache.org/jira/browse/SPARK-35375.
ARG PIP_PKGS="sphinx==3.0.4 mkdocs==1.1.2 numpy==1.19.4 pydata_sphinx_theme==0.4.1 ipython==7.19.0 nbsphinx==0.8.0 numpydoc==1.1.0 jinja2==2.11.3 twine==3.4.1 sphinx-plotly-directive==0.1.3"
ARG PIP_PKGS="sphinx==3.0.4 mkdocs==1.1.2 numpy==1.19.4 pydata_sphinx_theme==0.4.1 ipython==7.19.0 nbsphinx==0.8.0 numpydoc==1.1.0 jinja2==2.11.3 twine==3.4.1 sphinx-plotly-directive==0.1.3 pandas==1.1.5 pyarrow==3.0.0 plotly==5.4.0"
ARG GEM_PKGS="bundler:2.2.9"

# Install extra needed repos and refresh.
Expand Down

0 comments on commit 001d8b0

Please sign in to comment.