Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc] Integrate pyspark module into sphinx doc [skip ci] #8066

Merged
merged 1 commit into from
Jul 17, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -207,10 +207,11 @@
"python": ("https://docs.python.org/3.6", None),
"numpy": ("https://docs.scipy.org/doc/numpy/", None),
"scipy": ("https://docs.scipy.org/doc/scipy/reference/", None),
"pandas": ("http://pandas-docs.github.io/pandas-docs-travis/", None),
"pandas": ("https://pandas.pydata.org/pandas-docs/stable/", None),
"sklearn": ("https://scikit-learn.org/stable", None),
"dask": ("https://docs.dask.org/en/stable/", None),
"distributed": ("https://distributed.dask.org/en/stable/", None),
"pyspark": ("https://spark.apache.org/docs/latest/api/python/", None),
}


Expand Down
26 changes: 26 additions & 0 deletions doc/python/python_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -147,3 +147,29 @@ Dask API
:members:
:inherited-members:
:show-inheritance:


PySpark API
-----------

.. automodule:: xgboost.spark

.. autoclass:: xgboost.spark.SparkXGBClassifier
:members:
:inherited-members:
:show-inheritance:

.. autoclass:: xgboost.spark.SparkXGBClassifierModel
:members:
:inherited-members:
:show-inheritance:

.. autoclass:: xgboost.spark.SparkXGBRegressor
:members:
:inherited-members:
:show-inheritance:

.. autoclass:: xgboost.spark.SparkXGBRegressorModel
:members:
:inherited-members:
:show-inheritance:
4 changes: 3 additions & 1 deletion doc/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,6 @@ graphviz
numpy
recommonmark
xgboost_ray
sphinx-gallery
sphinx-gallery
pyspark
cloudpickle
27 changes: 16 additions & 11 deletions python-package/xgboost/spark/estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,13 @@ class SparkXGBRegressor(_SparkXGBEstimator):
"""
SparkXGBRegressor is a PySpark ML estimator. It implements the XGBoost regression
algorithm based on XGBoost python library, and it can be used in PySpark Pipeline
and PySpark ML meta algorithms like CrossValidator/TrainValidationSplit/OneVsRest.
and PySpark ML meta algorithms like :py:class:`~pyspark.ml.tuning.CrossValidator`/
:py:class:`~pyspark.ml.tuning.TrainValidationSplit`/
:py:class:`~pyspark.ml.classification.OneVsRest`

SparkXGBRegressor automatically supports most of the parameters in
`xgboost.XGBRegressor` constructor and most of the parameters used in
`xgboost.XGBRegressor` fit and predict method (see `API docs <https://xgboost.readthedocs\
.io/en/latest/python/python_api.html#xgboost.XGBRegressor>`_ for details).
:py:class:`xgboost.XGBRegressor` fit and predict method.

SparkXGBRegressor doesn't support setting `gpu_id` but support another param `use_gpu`,
see doc below for more details.
Expand Down Expand Up @@ -65,7 +66,8 @@ class SparkXGBRegressor(_SparkXGBEstimator):

.. Note:: This API is experimental.

**Examples**
Examples
--------

>>> from xgboost.spark import SparkXGBRegressor
>>> from pyspark.ml.linalg import Vectors
Expand Down Expand Up @@ -104,15 +106,16 @@ def _pyspark_model_cls(cls):


class SparkXGBClassifier(_SparkXGBEstimator, HasProbabilityCol, HasRawPredictionCol):
"""
SparkXGBClassifier is a PySpark ML estimator. It implements the XGBoost classification
algorithm based on XGBoost python library, and it can be used in PySpark Pipeline
and PySpark ML meta algorithms like CrossValidator/TrainValidationSplit/OneVsRest.
"""SparkXGBClassifier is a PySpark ML estimator. It implements the XGBoost
classification algorithm based on XGBoost python library, and it can be used in
PySpark Pipeline and PySpark ML meta algorithms like
:py:class:`~pyspark.ml.tuning.CrossValidator`/
:py:class:`~pyspark.ml.tuning.TrainValidationSplit`/
:py:class:`~pyspark.ml.classification.OneVsRest`

SparkXGBClassifier automatically supports most of the parameters in
`xgboost.XGBClassifier` constructor and most of the parameters used in
`xgboost.XGBClassifier` fit and predict method (see `API docs <https://xgboost.readthedocs\
.io/en/latest/python/python_api.html#xgboost.XGBClassifier>`_ for details).
:py:class:`xgboost.XGBClassifier` fit and predict method.

SparkXGBClassifier doesn't support setting `gpu_id` but support another param `use_gpu`,
see doc below for more details.
Expand All @@ -127,6 +130,7 @@ class SparkXGBClassifier(_SparkXGBEstimator, HasProbabilityCol, HasRawPrediction

Parameters
----------

callbacks:
The export and import of the callback functions are at best effort. For
details, see :py:attr:`xgboost.spark.SparkXGBClassifier.callbacks` param doc.
Expand Down Expand Up @@ -166,7 +170,8 @@ class SparkXGBClassifier(_SparkXGBEstimator, HasProbabilityCol, HasRawPrediction

.. Note:: This API is experimental.

**Examples**
Examples
--------

>>> from xgboost.spark import SparkXGBClassifier
>>> from pyspark.ml.linalg import Vectors
Expand Down