Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC Add example showcasing HGBT regression #26991

Merged
merged 67 commits into from
Feb 22, 2024
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
4240074
DOC Add example showcasing HGBT regression
Aug 2, 2023
9728566
Replace the landing-page figure
Aug 2, 2023
7842e6d
Several tweaks
Aug 2, 2023
f5ac584
Wording
Aug 2, 2023
353329d
Add cross-links from other examples
Aug 2, 2023
1d56abd
Use dictionary to define monotonic_cst
Aug 2, 2023
78eda9d
Merge branch 'main' into hgbt_new_example
ArturoAmorQ Aug 2, 2023
ff89b7c
Add cross-links in the documentation
Aug 3, 2023
543d280
Change title
Aug 3, 2023
1550069
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
Sep 7, 2023
b77ab5c
Apply suggestions from code review
ArturoAmorQ Sep 7, 2023
0126963
Merge branch 'hgbt_new_example' of github.com:ArturoAmorQ/scikit-lear…
Sep 7, 2023
4689b0f
Iter on suggestions from code-review
Sep 7, 2023
2685f9b
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
Oct 3, 2023
86f8f67
Remove comment that will no longer be true in v1.4
Oct 3, 2023
35c065a
Address comment from Christian on calibration
Oct 3, 2023
c3e01fc
Address comment from Christian on bias
Oct 3, 2023
093b8dd
Apply suggestions from code review
ArturoAmorQ Oct 4, 2023
ff2888f
Iter on suggestions
Oct 4, 2023
7471959
Silence warning from DataFrame.groupby
Oct 6, 2023
9a486b8
Add discussion on early stopping
Oct 6, 2023
822f3db
Wording
Oct 6, 2023
97cf642
Rename instances of hgbt
Oct 6, 2023
60d8f61
Remove distinction on type of missingness
Oct 6, 2023
8799932
Apply suggestions from code review
ArturoAmorQ Nov 2, 2023
e89c942
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
Nov 2, 2023
c3c883c
Use numbered list
Nov 2, 2023
9ddaf6f
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
Nov 9, 2023
26ddf3b
Prefer lineplot instead of pairplot
Nov 9, 2023
4d70038
Prefer sample over example
Nov 9, 2023
5b0dcfd
Remove stepwise constant piece of dataset
Nov 10, 2023
29146ae
Plot predictions on unseen data
Nov 10, 2023
25978ae
Refactor code
Nov 13, 2023
16a19b1
Use train set for determining max_iter
Nov 13, 2023
70c021f
Use test set for plots and add generate_missing_values function
Nov 13, 2023
5cf52c2
Reference the problem of coverage
Nov 13, 2023
a8f68dd
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
Nov 13, 2023
214a083
Fix typo
Nov 13, 2023
64ff629
Apply suggestions from code review
ArturoAmorQ Nov 14, 2023
604283e
Prefer ax instead of plt to plot
Nov 14, 2023
11d165c
Add brief interpretation of plot
Nov 14, 2023
3abb0c4
Revert use of numbered list
Nov 21, 2023
13d7a4c
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
Nov 21, 2023
6142af3
Merge branch 'main' into hgbt_new_example
ArturoAmorQ Dec 11, 2023
7c84068
Apply suggestions from code review
ArturoAmorQ Jan 6, 2024
5db0094
Merge branch 'main' into hgbt_new_example
ArturoAmorQ Jan 6, 2024
dcdf851
Lint
Jan 6, 2024
7bfc635
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
Jan 16, 2024
ab0e21a
Fix FutureWarning
Jan 16, 2024
c4d1b3b
List of features as suggested by Christian
Jan 16, 2024
49587ab
Simplify code
ArturoAmorQ Jan 17, 2024
42c1742
Print simple stats
Jan 17, 2024
37bb831
Fix indentation
Jan 17, 2024
d1b809a
Use programmatic way to round up n_iter
Jan 17, 2024
5b18755
Set random state for deterministic results
Jan 17, 2024
9499e61
Add explanation on time-aware cross validation
Jan 17, 2024
3b1789e
Add comment on overcronstraining feature
Jan 17, 2024
d972fae
Apply suggestion from Guillaume
ArturoAmorQ Jan 23, 2024
9f49ad5
Update examples/ensemble/plot_adaboost_regression.py
ArturoAmorQ Jan 23, 2024
d333c6d
Format
Jan 23, 2024
593c5fb
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
Jan 23, 2024
d46cd41
Merge branch 'main' into hgbt_new_example
ArturoAmorQ Feb 14, 2024
c4a79e6
Update examples/ensemble/plot_hgbt_regression.py
ArturoAmorQ Feb 19, 2024
1010ecc
Fix random_state
Feb 19, 2024
31db489
Wording as suggested by Guillaume
Feb 19, 2024
5e19545
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
Feb 19, 2024
ea70999
Merge branch 'hgbt_new_example' of github.com:ArturoAmorQ/scikit-lear…
Feb 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions doc/modules/ensemble.rst
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,8 @@ are not yet supported, for instance some loss functions.

.. topic:: Examples:

* :ref:`sphx_glr_auto_examples_inspection_plot_partial_dependence.py`
* :ref:`sphx_glr_auto_examples_inspection_plot_partial_dependence.py`
* :ref:`sphx_glr_auto_examples_ensemble_plot_forest_hist_grad_boosting_comparison.py`

Usage
^^^^^
Expand Down Expand Up @@ -129,6 +130,8 @@ Note that for technical reasons, using a scorer is significantly slower than
using the loss. By default, early-stopping is performed if there are at least
10,000 samples in the training set, and uses the validation loss.

.. _nan_support_hgbt:

Missing values support
^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -167,6 +170,10 @@ If no missing values were encountered for a given feature during training,
then samples with missing values are mapped to whichever child has the most
samples.

.. topic:: Examples:

* :ref:`sphx_glr_auto_examples_ensemble_plot_hgbt_regression.py`

.. _sw_hgbdt:

Sample weight support
Expand Down Expand Up @@ -315,6 +322,7 @@ Also, monotonic constraints are not supported for multiclass classification.
.. topic:: Examples:

* :ref:`sphx_glr_auto_examples_ensemble_plot_monotonic_constraints.py`
* :ref:`sphx_glr_auto_examples_ensemble_plot_hgbt_regression.py`

.. _interaction_cst_hgbt:

Expand Down Expand Up @@ -1634,4 +1642,3 @@ minimum required number of samples to consider a split ``min_samples_split``).

.. [HTF] T. Hastie, R. Tibshirani and J. Friedman, "Elements of
Statistical Learning Ed. 2", Springer, 2009.

4 changes: 2 additions & 2 deletions doc/templates/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,8 @@ <h4 class="sk-landing-subheader text-white font-italic mb-3">Machine Learning in
and <a href="supervised_learning.html#supervised-learning">more...</a></p>
</div>
<div class="overflow-hidden mx-2 text-center flex-fill">
<a href="auto_examples/ensemble/plot_adaboost_regression.html" aria-label="Regression">
<img src="_images/sphx_glr_plot_adaboost_regression_thumb.png" class="sk-index-img" alt="Decision Tree Regression with AdaBoost">
<a href="auto_examples/ensemble/plot_hgbt_regression.html" aria-label="Regression">
<img src="_images/sphx_glr_plot_hgbt_regression_002.png" class="sk-index-img" alt="Decision Tree Regression with HGBT">
</a>
</div>
<a href="auto_examples/index.html#examples" class="sk-btn-primary btn text-white btn-block" role="button">Examples</a>
Expand Down
4 changes: 4 additions & 0 deletions examples/ensemble/plot_adaboost_regression.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@
regressor. As the number of boosts is increased the regressor can fit more
detail.

See :ref:`sphx_glr_auto_examples_ensemble_plot_hgbt_regression.py` for an
example showcasing the benefits of using more robust regressions such as
ArturoAmorQ marked this conversation as resolved.
Show resolved Hide resolved
:class:`~ensemble.HistGradientBoostingRegressor`.

.. [1] `H. Drucker, "Improving Regressors using Boosting Techniques", 1997.
<https://citeseerx.ist.psu.edu/doc_view/pid/8d49e2dedb817f2c3330e74b63c5fc86d2399ce3>`_

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,9 @@
the predicted value. RFs, on the other hand, are based on bagging and use a
majority vote to predict the outcome.

For more information on ensemble models, see the :ref:`User Guide <ensemble>`.
See the :ref:`User Guide <ensemble>` for more information on ensemble models or
see :ref:`sphx_glr_auto_examples_ensemble_plot_hgbt_regression.py` for an
example showcasing some other features of HGBT models.
"""

# Author: Arturo Amor <david-arturo.amor-quiroz@inria.fr>
Expand Down
4 changes: 4 additions & 0 deletions examples/ensemble/plot_gradient_boosting_categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,10 @@
We will work with the Ames Lowa Housing dataset which consists of numerical
and categorical features, where the houses' sales prices is the target.

See :ref:`sphx_glr_auto_examples_ensemble_plot_hgbt_regression.py` for an
example showcasing some other features of
:class:`~ensemble.HistGradientBoostingRegressor`.

"""

# %%
Expand Down
4 changes: 3 additions & 1 deletion examples/ensemble/plot_gradient_boosting_quantile.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@
=====================================================

This example shows how quantile regression can be used to create prediction
intervals.
intervals. See :ref:`sphx_glr_auto_examples_ensemble_plot_hgbt_regression.py`
for an example showcasing some other features of
:class:`~ensemble.HistGradientBoostingRegressor`.

"""

Expand Down
5 changes: 4 additions & 1 deletion examples/ensemble/plot_gradient_boosting_regression.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,10 @@
and 500 regression trees of depth 4.

Note: For larger datasets (n_samples >= 10000), please refer to
:class:`~sklearn.ensemble.HistGradientBoostingRegressor`.
:class:`~sklearn.ensemble.HistGradientBoostingRegressor`. See
:ref:`sphx_glr_auto_examples_ensemble_plot_hgbt_regression.py` for an example
showcasing some other advantages of
:class:`~ensemble.HistGradientBoostingRegressor`.

"""

Expand Down
Loading