Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: automatically test all notebooks under examples/notebooks and render them in docs page #756

Merged
merged 19 commits into from
Sep 18, 2023

Conversation

wesk
Copy link
Collaborator

@wesk wesk commented Sep 6, 2023

This PR:

  1. Adds automation to test notebooks with every pull request
  2. Adds automation to render notebooks in our documentation website
  3. Adds a new example notebook

More detail on the three aspects below.

Automatically test notebooks with every pull request

To add a new Jupyter (.ipynb) file to the repository, simply add it to the examples/notebooks folder.

Every notebook in the examples/notebooks folder will be converted to a python script and executed by a GitHub Actions workflow.

If you don't want specific cells in a notebook to be run as part of our automatic tests, tag them with the ci-skip-cell tag, and the CI will not execute the cell. The reason this feature exists is because some cells that are used to install dependencies, such as !pip install xgboost do not convert easily into directly executable scripts. The first code cell in the XGBoost notebook is the motivating example for this feature.

Automatically render notebooks in Syne Tune documentation

The idea with this PR is to keep a single source of truth: examples/notebooks as the location where all example notebooks are kept.

To have your notebook show up in documentation, first add it to the examples/notebooks folder, and then update the following syntax inside the examples_toc.rst page to make sure that your notebook is included in the documentation.

Example showing adding the xgboost notebook:

Examples
========

.. toctree::
   :name: Example Notebooks
   :caption: Example Notebooks
   :maxdepth: 1

   notebooks/xgboost_example.ipynb


.. toctree::
   :name: Example Tuning Scripts
   :caption: Example Tuning Scripts
   :maxdepth: 1

   examples

Before you add a new notebook, be sure to clear the cell outputs, unless you also want them to render in the resulting documentation.

Note that the documentation build process copies the notebooks to the docs/source/notebooks folder. This is because the nbsphinx extension only knows about files inside the docs/source/ folder.

New example notebook: Tune XGBoost

This PR adds a new example notebook demonstrating how you can use Syne Tune to tune XGBoost's hyperparameters on the UCI hand-written digits dataset.


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@codecov
Copy link

codecov bot commented Sep 6, 2023

Codecov Report

Patch coverage has no change and project coverage change: -0.03% ⚠️

Comparison is base (78fe6ff) 64.23% compared to head (a1ee27c) 64.21%.
Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #756      +/-   ##
==========================================
- Coverage   64.23%   64.21%   -0.03%     
==========================================
  Files         435      435              
  Lines       29049    29049              
==========================================
- Hits        18661    18654       -7     
- Misses      10388    10395       +7     

see 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wesk wesk changed the title docs: add notebook ci: automatically test all notebooks under examples/notebooks and render them in docs page Sep 7, 2023
@wesk wesk marked this pull request as ready for review September 7, 2023 15:12
@wesk wesk requested review from mseeger and geoalgo September 7, 2023 15:12
Copy link
Contributor

@geoalgo geoalgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me overall.

One question would be what is the rationale for having a notebook and not an example?

I think there should be a clear way to say how functionalities are committed otherwise you may have a mix-and-match of notebook/python examples (for instance a bunch of the current examples could be notebooks).

def copy_notebooks_into_docs_folder(app):
# .ipynb files must be inside the docs/ folder for Jupyter to be able to convert them
currpath = os.path.dirname(os.path.realpath(__file__))
source = os.path.join(currpath, "../../examples/notebooks")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use Pathlib parent rather than "../../" which is not portable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Comment on lines 27 to 33
source = os.path.join(currpath, "../../examples/notebooks")
destination = os.path.join(currpath, "notebooks")
print(
f"current path: {currpath}; source path: {source}; destination path: {destination}"
)
try:
shutil.rmtree(destination)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this rmtree? It seems potentially dangerous, most of the time we would execute on the remote machine but I could see a case where we would execute locally and may get bad surprise.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I updated it to shutil.copytree(source, destination, dirs_exist_ok=True), the dirs_exist_ok=True is what I was missing previously.

https://docs.python.org/3/whatsnew/3.8.html#shutil

mseeger
mseeger previously approved these changes Sep 11, 2023
Copy link
Collaborator

@mseeger mseeger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have some small comments.

examples/notebooks/tune_xgboost.ipynb Show resolved Hide resolved
"config_space = {\n",
" \"max_depth\": randint(1,10),\n",
" \"gamma\": uniform(1,10),\n",
" \"reg_lambda\": uniform(.0000001, 1),\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use loguniform instead of uniform

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

" import xgboost\n",
" import numpy as np\n",
"\n",
" eval_metric = \"merror\"\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - removed in next revision

" max_depth=max_depth,\n",
" eval_metric=eval_metric,\n",
" )\n",
" clf.fit(X_train, y_train, eval_set=[(X_val, y_val)])\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the eval_set parameter needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not required, removed in next revision.

@wesk
Copy link
Collaborator Author

wesk commented Sep 11, 2023

It looks good to me overall.

One question would be what is the rationale for having a notebook and not an example?

I think there should be a clear way to say how functionalities are committed otherwise you may have a mix-and-match of notebook/python examples (for instance a bunch of the current examples could be notebooks).

Thanks for the feedback -- I updated CONTRIBUTING.md with some general guidance in the next revision.

I intentionally kept the guidance general - we can update/refine it over time.

Another consideration is that not every example can easily be run today as a notebook.

My hypothesis is that users, especially those who are new to Syne Tune, will appreciate notebook examples, and we may want to consider converting some existing script examples into notebooks over time.

@wesk wesk merged commit 8546a4a into main Sep 18, 2023
35 checks passed
@wesk wesk deleted the 2023-09-05example branch September 18, 2023 23:59
@wesk wesk added the documentation Improvements or additions to documentation label Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants