-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: automatically test all notebooks under examples/notebooks
and render them in docs page
#756
Conversation
Codecov ReportPatch coverage has no change and project coverage change:
Additional details and impacted files@@ Coverage Diff @@
## main #756 +/- ##
==========================================
- Coverage 64.23% 64.21% -0.03%
==========================================
Files 435 435
Lines 29049 29049
==========================================
- Hits 18661 18654 -7
- Misses 10388 10395 +7 ☔ View full report in Codecov by Sentry. |
examples/notebooks
and render them in docs page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good to me overall.
One question would be what is the rationale for having a notebook and not an example?
I think there should be a clear way to say how functionalities are committed otherwise you may have a mix-and-match of notebook/python examples (for instance a bunch of the current examples could be notebooks).
docs/source/conf.py
Outdated
def copy_notebooks_into_docs_folder(app): | ||
# .ipynb files must be inside the docs/ folder for Jupyter to be able to convert them | ||
currpath = os.path.dirname(os.path.realpath(__file__)) | ||
source = os.path.join(currpath, "../../examples/notebooks") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use Pathlib
parent
rather than "../../" which is not portable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
docs/source/conf.py
Outdated
source = os.path.join(currpath, "../../examples/notebooks") | ||
destination = os.path.join(currpath, "notebooks") | ||
print( | ||
f"current path: {currpath}; source path: {source}; destination path: {destination}" | ||
) | ||
try: | ||
shutil.rmtree(destination) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this rmtree
? It seems potentially dangerous, most of the time we would execute on the remote machine but I could see a case where we would execute locally and may get bad surprise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I updated it to shutil.copytree(source, destination, dirs_exist_ok=True)
, the dirs_exist_ok=True
is what I was missing previously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have some small comments.
"config_space = {\n", | ||
" \"max_depth\": randint(1,10),\n", | ||
" \"gamma\": uniform(1,10),\n", | ||
" \"reg_lambda\": uniform(.0000001, 1),\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use loguniform
instead of uniform
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
" import xgboost\n", | ||
" import numpy as np\n", | ||
"\n", | ||
" eval_metric = \"merror\"\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No - removed in next revision
" max_depth=max_depth,\n", | ||
" eval_metric=eval_metric,\n", | ||
" )\n", | ||
" clf.fit(X_train, y_train, eval_set=[(X_val, y_val)])\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the eval_set
parameter needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's not required, removed in next revision.
Thanks for the feedback -- I updated I intentionally kept the guidance general - we can update/refine it over time. Another consideration is that not every example can easily be run today as a notebook. My hypothesis is that users, especially those who are new to Syne Tune, will appreciate notebook examples, and we may want to consider converting some existing script examples into notebooks over time. |
This PR:
More detail on the three aspects below.
Automatically test notebooks with every pull request
To add a new Jupyter (.ipynb) file to the repository, simply add it to the
examples/notebooks
folder.Every notebook in the
examples/notebooks
folder will be converted to a python script and executed by a GitHub Actions workflow.If you don't want specific cells in a notebook to be run as part of our automatic tests, tag them with the
ci-skip-cell
tag, and the CI will not execute the cell. The reason this feature exists is because some cells that are used to install dependencies, such as!pip install xgboost
do not convert easily into directly executable scripts. The first code cell in the XGBoost notebook is the motivating example for this feature.Automatically render notebooks in Syne Tune documentation
The idea with this PR is to keep a single source of truth:
examples/notebooks
as the location where all example notebooks are kept.To have your notebook show up in documentation, first add it to the
examples/notebooks
folder, and then update the following syntax inside theexamples_toc.rst
page to make sure that your notebook is included in the documentation.Example showing adding the xgboost notebook:
Before you add a new notebook, be sure to clear the cell outputs, unless you also want them to render in the resulting documentation.
Note that the documentation build process copies the notebooks to the
docs/source/notebooks
folder. This is because the nbsphinx extension only knows about files inside thedocs/source/
folder.New example notebook:
Tune XGBoost
This PR adds a new example notebook demonstrating how you can use Syne Tune to tune XGBoost's hyperparameters on the UCI hand-written digits dataset.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.