New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEA Add examples recommender system #1125
FEA Add examples recommender system #1125
Conversation
This is exciting. It works: https://output.circle-artifacts.com/output/job/24f2114f-3815-45f3-8e2e-aec20d01b25a/artifacts/0/rtd_html/auto_examples/plot_0_sin.html#sphx-glr-auto-examples-plot-0-sin-py . If at some point you can test a documentation with more examples such as matplotlib or scikit-learn's, it would be very useful to get a feeling of how well it works. |
I'll take the bait :) Tested for MNE-Python (192 examples) in mne-tools/mne-python#11619. Spot checking I see that:
So it seems to work well! FYI I do get some warnings like:
|
Thanks a lot, @larsoner ! I don't know MNE-Python that well. At a quick glance it seems that the recommendations are imperfect but definitely useful! @ArturoAmorQ a couple of quick comments:
|
…pull/1/files Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments regarding some clean up
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@ArturoAmorQ let me know when this is ready to go in! |
@lucyleeow On my side it's good to go :D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all your work! Sorry, last nitpicks I think.
sphinx_gallery/gen_rst.py
Outdated
@@ -1466,6 +1470,11 @@ def save_rst_example( | |||
|
|||
example_rst += CODE_DOWNLOAD.format(example_file.name, language) | |||
|
|||
if gallery_conf["recommender"]["enable"]: | |||
# extract the filename without the extension | |||
recommend_fname = os.path.splitext(os.path.split(example_fname)[1])[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be nicer if we use Path
instead?
return app | ||
|
||
|
||
def test_recommend_n_examples(sphinx_app): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So sorry, I should have been clearer, this test can stay in test_full.py
. Since it uses sphinx_app
it may as well stay in test_full.py
, I was meaning more the other 2 tests, which don't need spinx_app
.
Also thank you for separating test_example_recommender_methods
to it's own unit test.
Co-authored-by: Lucy Liu <jliu176@gmail.com>
I hope that all the comments have been addressed. Please let me know otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So sorry, just thought of 2 more things. Thanks for your patience. This is the last thing!
the very rare/very common words. This may improve the recommendations quality, | ||
but more importantly, it spares some computation resources that would be wasted | ||
on non-informative tokens. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is already implied above but could we explicitly say that only the examples within a single gallery (and it's sub galleries) are used for computing closest examples.
Also this is probably obvious but can we add that only recommendations for .py
files will be generated.
n_examples = sphinx_app.config.sphinx_gallery_conf["recommender"]["n_examples"] | ||
|
||
assert '<p class="rubric">Related examples</p>' in html | ||
assert count == n_examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about adding a check that the same 3 related examples are found?
for more information, see https://pre-commit.ci
Note CI failure due to: scipy/docs.scipy.org#80 (we use 'http://docs.scipy.org/doc/scipy/wrong_url' in our tinybuild) Will merge once green, just in case. @ArturoAmorQ I've made the last small changes. |
Thanks @lucyleeow :) |
Woohoo!! Thanks everybody involved, in particular Arturo. I'm really looking forward to this feature being used in the downstream projects!
|
Would be good to get this out there -- @lucyleeow do you have time to cut a release next week? If not, I can do it |
Tentative yes. I'll ping you if not. |
… to version 0.15.0 v0.15.0 ------- Support for Python 3.7 dropped in this release. Requirement is now Python >=3.8. Pillow added as a dependency. **Implemented enhancements:** - ENH: Improve logging visibility of errors and filenames `#1225 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1225>`__ (`larsoner <https://github.com/larsoner>`__) - ENH: Improve API usage graph `#1203 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1203>`__ (`larsoner <https://github.com/larsoner>`__) - ENH: Always write sg_execution_times and make DataTable `#1198 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1198>`__ (`larsoner <https://github.com/larsoner>`__) - ENH: Write all computation times `#1197 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1197>`__ (`larsoner <https://github.com/larsoner>`__) - ENH: Support source files in any language `#1192 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1192>`__ (`speth <https://github.com/speth>`__) - FEA Add examples recommender system `#1125 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1125>`__ (`ArturoAmorQ <https://github.com/ArturoAmorQ>`__) (NEWS truncated at 15 lines)
Sorry to post about it here, but I don't have folks emails. For GSOC this year, one of the proposed projects is an image based search. I think it could work as an extension of this feature/built on top if it, so I was wondering if any of the folks here had time/interest in being subject area mentors? Thanks! |
Hmm... I'm not totally convinced it would be useful to other projects that use SG like scikit-learn, MNE, etc. Would it? If it's mostly (or only) useful for |
The reason I was thinking custom solution is primarily b/c my end goal is a detexify type interface & also possibly a component based search. Granted like a detexify interface could hook into an image-based search - and also yeah that might be generally useful b/c I don't think my motivation of "folks may not know what a chart is called" is necessarily specific to matplotlib. |
@lucyleeow would you be interested in mentoring this potentially? You might be a better fit |
I was thinking the scikit-learn folks would be more knowledgable about machine learning algorithms but @story645 email me, we could be able to work something out. |
Actually do you have my email? I'll make it visible on my github account. |
Fixes #1081
For a proof of concept using the scikit-learn library navigate through:
https://output.circle-artifacts.com/output/job/22589aa6-cfe8-4daf-a9c0-b5abf94098b6/artifacts/0/doc/auto_examples/index.html
This PR is ready for review.