Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Support source files in any language #1192

Merged
merged 37 commits into from Oct 2, 2023

Conversation

speth
Copy link
Contributor

@speth speth commented Sep 9, 2023

This PR adds support for reading source files in any language supported by Pygments and generating annotated HTML files from them.

I saw the previous work that was attempted on multi-language support (#196/#197), and it's clear that some of the refactoring work that's been done since allowed this feature to be introduced with relatively few changes to the existing code base.

Obviously, this is a fair bit short of being able to execute such examples and collect outputs, but I think it's a useful capability for projects that have multiple language interfaces to be able to present examples this way, even if only the Python ones have all the bells and whistles.

There are a few things that still need to be resolved here, including additional tests and documentation, but I wanted to get some feedback first before going too much further. There is a bit of refactoring of py_source_parser.py that could be done to share some of the implementation with the new BlockParser class, which I'd be happy to do if desired.

For context, I'm one of the developers of Cantera, a library for chemical kinetics that provides interfaces in Python, Matlab, C++, and Fortran. I'm currently working toward a replacement for our home-grown index of examples, which is implemented as a set of extensions to Nikola as we move these examples into our Sphinx-generated site. I think sphinx-gallery provides a great set of features for our Python examples, but I don't want to leave our other interfaces behind. I'm hoping this will also be useful for other projects that include interfaces in multiple languages.

Copy link
Contributor

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a clean implementation so far! Just some preliminary comments from me. Just for background, how will this help you(r project) specifically? I assume you have some mix of Python and something else, and being able to link to non-executed something-else code will make your doc better...

sphinx_gallery/gen_rst.py Show resolved Hide resolved
sphinx_gallery/gen_rst.py Outdated Show resolved Hide resolved
sphinx_gallery/gen_rst.py Outdated Show resolved Hide resolved
sphinx_gallery/gen_rst.py Outdated Show resolved Hide resolved
sphinx_gallery/gen_rst.py Show resolved Hide resolved
sphinx_gallery/utils.py Outdated Show resolved Hide resolved
@speth
Copy link
Contributor Author

speth commented Sep 12, 2023

Just for background, how will this help you(r project) specifically? I assume you have some mix of Python and something else, and being able to link to non-executed something-else code will make your doc better...

I updated the PR description to provide some context.

@larsoner
Copy link
Contributor

One last testing idea -- can you modify sphinx_gallery/tests/tinybuild to use this functionality? It's really convenient to be able to do make in that dir and look quickly to make sure everything renders properly, plus CircleCI renders it for us (so easy to review by eye). After you add it, you can modify sphinx_gallery/tests/test_full.py -- maybe make a new function that uses the sphinx_app fixture that gives the paths to the source-copied-to-a-tmpdir and rendered site -- to have some lines that make sure that whatever non-Python source file you add gets rendered in at least some way you expect.

@speth speth force-pushed the multi-src-lang branch 2 times, most recently from 30c742d to eb4f15e Compare September 19, 2023 22:13
@speth
Copy link
Contributor Author

speth commented Sep 19, 2023

Thanks for the pointers on setting up some integration tests using the tinybuild project. I added a couple of examples there with corresponding tests of the resulting .rst files.

The ci/circleci: build_docs failure:

! LaTeX Error: Unicode character π (U+03C0)
               not set up for use with LaTeX.

Is apparently caused by the fact that I tried to use π in a Julia example. I'm not sure how to fix this other than by not using Unicode in the example.

The failure on sphinx-gallery.sphinx-gallery (Main Linux conda_python310_sphinxDev):

Theme error:
An error happened in rendering the page advanced.
Reason: ImportError("cannot import name '_get_toctree_ancestors' from 'sphinx.environment.adapters.toctree' (/usr/share/miniconda/lib/python3.10/site-packages/sphinx/environment/adapters/toctree.py)")

Doesn't seem to have anything to do with the changes in this PR.

@larsoner
Copy link
Contributor

Yes sphinx dev is unrelated so you can ignore it for now. Can you avoid Unicode for now and we can fix it later if needed?

@speth speth force-pushed the multi-src-lang branch 4 times, most recently from ef7e565 to d13963d Compare September 22, 2023 23:53
@speth speth marked this pull request as ready for review September 23, 2023 00:02
@speth speth requested a review from larsoner September 30, 2023 00:43
Copy link
Contributor

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than a few minor comments, looks good to me!

@lucyleeow will you have time to look? If not I think we can rely on our unit tests in addition to my review here, but this touches some core code so if you can look that's better if possible

Comment on lines +1289 to +1290
Currently, all generated notebooks specify Python as the kernel. After downloading,
the user will need to manually change to the correct kernel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a follow-up PR? I think in principle changing the skeleton might do it

https://github.com/speth/sphinx-gallery/blob/0157d3f801069f6b805b3f232453d75553e0c6cd/sphinx_gallery/notebook.py#L30

But I have no experience with this so...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is on my list for follow-on work.

@@ -10,7 +10,7 @@ clean:
rm -rf gen_modules/

html:
sphinx-build -b html -d _build/doctrees $(SPHINXOPTS) . _build/html
sphinx-build -b html -v -d _build/doctrees $(SPHINXOPTS) . _build/html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather move the -v to SPHINXOPTS at the top

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't mean to commit that in the first place.

speth and others added 21 commits September 30, 2023 19:32
Allow just '%%' instead of '% %%'.

Also, treat the rest of the line as a heading, which is what
Matlab does. This is enabled for all languages, as it provides
a nice short syntax for just adding headings to the code.
Include non-Python source files in the source code zip, and adjust
the file name if it includes non-Python sources.

Only generate Jupyter notebook zip file and link if at least
some of the files are Jupyter compatible.
@lucyleeow
Copy link
Contributor

I'll have a look next week.

Copy link
Contributor

@lucyleeow lucyleeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @speth , this is great and well tested, only some nits.

@@ -211,6 +214,25 @@ consult the `regular expressions`_ module for more details.

$ sphinx-build -D sphinx_gallery_conf.filename_pattern=plot_specific_example\.py ...

To parse examples in other languages, add their extensions to ``example_extensions``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could be more explicit in saying this is only parsing and syntax highlighting, not executing (I know its obvious for us)? Something like:

"You can also parse and highlight syntax examples in other languages by adding their extensions to example_extensions, though they will not be executed."

determined by the file extension. To override Pygments' default file associations, the
``filetype_parsers`` option can be used to specify a ``dict`` mapping a file extension
to any of the `pygments language names <https://pygments.org/languages/>`__. For
example::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we specify a non default file association, should the extension in example_extensions above be the default one or non-default one? Could we state this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to clarify. This option isn't so much about handling the case of an extension with no corresponding lexer (though it can be used that way), but for the case where an extension corresponds to several lexers and Pygments' defaults to the wrong one. For example, Pygments lists '.m as a known extension for Matlab, Objective-C, and Mason, with the default being Objective-C.

downloads can be enabled using the ``notebook_extensions`` option. For example::

sphinx_gallery_conf = {
"notebook_extensions": {".py", ".jl"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, if a non default extension is used in filetype_parsers, should this be the default or non default extension? Could we state this?

doc/syntax.rst Outdated

For such examples, the header for the example is defined by the first comment block
in the file, which must contain a reST title, and may contain any additional reST
content that should appear above the example. For example, A C++ example could start
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
content that should appear above the example. For example, A C++ example could start
content that should appear above the first code block. For example, a C++ example could start

Comment on lines +173 to +174
Any text following the special delimiter on the same line will be converted into a reST
heading (underlined with ``-``). The reST block continues until a line that does not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the "(underlined with -)" mean that the header should be underlined with - or it will become a reST header as if it was underlined with e.g., - ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I am not so familiar with other languages, does 'single line' (?) comment languages make supporting the current header syntax not possible? e.g.,

# %%
# Header
# -----------

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means that the underlining will be added, using the - character. I specified this so the user would know what character to use if they wanted other headings at either the same or a different level.

The existing header syntax is also supported. However, in the case of Matlab, which has had the notion of "code cells" for a long time (possibly where this idea originated), the style is that text appearing on the same line as the %% delimiter is treated as a heading, rather than ignored as sphinx-gallery currently does for Python scripts. Beyond Matlab, I thought it was a handy syntax especially in the case where all you want to add is the heading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, sorry I think I got confused.

thumb, _ = _find_image_ext(
os.path.join(target_dir, "images", "thumb", f"sphx_glr_{fname[:-3]}_thumb.png")
os.path.join(target_dir, "images", "thumb", f"sphx_glr_{fname.stem}_thumb.png")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice!

@@ -0,0 +1,361 @@
"""BlockParser divides source files into blocks of code and markup text."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""BlockParser divides source files into blocks of code and markup text."""
"""BlockParser divides non `.py` source files into blocks of code and markup text."""

nitpick

flag_start = rf"^[\ \t]*(?:{comment_start})\s*"

self.infile_config_pattern = re.compile(
flag_start + r"sphinx_gallery_([A-Za-z0-9_]+)(\s*=\s*(.+))?[\ \t]*\n?",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, r"sphinx_gallery_([A-Za-z0-9_]+)(\s*=\s*(.+))?[\ \t]*\n?" is repeated in py_source_parser.py and is complex enough that we could use a variable? (Even if we refactor in future)

Comment on lines 48 to 54
("#= comment =#", "#=", "=#", None), # Julia multiline
("# comment", "#", None, "#{20,}"),
("// comment", "//", None, "/{20,}"),
("/* comment */", r"/\*", r"\*/", r"/\*{20,}/"),
("% comment", "%", None, "%{20,}"),
("! comment", "!", None, "!{20,}"),
("c comment", r"^c(?:$| )", None, None),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do each of these correspond to a (or more) language? Could we add a comment specifying the associated language?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I added some examples for each. There are others, but I didn't want to go overboard.

@speth
Copy link
Contributor Author

speth commented Oct 2, 2023

Thanks for the review, @lucyleeow and @larsoner. I think I've addressed all the review comments.

@lucyleeow lucyleeow merged commit 846c1a0 into sphinx-gallery:master Oct 2, 2023
17 checks passed
@lucyleeow
Copy link
Contributor

Thank you @speth ! This is a great addition!

clrpackages pushed a commit to clearlinux-pkgs/pypi-sphinx_gallery that referenced this pull request Nov 28, 2023
… to version 0.15.0

v0.15.0
-------

Support for Python 3.7 dropped in this release. Requirement is now Python >=3.8.
Pillow added as a dependency.

**Implemented enhancements:**

-  ENH: Improve logging visibility of errors and filenames `#1225 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1225>`__ (`larsoner <https://github.com/larsoner>`__)
-  ENH: Improve API usage graph `#1203 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1203>`__ (`larsoner <https://github.com/larsoner>`__)
-  ENH: Always write sg_execution_times and make DataTable `#1198 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1198>`__ (`larsoner <https://github.com/larsoner>`__)
-  ENH: Write all computation times `#1197 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1197>`__ (`larsoner <https://github.com/larsoner>`__)
-  ENH: Support source files in any language `#1192 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1192>`__ (`speth <https://github.com/speth>`__)
-  FEA Add examples recommender system `#1125 <https://github.com/sphinx-gallery/sphinx-gallery/pull/1125>`__ (`ArturoAmorQ <https://github.com/ArturoAmorQ>`__)

(NEWS truncated at 15 lines)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants