Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are double percent scripts of Jupytext compatible with that of Spyder? #7933

Closed
mwouts opened this issue Sep 22, 2018 · 18 comments
Closed

Are double percent scripts of Jupytext compatible with that of Spyder? #7933

mwouts opened this issue Sep 22, 2018 · 18 comments

Comments

@mwouts
Copy link

mwouts commented Sep 22, 2018

I see that Spyder has the practice of identifying cells based on lines starting with `# %%'. As this seems to be a common practice (among Spyder, Hydrogen, VScode) I have just implemented, in Jupytext, a Jupyter notebook converter to double percent scripts, cf. mwouts/jupytext#59.

Could you please provide feedback on the format specifications? I found no mention of how markdown cells should be represented (or even cell metadata), so I had to made a few choices on my own, and I would like to make sure that these choices are fine with Spyder's practice as well.

If you also want to test the converter, then install jupytext RC with

pip install jupytext==0.7.0rc0

and use the command line converter

jupytext --to py:percent notebook.ipynb            # create a notebook.py file in the double percent format
jupytext --to ipynb notebook.py                    # create a notebook.ipynb from a double percent script that has at least two cells
jupytext --from py:percent --to ipynb notebook.py  # create a notebook.ipynb from a double percent script with only one cell
@ccordoba12
Copy link
Member

As this seems to be a common practice (among Spyder, Hydrogen, VScode)

Well, we introduced that practice, i.e. we were the first ones to define cells with #%% about 5 years ago.

I found no mention of how markdown cells should be represented (or even cell metadata)

At the moment we can't process markdown and I don't think we could in the future. That's because we're really oriented to evaluate plain text files.

However, the cell metadata stuff looks interesting. Pinging @bcolsen about this, who has been working with cells a lot lately.

How do you use and handle cell metadata in Jupytext? It's not that clear from the spec.


And thanks a lot for reaching out to us!

@mwouts
Copy link
Author

mwouts commented Sep 22, 2018

Thanks @ccordoba12 , that's an interesting feedback. It's good to know that the # %% format comes from Spyder! Sure, I will review the documentation and tell a bit more about the specs. Schematically it's like this:

# %% [cell type] some optional text that is mapped to cell metadata 'name' {other cell metadata in JSON format, if any}

with [cell type] empty for code cells, and markdown or raw for other cell types.

Jupytext offers no option to render or execute the text notebooks, but it can convert them to Jupyter notebooks (or to R markdown), which can then be executed and exported to HTML, PDF, etc. That is the point of having support for markdown cells, and cell metadata (taken verbatim from Jupyter, converted to matching knitr options when converting to R markdown).

I am not sure I am aware of what Spyder can do with these cells in scripts. I've seen that I was able to run the cells one per one in the console. Is it possible to 'render' the scripts to HTML or some other format? Does spyder include any rendered like pystitch or knitpy? I am asking as you seem to consider applications of cell metadata.

Any way, for now I think the most important questions are...

  1. Do you think the current implementation is compatible with Spyder - I would say it's OK since cell delimiters just need to start with # %%, right?
  2. Do you think the format is sustainable? For instance, having a special meaning for the first word after # %% (when it's either raw or markdown) may be questionable... But I like the relative simplicity of how to create a markdown cell!

@bcolsen
Copy link
Member

bcolsen commented Sep 23, 2018

@mwouts Jupytext looks like a great way to integrate the spyder text editor with the notebook plugin! I haven't really used the jupyter notebook that much but I'll try to be helpful there, but I have spent a couple of hours writing and thinking about this so I hope it's helpful

I might be missing some functionality that already exists in Spyder(@ccordoba12 ?)but it would be pretty cool to execute a text file and the output goes to the notebook plugin. Then running a cell in the text editor and have it only update that notebook cell.

I my not sure what your plan is here with # %% cell separators. I've looked at your current plain text files and while they are quite compact and read well, if I were to write one from scratch I would have to be careful about my commenting in the code so I wouldn't accidentally make a markdown cell. In the spirit of "Explicit is better than implicit" specifying each cell with # %% would be a good way to go.

Do you think the format is sustainable? For instance, having a special meaning for the first word after # %% (when it's either raw or markdown) may be questionable... But I like the relative simplicity of how to create a markdown cell!

Again it would better to be explicit here as well. My take on the format would look like this:

# %% cell name [cell type] {JSON format metadata}

I looked at your examples in the issue you linked and it seems like [] aren't part of the syntax. Here you would need to enclose the cell type in [] so there is no ambiguity between the name and the type. The order wouldn't actually matter, everything could be optional, and it dosen't rely on special words. This way one day you could have a crazy file with [R], [julia], [python], [html], and [latex] blocks or some craziness. If you just want a markdown block its still just # %% [markdown] not too long or complicated.

Even though the order wouldn't matter, I'm suggesting name first for compatibility with existing editors that use the labels like Spyder does in the outline explorer. The editors will still likely take the whole line as the name but at least it will start with something user defined rather then seeing a bunch of [markdowns].

As for the non-code cells I would suggest wrapping them in """ strings this would be similar to the php heredoc style. This could allow for an editor to change the syntax highlighting to markdown like many do for html in php scripts. Perhaps more excitingly it would also allow f-stings and sting formatting in markdown. But this might be challenging to get back from the notebook to a text file.

Spyder recently implemented grouped code cells with the # %% as the top cell and # %%% as a subgroup. You could use this to label code cells with the appropriate heading styles.

@mwouts
Copy link
Author

mwouts commented Sep 23, 2018

Thanks @bcolsen , there are so many useful suggestions in your comments!

  • Default jupytext format was designed a few months back, when I was not aware of the # %% cells. At some point we may consider changing the default to the percent cell format. One thing that I like from the light format is that it can read any script as a notebook (and split it into natural cells).
  • Sample percent representations of Jupyter notebook are available at https://github.com/mwouts/jupytext/tree/v0.7.0/tests/notebooks/mirror/ipynb_to_percent
  • Now for the double percent format: I agree that cell type should go to the end, and was also thinking of using [] - granted for the next RC!
  • I will take care to implement support for subgroup cells - do you expect that they are rendered as independent cells in the notebook, or should they belong to the same Jupyter notebook cell?
  • I like very much the suggestion of f-strings. I am aware of, but I never tried, a Jupyter notebook extension named Python Markdown which does display Python variables in markdown cells - using however pairs of curly brackets rather than simple curly brackets...
  • I guess that updating the notebook outputs from Spyder is doable, but that requires using Jupyter kernels for executing code. The output of Jupyter kernels is quite similar to the outputs found in notebooks. But... there are no plans to do that within Jupytext!

Another side remark... I just saw that Jupyter requires that name in cell metadata is unique (see also nteract/papermill#30), so I plan to store the cell name into another field of Jupyter notebook cell metadata, possibly title.

@bcolsen
Copy link
Member

bcolsen commented Sep 24, 2018

@mwouts

I will take care to implement support for subgroup cells - do you expect that they are rendered as independent cells in the notebook, or should they belong to the same Jupyter notebook cell?

I think each one should be a cell. They are really just there to help code organization, but it would be sweet if notebook actually had code cell groups. The only thing for Jupytext to worry about is to catch any number of %'s without spaces before the name. like # %%%%%%% name would be named name not %%%%% name.

I like very much the suggestion of f-strings. I am aware of, but I never tried, a Jupyter notebook extension named Python Markdown which does display Python variables in markdown cells - using however pairs of curly brackets rather than simple curly brackets...

They likely do that so {}'s you want in the text don't need to be escaped. It would be pretty simple just to use {{}} to have a string that would be directly compatible with an existing plugin. I'm still for using the """ strings in the markdown cells for the possibility of doing markdown syntax highlighting for markdown cells in Spyder.

I guess that updating the notebook outputs from Spyder is doable, but that requires using Jupyter kernels for executing code. The output of Jupyter kernels is quite similar to the outputs found in notebooks. But... there are no plans to do that within Jupytext!

Yeah this would be a Spyder thing. I think with the current syntax we are taking about we could actually get something similar to R markdown files for python files.

@mwouts
Copy link
Author

mwouts commented Sep 24, 2018

Thanks @bcolsen for the useful comments. I have included a few of your suggestions above for the next format update at mwouts/jupytext#89 !

I'm not sure I would recomment to work on an alternative to R markdown for python files. I've seen a few previous attemps (pystich, knitpy), and I think a very hard work is required to achieve the same functionality as R markdown. Why not using either jupyter nbconvert, or even R's rmarkdown::render directly (which also works well for Python code, and can even render matplotlib plots)?

@bcolsen
Copy link
Member

bcolsen commented Sep 25, 2018

I'm not sure I would recommend to work on an alternative to R markdown for python files.
Why not using either jupyter nbconvert?

Exactly what i was thinking (and didn't say). Use jupytext to make a notebook from python, execute the notebook to get the results(this should be possible) and use nbconvert go from there to PDF or HTML. I would just make an export as button in Spyder call the commands and that's it!

@mwouts
Copy link
Author

mwouts commented Sep 26, 2018

Great! That's documented here. You can either execute the notebook on the command line, on in Python directly (in combination with jupytext.readf("notebook.py", format_name="percent")). The later is possibly more appropriate for your use case if you want to run the notebook in the current python environment. In either case, you may also want to set the timeout option to None, otherwise long notebooks won't be rendered.

@mwouts
Copy link
Author

mwouts commented Oct 3, 2018

@bcolsen, @ccordoba12, do you think I could map the Spyder cell title to the cell content in the Jupyter representation? Currently I map that to the 'title' cell metadata, but that's not very useful. Precisely, what would you think of converting

%% cell title {JSON metadata}
1 + 1

to

# cell title
1 + 1

in the Jupyter notebook ? (and reversely, map the first comment of code cells to the cell title)

@ccordoba12
Copy link
Member

I don't have a problem with that.

@bcolsen
Copy link
Member

bcolsen commented Oct 3, 2018

By

# cell title
1 + 1

You mean the title would be heading1 above the cell. That would be good.

Also:

# %%% cell title {JSON metadata}
1 + 1

could be

## cell title
1 + 1

with heading2:

Basically:

# %%  = Heading1(#)
# %%%  = Heading2(##)
# %%%%  = Heading3(###)
# %%%%%  = Heading4(####)
# %%%%%%  = Heading5(#####)
# %%%%%%%...  = Heading5(#####)

I don't think there is more than 5 headings.

@mwouts
Copy link
Author

mwouts commented Oct 3, 2018

Thanks @bcolsen , @ccordoba12 for your quick replies. Well, @bcolsen , the case of sub cells is interesting. In Python I think a line starting with two # would not be pep8-compliant, and will be modified by most editors when the user 'reformats' the code... We could either insert that space between the two hash signs, or keep the third percent sign and write the sub cell as

# % sub cell title
1 + 1

Which one do you prefer?

Also, more generally, I think there a subject, of defining the specifications for that format. Nteract has done an interesting job on making the format language agnostic in Hydrogen, and it looks very plausible that they will have to implement another parser (in javascript - they're not python based). I am afraid that, if we don't have clear and well accepted specs, we may end up with incompatible implementations, and that would be a pity, right? Do you have an idea on how we should proceed for that? Is there a natural autority for this (Python board? Jupyter?)

@bcolsen
Copy link
Member

bcolsen commented Oct 4, 2018

@mwouts

I think I miss read your comment. I though you were talking about a markdown/html formatted header in the notebook above the cell not as a comment in the cell.

For a comment in the actual code your suggestion would work but it would loose the sub cell information. I think actually leaving the cell name comment as is with the meta data and cell type striped would be the safest thing to do:

# %% cell title {JSON metadata}
1 + 1

to

# %% cell title
1 + 1

This way users can use whatever cell separators they want (as long as you have it in Jupytext) and it won't be lost.

@goanpeca
Copy link
Member

goanpeca commented May 5, 2020

Just wondering what is the state of this @ccordoba12 :-p

@ccordoba12
Copy link
Member

We don't have plans to implement this.

@bcolsen, what do you think about it?

@bcolsen
Copy link
Member

bcolsen commented May 5, 2020

I would say that this conversation has filled it purpose in that jupytext now supports Spyder style "percent" cells.

The only other interesting idea in this thread would be integrating jupytext with the spyder-notebook plugin so a file in the Spyder editor could be opened in the notebook or a notebook could be opened in the editor. I don't use the notebook at all currently, but maybe @jitseniesen or a notebook user might be interested. It looks like jupytext does most of the heavy lifting and it's mostly a user interface thing. All that being said, maybe if we had this feature I might use the notebook more :-)

@jitseniesen
Copy link
Member

Nice idea, I opened an issue in the notebook plugin.

@mwouts
Copy link
Author

mwouts commented May 5, 2020

I would say that this conversation has filled it purpose in that jupytext now supports Spyder style "percent" cells.

I do agree 100%. I think we're done with this question and that now, the Spyder format and the py:percent format in Jupytext do match well (so I'll close this issue, thanks again for your feedback).

Nice idea, I opened an issue in the notebook plugin.

Thank you @jitseniesen !

Indeed, it would be great to offer integration directly in Spyder, like is done in VIM with the jupytext.vim plugin: when the users open an .ipynb file in the editor, they get the .py form, and when they save it, the .ipynb file is updated (re. the other way around, i.e. opening .py files as notebooks in Jupyter, this does come with Jupytext and its custom contents manager).

@mwouts mwouts closed this as completed May 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants