Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export Notebook - markdown cells not correctly exported to ipynb #1296

Closed
dynobo opened this issue Apr 22, 2018 · 23 comments · Fixed by #1498
Closed

Export Notebook - markdown cells not correctly exported to ipynb #1296

dynobo opened this issue Apr 22, 2018 · 23 comments · Fixed by #1498
Labels

Comments

@dynobo
Copy link

dynobo commented Apr 22, 2018

Description:

When exporting a py-File to ipynb-File the Markdown-Cells are exported as Code-Cells, not as Markdown cells.

Steps to Reproduce:

  1. Copy the following code to a new py-File:
#%%
print('hello world')

#%%markdown
# # Headline1
# - bullet point
# **bold text**

# %%
print('bye world')
  1. Run "Hydrogen - Export Notebook"

  2. Open the created ipynb-File in Jupyter Notebook, you'll see:

screen1

Expected would be a markdown cell:

screen2

Versions:
Hydrogen 2.4.1
Atom 1.25.1
Running on Solus Linux
Which OS and which version of Hydrogen and Atom are you running?

Logs:
not relevant

@BenRussert BenRussert added the enhancement 🌟 New feature ideas label Jun 16, 2018
@Madhu94
Copy link
Contributor

Madhu94 commented Jun 28, 2018

Picking this up, as I made the original PR for exporting to notebook.

@Madhu94
Copy link
Contributor

Madhu94 commented Jul 1, 2018

@dynobo Is the "%%markdown" syntax expected to be like the ipython's "%%markdown" cell magic ? I think in that case, the expected output would be this ?

image

@dynobo
Copy link
Author

dynobo commented Jul 4, 2018

@Madhu94
No, not really. I would expect it to be converted into a cell of type "markdown". After running a cell of this type, the source code vanishs and only the rendered markdown stays visible.

For an example, see the text in this shot of a notebook. The first two cells ("Jupyter Notebook Slides.." and "Overview") are markdown cells:
notebook

@Madhu94
Copy link
Contributor

Madhu94 commented Jul 13, 2018

My mistake, I thought %%markdown was a magic like %%javascript. I just wondered where the "%%markdown" syntax came from; I couldn't find it in the Hydrogen docs either.

@dynobo
Copy link
Author

dynobo commented Jul 16, 2018

@Madhu94: I stumbled upon it through the "rich document" feature, described shortly on the bottom of this page. I expected the markdown-parts of such an document to be exported as markdown-cells even without "%%markdown" syntax (this would be my favourite solution), but as it didn't do this, I thought at least the markdown syntax might be able to handle that.

But I want to give a bit background:

I like a lot the R-Notebooks, which allow to mix Code and Markdown in the same file. There, I can write everything from top to bottom in a single text file in my IDE, and it can also be read in every simple Text-Editor. In e.g. Jupyter I have to create/move cells in the Web-App, and I need a notebook viewer to render the content.

Hydrogen, with "Rich documents" is just soooo close to filling this need, which would be really cool. Unfortunately, as long as a Hydrogen is not "the standard", such a feature still would be quite worthless without being able to export such a Rich Document to Jupyter Notebooks, so I can share it in my university class or something...

So, if the markdown-parts of a "Rich document" would be rendered as jupyter-markdown-cells (with, or even better without %%markdown), this would be so perfect. I guess other people knowing R-Notebooks might feel the same... :-)

@kylebarron
Copy link
Contributor

@dynobo You might want to try the pweave package. It was modeled after R Markdown and lists jupyter notebooks as one of its output formats, though I haven't tested that part.

@kylebarron
Copy link
Contributor

I'm interested in taking a closer look at this, though I'm about to go on vacation for a week and a half.

It seems like it's currently a simple wrapper around stringifyNotebook. So would a fix be done in nteract/commutable instead of here?

writeFile(fname, stringifyNotebook(store.notebook), function(err, data) {

@BenRussert
Copy link
Member

@kylebarron yes, it would be in commutable. We have actually been talking about refactoring that package.

@mwouts
Copy link

mwouts commented Sep 6, 2018

Hello @dynobo, I see you have previous experience with R markdown notebooks. Have you ever tried to open these documents with Jupyter? There are a few plugins around that allow this. I recommend in particular notedown (many GitHub stars) and jupytext (my contribution).

Are you able to run python cells in markdown or R markdown documents in Hydrogen? If yes, you could experiment editing these documents simultaneously in Hydrogen and Jupyter (run %autosave 0 if you open both editors at the same time, and use Ctrl+R to reload from file while preserving kernel).

@kiwi0fruit
Copy link
Contributor

Does this work in Hydrogen export to ipynb?

# <markdowncell>
# # Headline1
# - bullet point
# **bold text**

As nbconvert seems to support that: http://ipython.org/ipython-doc/2/notebook/nbconvert.html#notebook-json-file-format

@mwouts
Copy link

mwouts commented Oct 23, 2018

Hello @kiwi0fruit , I think there are plans to support this in Hydrogen at some point. For now you can use an independent program named jupytext (I am the main author). Jupytext does support the double percent cells syntax (this is documented here). Markdown cells are defined like this:

# %% [markdown]
# Content of 
# markdown cell

@kylebarron
Copy link
Contributor

I think I've narrowed down the issue in Hydrogen. Hydrogen:export-notebook calls exportNotebook() here, which is mostly boilerplate around

JSON.stringify(store.notebook, null, 2)

So the issue here is in store.notebook. The definition of .notebook is:

@computed
get notebook() {
const editor = this.editor;
if (!editor) {
return null;
}
// Should we consider starting off with a monocellNotebook ?
let notebook = commutable.emptyNotebook;
const cellRanges = codeManager.getCells(editor);
_.forEach(cellRanges, cell => {
const { start, end } = cell;
let source = codeManager.getTextInRange(editor, start, end);
source = source ? source : "";
const newCell = commutable.emptyCodeCell.set("source", source);
notebook = commutable.appendCellToNotebook(notebook, newCell);
});
return commutable.toJS(notebook);
}

To me, it appears the issue is likely in codeManager.getCells(editor);. That probably generates cell ranges that include comments in the code blocks, and there probably needs to be a way to divide the cell ranges into Markdown blocks as well.

@kylebarron
Copy link
Contributor

kylebarron commented Oct 24, 2018

Actually the codeManager.getCells(editor); is fine. It's the commutable.emptyCodeCell that needs to be changed. It needs to be emptyMarkdownCell for cells that started with # %% markdown. This might require a change to getCells, since I don't think it currently returns the text next to %%.

@mwouts
Copy link

mwouts commented Oct 24, 2018

Hello @kylebarron , I think we should use compatible specifications across the multiple implementations of the text to jupyter notebook converters.

May I suggest that Hydrogen could use # %% [markdown] for marking markdown cells? That syntax was suggested by the Spyder team at spyder-ide/spyder#7933, and that's the one implemented in jupytext.

See also the documentation for the percent format in jupytext, or this
review of script formats for notebooks. Interestingly the percent format is supported beyond Hydrogen and Spyder, in as many as five Python editors!

@kylebarron
Copy link
Contributor

kylebarron commented Oct 24, 2018

What about Hydrogen having a superset? Something like

^#\s*%%\s*\[?(markdown|md)\]?

Hydrogen already supports other cell markers,

(%%| %%| <codecell>| In\[[0-9 ]*\]:?)

So we might also allow something like <markdown> or <md>

@kiwi0fruit
Copy link
Contributor

kiwi0fruit commented Oct 24, 2018

I think block comment way is more convenient for one way conversion (hydrogen to ipynb). Like in vscode-ipynb-py-convert and pandoctools.

# %%
'''
# Header

text.
'''

# %%
print('hello')

@kylebarron
Copy link
Contributor

kylebarron commented Oct 24, 2018

Well in the long term I think it's optimal to have two-way conversion in Hydrogen, both to and from .ipynb files.

Those multiline strings are problematic because, most importantly, you're using Python-specific syntax. Nothing in Hydrogen should be Python specific. Everything should be language agnostic, because we're aiming to support all Jupyter kernels.

@kiwi0fruit
Copy link
Contributor

kiwi0fruit commented Oct 24, 2018 via email

@mwouts
Copy link

mwouts commented Oct 24, 2018

Thanks @kylebarron , @kiwi0fruit ! It's good to have this conversation.

What about Hydrogen having a superset?

Well, if you choose a superset that we can also implement in jupytext, I will be glad to follow your standard. Not on markdown however because that's hard to distinguish from the cell title used by Spyder. But I have no objection to <markdown> or <md>. By the way, do you plan to support raw cells?

I think block comment way is more convenient for one way conversion

The Spyder team also suggested to represent markdown cells as multiline strings - they like the potential for using f-strings. And Sphinx-Galleries also offer markdown cells as multiline strings. However, we have not yet implemented support for these multiline strings in jupytext, and I am not sure we will, because of the following reasons

  • multiline strings are Python specific
  • and they are more difficult to parse than commented lines, especially in Hydrogen's case, when the parser has to be developed in Javascript.

@kiwi0fruit
Copy link
Contributor

kiwi0fruit commented Oct 24, 2018 via email

@kiwi0fruit
Copy link
Contributor

kiwi0fruit commented Oct 24, 2018 via email

@kylebarron
Copy link
Contributor

Sorry, I edited my previous post a bit too extensively after posting...

I think one of the best and most important features of Hydrogen is its total language agnosticness. I don't think any of the maintainers want to add something to Hydrogen that only works for Python.

By the way, do you plan to support raw cells?

Not in the short term, theoretically in the long term, but it really depends if there's a desire for it in the community.

@kiwi0fruit
Copy link
Contributor

In addition to vscode-ipynb-py-convert the block commented Markdown cells are supported by pandoctools (that uses knitty that uses stitch). And the # %% format supported by pandoctools is convertable to stitch format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants