Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Pandoc to convert ipynb into markdown #94

Closed
choldgraf opened this issue Feb 2, 2019 · 8 comments
Closed

Use Pandoc to convert ipynb into markdown #94

choldgraf opened this issue Feb 2, 2019 · 8 comments

Comments

@choldgraf
Copy link
Member

choldgraf commented Feb 2, 2019

Currently we're using nbconvert and a custom template to create jekyll-ready markdown files. Now that pandoc supports jupyter notebooks, perhaps we could use it to offload some custom code onto the (much more well-tested and used) pandoc.

Pandoc will convert Jupyter Notebooks to pandoc-flavored markdown, which is pretty close to jekyll-flavored markdown. I think that if we did the pandoc conversion, then ran find/replaces on:

  1. ``` {.python} -> ```python
  • (or r, or whatever the language was)
  1. ::: {. -> ::: {:.
  • (since Jekyll classes start with {:.classname, not {.classname
  1. ::: ->
  • (since the remaining ::: shouldn't be needed.

I think that this would get us the same functionality, including a ton of metadata from the conversion process.

Also, I don't believe that pandoc is difficult to install since it's got binaries available for many languages. Any feedback on that would be great.

@choldgraf
Copy link
Member Author

@jgm in case you've got intuition on this one, does the above look roughly true?

@jgm
Copy link

jgm commented Feb 2, 2019

If you want your code blocks to say

``` python

instead of

``` {.python}

then you can do pandoc -t markdown-fenced_code_attributes. (This disables the full attribute syntax with the curly braces. Read - as "minus".) You may need to make some other adjustments in markdown extensions to more closely match what jekyll needs. I'm not too familiar with jekyll, but the transformation you recommend with ::: is probably not quite equivalent. It looks like

{:.MyClass}

in jekyll will apply the class to the next element. But in pandoc ::: is a fence creating a div with attributes; it ends at the next ::: and might contain several elements.

Looks like kramdown supports the <div markdown="1"> syntax, to allow markdown to be interpreted inside a div marked with HTML tags. If that's right, then you can get pandoc to generate

<div markdown="1" class="MyClass">
...
</div>

instead of

::: {.MyClass}
...
:::

by disabling fenced_divs and enabling markdown_attribute. So,

pandoc -t markdown-fenced_divs+markdown_attribute-fenced_code_attributes

But it doesn't quite work, looks like the markdown writer doesn't add the markdown="1". I'll look into fixing that.

@jgm
Copy link

jgm commented Feb 2, 2019

OK, with the commit I'm about to push, the following works to produce the divs with markdown="1":

pandoc -t markdown-fenced_divs+markdown_attribute-fenced_code_attributes-native_divs-markdown_in_html_blocks

commit 20a0b4433f1fa72f921b5b660a43c221926634ec

@emdupre
Copy link
Collaborator

emdupre commented Feb 12, 2019

This would be really great to have ! It looks like R has something in the works using pandoc to convert RMarkdown into Distill-style site(s), Radix.

Right now their recommended workflow for Jupyter Notebooks is to convert to markdown, but pandoc would likely be a better bridge between these two projects.

@choldgraf
Copy link
Member Author

@emdupre as a user, do you think "you need to use pandoc" would be a big bump in complexity? Most of it would still happen under-the-hood so I don't know that users would need to be exposed to it much (and we could then utilize things like citeproc for references, I believe)

I think that @jgm mentioned there may be some changes to how pandoc handles the notebook in the pandoc AST, so we should probably wait until that happens before exploring this further.

If somebody wants to prototype ripping out the current markdown conversion and adding a dependency on pandoc (no bells and whistles yet, just current functionality w/ pandoc) then that'd be awesome!

@emdupre
Copy link
Collaborator

emdupre commented Feb 12, 2019

One question from the user-experience side: would this mean we develop a jupyter-book docker image ?

It looks like in converting to pandoc we could pull in their docker container as it's finalized. I'm quite comfortable with docker, and I know we've used it with success in fMRIPrep (i.e., many users are able to run the docker / singularity images, though we do end up with a modest amount of container support on NeuroStars).

Relying on users to have a functioning pandoc dependency is the obvious alternative, but in my experience distributing containers is actually easier to debug.

I can work on a prototype with the pandoc docker image, if you think that's a viable path forward !

@choldgraf
Copy link
Member Author

I'm a fan of making both an option :-) I think that getting a working Docker image that had:

  • Ruby
  • Pandoc
  • jupyter-book + env needed for it

installed would be a really nice first start and wouldn't be too hard to maintain. I think the biggest challenge there would be figuring out how to expose ports etc properly so that you can still demo your site from within the container.

Another thing we'd need to make sure of is that we don't make it less easy to not use Docker (that's a more long-term conversation).

@choldgraf
Copy link
Member Author

closing as this will be superceded by https://beta.jupyterbook.org/intro.html

choldgraf added a commit to choldgraf/jupyter-book that referenced this issue Apr 28, 2020
…docs_tests

updating CLI docs and adding more tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants