Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have a sync meeting? #44

Closed
choldgraf opened this issue Mar 22, 2020 · 24 comments
Closed

Have a sync meeting? #44

choldgraf opened this issue Mar 22, 2020 · 24 comments

Comments

@choldgraf
Copy link
Member

Hey all - in particular, @jstac , @mmcky , @chrisjsewell

It's been several weeks since the last time that we spoke together. There has been a ton of progress on our various tools (though still much more to do!)

Can we have a meeting to re-sync a little bit, set some expectations for what we'll have time to work on in the coming weeks, and make a plan for early releases? In particular, I'd like to plan a date by which we release early versions of these tools as "public-ready".

One reason I think this is timely is because a lot of courses are going to move online in the next month, over the summer, and into next semester. The earlier we can solidify the EBP stack, the higher chance that we will be providing a really crucial tool to help instructors move their work remote.

I'm generally available for most of this and next week...can we find a time that works? I am also happy to have two meetings if the timezones don't work out...

@chrisjsewell
Copy link
Member

chrisjsewell commented Mar 23, 2020

Hey yeh I'm sure I can squeeze in some time 👍 (although I won't do much work on anything ebp related this week).

One thing I wanted to float, which would involve a pretty big change to myst-parser (although it shouldn't hopefully have much impact on upstream packges), is moving to a different Markdown parser...

When we initially looked at parsers, we correctly surmised that the current python based Markdown parser were insufficient for our needs: principally lacking extensibility to add the 'MyST' components, and the ability to capture line numbers for warning/error reporting. So now we are using mistletoe-ebp, which basically at this point is a completely separate thing to mistletoe (driven by that packages maintainer being unavailable).

In my recent messing around with the VS Code extension, to extend VS Code's built-in previewer, I've needed to use the JavaScript package they use: https://github.com/markdown-it/markdown-it. This parser is great, in terms of extendability; there's plenty of extensions already available, and it was pretty easy for me to write an extra one for MyST: https://github.com/ExecutableBookProject/myst-language-support/blob/master/src/md_it_plugin.ts, and obviously it's an extremely well used/tested/supported package.

This got me thinking, does there exist (no) or can you create a python port of it, given that JavaScript is very similar to Python. This is what I've started doing in: https://github.com/chrisjsewell/markdown-it/tree/js2py/lib; if you compare the original package and the fork, you'll see that it basically looks exactly the same, just with JavaScript files swapped in for Python ones. And (to the extent of the components that I've ported) it actually works! You can clone this and run:

In [1]: from lib import MarkdownIt                                                            
In [2]: md = MarkdownIt("working")                                                            

In [3]: md.render(""" 
   ...: hallo __*heres*__ 
   ...:  
   ...: - my port or `MarkdownIt` 
   ...: """)                                                                                  
Out[3]: '<p>hallo <strong><em>heres</em></strong></p>\n<ul>\n<li>my port or <code>MarkdownIt</
code></li>\n</ul>\n'

This would require extra time/effort to fully port + implement the MyST syntax extensions + rewrite aspects of myst-parser to use the new API. But I think it has a lot of long term benefits:

  1. It has a lot better scope for community buy-in: calling it something like markdown-it-py, you have instant name recognition. Potentially also we can get support/endorsement from the actual markdown-it guys
  2. You instantly get a decent level of cross-language support: Python, which we obviously require for the rest of our tool-chain, but also JavaScript for any web/HTML based stuff, i.e. VS Code and Jupyter Lab extensions
  3. It makes it really easy to add extensions: as I mentioned before, there is plenty of existing plugins: https://www.npmjs.com/search?q=keywords:markdown-it-plugin, and it's pretty easy to port any of these over to Python. Or you write it in Python first, then port it to JavaScript.
  4. It should be relatively easy to test/maintain: you can just auto-generate a bunch of test input/outputs using the actual markdown-it and check that this package creates the same output (+ running the CommonMark spec tests). With a modicum of JS + Python knowledge its pretty easy to compare the two code bases; to check for any discrepancies, and port any updates from markdown-it to markdown-it-py

@choldgraf
Copy link
Member Author

@chrisjsewell sounds good - if we don't hear from the others, then I'd also like to just meet for 30 minutes or so to get on the same page. Knowing what your availability will be in general over the coming weeks will be helpful in planning what to work on.

re: markdown-it...that is a pretty intriguing idea, some initial thoughts from me:

  • Agreed that one potential benefit of this is a standardization of tools across languages. If we wanted a javascript markdown parser for MyST, then we just need to add a markdown-it extension in javascript right?
  • What do you think about the underlying structure of markdown-it? I remember you quite liked mistletoe, so curious what you see as better.
  • Any idea on whether this will affect speed?

I think that the biggest drawback here is that it would be a chunk of work to do, and we are already operating on limited resources and availability due to COVID + @chrisjsewell starting his new position soon.

How about in our meeting, we figure out the things that need to happen for an MVP of our final user-facing build chain. It the remaining items are doable by others, perhaps @AakashGfude and I can focus on those pieces and @chrisjsewell could focus on exploring the new markdown-it-py (pymarkdown-it?) parser.

What do you think?

@phaustin
Copy link

I don't have anything particular to contribute to a timeline discussion, but we (EOAS/UBC) are in the process of moving about a dozen courses over to Jupyter in the next three years, and I'm currently interviewing summer co-op students to begin work on one of our larger courses (a 2nd year matlab-base coding introduction). We're more than happy to help with field testing, user documentation, etc.

@choldgraf
Copy link
Member Author

@phaustin as an aside, thank you for all the testing, opening PRs/Issues, etc...it is really helpful :-)

As you have students etc continue to transition material, please do have them reach out, and if you know anybody that is technically-inclined and may be interested in helping out with any of the projects, we'd love that too!

@phaustin
Copy link

for sure -- we're just starting a $250K project, and the timing with ebp is really fortuitous. The first couple pages of our proposal give an idea of our objectives/workplan.

@choldgraf
Copy link
Member Author

That's fantastic! I suspect that many, many courses are just now launching efforts to move their material to online only, so I hope that the EBP tools will be a huge help here. The more bugs you can uncover / feedback to give / etc, the better @phaustin :-)

BTW - are you building things directly with Sphinx and MyST-NB? Or are you using the command-line interface at https://github.com/executablebookproject/cli ?

@phaustin
Copy link

right now I'm just replacing nbsphinx with myst_nb in my own course repos. I'm also nearly finished with an nbconvert jinja2 template to produce pdf-ready html following pagedown/pagedjs. I'll move that over to sphinx once I figure out why equation numbers aren't being rendered. For us, pdf is a pretty high priority -- students expect labs/handouts in pdf that they can annotate. Using latex for pdf rendering is probably a bridge too far for the 2nd wave of our adopters (geophysics faculty using matlab). The plan is to use @page media and trio_cdp to make that step as low friction as possible.

@choldgraf
Copy link
Member Author

@phaustin I would love help in figuring out how to get paged HTML -> PDF support in the Jupyter Book sphinx theme...in Jupyter Book 1.0, I was using PrintJS for this, but it was still kind of clunky. I know that @mmcky and @AakashGfude are planning to work on a top-quality Latex template to allow Sphinx to build PDF that way, but for the HTML theme, I would also love to have a top-quality paged-output that looks nice.

@jstac
Copy link
Member

jstac commented Mar 23, 2020

Hi all, sorry to be late to the party. It would be good to hear updates. I ditched my Nokia for a Pixel 3a and now actually have the ability to speak during our zoom calls :-)

@phaustin, thanks for trialing the tools. It's great to have you involved at this early stage. All feedback is most appreciated, as well as requests for new features. The more diverse the pool of early adopters, the better the tools will be in terms of broad applicability.

I can chat this afternoon west coast time --- any time after 9:30am Canberra time.

@mmcky and @AakashGfude, it would be good to have you involved in this call, if possible. I assume you're not going anywhere 😬

I'm keen to hear about hooking up jupyter-cache so we can get the CLI working, as well as PDF output.

@choldgraf
Copy link
Member Author

I could meet from 4:30pm onward today (California time), or 3:30pm onward tomorrow. Do we think it's possible to get all of Australia / Cali / England in one call? Or try to do separate calls?

@mmcky
Copy link
Member

mmcky commented Mar 23, 2020

I am free anytime after 9:30am Canberra time also.

@chrisjsewell
Copy link
Member

I can make this

@jstac
Copy link
Member

jstac commented Mar 23, 2020

So, 4:30pm PST on zoom?

Ping @AakashGfude

@mmcky
Copy link
Member

mmcky commented Mar 23, 2020

Ah sorry was an hour early.

4:30PM - California
10:30AM - Canberra
11:30PM - London

@choldgraf
Copy link
Member Author

Great, looking forward to seeing you all in ~45 minutes

@chrisjsewell
Copy link
Member

re: markdown-it...that is a pretty intriguing idea, some initial thoughts from me:

Agreed that one potential benefit of this is a standardization of tools across languages. If we wanted a javascript markdown parser for MyST, then we just need to add a markdown-it extension in javascript right?

Yes absolutely. As per the markdown-it-py port, you would just have one file/folder that contains the plugin written in JS and another written in Python.

Note obviously that you can't have a sphinx renderer in JS,
and so here (as I have done in the VS Code extension) you add extra aspects to the JS plugin to do some simple mirroring of the sphinx logic, for previews, e.g. treating certain directives as admonitions, and correcting the language of code/code-block directives.

What do you think about the underlying structure of markdown-it? I remember you quite liked mistletoe, so curious what you see as better.

While mistletoe is, IMO, the best API of the python packages, I'd say markdown-it is just next-level, in terms of its API simplicity and flexibility.

Any idea on whether this will affect speed?

Well given markdown-it is battle tested daily with the 1000s of people using VS Code's previewer, I'd imagine its pretty rapid. But obviously I can only compare properly when I have ported all the CommonMark spec parts of the code, and cn run the benchmarking tests: https://mistletoe-ebp.readthedocs.io/en/latest/using/intro.html#performance

@mmcky
Copy link
Member

mmcky commented Mar 23, 2020

Join Zoom Meeting
https://anu.zoom.us/j/661104161?pwd=UlJiR3pST3BLSnJqOC9WNUlBN3JSdz09

Meeting ID: 661 104 161
Password: 538474

One tap mobile
+61370182005,,661104161# Australia
+61871501149,,661104161# Australia

Dial by your location
+61 3 7018 2005 Australia
+61 8 7150 1149 Australia
+61 2 8015 6011 Australia
Meeting ID: 661 104 161
Find your local number: https://anu.zoom.us/u/adPV8hRMX1

Join by SIP
661104161@113.197.7.78
661104161@113.197.7.79

Join by H.323
113.197.7.78
113.197.7.79
Meeting ID: 661 104 161
Password: 538474

@choldgraf
Copy link
Member Author

waiting in the meeting room now...

@mmcky
Copy link
Member

mmcky commented Mar 24, 2020

hey guys -- I think we just timed out.

@choldgraf
Copy link
Member Author

Hey all - thanks for a productive meeting. I'll get to work on finishing up the CLI stuff and improving documentation. Here were some major takeaways for me, please feel free to add your own if I missed something:

  • Main things to do before a first release of the stack:

    • Merge @AakashGfude 's PR for jupyter cache in myst-nb
    • Get a myst-notebook -> jupytext -> sphinx chain working that uses the cache
    • Figure out the book template features that we want (for what is created in jupyter-book create mybookname
    • Improve the user-facing CLI documentation so it is user-ready
      • Agree on the list of features to add to this documentation before we're ready to go
    • Figure out what kind of "quick start" options users should have access to from the CLI
    • Get a minimal PDF output working (maybe via HTML) and mention we'll have latex working in the future
  • Future items

    • Explore a full LSP that can be utilized by non-js editors like VIM
    • Parallel jupyter-cache execution
    • PDF generation via latex

@mmcky
Copy link
Member

mmcky commented Mar 24, 2020

thanks @choldgraf this is a nice summary.

@chrisjsewell
Copy link
Member

chrisjsewell commented Mar 24, 2020

@choldgraf

Any idea on whether this will affect speed?

Test document: spec.md
Test iterations: 50
Running 7 test(s) ...
=====================
mistune               (0.8.4): 5.29 s
mistletoe            (0.10.0): 14.08 s
*markdown-it-py       (x.x.x): 19.45 s
commonmark-py         (0.9.1): 33.66 s
panflute             (1.12.5): 46.91 s
pymarkdown            (3.2.1): 63.13 s
pymarkdown:extra      (3.2.1): 73.81 s

@choldgraf
Copy link
Member Author

Great - not much of a big difference there for current mistletoe 👍

if your intuition is that this will reduce our maintenance burden and be a useful tool in the python ecosystem, I'd say it's worth trying it out. I want to make sure you don't accidentally fall down an unforeseen rabbit hole, but if you think it's a good choice then I support it. Probably not something that we need to prioritize ASAP or anything, unless you think it'd save significant time in other projects you're working on...

@jstac
Copy link
Member

jstac commented Mar 24, 2020

Thanks for putting the list together @choldgraf . I'll make some comments in a separate issue on a minimum demo project and quickstart.

I agree that markdown-it-py is likely to reduce maintenance burden.

@jstac jstac mentioned this issue Mar 24, 2020
11 tasks
chrisjsewell added a commit to executablebooks/MyST-Parser that referenced this issue Apr 1, 2020
This commit implements the move from `mistletoe` to `markdown-it-py`, as the underlying markdown parser. The reason for this is are discussed in executablebooks/meta#44 (comment) and executablebooks/meta#47, and the PR #123 discusses in more details the update.

Additional changes:

- Update `pydata-sphinx-theme` requirement
- Improve testing and move to GitHub Actions CI
- Add tests and fixes for reporter warnings and include directive
- Add documentation of sphinx parser options
- Apply doc fixes suggested by @rossbar in #121
- Add warning for non-consecutive headings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants