empty lines at end of certain files cause parse to fail #31

poldrack · 2020-08-05T14:18:37Z

Describe the bug

The presence of three empty lines at the end of a particular file causes the build to break.

To Reproduce

Steps to reproduce the behavior:

Clone https://github.com/poldrack/psych-open-science-guide
"jb build guide" should work properly
Add two additional line feeds to the end of guide/4_reproducibleanalysis.md
"jb build guide" should now fail with an error.

Expected behavior

When the extra lines are added, the follow Exception occurs:

Environment

Python 3.8.3
output of jupyter-book --version:
Jupyter Book: 0.7.3
MyST-NB: 0.8.4
Sphinx Book Theme: 0.0.33
MyST-Parser: 0.9.0
Jupyter-Cache: 0.2.2
Operating System: Mac OS X

The text was updated successfully, but these errors were encountered:

choldgraf · 2020-08-05T14:28:54Z

I think you didn't paste in the exception :-)

poldrack · 2020-08-05T14:33:30Z

odd, thought I had, here it is:

Exception occurred:
File "/Users/poldrack/anaconda3/envs/py38/lib/python3.8/site-packages/markdown_it/rules_block/list.py", line 285, in list_block
contentStart = state.bMarks[startLine]
IndexError: list index out of range

choldgraf · 2020-08-05T15:05:39Z

Interesting - I wonder if the empty lines are being treated as a special block by the markdown parser, then failing because they're empty?

ping @chrisjsewell since this seemes like something in the bowels of markdown-it-py. I think we could probably fix this in jupyter book by stripping the end of the page, but maybe it's a bug that should be fixed deeper?

firasm · 2020-08-14T00:01:47Z

I also have had this empty-line issue in some of my notebooks, I now know how to debug it so I've just been fixing it where it's been an issue.

If another reproducible example is needed, I could probably create one if it'll help, but I think with the example above, there already be enough info?

chrisjsewell · 2020-08-14T02:13:45Z

moving this to markdown-it-py

chrisjsewell · 2020-08-14T02:16:13Z

could some one copy/link here a mininimal example Markdown file where this occurs thanks

choldgraf · 2020-08-14T04:19:03Z

Ping @firasm and @poldrack in case they don't see Chris request above!

firasm · 2020-08-14T04:33:51Z

oops - didn't get notified of the above! Thanks @choldgraf

@chrisjsewell: Here's an example. There are only two commits in this repo, the first commit without the two blank lines (works fine), and the second commit with the two blank lines (build fails).

▶ jb build .
Running Sphinx v2.4.4
loading pickled environment... done
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 1 source files that are out of date
updating environment: 0 added, 1 changed, 0 removed
reading sources... [100%] markdown                                              
Exception occurred:
  File "/Users/firasm/.pyenv/versions/3.8.3/lib/python3.8/site-packages/markdown_it/rules_block/list.py", line 285, in list_block
    contentStart = state.bMarks[startLine]
IndexError: list index out of range
The full traceback has been saved in /var/folders/64/bfv2dn992m17r4ztvfrt93rh0000gn/T/sphinx-err-k2gw1der.log, if you want to report the issue to the developers.
Please also report this if it was a user error, so that a better error message can be provided next time.
A bug report can be filed in the tracker at <https://github.com/sphinx-doc/sphinx/issues>. Thanks!
Traceback (most recent call last):
  File "/Users/firasm/.pyenv/versions/3.8.3/bin/jb", line 8, in <module>
    sys.exit(main())
  File "/Users/firasm/.pyenv/versions/3.8.3/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/firasm/.pyenv/versions/3.8.3/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/firasm/.pyenv/versions/3.8.3/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/firasm/.pyenv/versions/3.8.3/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/firasm/.pyenv/versions/3.8.3/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/firasm/.pyenv/versions/3.8.3/lib/python3.8/site-packages/jupyter_book/commands/__init__.py", line 140, in build
    _error(
  File "/Users/firasm/.pyenv/versions/3.8.3/lib/python3.8/site-packages/jupyter_book/utils.py", line 65, in _error
    raise kind(box)
ValueError: 
===============================================================================

There was an error in building your book. Look above for the error message.

===============================================================================

P.S. @poldrack I created a template jupyterbook, and replaced the content of markdown.md with your file: guide/4_reproducibleanalysis.md just to reproduce the bug (it was easier than tracking down which one of my files was showing this behaviour), I'll delete the repo once this issue is resolved.

firasm · 2020-08-14T04:34:35Z

For what it's worth, I couldn't reproduce the issue with any old md file by adding two blank lines, only certain files.

sildar · 2020-08-17T11:40:12Z

Hi,

I took a quick look at this.

When processing lists, there is a call to tokenize() that advances the state.line attribute :

list.py 
line 264: state.md.block.tokenize(state, startLine, endLine)

After this line is executed, state.line can be larger than endLine, leading to the IndexError when checking the state at that index.

Breaking right after this line if state.line > endLine doesn't work though, we have to update the nextline variable before, as well as closing the list. It's actually already in the codebase, but one line too late:

line 280 onwards:
    token = state.push("list_item_close", "li", -1)
    token.markup = chr(markerCharCode)

    nextLine = startLine = state.line
    itemLines[1] = nextLine
    contentStart = state.bMarks[startLine]

    if nextLine >= endLine:
        break

The easy solution would be to just put the check a few lines earlier.

line 280 onwards:
    token = state.push("list_item_close", "li", -1)
    token.markup = chr(markerCharCode)

    nextLine = startLine = state.line  # we actually need to update these before breaking

    if nextLine >= endLine:
        break

    itemLines[1] = nextLine
    contentStart = state.bMarks[startLine]

Also, I'm not sure about what itemLines does, it's not really used at all in this method.

I can't run the test suite right now to check that this solution doesn't break anything else. But a diff on spec.md shows no difference.

If no one else writes the fix (feel free to do it), I'll do it this evening or tomorrow.

Edit: also, maybe there is a more elegant solution to implement in the tokenize() method.
Edit 2: I improved my reply to include more information/clarify

…list and several empty lines fails

chrisjsewell transferred this issue from executablebooks/jupyter-book Aug 14, 2020

chrisjsewell added the bug Something isn't working label Aug 14, 2020

executablebooks deleted a comment from welcome bot Aug 14, 2020

chrisjsewell added this to To do in Chris S's TODO list via automation Aug 14, 2020

chrisjsewell changed the title ~~empty lines at end of certain files cause build to fail~~ empty lines at end of certain files cause parse to fail Aug 14, 2020

chrisjsewell mentioned this issue Aug 14, 2020

👌 IMPROVE: Parsing performance #32

Merged

sildar added a commit to sildar/markdown-it-py that referenced this issue Aug 17, 2020

🐛FIX: Fix issue executablebooks#31: parsing a document ending with a …

1352da8

…list and several empty lines fails

sildar mentioned this issue Aug 17, 2020

🐛 FIX: empty lines after certain lists raises exception #36

Merged

chrisjsewell added a commit that referenced this issue Aug 17, 2020

🧪 TEST: Identify minmal failure for #31

ef59092

chrisjsewell linked a pull request Aug 17, 2020 that will close this issue

🐛 FIX: empty lines after certain lists raises exception #36

Merged

chrisjsewell closed this as completed in #36 Aug 17, 2020

Chris S's TODO list automation moved this from To do to Done Aug 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

empty lines at end of certain files cause parse to fail #31

empty lines at end of certain files cause parse to fail #31

poldrack commented Aug 5, 2020

choldgraf commented Aug 5, 2020

poldrack commented Aug 5, 2020

choldgraf commented Aug 5, 2020 •

edited

firasm commented Aug 14, 2020 •

edited

chrisjsewell commented Aug 14, 2020

chrisjsewell commented Aug 14, 2020

choldgraf commented Aug 14, 2020

firasm commented Aug 14, 2020 •

edited

firasm commented Aug 14, 2020

sildar commented Aug 17, 2020 •

edited

empty lines at end of certain files cause parse to fail #31

empty lines at end of certain files cause parse to fail #31

Comments

poldrack commented Aug 5, 2020

choldgraf commented Aug 5, 2020

poldrack commented Aug 5, 2020

choldgraf commented Aug 5, 2020 • edited

firasm commented Aug 14, 2020 • edited

chrisjsewell commented Aug 14, 2020

chrisjsewell commented Aug 14, 2020

choldgraf commented Aug 14, 2020

firasm commented Aug 14, 2020 • edited

firasm commented Aug 14, 2020

sildar commented Aug 17, 2020 • edited

choldgraf commented Aug 5, 2020 •

edited

firasm commented Aug 14, 2020 •

edited

firasm commented Aug 14, 2020 •

edited

sildar commented Aug 17, 2020 •

edited