Add performance benchmark #88

chrisjsewell · 2020-02-28T00:34:31Z

No description provided.

chrisjsewell · 2020-02-28T00:47:11Z

Block quotes don't seem to be getting rendered correctly: https://181-240151150-gh.circle-artifacts.com/0/html/using/benchmark.html#blockquote. However, it seems to be parsing as expected:

> This is the first level of quoting.
>
> > This is nested blockquote.
>
> Back to the first level.

<document source="notset">
    <block_quote>
        <paragraph>
            This is the first level of quoting.
        <block_quote>
            <paragraph>
                This is nested blockquote.
        <paragraph>
            Back to the first level.

chrisjsewell · 2020-02-28T01:11:09Z

@choldgraf the issue above looks to be another fix for the pandas_sphinx_theme, since when I remove it, I get the correct block quote indentations: https://183-240151150-gh.circle-artifacts.com/0/html/using/benchmark.html#blockquote

This reverts commit 5761fdb.

choldgraf · 2020-02-28T03:04:28Z

good catch, opened up: pydata/pydata-sphinx-theme#103

btw, I'm curious how this maps on to, say, the amount of content that the QuantEcon book has. It seems like our Sphinx parser will take relatively more time, but what about the absolute time for an amount of content like what the QE lectures have?

chrisjsewell · 2020-02-28T06:48:39Z

btw, I'm curious how this maps on to, say, the amount of content that the QuantEcon book has. It seems like our Sphinx parser will take relatively more time, but what about the absolute time for an amount of content like what the QE lectures have?

How would you envisage benchmarking this? As we saw before in your profiling, really the bottleneck will be in calling certain roles/directives that do a lot of processing (perhaps we could add a profiller for that, or upstream to Sphinx). The raw parsing speed would mainly be a factor if you are doing 'real-time' parsing (for linting, previews, etc); here you probably wouldn't actually call all the directives/roles (maybe just a small 'whitelist')

choldgraf · 2020-02-28T16:50:37Z

@chrisjsewell I don't care so much about benchmarking this, but about having a number that we can use to convince people that performance won't be an issue here. Your test doc was 1000 lines (which is probably longer than most), and you ran the iteration 1000 times.

E.g., does this mean that if this process took 70 seconds, then we have 73 / 1000 = .073 seconds per page? Or about 1 second processing per 10 pages? I'm just trying to tie these benchmarking numbers to people's expected subjective experience.

chrisjsewell · 2020-02-29T00:52:09Z

E.g., does this mean that if this process took 70 seconds, then we have 73 / 1000 = .073 seconds per page? Or about 1 second processing per 10 pages? I'm just trying to tie these benchmarking numbers to people's expected subjective experience.

Going back to my patented sphinx summary:

1. event.config-inited(app,config)
2. event.builder-inited(app)
3. event.env-get-outdated(app, env, added, changed, removed)
4. event.env-before-read-docs(app, env, docnames)

for docname in docnames:
    5.  event.env-purge-doc(app, env, docname)
    if doc changed and not removed:
        6. source-read(app, docname, source)
        7. run parser: text -> docutils.document
        8. apply transforms (by priority): docutils.document -> docutils.document
        9. event.doctree-read(app, doctree)

10. event.env-updated(app, env)
11. event.env-check-consistency(app, env)

for docname in docnames:
    12. apply post-transforms (by priority): docutils.document -> docutils.document
    13. event.doctree-resolved(app, doctree, docname)

14. call builder

15. event.build-finished(app, exception)

It means that stage (6) will take x/1000 seconds per page, with x being the value for the DocutilsRenderer (with no sphinx initiation), IF you are parsing a page with no roles/directives. But then, for a subjective experience, you need to factor in (a) what roles/directives you are using, and how much processing time these take, (b) all the other stages of the sphinx build.

You could write a function do run/measure this, given a certain set of source files (in the same way I run contained sphinx builds in tests/test_sphinx). But obviously you couldn't actually benchmark this to the other Markdown parsers, except recommonmark, because they don't have sphinx parsers.

chrisjsewell added 2 commits February 28, 2020 11:32

Add performance benchmark

f7e4d01

docs fix

c5b3b95

chrisjsewell added 2 commits February 28, 2020 11:52

Add docs on new parse_text function

d37c12d

remove theme

5761fdb

Revert "remove theme"

37ea0da

This reverts commit 5761fdb.

Add benchmark as CLI

7a51441

chrisjsewell merged commit cf3352c into develop Feb 28, 2020

chrisjsewell deleted the benchmark branch March 4, 2020 09:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add performance benchmark #88

Add performance benchmark #88

chrisjsewell commented Feb 28, 2020

chrisjsewell commented Feb 28, 2020

chrisjsewell commented Feb 28, 2020

choldgraf commented Feb 28, 2020

chrisjsewell commented Feb 28, 2020

choldgraf commented Feb 28, 2020

chrisjsewell commented Feb 29, 2020 •

edited

Loading

Add performance benchmark #88

Add performance benchmark #88

Conversation

chrisjsewell commented Feb 28, 2020

chrisjsewell commented Feb 28, 2020

chrisjsewell commented Feb 28, 2020

choldgraf commented Feb 28, 2020

chrisjsewell commented Feb 28, 2020

choldgraf commented Feb 28, 2020

chrisjsewell commented Feb 29, 2020 • edited Loading

chrisjsewell commented Feb 29, 2020 •

edited

Loading