New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialize markdown only once #329

Merged
merged 3 commits into from Jul 8, 2018

Conversation

2 participants
@Glandos
Contributor

Glandos commented Jun 16, 2018

This is a small performance improvement. But it should be somewhat significant for large collections with a lot of markdown.

@Glandos

This comment has been minimized.

Contributor

Glandos commented Jun 17, 2018

Notice that this is about performance for no-op builds. Otherwise, image resizing and video compression is far more important for the CPU.

@saimn

This comment has been minimized.

Owner

saimn commented Jun 18, 2018

Hi,
Could you give an example of the gain with this (on a no-op build) ? From what I can see you need quite a big number of markdown files to notice the difference:

In [3]: %timeit md = Markdown(extensions=['markdown.extensions.meta','markdown.extensions.tables'],output_format='html5')
256 µs ± 9.04 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Also, the py35 build is broken for some reason (ffmpeg disappeared there 😢) but the py36 has a test failing with your PR.

@Glandos

This comment has been minimized.

Contributor

Glandos commented Jun 18, 2018

Performance figures

Using master:

adrien@complexe ~/D/s/tests> perf stat -r10 -B ./sigal_py3_venv/bin/sigal build photos/ _build_prod;
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.54 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 2.03 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 2.05 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.82 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.47 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.95 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.80 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 2.06 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.83 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.40 seconds.

 Performance counter stats for './sigal_py3_venv/bin/sigal build photos/ _build_prod' (10 runs):

       2360,241115      task-clock (msec)         #    0,969 CPUs utilized            ( +-  4,24% )
               488      context-switches          #    0,207 K/sec                    ( +-  5,88% )
                11      cpu-migrations            #    0,005 K/sec                    ( +-  8,89% )
            12 157      page-faults               #    0,005 M/sec                    ( +-  0,02% )
     5 705 818 220      cycles                    #    2,417 GHz                      ( +-  0,40% )  (49,92%)
     4 917 277 168      instructions              #    0,86  insn per cycle           ( +-  0,17% )  (74,93%)
     1 036 823 766      branches                  #  439,287 M/sec                    ( +-  0,12% )  (75,09%)
        50 325 792      branch-misses             #    4,85% of all branches          ( +-  0,28% )  (74,99%)

       2,436645094 seconds time elapsed                                          ( +-  4,28% )

Applying this patch:

> perf stat -r10 -B ./sigal_py3_venv/bin/sigal build photos/ _build_prod;
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.34 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.29 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.30 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.42 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.47 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.71 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.55 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.33 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.86 seconds.
  Sorting albums  [####################################]  100%
   Sorting media  [####################################]  100%
Collecting files  [####################################]  100%                                                   
Processing files  [------------------------------------]  0/0
   Writing files  [####################################]  100%                                                   

Done.
Processed 0 images (379 skipped) and 0 videos (22 skipped) in 1.39 seconds.

 Performance counter stats for './sigal_py3_venv/bin/sigal build photos/ _build_prod' (10 runs):

       2043,372745      task-clock (msec)         #    0,963 CPUs utilized            ( +-  3,35% )
               453      context-switches          #    0,222 K/sec                    ( +- 10,21% )
                10      cpu-migrations            #    0,005 K/sec                    ( +- 13,89% )
            12 249      page-faults               #    0,006 M/sec                    ( +-  0,02% )
     5 344 997 080      cycles                    #    2,616 GHz                      ( +-  0,62% )  (50,02%)
     4 637 683 203      instructions              #    0,87  insn per cycle           ( +-  0,11% )  (75,03%)
       981 760 224      branches                  #  480,461 M/sec                    ( +-  0,16% )  (75,01%)
        47 384 136      branch-misses             #    4,83% of all branches          ( +-  0,33% )  (74,97%)

       2,120834234 seconds time elapsed                                          ( +-  3,26% )

So yes, this is a 12.9% improvement for me.

By the way, I have 221 Markdown files in my photos directory.

@Glandos

This comment has been minimized.

Contributor

Glandos commented Jun 18, 2018

This PR is also based on the documentation of the Markdown library:

You should also use this method if you want to process multiple strings without creating a new instance of the class for each string.

Note that the reset() method should be called between convert(). I will push it with some fix for the failing tests.

Speaking about tests, there is no real assertions about what should be extracted from an empty file. Currently, all preprocessors are skipped, but attributes are not deleted, so the Meta attribute have the previous data. This seems a bug to me. But this is for another day.

Oh, and by the way: shame on me for not running the tests before pushing.

@Glandos

This comment has been minimized.

Contributor

Glandos commented Jun 19, 2018

See Python-Markdown/markdown#671 and Python-Markdown/markdown#672.
There was a bug there, that will be solved in the next release. Until then, this has to be done manually.

Glandos added some commits Jun 19, 2018

Call reset() if already initialized.
This also include a workaround for a
waiting-to-be-merged bug in python-markdown.
@codecov

This comment has been minimized.

codecov bot commented Jun 19, 2018

Codecov Report

Merging #329 into master will decrease coverage by 0.03%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #329      +/-   ##
==========================================
- Coverage   87.74%   87.71%   -0.04%     
==========================================
  Files          19       19              
  Lines        1412     1416       +4     
==========================================
+ Hits         1239     1242       +3     
- Misses        173      174       +1
Impacted Files Coverage Δ
sigal/utils.py 94.44% <100%> (-3.56%) ⬇️
sigal/plugins/watermark.py 93.02% <0%> (+2.32%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 99bafaf...f95393b. Read the comment docs.

@saimn

This comment has been minimized.

Owner

saimn commented Jun 25, 2018

Could you please run your benchmark once more with the reset changes, just to be sure that there is still a significant enough gain ? Thanks.

@Glandos

This comment has been minimized.

Contributor

Glandos commented Jul 5, 2018

OK, here are more figures. For brevity, I skip output from sigal. I manually applied the patch for python-markdown. All sigal tests are still OK.

master with photoswipe theme

 Performance counter stats for './sigal_py3_venv/bin/sigal build photos/ _build_prod' (10 runs):

       5015,872216      task-clock (msec)         #    0,990 CPUs utilized            ( +-  5,22% )
               676      context-switches          #    0,135 K/sec                    ( +-  7,62% )
                 9      cpu-migrations            #    0,002 K/sec                    ( +- 15,99% )
            13 899      page-faults               #    0,003 M/sec                    ( +-  0,83% )
    12 944 448 854      cycles                    #    2,581 GHz                      ( +-  0,17% )  (49,95%)
    11 274 264 137      instructions              #    0,87  insn per cycle           ( +-  0,10% )  (74,98%)
     2 354 613 689      branches                  #  469,433 M/sec                    ( +-  0,13% )  (75,04%)
       115 908 249      branch-misses             #    4,92% of all branches          ( +-  0,17% )  (75,01%)

       5,067768353 seconds time elapsed                                          ( +-  5,26% )

This patch with photoswipe theme

 Performance counter stats for './sigal_py3_venv/bin/sigal build photos/ _build_prod' (10 runs):

       4794,963802      task-clock (msec)         #    0,963 CPUs utilized            ( +-  3,02% )
               922      context-switches          #    0,192 K/sec                    ( +- 13,65% )
                13      cpu-migrations            #    0,003 K/sec                    ( +- 26,46% )
            14 410      page-faults               #    0,003 M/sec                    ( +-  0,82% )
    12 704 923 530      cycles                    #    2,650 GHz                      ( +-  0,84% )  (50,01%)
    10 998 472 366      instructions              #    0,87  insn per cycle           ( +-  0,16% )  (75,02%)
     2 297 597 621      branches                  #  479,169 M/sec                    ( +-  0,17% )  (75,04%)
       112 298 936      branch-misses             #    4,89% of all branches          ( +-  0,38% )  (74,95%)

       4,977598037 seconds time elapsed                                          ( +-  3,55% )

This is 1.77% faster, so it is insignificant.

master branch with a custom theme

This is a theme I am currently working on. It's a derivative of photoswipe

 Performance counter stats for './sigal_py3_venv/bin/sigal build photos/ _build_prod' (10 runs):

       2219,313680      task-clock (msec)         #    0,980 CPUs utilized            ( +-  4,67% )
               428      context-switches          #    0,193 K/sec                    ( +-  5,91% )
                 8      cpu-migrations            #    0,004 K/sec                    ( +- 10,91% )
            12 130      page-faults               #    0,005 M/sec                    ( +-  0,03% )
     5 711 295 676      cycles                    #    2,573 GHz                      ( +-  0,39% )  (50,02%)
     4 948 863 961      instructions              #    0,87  insn per cycle           ( +-  0,21% )  (75,08%)
     1 044 639 102      branches                  #  470,704 M/sec                    ( +-  0,14% )  (75,13%)
        50 406 356      branch-misses             #    4,83% of all branches          ( +-  0,25% )  (74,85%)

       2,263710201 seconds time elapsed                                          ( +-  4,44% )

Here I'm starting to have a doubt.

This patch with my custom theme

 Performance counter stats for './sigal_py3_venv/bin/sigal build photos/ _build_prod' (10 runs):

       2064,062561      task-clock (msec)         #    0,970 CPUs utilized            ( +-  4,44% )
               434      context-switches          #    0,210 K/sec                    ( +-  5,50% )
                 9      cpu-migrations            #    0,004 K/sec                    ( +- 13,92% )
            12 767      page-faults               #    0,006 M/sec                    ( +-  0,72% )
     5 379 681 810      cycles                    #    2,606 GHz                      ( +-  0,35% )  (49,92%)
     4 665 004 022      instructions              #    0,87  insn per cycle           ( +-  0,07% )  (74,98%)
       985 776 218      branches                  #  477,590 M/sec                    ( +-  0,11% )  (75,06%)
        47 268 135      branch-misses             #    4,80% of all branches          ( +-  0,16% )  (75,02%)

       2,126881752 seconds time elapsed                                          ( +-  4,42% )

And we still have a 6% improvement. Not as significant as before, but still.

Conclusion

My custom theme seems to spend way less time to write files. I didn't have the time to inspect why for now. But only with it the parsing of Markdown is significant, otherwise, the time spent when writing files is much higher than parsing.

No doubt that this patch improve the markdown parsing. But it seems to be insignificant for a lot of cases. I'd like to see it merged, but I could understand that it will probably be invisible for everyone.

@saimn saimn added this to the 2.0 milestone Jul 8, 2018

@saimn

This comment has been minimized.

Owner

saimn commented Jul 8, 2018

Thanks @Glandos for the new numbers. The difference is quite small and will not be noticeable in most cases, but I guess there is no real drawback and the additional complexity is limited, so let's go for it.

@saimn saimn merged commit c02f6ff into saimn:master Jul 8, 2018

3 checks passed

codecov/patch 100% of diff hit (target 87.74%)
Details
codecov/project Absolute coverage decreased by -0.03% but relative coverage increased by +12.25% compared to 99bafaf
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment