Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Untwine Pod #1300

Merged
merged 1 commit into from Dec 13, 2017
Merged

Untwine Pod #1300

merged 1 commit into from Dec 13, 2017

Conversation

softmoth
Copy link
Contributor

@softmoth softmoth commented Dec 8, 2017

This is a re-merge of LLFour:pod-fc-fixes branch, rakudo/rakudo pull #651

The changes to roast (see below) are still required.

Original commit message:

tl;dr =pod C now becomes [ FormattingCode ] instead of ["",FormattingCode,""]

After merge, merge:

perl6/roast#90 (tests)
perl6/doc#261

long story
I started out trying to fix stray empty strings being everywhere:

https://rt.perl.org/Public/Bug/Display.html?id=126966

Turns out empty strings were a symptom of a twine algorithm (every FormattingCode has to be intertwined with a string). This implementation detail side effect then leaked out everywhere:

In the tests: perl6/roast#90 (depends on odd array indexes because of empty strings)
in Pod::To::HTML: perl6/Pod-To-HTML#9 (caused by empty strings)
in the htmlifying code: https://github.com/perl6/doc/blob/master/htmlify.p6#L379
it sorta meant it had to serialize strings that never got used

The purpose of this commit is to expunge this algorithm. Because it is a relatively large commit I did some due diligence and tested it on a pod file containing the concatenation of all the .pod in docs and then putting $=pod.perl.say at the end.

bash-3.2$ /usr/bin/time -l perl6 test.pod | wc -c
58.72 real 58.26 user 0.35 sys
705699840 maximum resident set size
...
3672024
bash-3.2$ /usr/bin/time -l p6 test.pod | wc -c #patched
59.07 real 58.29 user 0.54 sys
1181294592 maximum resident set size
...
3663120

I also did a precompile with --target=mbc to see the difference:

1.5M 24 Dec 13:59 test.pod
6.8M 24 Dec 23:08 test.pod.moarvm
6.1M 24 Dec 23:03 test.patched.pod.moarvm

This shows:

This commit increases the amount of memory used during compilation from 700Mb => 1200Mb
that there are less characters in the output (due to less "" everywhere)
The patched one has a smaller image size due to less unnecessary serialization
They are roughly the same speed

Despite 1. I suggest merging this. It's not caused by my code (I think) but my code exacerbates whatever issue is causing rakudo to need 700Mb to compile a 1.5Mb pod file in the first place. I made a RT:
https://rt.perl.org/Public/Bug/Display.html?id=127020

Given we have precomp now 馃帀 this patch only benefits real world non-compiling-large-amounts of pod-code-at-runtime stuff. Also from here it's easier to implement NYI tags like S<> T<> etc.

Oh and =pod B< < B > > will work instead of dying with LTA error now

Merry christmas!

This is a re-merge of LLFour:pod-fc-fixes branch, rakudo/rakudo pull rakudo#651

The changes to roast (see below) are still required.

Original commit message:

tl;dr =pod C<foo> now becomes [ FormattingCode ] instead of ["",FormattingCode,""]

After merge, merge:

    Raku/roast#90 (tests)
    Raku/doc#261

long story
I started out trying to fix stray empty strings being everywhere:

https://rt.perl.org/Public/Bug/Display.html?id=126966

Turns out empty strings were a symptom of a twine algorithm (every FormattingCode has to be intertwined with a string). This implementation detail side effect then leaked out everywhere:

    In the tests: Raku/roast#90 (depends on odd array indexes because of empty strings)
    in Pod::To::HTML: Raku/Pod-To-HTML#9 (caused by empty strings)
    in the htmlifying code: https://github.com/perl6/doc/blob/master/htmlify.p6#L379
    it sorta meant it had to serialize strings that never got used

The purpose of this commit is to expunge this algorithm. Because it is a relatively large commit I did some due diligence and tested it on a pod file containing the concatenation of all the .pod in docs and then putting $=pod.perl.say at the end.

bash-3.2$ /usr/bin/time -l perl6 test.pod  | wc -c
       58.72 real        58.26 user         0.35 sys
 705699840  maximum resident set size
...
 3672024
bash-3.2$ /usr/bin/time -l p6 test.pod  | wc -c #patched
       59.07 real        58.29 user         0.54 sys
1181294592  maximum resident set size
...
 3663120

I also did a precompile with --target=mbc to see the difference:

1.5M 24 Dec 13:59 test.pod
6.8M 24 Dec 23:08 test.pod.moarvm
6.1M 24 Dec 23:03 test.patched.pod.moarvm

This shows:

    This commit increases the amount of memory used during compilation from 700Mb => 1200Mb
    that there are less characters in the output (due to less "" everywhere)
    The patched one has a smaller image size due to less unnecessary serialization
    They are roughly the same speed

Despite 1. I suggest merging this. It's not caused by my code (I think) but my code exacerbates whatever issue is causing rakudo to need 700Mb to compile a 1.5Mb pod file in the first place. I made a RT:
https://rt.perl.org/Public/Bug/Display.html?id=127020

Given we have precomp now 馃帀 this patch only benefits real world non-compiling-large-amounts of pod-code-at-runtime stuff. Also from here it's easier to implement NYI tags like S<> T<> etc.

Oh and =pod B< < B<foo> > > will work instead of dying with LTA error now

Merry christmas!
@AlexDaniel
Copy link
Contributor

馃憤 for the effort and the PR.

Is Raku/roast#90 all that was needed? (it's merged now) Also, are there any other tests coming?
FWIW Raku/doc#261 is also already merged.

This commit increases the amount of memory used during compilation from 700Mb => 1200Mb

That sounds scary and unacceptable, but that's just the synthetic test, right? Does it affect memory usage of anything else (e.g. rakudo compilation)?

@softmoth
Copy link
Contributor Author

While Raku/roast#90 is shown as merged, it isn't fixed. So the same changes are still required.

I don't think any other tests are required, since S26-documentation/08-formattingcodes.t already catches this.

@softmoth
Copy link
Contributor Author

I've re-run LLFourn's concatenate-all-p6doc test, and the results reflect the many rakudo improvements over the past two years:

  • Rakudo 2017.10: 51 seconds, 262M maximum resident set size
  • This branch: 53 seconds, 292M maximum resident set size

So this is an 11% increase in memory usage, on a 1.3M pod6 file.

$ find ~/.perl6/doc -name '*.pod*' | xargs cat > test.pod6
$ vim test.pod6  # Fix pod for rakudo/rakudo#1282
$ /usr/bin/time perl6 test.pod6
Length of resulting documentation: 3072092 chars
51.07user 0.11system 0:50.87elapsed 100%CPU (0avgtext+0avgdata 269124maxresident)k
0inputs+0outputs (0major+58416minor)pagefaults 0swaps
$ /usr/bin/time ./install/bin/perl6 test.pod6
===WARNING: One or more tables evidence bad practice.
==          Set environment variable 'RAKUDO_POD6_TABLE_DEBUG' for more details.
Length of resulting documentation: 3064320 chars
53.54user 0.15system 0:53.37elapsed 100%CPU (0avgtext+0avgdata 311084maxresident)k
0inputs+0outputs (0major+68124minor)pagefaults 0swaps
$ 

softmoth added a commit to softmoth/roast that referenced this pull request Dec 12, 2017
This adjusts for rakudo/rakudo#1300.

- rejigs the array indexes so that the tests work without "" being
   everywhere

- adds a new test B< < B<foo> > > which used to fail
@softmoth
Copy link
Contributor Author

The needed pod changes are re-merged here, ready for a roast pull request:

Raku/roast@master...softmoth:pod-fc-fixes

softmoth added a commit to softmoth/roast that referenced this pull request Dec 12, 2017
This adjusts for rakudo/rakudo#1300.

- rejigs the array indexes so that the tests work without "" being
   everywhere

- adds a new test B< < B<foo> > > which used to fail
@LLFourn
Copy link
Contributor

LLFourn commented Dec 13, 2017

@AlexDaniel that merge was reverted.

Thanks @softmoth for renewing the effort to fix this. The conclusion I came to while doing this work is that, really POD needs a complete re-work at every level. I think it needs to be its own language on the braid.

Just recently @timo commented that it may be using a lot of memory to parse things because it matches character by character.
https://rt.perl.org/Public/Bug/Display.html?id=127020

What this patch does is allows us to get the tests passing right now so the person doing the re-implementation doesn't have to worry about it.

@AlexDaniel AlexDaniel merged commit c398eb7 into rakudo:master Dec 13, 2017
@softmoth softmoth deleted the pod-fc-fixes branch December 13, 2017 05:59
AlexDaniel pushed a commit to Raku/roast that referenced this pull request Dec 17, 2017
This adjusts for rakudo/rakudo#1300.

- rejigs the array indexes so that the tests work without "" being
   everywhere
@softmoth softmoth mentioned this pull request Jan 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants