Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lazy parsing #274

Open
rurban opened this issue May 2, 2017 · 12 comments

Comments

2 participants
@rurban
Copy link
Member

commented May 2, 2017

from javascript optimizers: parse function bodies only when executed or inspected. keep it as string, webkit even only stores the range in the file, which is mapped.
before starting we should check the percentage of dead code, and string vs ops memory costs and parse time per typical body.
-c (PL_minus_c) needs to turn it off.

see eg. mozilla https://bugzilla.mozilla.org/show_bug.cgi?id=678037

            # scripts created / dead / %    totalSize created / dead / %
 Startup:   4.5K / 3.6K / 80%               2.0MB / 1.6MB / 79%
 GMail:     20K  / 15K  / 73%               8.7MB / 6.3MB / 73%
 Facebook:  9K   / 7K   / 75%               4.1MB / 3.0MB / 73%
 Google:    6.9K / 5.3K / 76%               3.1MB / 2.3MB / 73%
 Session:   97K  / 72K  / 74%               43MB  / 30MB  / 68%

or https://ariya.io/2012/07/lazy-parsing-in-javascript-engines, http://www.mattzeunert.com/2017/01/30/lazy-javascript-parsing-in-v8.html or
https://pointersgonewild.com/about/

@rurban rurban self-assigned this May 2, 2017

@rurban rurban added the enhancement label May 2, 2017

@vendethiel

This comment has been minimized.

Copy link
Contributor

commented May 2, 2017

How well will this work with strict+warnings?

$ perl -E 'use strict; use warnings; sub a { $x };'
Global symbol "$x" requires explicit package name at -e line 1.
Execution of -e aborted due to compilation errors.
@rurban

This comment has been minimized.

Copy link
Member Author

commented May 3, 2017

This will be deferred until the sub is called. It will miss a lot of warnings in dead code. I will check how JS does the strict checking

@vendethiel

This comment has been minimized.

Copy link
Contributor

commented May 3, 2017

JS doesn't have such strict checking (partly because everything can be global), that's why it can get away with it.

@rurban

This comment has been minimized.

Copy link
Member Author

commented May 3, 2017

No, they added a perl-inspired use strict recently

@vendethiel

This comment has been minimized.

Copy link
Contributor

commented May 3, 2017

The checks all happen at runtime.

> "use strict"; function a() { x += 1 }
(no error)
> "use strict"; console.log(1); x+=1;
1
ReferenceError: x is not defined
@rurban

This comment has been minimized.

Copy link
Member Author

commented May 3, 2017

Yes, because all javascript engines I know do lazy parsing

@vendethiel

This comment has been minimized.

Copy link
Contributor

commented May 3, 2017

That second one doesn't even have a function surrounding it. It's not about function bodies not bieng parsed.


In general, it's a tradeoff, you lose static checking (which is a big advantage) for your functions for a reduction in memory

@rurban

This comment has been minimized.

Copy link
Member Author

commented May 3, 2017

ad 1: that's why the 2nd example did find the problem, the first not.

ad 2:
You lose static checking at compile-time, but can enforce it via -c or some new switch to turn off lazy parsing. You get all checks at run-time or for BEGIN blocks.

The advantages are not only memory, creating useless code is mostly a performance issue.
For ~80% dead code this is a valid tradeoff, that's why javascript did it.

@vendethiel

This comment has been minimized.

Copy link
Contributor

commented May 3, 2017

ad 1: that's why the 2nd example find the problem, the first not.

It doesn't. The console.log is executed.

@rurban

This comment has been minimized.

Copy link
Member Author

commented May 3, 2017

  • function a() { x += 1 } error inside a body, not executed. silent
  • console.log(1); no error, continue.
  • x+=1; error outside a function body: error.
@vendethiel

This comment has been minimized.

Copy link
Contributor

commented May 3, 2017

What I'm getting at is that JS doesn't have a Perl-like lexical scope checking, even with "use strict".

@rurban

This comment has been minimized.

Copy link
Member Author

commented May 15, 2017

Yes, JS is a very primitive language. dart or typescript are the better versions.

@rurban rurban added the in progress label May 17, 2017

@rurban rurban added this to the v5.28.0c milestone May 18, 2017

@rurban rurban removed the in progress label May 28, 2017

rurban added a commit that referenced this issue Sep 18, 2018

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Nov 2, 2018

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Nov 25, 2018

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Mar 18, 2019

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Apr 1, 2019

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Apr 5, 2019

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Apr 5, 2019

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Apr 30, 2019

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Jun 12, 2019

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Jun 24, 2019

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Jun 26, 2019

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Jun 27, 2019

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Jul 1, 2019

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Jul 2, 2019

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Jul 2, 2019

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.

rurban added a commit that referenced this issue Jul 3, 2019

lazyparse: #274 WIP
on SUB subname scan the block ahead and store it
as string SV in CvLAZY(PL_compcv).
Don't store any optree, defer it to later when the sub
is actually called.
This is a benefit when there's a lot of dead code (~70% typically)
and the storage to store the string buffers is not too big.
i.e. many ops or 1-3 ops.

Better than storing the block buffer would be just to store just
the mmap'ed bufptr, offset and len of the source file. But we don't
mmap by default yet, even if it's faster.

Note that subs are stored in slabs. As we store almost no slabs at
compile-time (via PL_compcv) it gets tricky.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.