New lexer #486

Lan2u · 2020-06-12T14:08:38Z

This PR includes the new lexer + parser made to work with goal symbols.

Addresses: #294 primarily and #456 as a side

Hopefully once this is finished we can complete #12

Co-authored-by: Iban Eguia <razican@protonmail.ch>

…exer

Catchup

boa/src/syntax/lexer/template.rs

HalidOdat · 2020-07-07T19:21:06Z

Thoughts?

I would go with option 3, lexing and parsing is one operation now because of goal symbols.

What do you think? @Razican @jasonwilliams

boa/benches/parser.rs

jasonwilliams · 2020-07-07T21:31:46Z

Agree option 3 they can’t run separately anyway

Razican · 2020-07-08T07:13:05Z

Thoughts?

Yes, that was the reason behind us adding the lexing part to parsing benchmarks.

Sorry I didn't have time these last weeks to review this. Hopefully I will get some time these days. This is an important step!

Razican

I reviewed some files, will continue when I have a bit of time. Thanks for the awesome work!!

Razican · 2020-07-08T07:17:17Z

boa/src/syntax/lexer/cursor.rs

+    /// It will fill the buffer with checked ASCII bytes.
+    pub(super) fn fill_bytes(&mut self, buf: &mut [u8]) -> io::Result<()> {
+        unimplemented!("Lexer::cursor::fill_bytes {:?}", buf)
+    }


Do we need this?

I don't think so, its from the string_literal code and I assumed you were using it for a specific reason so I left it

Are we still using it in string literals, or did you find a better way to do it?

Are we still using it in string literals, or did you find a better way to do it?

I didn't really change the string literal stuff so its still there but equally all the tests pass so it can't be doing anything major at the moment - will need to look back into the source code and see.

Its effectively sugar for:

for (x = 0; x < buf.len(); x++) {
buf[x] = next()?;
}

so we could just write that out in the one place it is used.

Actually I think we might need it - the only usage seems to require the raw u8 bytes rather than a rust char (and so therefore the regular .next() method won't work).

boa/src/syntax/lexer/comment.rs

Razican · 2020-07-08T07:54:43Z

About benchmarks, It seems that the parser is now 2x-3x slower. Except, as expected, for long files, where we get a huge speed bump (from 6.1±0.25ms to 742.7±54.50ns), which is an 87.8% speed improvement.

I will review the parser, especially the parser cursor, to see if further improvements can be made.

Lan2u · 2020-07-08T08:08:56Z

About benchmarks, It seems that the parser is now 2x-3x slower. Except, as expected, for long files, where we get a huge speed bump (from 6.1±0.25ms to 742.7±54.50ns), which is an 87.8% speed improvement.

I will review the parser, especially the parser cursor, to see if further improvements can be made.

It would be interesting to see a benchmark with a lot of goal symbol switches vs one without

Lan2u · 2020-07-08T08:13:35Z

About benchmarks, It seems that the parser is now 2x-3x slower. Except, as expected, for long files, where we get a huge speed bump (from 6.1±0.25ms to 742.7±54.50ns), which is an 87.8% speed improvement.

I will review the parser, especially the parser cursor, to see if further improvements can be made.

I suspect that mechanism used to allow peeking 2 tokens ahead might be a good place to start (it might be faster to replace the VecDeque with just an array of a certain size)

boa/src/syntax/lexer/tests.rs

Razican · 2020-07-08T09:16:45Z

It would be interesting to see a benchmark with a lot of goal symbol switches vs one without

Do you have an example code that would cause this?

I suspect that mechanism used to allow peeking 2 tokens ahead might be a good place to start (it might be faster to replace the VecDeque with just an array of a certain size)

Using an array in the stack would be very helpful, I think. What do we use the VecDeque for?

Lan2u · 2020-07-08T09:58:25Z

It would be interesting to see a benchmark with a lot of goal symbol switches vs one without

Do you have an example code that would cause this?

I suspect that mechanism used to allow peeking 2 tokens ahead might be a good place to start (it might be faster to replace the VecDeque with just an array of a certain size)

Using an array in the stack would be very helpful, I think. What do we use the VecDeque for?

VecDeque is basically a queue implementation which is used to store peeked elements - it is also used in one case for the 'push_back' method which pushes an element back onto the peek queue.

We only ever need to skip a single element when peeking ahead but because of the possibility of push_back() we need the buffer peeked array to be of at least size 3 (peek + peek_skip + push_back).

Lan2u · 2020-07-08T18:26:17Z

It would be interesting to see a benchmark with a lot of goal symbol switches vs one without

Do you have an example code that would cause this?

I suspect that mechanism used to allow peeking 2 tokens ahead might be a good place to start (it might be faster to replace the VecDeque with just an array of a certain size)

Using an array in the stack would be very helpful, I think. What do we use the VecDeque for?

function foo(regex, num) {}

let i = 0;
while (i < 1000000) {
	foo(/ab+c/, 5.0/5);
  i++;
}

…ountered

boa/src/syntax/lexer/comment.rs

Razican · 2020-07-08T19:29:33Z

function foo(regex, num) {}

let i = 0;
while (i < 1000000) {
	foo(/ab+c/, 5.0/5);
  i++;
}

I will add this as a parsing benchmark, so that we can see differences.

Co-authored-by: Iban Eguia <razican@protonmail.ch>

Razican · 2020-07-09T13:58:09Z

@Lan2u could you rebase the branch to get the new benchmarks and compare? I think we will get nice insights there :)

Catchup master

Lan2u · 2020-07-09T14:07:06Z

@Lan2u could you rebase the branch to get the new benchmarks and compare? I think we will get nice insights there :)

Done (I think)

Razican · 2020-07-13T09:13:03Z

Should we close this in favour of #559?

Razican and others added 14 commits June 10, 2020 19:04

Started with the new lexer implementation

2f78ebe

Minimal amount to allow compiling

c1318ec

Update boa_wasm/src/lib.rs

119f26f

Co-authored-by: Iban Eguia <razican@protonmail.ch>

Update boa/src/syntax/parser/tests.rs

d4e3e34

Co-authored-by: Iban Eguia <razican@protonmail.ch>

Update boa_cli/src/main.rs

30c81fc

Co-authored-by: Iban Eguia <razican@protonmail.ch>

Update boa/src/syntax/lexer/template.rs

42c5835

Co-authored-by: Iban Eguia <razican@protonmail.ch>

Moved token.rs file

fe13f7a

fixed merge conflicts

f115892

New Lexer: Minimal amount to allow compiling (#477)

688fff5

Co-authored-by: Iban Eguia <razican@protonmail.ch>

Merge branch 'new_lexer' of https://github.com/boa-dev/boa into new_l…

d2d710a

…exer

Merge pull request #18 from boa-dev/new_lexer

e1c5e0f

Catchup

Merge branch 'new_lexer' of https://github.com/Lan2u/boa into new_lexer

b9f4e3f

Moved identifier/keyword lexing into new lexer

204ef0a

Lexing punctuation

f47b44a

Lan2u marked this pull request as draft June 12, 2020 14:08

Paul Lancaster added 8 commits June 12, 2020 15:30

Branches of high level lexer created

51c7425

Fixed clippy warnings (some temporarily)

3bcce8a

Small explaination comment modification

9e73a6b

Fixed warning in cursor

514680d

Fixed benchmarks lexer usage

81d847f

Operator lexing reimplemented

8846f5e

Removed unused imports

17b0f46

Implemented comment lexing, half of regex lexing

ec782f1

Razican reviewed Jun 13, 2020

View reviewed changes

boa/src/syntax/lexer/template.rs Outdated Show resolved Hide resolved

Razican added the lexer Issues surrounding the lexer label Jun 13, 2020

Paul Lancaster added 5 commits June 13, 2020 13:00

Start of numerical literal lexing

fa5dc86

More of numeric lexing

6dcc784

Re-enabled the lexer tests

ad2c147

Fixed some clippy warnings

0398c8b

Made clippy happy

0d9c9da

HalidOdat reviewed Jul 7, 2020

View reviewed changes

boa/benches/parser.rs Outdated Show resolved Hide resolved

Refractor of benchmark

7d20427

Removed expression_lexer benchmark

29a2cbb

Razican reviewed Jul 8, 2020

View reviewed changes

boa/src/syntax/lexer/tests.rs Show resolved Hide resolved

Paul Lancaster added 2 commits July 8, 2020 19:02

Fixed comments consuming line terminator (with test)

5cc967f

Updated test to reflect change to single-line comments

d9fa2b8

Paul Lancaster added 2 commits July 8, 2020 19:30

Small changes as per PR review

4da7fc2

Modified multiline comment behaviour to return line terminator if enc…

a5cdaa8

…ountered

Razican reviewed Jul 8, 2020

View reviewed changes

boa/src/syntax/lexer/comment.rs Outdated Show resolved Hide resolved

Update boa/src/syntax/lexer/comment.rs

93b4dc7

Co-authored-by: Iban Eguia <razican@protonmail.ch>

Razican mentioned this pull request Jul 8, 2020

Added benchmark for goal symbol switching #556

Merged

Merge pull request #24 from boa-dev/master

08aba33

Catchup master

Paul Lancaster added 2 commits July 9, 2020 15:28

Fixed goal symbol benchmark

71488aa

Inline set/get goal

e26c12e

Razican mentioned this pull request Jul 10, 2020

Added benchmarks for full program execution #560

Merged

Lan2u closed this Jul 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New lexer #486

New lexer #486

Lan2u commented Jun 12, 2020 •

edited

HalidOdat commented Jul 7, 2020 •

edited

jasonwilliams commented Jul 7, 2020 •

edited

Razican commented Jul 8, 2020

Razican left a comment

Razican Jul 8, 2020

Lan2u Jul 8, 2020

Razican Jul 8, 2020

Lan2u Jul 8, 2020 •

edited

Lan2u Jul 8, 2020

Razican commented Jul 8, 2020 •

edited

Lan2u commented Jul 8, 2020 •

edited

Lan2u commented Jul 8, 2020

Razican commented Jul 8, 2020

Lan2u commented Jul 8, 2020

Lan2u commented Jul 8, 2020 •

edited by Razican

Razican commented Jul 8, 2020

Razican commented Jul 9, 2020

Lan2u commented Jul 9, 2020

Razican commented Jul 13, 2020

New lexer #486

New lexer #486

Conversation

Lan2u commented Jun 12, 2020 • edited

HalidOdat commented Jul 7, 2020 • edited

jasonwilliams commented Jul 7, 2020 • edited

Razican commented Jul 8, 2020

Razican left a comment

Choose a reason for hiding this comment

Razican Jul 8, 2020

Choose a reason for hiding this comment

Lan2u Jul 8, 2020

Choose a reason for hiding this comment

Razican Jul 8, 2020

Choose a reason for hiding this comment

Lan2u Jul 8, 2020 • edited

Choose a reason for hiding this comment

Lan2u Jul 8, 2020

Choose a reason for hiding this comment

Razican commented Jul 8, 2020 • edited

Lan2u commented Jul 8, 2020 • edited

Lan2u commented Jul 8, 2020

Razican commented Jul 8, 2020

Lan2u commented Jul 8, 2020

Lan2u commented Jul 8, 2020 • edited by Razican

Razican commented Jul 8, 2020

Razican commented Jul 9, 2020

Lan2u commented Jul 9, 2020

Razican commented Jul 13, 2020

Lan2u commented Jun 12, 2020 •

edited

HalidOdat commented Jul 7, 2020 •

edited

jasonwilliams commented Jul 7, 2020 •

edited

Lan2u Jul 8, 2020 •

edited

Razican commented Jul 8, 2020 •

edited

Lan2u commented Jul 8, 2020 •

edited

Lan2u commented Jul 8, 2020 •

edited by Razican