Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SIMD to accelerate comment lexing #18997

Open
dlangBugzillaToGithub opened this issue Jun 1, 2015 · 1 comment
Open

Use SIMD to accelerate comment lexing #18997

dlangBugzillaToGithub opened this issue Jun 1, 2015 · 1 comment

Comments

@dlangBugzillaToGithub
Copy link

Walter Bright (@WalterBright) reported this on 2015-06-01T19:37:55Z

Transferred from https://issues.dlang.org/show_bug.cgi?id=14641

CC List

Description

We encourage use of Ddoc to document functions. But this can result in voluminous comments, which slow down the lexer. Lexing comments can be accelerated by using SIMD vector instructions.

A little inline assembler in the lexer.c dmd source code would implement this.
@dlangBugzillaToGithub
Copy link
Author

briancschott commented on 2015-06-01T20:17:09Z

The best way to do this that I've found is to skip everything other than a set of bytes that varies based on the comment being lexed:

For /* */ comments:
0x0c (\n)
0x0d (\r)
0x2a (*)
0x2f (/)
x0e2 (Beginning of multi-byte UTF-8 newline)

For /+ +/ comments:
0x0c (\n)
0x0d (\r)
0x2b (+)
0x2f (/)
x0e2 (Beginning of multi-byte UTF-8 newline)

For // comments:
0x0c (\n)
0x0d (\r)
x0e2 (Beginning of multi-byte UTF-8 newline)

The instruction used in libdparse to do this is "pcmpestri", which requires SSE4.2 (First released in 2008 according to wikipedia). My advice is to leave most of the logic intact and implement the assembly code such that it may advance the lexer 0 or more bytes, so that the rest of the algorithm is not disrupted on machines that don't support SSE4.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant