Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/scanner: potential inefficiency #35588

Closed
danaugrs opened this issue Nov 14, 2019 · 3 comments

Comments

@danaugrs
Copy link

@danaugrs danaugrs commented Nov 14, 2019

I noticed a potential scanner inefficiency here:

switch ch := s.ch; {
case isLetter(ch):
lit = s.scanIdentifier()

s.scanIdentifier() calls isLetter(s.ch) with the same char that was just checked in the case condition:

func (s *Scanner) scanIdentifier() string {
offs := s.offset
for isLetter(s.ch) || isDigit(s.ch) {
s.next()
}
return string(s.src[offs:s.offset])
}

I think that instead of calling s.next() on line 809 we should both record the current char and call s.next() immediately before the entire nested switch statement. That would remove the double call to isLetter and also remove the need for the s.peek() on line 805. What do you think? It's very possible I'm overlooking something. Let me know if I should put together a PR.

@mvdan

This comment has been minimized.

Copy link
Member

@mvdan mvdan commented Nov 14, 2019

There are benchmarks. Does any change you can come up with actually improve the benchmarks with benchstat? If not, then it's probably not worth optimizing.

@danaugrs danaugrs changed the title Potential scanner inefficiency go/scanner: potential inefficiency Nov 14, 2019
@danaugrs

This comment has been minimized.

Copy link
Author

@danaugrs danaugrs commented Nov 14, 2019

Before modifying and benchmarking something I find it beneficial to discuss it with those familiar with the matter. Is anyone familiar with the scanner logic interested in reasoning about the inefficiency I described and my proposal? @griesemer

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Nov 14, 2019

It is almost always better to write code that is clear and easy to adjust in the future than code that is awkward but saves a few cycles. In this case, the string conversion far outweighs the cost of potentially looking up the first character twice. Even if the string conversion were not here, the current code can't be running slower than if every identifier were just one character longer, which we know doesn't really affect time. (Otherwise everyone would be saying to use short names in your program to make it parse faster.)

There's no change to make here, but thanks for taking the time to file the issue.

@rsc rsc closed this Nov 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.