-
Notifications
You must be signed in to change notification settings - Fork 773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CompilerPerf] Fix large ranges #4476
Conversation
@dsyme, conflicts to resolve. How do you propose investigating the perf implications you mention? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of bit twiddling and masking here. Looks good though.
@KevinRansom Good question. I ran the compiler perf script a few times and it seems to be slower by around 3-5%, which is kind of what I expected. BUT, I'm always a bit sceptical of the reliability of the data collection in that compiler perf script.... It's possible we should just do this properly by changing to use the character-offset-from-start-of-file. That is, however, a more intrusive change.... |
That would be great. |
Great work! I don't think you should let the 3% performance drop stop you from merging this, it solves quite a few stability problems already. And for code-generation this is great, because sometimes these files get much larger than 64k lines. This would solve some really old issues, for instance with source stepping, i.e. #758 from 2015 (I know there were others, but couldn't locate them anymore). It also has a StackOverflow reference: https://stackoverflow.com/questions/33794570. |
In case it helps, there's a repro sln in #758 that could perhaps be used to write a test case for this. |
@dsyme, sorry mate resolution required :-( |
@dsyme merge issues :-) |
After talking with @dsyme about this, the approach here is fundamentally flawed, and packing in more information here will result in compiler and tooling slowdowns that we're hesitant to accept. The right way to store this information is as an offset, calculating range information as needed rather than flowing that larger amount of data everywhere. Such a change is more invasive, but the correct approach. @dsyme do you want to keep this open? |
I do think the slight perf slow down is probably still acceptable given we just have to fix this problem. I'll bring this up to date and do some more perf checks. |
This is ready, I think we should merge it and just be done with the issue. We have to be correct in these cases and the indications above are that performance is reasonable. |
@KevinRansom @TIHan could you give this a review? Given @dsyme's investigations I think this is worth taking, especially because it fixes at least two bugs. It'd be good to do it the Right Way ™️, but that's very invasive work and this is ready to be flighted in dev16 previews. |
Created an issue to track that we need perf improvements in the future: #5826 |
* Fix large ranges * update FCS * I can't add up * fix build * test fix * fix tests * fix tests * fix tests
The F# compiler has historically used a 32-bit integer for "position" information and a 64-bit integer for "range" information. But this means we lose information and can only accurately handle:
So we have bugs where long lines and/or long files and/or lots-of-files give inaccurate information because we compress the range information into a 64-bit integer.
This PR is a possible fix for this by using 128-bits. This gives limits:
However I’m concerned about perf implications of using 128bits for range values since these get passed around everywhere. I will test perf on this PR and will compare with other compilers such as Roslyn.
(Quote of the day: "F# file lengths of 640K should be enough for anyone!")