Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CompilerPerf] Fix large ranges #4476

Merged
merged 12 commits into from
Oct 25, 2018
Merged

[CompilerPerf] Fix large ranges #4476

merged 12 commits into from
Oct 25, 2018

Conversation

dsyme
Copy link
Contributor

@dsyme dsyme commented Mar 9, 2018

The F# compiler has historically used a 32-bit integer for "position" information and a 64-bit integer for "range" information. But this means we lose information and can only accurately handle:

  • file references up to 16384 referenced files (across all projects being analysed by FSharp.Compiler.Service)
  • line length up to 512
  • file length up to 65536
  • intra-file range spans up to 32768

So we have bugs where long lines and/or long files and/or lots-of-files give inaccurate information because we compress the range information into a 64-bit integer.

This PR is a possible fix for this by using 128-bits. This gives limits:

  • file references up to ~16million referenced files (across all projects being analysed by FSharp.Compiler.Service)
  • line length up to ~1million
  • file length up to ~2billion

However I’m concerned about perf implications of using 128bits for range values since these get passed around everywhere. I will test perf on this PR and will compare with other compilers such as Roslyn.

(Quote of the day: "F# file lengths of 640K should be enough for anyone!")

@dsyme dsyme changed the title Fix large ranges [CompilerPerf] Fix large ranges Mar 9, 2018
@KevinRansom
Copy link
Member

@dsyme, conflicts to resolve. How do you propose investigating the perf implications you mention?

Copy link
Member

@KevinRansom KevinRansom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of bit twiddling and masking here. Looks good though.

@dsyme
Copy link
Contributor Author

dsyme commented Mar 27, 2018

@dsyme, conflicts to resolve. How do you propose investigating the perf implications you mention?

@KevinRansom Good question. I ran the compiler perf script a few times and it seems to be slower by around 3-5%, which is kind of what I expected. BUT, I'm always a bit sceptical of the reliability of the data collection in that compiler perf script....

It's possible we should just do this properly by changing to use the character-offset-from-start-of-file. That is, however, a more intrusive change....

@auduchinok
Copy link
Member

It's possible we should just do this properly by changing to use the character-offset-from-start-of-file. That is, however, a more intrusive change....

That would be great.

@abelbraaksma
Copy link
Contributor

Great work! I don't think you should let the 3% performance drop stop you from merging this, it solves quite a few stability problems already. And for code-generation this is great, because sometimes these files get much larger than 64k lines.

This would solve some really old issues, for instance with source stepping, i.e. #758 from 2015 (I know there were others, but couldn't locate them anymore).

It also has a StackOverflow reference: https://stackoverflow.com/questions/33794570.

@abelbraaksma
Copy link
Contributor

In case it helps, there's a repro sln in #758 that could perhaps be used to write a test case for this.

@cartermp cartermp added this to the 16.0 milestone Jul 21, 2018
@cartermp
Copy link
Contributor

@dsyme I'd like to consider this for dev16, as the issue in #758 implies this is the right stability fix, and will also help and other tooling that do things that aren't AST-based

@dsyme
Copy link
Contributor Author

dsyme commented Jul 30, 2018

@dsyme I'd like to consider this for dev16, as the issue in #758 implies this is the right stability fix, and will also help and other tooling that do things that aren't AST-based

@cartermp ok

@KevinRansom
Copy link
Member

@dsyme, sorry mate resolution required :-(

@KevinRansom
Copy link
Member

@dsyme merge issues :-)

@cartermp
Copy link
Contributor

After talking with @dsyme about this, the approach here is fundamentally flawed, and packing in more information here will result in compiler and tooling slowdowns that we're hesitant to accept. The right way to store this information is as an offset, calculating range information as needed rather than flowing that larger amount of data everywhere. Such a change is more invasive, but the correct approach.

@dsyme do you want to keep this open?

@dsyme
Copy link
Contributor Author

dsyme commented Oct 17, 2018

After talking with @dsyme about this, the approach here is fundamentally flawed, and packing in more information here will result in compiler and tooling slowdowns that we're hesitant to accept.

I do think the slight perf slow down is probably still acceptable given we just have to fix this problem. I'll bring this up to date and do some more perf checks.

@dsyme
Copy link
Contributor Author

dsyme commented Oct 24, 2018

This is ready, I think we should merge it and just be done with the issue. We have to be correct in these cases and the indications above are that performance is reasonable.

@cartermp
Copy link
Contributor

@KevinRansom @TIHan could you give this a review? Given @dsyme's investigations I think this is worth taking, especially because it fixes at least two bugs. It'd be good to do it the Right Way ™️, but that's very invasive work and this is ready to be flighted in dev16 previews.

@TIHan TIHan merged commit 4a368d4 into dotnet:master Oct 25, 2018
@TIHan
Copy link
Member

TIHan commented Oct 25, 2018

Created an issue to track that we need perf improvements in the future: #5826

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants