[CompilerPerf] Fix large ranges #4476

dsyme · 2018-03-09T13:02:47Z

The F# compiler has historically used a 32-bit integer for "position" information and a 64-bit integer for "range" information. But this means we lose information and can only accurately handle:

file references up to 16384 referenced files (across all projects being analysed by FSharp.Compiler.Service)
line length up to 512
file length up to 65536
intra-file range spans up to 32768

So we have bugs where long lines and/or long files and/or lots-of-files give inaccurate information because we compress the range information into a 64-bit integer.

This PR is a possible fix for this by using 128-bits. This gives limits:

file references up to ~16million referenced files (across all projects being analysed by FSharp.Compiler.Service)
line length up to ~1million
file length up to ~2billion

However I’m concerned about perf implications of using 128bits for range values since these get passed around everywhere. I will test perf on this PR and will compare with other compilers such as Roslyn.

(Quote of the day: "F# file lengths of 640K should be enough for anyone!")

KevinRansom · 2018-03-22T19:51:05Z

@dsyme, conflicts to resolve. How do you propose investigating the perf implications you mention?

KevinRansom

Lots of bit twiddling and masking here. Looks good though.

dsyme · 2018-03-27T23:51:34Z

@dsyme, conflicts to resolve. How do you propose investigating the perf implications you mention?

@KevinRansom Good question. I ran the compiler perf script a few times and it seems to be slower by around 3-5%, which is kind of what I expected. BUT, I'm always a bit sceptical of the reliability of the data collection in that compiler perf script....

It's possible we should just do this properly by changing to use the character-offset-from-start-of-file. That is, however, a more intrusive change....

auduchinok · 2018-03-28T07:51:00Z

It's possible we should just do this properly by changing to use the character-offset-from-start-of-file. That is, however, a more intrusive change....

That would be great.

abelbraaksma · 2018-03-28T19:15:54Z

Great work! I don't think you should let the 3% performance drop stop you from merging this, it solves quite a few stability problems already. And for code-generation this is great, because sometimes these files get much larger than 64k lines.

This would solve some really old issues, for instance with source stepping, i.e. #758 from 2015 (I know there were others, but couldn't locate them anymore).

It also has a StackOverflow reference: https://stackoverflow.com/questions/33794570.

abelbraaksma · 2018-03-28T19:19:17Z

In case it helps, there's a repro sln in #758 that could perhaps be used to write a test case for this.

…o range1

cartermp · 2018-07-21T18:38:27Z

@dsyme I'd like to consider this for dev16, as the issue in #758 implies this is the right stability fix, and will also help and other tooling that do things that aren't AST-based

dsyme · 2018-07-30T15:27:56Z

@dsyme I'd like to consider this for dev16, as the issue in #758 implies this is the right stability fix, and will also help and other tooling that do things that aren't AST-based

@cartermp ok

KevinRansom · 2018-09-15T03:09:56Z

@dsyme, sorry mate resolution required :-(

KevinRansom · 2018-09-26T18:14:40Z

@dsyme merge issues :-)

cartermp · 2018-10-16T17:07:11Z

After talking with @dsyme about this, the approach here is fundamentally flawed, and packing in more information here will result in compiler and tooling slowdowns that we're hesitant to accept. The right way to store this information is as an offset, calculating range information as needed rather than flowing that larger amount of data everywhere. Such a change is more invasive, but the correct approach.

@dsyme do you want to keep this open?

dsyme · 2018-10-17T20:36:53Z

After talking with @dsyme about this, the approach here is fundamentally flawed, and packing in more information here will result in compiler and tooling slowdowns that we're hesitant to accept.

I do think the slight perf slow down is probably still acceptable given we just have to fix this problem. I'll bring this up to date and do some more perf checks.

…o range1

dsyme · 2018-10-24T12:52:56Z

This is ready, I think we should merge it and just be done with the issue. We have to be correct in these cases and the indications above are that performance is reasonable.

cartermp · 2018-10-25T17:23:12Z

@KevinRansom @TIHan could you give this a review? Given @dsyme's investigations I think this is worth taking, especially because it fixes at least two bugs. It'd be good to do it the Right Way ™️, but that's very invasive work and this is ready to be flighted in dev16 previews.

TIHan · 2018-10-25T18:15:58Z

Created an issue to track that we need perf improvements in the future: #5826

* Fix large ranges * update FCS * I can't add up * fix build * test fix * fix tests * fix tests * fix tests

dsyme added 2 commits March 9, 2018 12:53

Fix large ranges

cde161e

update FCS

a070b38

dsyme changed the title ~~Fix large ranges~~ [CompilerPerf] Fix large ranges Mar 9, 2018

I can't add up

e49ac8a

KevinRansom approved these changes Mar 22, 2018

View reviewed changes

integrate master

1feaceb

dsyme added 2 commits March 29, 2018 00:29

fix build

fa2ed0d

Merge branch 'master' of http://github.com/Microsoft/visualfsharp int…

46be8be

…o range1

cartermp mentioned this pull request May 30, 2018

[WIP] Encode pos as a pair of 16 bit values #5044

Closed

cartermp added this to the 16.0 milestone Jul 21, 2018

cartermp mentioned this pull request Aug 13, 2018

Long format string causes bogus error in editor FS0001 #5041

Closed

dsyme added 6 commits October 23, 2018 15:41

merge master

83ee64a

test fix

3b4c80d

fix tests

27df13a

Merge branch 'master' of http://github.com/Microsoft/visualfsharp int…

84572d0

…o range1

fix tests

ff23033

fix tests

ed68599

TIHan approved these changes Oct 25, 2018

View reviewed changes

TIHan merged commit 4a368d4 into dotnet:master Oct 25, 2018

TIHan mentioned this pull request Oct 25, 2018

Range Performance Improvements #5826

Closed

This was referenced Oct 25, 2018

Long line of union cases, records, module, function names breaks coloring (512 characters, specifically) #4268

Closed

Large, single-file sources of F# get the line number and debug-stepping totally wrong #758

Closed

brettfo mentioned this pull request Dec 3, 2018

Invalid range when parsing generated yacc files. #5976

Closed

cartermp mentioned this pull request Mar 5, 2019

Problems with very long lines and/or files fsprojects/fantomas#119

Closed

cartermp mentioned this pull request Mar 3, 2020

Hmm? #8649

Closed

cartermp mentioned this pull request May 2, 2020

Internal compiler error when creating Portable PDB files for source files with very long lines #3866

Closed

nosami pushed a commit to xamarin/visualfsharp that referenced this pull request Jan 26, 2022

[CompilerPerf] Fix large ranges (dotnet#4476)

67c0d46

* Fix large ranges * update FCS * I can't add up * fix build * test fix * fix tests * fix tests * fix tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CompilerPerf] Fix large ranges #4476

[CompilerPerf] Fix large ranges #4476

dsyme commented Mar 9, 2018 •

edited

Loading

KevinRansom commented Mar 22, 2018

KevinRansom left a comment

dsyme commented Mar 27, 2018

auduchinok commented Mar 28, 2018

abelbraaksma commented Mar 28, 2018

abelbraaksma commented Mar 28, 2018

cartermp commented Jul 21, 2018

dsyme commented Jul 30, 2018

KevinRansom commented Sep 15, 2018

KevinRansom commented Sep 26, 2018

cartermp commented Oct 16, 2018

dsyme commented Oct 17, 2018

dsyme commented Oct 24, 2018

cartermp commented Oct 25, 2018

TIHan commented Oct 25, 2018

[CompilerPerf] Fix large ranges #4476

[CompilerPerf] Fix large ranges #4476

Conversation

dsyme commented Mar 9, 2018 • edited Loading

KevinRansom commented Mar 22, 2018

KevinRansom left a comment

Choose a reason for hiding this comment

dsyme commented Mar 27, 2018

auduchinok commented Mar 28, 2018

abelbraaksma commented Mar 28, 2018

abelbraaksma commented Mar 28, 2018

cartermp commented Jul 21, 2018

dsyme commented Jul 30, 2018

KevinRansom commented Sep 15, 2018

KevinRansom commented Sep 26, 2018

cartermp commented Oct 16, 2018

dsyme commented Oct 17, 2018

dsyme commented Oct 24, 2018

cartermp commented Oct 25, 2018

TIHan commented Oct 25, 2018

dsyme commented Mar 9, 2018 •

edited

Loading