Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel type checking for impl files with backing sig files #13737

Merged
merged 33 commits into from Sep 21, 2022

Conversation

nojaf
Copy link
Contributor

@nojaf nojaf commented Aug 20, 2022

Trying to pick up where #11152 left.

@nojaf nojaf changed the title Apply original changes of TIHan. Revisit Parallel type checking for impl files with backing sig files Aug 20, 2022
@dsyme
Copy link
Contributor

dsyme commented Aug 22, 2022

@nojaf Any idea what actually goes wrong here?

@nojaf
Copy link
Contributor Author

nojaf commented Aug 22, 2022

@dsyme for some reason

C:\Users\nojaf\Projects\fsharp\src\Compiler\Checking\InfoReader.fs(932,5): error FS0971: Undefined value 'ExcludeHiddenOfMethInfos' [C:\Users\nojaf\Projects\fsharp\src\Compiler\FSharp.Compiler.Service.fsproj]

is no longer recognised.
At first glance, I'm not quite sure why. I'll try and reproduce this on a smaller scale.

@nojaf
Copy link
Contributor Author

nojaf commented Aug 22, 2022

@dsyme I can reproduce it with the following:

signature:

module FSharpPlaygroundProject.InfoReader

type  TcGlobals = { A: int }

type  ImportMap = { A: int }

type  range = { A: int }

type  MethInfo = { A: int }

/// An InfoReader is an object to help us read and cache infos.
/// We create one of these for each file we typecheck.
type InfoReader =

    new: g: TcGlobals * amap: ImportMap -> InfoReader

    static member ExcludeHiddenOfMethInfos:
        g: TcGlobals -> amap: ImportMap -> m: range -> minfos: MethInfo list list -> MethInfo list

val  ExcludeHiddenOfMethInfos:
    g: TcGlobals -> amap: ImportMap -> m: range -> minfos: MethInfo list list -> MethInfo list

implementation:

module  FSharpPlaygroundProject.InfoReader

type TcGlobals = { A: int }
type ImportMap = { A: int }
type range = { A: int }
type MethInfo = { A: int }

/// An InfoReader is an object to help us read and cache infos.
/// We create one of these for each file we typecheck.
type InfoReader(g: TcGlobals, amap: ImportMap) =
    static member ExcludeHiddenOfMethInfos
        (g: TcGlobals)
        (amap: ImportMap)
        (m: range)
        (minfos: MethInfo list list)
        : MethInfo list =
        []

let ExcludeHiddenOfMethInfos (g: TcGlobals) (amap: ImportMap) (m: range) (minfos: MethInfo list list) : MethInfo list =
    InfoReader.ExcludeHiddenOfMethInfos g amap m minfos

The problem goes away when ExcludeHiddenOfMethInfos is renamed to ExcludeHiddenOfMethInfos2.
So I believe it to be a naming clash between a static member and a top-level binding.

@dsyme
Copy link
Contributor

dsyme commented Aug 22, 2022

@nojaf I believe the problem is that all the signatures (e.g. of InfoReader.fsi) are present in the tcState at this line https://github.com/dotnet/fsharp/pull/13737/files#diff-efdad7fa53b23cfadc0c316627cd7f8833d0a531db500cc7c220cc4fa0465ab2R1384-R1387 where they shouldn't be. For example, when processing InforReader.fsi only the signatures up to (but not including) InforReader.fsi should be in the tcImplEnv of tcState.

I think the tcState and other inputs to use in the delayed parallel work need to be exactly the ones that would have been used here https://github.com/dotnet/fsharp/pull/13737/files#diff-efdad7fa53b23cfadc0c316627cd7f8833d0a531db500cc7c220cc4fa0465ab2R1227

It is probably better to replace these lines with some more explicit return of the delayed stuff to be processed later, including the relevant tcState etc. Plus this entire section](https://github.com/dotnet/fsharp/pull/13737/files#diff-efdad7fa53b23cfadc0c316627cd7f8833d0a531db500cc7c220cc4fa0465ab2R1245-R1290) should likely be factored out as a separate function representing "work to be done to incorporate the results once the check of the impl file is completed". I can pair program a draft with you if you like.

@dsyme
Copy link
Contributor

dsyme commented Aug 22, 2022

The problem goes away when ExcludeHiddenOfMethInfos is renamed to ExcludeHiddenOfMethInfos2. So I believe it to be a naming clash between a static member and a top-level binding.

This is definitely going to be because the signature is present in the name-checking environment when it shouldn't be (the presence of the signature causes multiple InfoReader names to be in scope in a way that's slightly different to the scoping available when starting without the signature present - the latter being what "should" be used for InfoReader.fs).

There will be many other manifestations of this problem - e.g. the current diff would allow implementation files to see "later" signature files in the compilation order.

Happy to pair-program a better starting point, the current one is just not logically right

@dsyme
Copy link
Contributor

dsyme commented Aug 22, 2022

@nojaf we should turn it on by default in the PR for testing purposes

@dsyme
Copy link
Contributor

dsyme commented Aug 23, 2022

@nojaf Bootstrap now completes ✔️

I did a quick performance check on my machine after integrating #13740 (comment)

That ignores the time for the Proto phase. Note clean builds hit #13730 so are a PITA until that's fixed.

That's looking very promising given that

  • quite a lot of the files being compiled don't have signature files (FSharp.Build, most test files)
  • there's also significant fsyacc and fslex work in there

@dsyme dsyme force-pushed the parallel-type-checking branch 2 times, most recently from 360dfa5 to 2144ba9 Compare August 23, 2022 01:56
@nojaf
Copy link
Contributor Author

nojaf commented Aug 23, 2022

Wonderful news! I'm having similar results when building the Fantomas solution (a couple of projects have signatures, some don't have any)

Before: 17.49sec
After: 14.28sec

@dsyme
Copy link
Contributor

dsyme commented Aug 23, 2022

I pushed some more fixes regarding diagnostics

There's quite a lot of cleanup in this branch now, we could factor it out, though I'm actually happy for the thing to go in as disabled or preview-enabled once green, I'll talk to @vzarytovskii and @KevinRansom about it.

Getting exactly the same diagnostics emitted is a bit tricky. The compiler uses this checkForErrors thing to prevent cascading errors for multiple phases of checking a file, and also from file to file. So an error in an earlier file (e.g. signature file) can cause some errors to be suppressed in later files. This makes it hard to get perfect match in diagnostics produced. But what's there now is very close.

@dsyme
Copy link
Contributor

dsyme commented Aug 23, 2022

@nojaf I could see this causing a bunch of features....

  • "add signatures for all files" action in IDEs
  • IDE assists to maintain signatures under change to implementation
  • IDE assists aligning comments and other metadata in signature and implementation
  • Enable incremental compilation if the inferred signature of an implementation file hasn't changed (hard - requires intra-compilation state)

@nojaf
Copy link
Contributor Author

nojaf commented Aug 23, 2022

Yes, generating all signatures correctly is already on my radar:
https://github.com/dotnet/fsharp/issues?q=is%3Aissue+label%3AArea-Compiler-SigFileGen+author%3Anojaf

@dsyme
Copy link
Contributor

dsyme commented Aug 23, 2022

@nojaf The failures now look mostly related to the fact that we delay formatting of diagnostics until "after the whole file has been checked", which means more type inference parameters have been solved. This causes some changes in the types reported in the diagnostics.

I need to have a think about the best way to approach this. We use the delay-and-replay-diagnostics in other places but this is the first time we're doing it for anything involving type inference. I guess we need to format the diagnostics early.

@dsyme
Copy link
Contributor

dsyme commented Aug 23, 2022

@nojaf I guess people would also love some kind of "store local signature files silently, automatically keep them up to date each time I compile, use them for better compilation performance if they're there, if something changes then update them and re-do compilation" feature.

1 similar comment
@dsyme
Copy link
Contributor

dsyme commented Aug 23, 2022

@nojaf I guess people would also love some kind of "store local signature files silently, automatically keep them up to date each time I compile, use them for better compilation performance if they're there, if something changes then update them and re-do compilation" feature.

@dsyme
Copy link
Contributor

dsyme commented Aug 23, 2022

@nojaf Determinism failed for the following binaries:

FSharp.Compiler.Service.dll (old hash: FD69A91B4E4F03271607DD246B279941; new hash: ED75EABE3B10F32C12064BB56812E5B6)

This sort of thing is going to be quite painful to track down :) We will probably need a better binary diff to determine differences in IL metadata (dumping ILDASM and doing a diff may be a good start, but likely even more detail is needed)

@nojaf
Copy link
Contributor Author

nojaf commented Aug 23, 2022

people would also love some kind of "store local signature files silently, automatically keep them up to date each time I compile, use them for better compilation performance if they're there, if something changes then update them and re-do compilation" feature.

Yeah, I thought about that, but I think you typically will want to generate the signatures once.
Throw out what you don't want to expose and keep impl/sig in sync afterwards.

As soon as the implementation no longer matches the signature you want to alt-enter to sync the signature file. That is at least the approach I'm thinking of.

@nojaf
Copy link
Contributor Author

nojaf commented Aug 23, 2022

Determinism failed for the following binaries:

Would any of this have an impact on the stability of the mvid in references assemblies?

@kerams
Copy link
Contributor

kerams commented Aug 23, 2022

Yeah, I thought about that, but I think you typically will want to generate the signatures once.
Throw out what you don't want to expose and keep impl/sig in sync afterwards.

I imagine that would be geared towards users that don't need or care about signature files (like me), but would like to reap the performance benefits of having them. I mean, why not have the compiler always generate hidden signature files (unless explicit signatures exist) in the bin folder? Sort of like ref assemblies, but on the file level.

@auduchinok
Copy link
Member

I mean, why not have the compiler always generate hidden signature files (unless explicit signatures exist) in the bin folder?

Because it'll break how inferred type changes flow between files.

@Lenne231
Copy link

I mean, why not have the compiler always generate hidden signature files (unless explicit signatures exist) in the bin folder?

Because it'll break how inferred type changes flow between files.

Would it be possible to enforce explicit signatures of public APIs in fs files with a compiler flag? A fs file could be treated similar to a signature file then, right?

@auduchinok
Copy link
Member

Would it be possible to enforce explicit signatures of public APIs in fs files with a compiler flag? A fs file could be treated similar to a signature file then, right?

This is what I've also had in mind. It could work and would probably be much more pleasant than requiring signature files. @dsyme have you considered that before?

@kerams
Copy link
Contributor

kerams commented Aug 23, 2022

Because it'll break how inferred type changes flow between files.

Couldn't we also invalidate the signature files of all dependent files?

@dsyme
Copy link
Contributor

dsyme commented Aug 23, 2022

@auduchinok @kerams @Lenne231 let's take the discussion of compilation and tooling related to signature files to a discussion thread or similar.

@nojaf

Would any of this have an impact on the stability of the mvid in references assemblies?

I'm not sure. Most likely it's non-determinism in generation of compiler-generated names of some kind.

@dsyme
Copy link
Contributor

dsyme commented Aug 23, 2022

@nojaf I think I've fixed the remaining errors. The core of the fix is to do eager formatting on diagnostics as they are produced in the parallel type checking.

There's another set of code simplifications in the latest commits:

  • SplitRelatedDiagnostics was old code that never actually returned any "related" diagnostics, so the second list was always empty. The routine actually basically doesn nothing and should likely be removed.

  • The "DiagnosticLoggerProvider" stuff was over-complicated and could be simplified out

@nojaf
Copy link
Contributor Author

nojaf commented Sep 19, 2022

Hello @vzarytovskii and @dsyme, any thoughts on how we can move forward with this?
I understand you probably don't want this for 17.4 but could we merge it into main maybe?

@vzarytovskii vzarytovskii merged commit 51635bb into dotnet:main Sep 21, 2022
@nojaf
Copy link
Contributor Author

nojaf commented Sep 22, 2022

Many thanks @vzarytovskii for merging this in!

@TIHan
Copy link
Member

TIHan commented Sep 24, 2022

Congrats @nojaf for your work on this! Congrats to the F# team on getting this in!

@nojaf
Copy link
Contributor Author

nojaf commented Sep 25, 2022

Thank you @TIHan, although I can hardly take any credit here. The original idea came from yourself, @dsyme has been a tremendous help during the implementation and the F# team supported us to get in it.
I'm very excited about this either way 🎉.

@nojaf nojaf deleted the parallel-type-checking branch September 25, 2022 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants