Parallel type checking #11550

kerams · 2021-05-11T20:25:24Z

The recent awesome work on parallel parsing and IL gen got me thinking whether it would be feasible to parallelize type checking as well.

Given F#'s significant file order, if the compiler can quickly establish that file B can't possibly reference anything in file A declared directly above it, can't A and B be typechecked (and optimized) simultaneously? This would require an initial traversal of the AST in every file in order to create a file dependency graph. Specifically, things like this need to be checked:

If B opens a module or class defined in A, B depends on A
If A contains an AutoOpen that has immediate effect in B (because of overlapping namespaces, regardless of actual contents), B depends on A
If B references something from A using a qualified name, B depends on A

I think these can be gleaned from the untyped syntax tree alone, the question is whether it would be fast enough to offset the overhead and whether typechecking has a significant cost. Apologies in advance if there's a gaping flaw in here somewhere :).

The text was updated successfully, but these errors were encountered:

smoothdeveloper · 2021-05-12T01:03:12Z

@kerams, would you mind paying a look at the recent comments I made in the broader/fuzzy discussion for incremental recompile (#8044)?

What I touch on my comments in #8044 is targeted at identifying if a file and a diff of signatures before it should engender a recompile of the file.

The same approach could be taken earlier, it could be on diff of AST that we could already identify opportunities to skip recompile of a file, or more likely, proceed in checking the next stage.

Why is it related to incremental recompilation?

Because incremental recompilation will require serialisation and heuristics on the serialized graphs, and making things like a graph and serialised, means they can conceptually be parallelised for read only usage 🙂 (I know I'm stretching everything a little bit).

Now, regarding your more precise issue, I have few things coming to mind:

Are you familiar with the NameResolution business? if there was a way to populate those after AST parsing, and do the checks that you are looking for, and serialise the outcome, it would benefit both some of the rough ideas I explore for incremental recompilation and as you say, gets us closer to the ability to type check in parallel, according to your more precise plot (the F# world domination, errr, compilation plan, I presume).

If NameResolution as we know (but me) is not good enough (because needs a first pass of type checking, or parsing of references), we could maybe have a lower one, that still targets few use cases for your parallel type checking graph splitter endeavours.

That would be the beginning of a NameResolution that enrich at each stages:

AST
assembly reference loading
later type checking
codegen (not sure if relevant, but maybe for relinking, in context of type provider SDK...)

It means invalidation of the serialised NameResolutions could be fine grained in terms of which context it impacts.

One big pitfall, at least, nest of edge cases, is shadowing and aliasing features of the language, and also stuff like AutoOpen and open static.

It sounds like "after assembly reference resolution" maybe the next stage if bare AST can't help much in sorting out what you are pointing at.

edit:

Apologies in advance if there's a gaping flaw in here somewhere :)

Don't worry, you are covered at least 10x by all mine :)

TIHan · 2021-05-12T16:23:32Z

We have a PR that could enable this: #11152

The restriction is that it will only parallelize type checking impl files that have a backing signature file. Our hope in the future is that we will add better tooling to create these signature files automatically through a gesture in an editor.

Trying to parallelize type checking for impl files without sig files is a much more difficult undertaking. Sure, you could add heuristics like you suggested, but we would have to have a dependency graph of types and functions; which may be more expensive in of itself.

kerams · 2021-05-12T19:32:58Z

Yes, I'm specifically talking about no sig scenarios.

a dependency graph of types and functions;

Yes, that would be expensive. I was thinking keeping track of namespaces and modules alone could theoretically suffice. For instance, if we collect those for file A and see a related open in B, we can't parallelize. It does not matter what exactly is used (or if the open is indeed redundant). This heuristic would only guarantee what definitely isn't used, and would because of this specificity hopefully run very fast.

@smoothdeveloper, I'm afraid I don't know enough about those topics, or whether your approach, if viable, would supersede mine (perhaps they could even complement each other). My evening thoughts were very narrowly concerned about full compilation performance.

kerams · 2021-05-12T21:18:27Z

Here's a really silly idea. If I extract all namespaces and top level modules (a very shallow AST traversal would be enough) from file A and do a full text search for them in file B, will I not be perfectly sure that the files are independent if there are no hits (other than perhaps a shared top level namespace)? :)))))

I'll answer myself. The aforementioned auto opens would cause a problem, but that can be easily checked separately. Partially qualified opens are worse though.

smoothdeveloper · 2021-05-12T22:27:05Z

It seems to me like a great bang for buck first pass at splitting out stuff just at the AST level, like having both a genius and captain obvious being one and single person bringing this :D

I'm curious what roadblocks would be hit by this heuristics to turn faulty in the wild.

@vasily-kirichenko, any opinion?

cartermp · 2021-06-04T16:53:31Z

Will move this to a discussion

cartermp closed this as completed Jun 4, 2021

dotnet locked and limited conversation to collaborators Jun 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Parallel type checking #11550

Parallel type checking #11550

kerams commented May 11, 2021 •

edited

Loading

smoothdeveloper commented May 12, 2021 •

edited

Loading

TIHan commented May 12, 2021 •

edited

Loading

kerams commented May 12, 2021

kerams commented May 12, 2021 •

edited

Loading

smoothdeveloper commented May 12, 2021

cartermp commented Jun 4, 2021

This issue was moved to a discussion.

This issue was moved to a discussion.

Parallel type checking #11550

Parallel type checking #11550

Comments

kerams commented May 11, 2021 • edited Loading

smoothdeveloper commented May 12, 2021 • edited Loading

TIHan commented May 12, 2021 • edited Loading

kerams commented May 12, 2021

kerams commented May 12, 2021 • edited Loading

smoothdeveloper commented May 12, 2021

cartermp commented Jun 4, 2021

This issue was moved to a discussion.

kerams commented May 11, 2021 •

edited

Loading

smoothdeveloper commented May 12, 2021 •

edited

Loading

TIHan commented May 12, 2021 •

edited

Loading

kerams commented May 12, 2021 •

edited

Loading