Plan: Modernizing F# Analysis #11976

TIHan · 2021-08-16T17:55:24Z

@dsyme and I sat down to document my overall plan for modernizing the way F# and FCS do analysis (Don did most of the writing here). Much of this work is already done, this documents the plan end-to-end. Here's what we came up with, please comment and discuss below.

See also #7077 for a previous description.

Planning: Modernizing F# Analysis

This note describes the technical agenda to "modernize" the FCS analysis services to use best-known techniques from Roslyn.

Executive summary

The core of the plan is to adopt a more Roslyn-like model of analysis, based on

Immutable snapshots of the contents of documents and projects
Immutable views of their enrichment with analysis information
A cacheless compiler-service API

In the long term this agenda delivers multiple critical benefits:

High-performance multi-threaded analysis
A more reliable basis for implementing multiple IDE features, including cross-file refactorings and analysis
Alows features "in-memory documents" and "in-memory cross-project references to C#", simplifying the user experience of using F# in Visual Studio.
It aligns F# with the architectural principles of Roslyn, allowing contributors to transfer experience between the two
Looking forward, gives a strong basis for reliably make F# analysis more incremental w.r.t. incremental changes in inputs.
Looking forward, gives a strong basis for reliably building a simple, reliable "out-of-proc" LSP implementation for F# following Roslyn design principles.
Our compiler testing framework can be simplified; it will not read files on disk in order to run a test that verifies parsing and type-checking behavior.

The Current Situation and why it's a Problem

FCS provides services to compute analysis information from inputs. For the current API, some of these inputs are filenames, and so FCS relies in part on the state of the file system, which is highly mutable state and is highly problematic.

Specifically, when requesting the analysis of a file in a project (e.g. for a refactoring or a tooltip), the state of the current file is captured as a "snapshot", but the state of other files in the project is accessed via the file system as the analysis proceeds. This causes four problems:

A. The state of these files as saved on disk may have changed in-between

B. Differences between the saved and unsaved contents of in-memory buffers in the IDE.

C. It is extremely error-prone to implement incremental updates to analysis w.r.t. incremental changes in input

Additionally, FCS had two other major problems:

D. FCS is stateful and implements multiple kinds of caching for parsing and analysis

E. FCS was single-threaded, with a "reactor thread" compilation lock

Problem A leads to:

Repeated polling checks on timestamps of files whenever checking for validity of the results
Is a very frequent cause of bugs (BUG LINKS)

Problem B leads to:

Confusion for the user who doesn't understand that prior files must be saved.
Double type-checking of open files in a project when a change is made: once as part of the "background" build that is done using on-disk representations, once as part of the "foreground" build in order to get diagnostics for currently open documents. This is not something users perceive, but is a potential overall performance gain we can deliver for large projects.
Unnecessary complexity and distinctions in the FCS API that can makes it difficult to understand what's going - for example, we must document and test the differences between foreground and background checking.
A slew of bugs in cross-file refactorings (LINK LINK), cross-file goto-definition, cross-file tooltips (based on saved, not unsaved contents)
Missed opportunities to take advantage of F# language features for more efficient incremental checking, in particular signature files.

Problems C & D leads to a slew of bugs related to not invalidating cache entries with regard to changes in on-disk files. Problem D also causes issues with memory usage and too many analysis results being "kept live" by the FCS caches.

Problems A-C also apply to the "referenced assemblies" inputs to analysis, particularly cross-project references.
Before the start of this work, specifying a cross-project reference was done via a graph of FSharpProjectOptions,
but no in-memory cross-project references were allowed to C# projects. Further, the cross-project references
lead to reading input files from disk for other projects and assessing their timestamps, leading to bugs
and inconsistencies.

In combination these issues lead to a kind of "grid lock" where the root causes of the kinds of bugs we see are not addressed using best-known techniques. We patch a few bugs, which can cause other bugs etc. We know the solution to unlock this, which is to follow the design principles used by Roslyn.

Aside: Problem B can be partly addressed by an existing "hack" in the FCS API that allows the file system used by FCS to be "shimmed". This is used by JetBrains Rider in order to implement in-memory documents. However this is an awkward solution that differs greatly from the Roslyn approach, and Problems A, C and D still remains.

What's Needed

The Roslyn approach to these problems is to

Make all inputs to analysis be "snapshot" objects
Make all analysis results to be on-demand stateless enrichments of these snapshots
Do not implement adhoc caching of analysis objects within Roslyn, but rather allow liveness of analysis objects to determine lifetimes.

The technical agenda is based on transforming FCS to correspond to these principles.

Aside: when we say Roslyn analysis objects (e.g. Compilation LINK) are "on-demand stateless enrichments", this means there may be internal state recording what enrichments have already been computed, and this may be important or reasoning about memory usage. However, logically speaking, the analysis objects are still functional enrichments. Roslyn analysis objects are effectively like a composition of multiple lazy values - computed on-demand.

Technical Agenda

The agenda is as follows. Where new constructs are brought into existence in the FCS API, we show their correspondence to Roslyn equivalents

And that is all.

Looking ahead

Further Incrementality

One result of the above agenda is that is provides a basis to begin to implement finer-grained incremental adjustment of analysis results w.r.t. incremental changes in inputs. Currently (at the end of the above agenda) incrementality is at the granularity of replacing the contents of an entire file. We could now consider incrementality w.r.t. adding text at the end of a file, or changes within a line. This requires incremental parsing, checking. The aim here would be higher performance IDE analysis.

Roslyn supports this kind of incrementality but it is not an essential part of the above agenda.

LSP and Out of Process

Future changes to Roslyn will require F# to implement LSP, at least for the minimal of doing diagnostic analysis out-of-process (see
#11969, note this is a tiny part of LSP, and Ionide provides a full implementation).

An LSP implementation of F# will host FSharp.Compiler.Service and should ideally have an implementation architecture very similar to the C# out-of-proc LSP implementation. Completing this agenda allows us to use this approach. For example, the out-of-proc process will mirror the Roslyn workspace and hold handles to the appropriate FSharpProject objects, just as the C# version of the same holds a Roslyn Compilation object.

Crucially, this means the F# LSP implementation will be simple, reliable and relatively stateless (apart from holding FSharpProject
objects).

The text was updated successfully, but these errors were encountered:

baronfel · 2021-08-16T18:03:30Z

Aside: Problem B can be partly addressed by an existing "hack" in the FCS API that allows the file system used by FCS to be "shimmed". This is used by JetBrains Rider in order to implement in-memory documents. However this is an awkward solution that differs greatly from the Roslyn approach, and Problems A, C and D still remains.

Yep, worth noting that we do this in FSAC as well and our in-memory FS is powered by LSP file changed messages. We do still have all of the mentioned issues as well 👍

I skimmed through the rest of this and it sounds generally quite nice. Interested in seeing the details of course, but I'm encouraged by everything I see here. Excellent writeup!

alfonsogarciacaro · 2021-09-09T07:27:58Z

This is great work @TIHan, looking forward to it!

Not entirely sure if it's related but just to mention that it'd be nice if this work also takes into account FSC-based compilers targeting other platforms. Particularly about incremental compilation. For Fable, we use a custom build of FSC from a branch of @ncave fork that does some simple caching (it only recompiles the first changed file and those below) and skips work that's not needed to get the typed AST (basically symbol information, if I'm not mistaken). It'd be great if we could have a similar mechanism directly built into FSC.

kerams · 2022-04-04T10:32:55Z

I assume out of process hosting implies leaving the .NET Framework world, with free runtime performance gains, type providers' design time parts being able to target .NET 6, etc.? Can't wait.

Any updates on how it's going and what the plans for the immediate future are? Obviously Will's leaving the team has thrown a wrench into this endeavor a bit.

TIHan added Area-Compiler Area-FCS Area-VS-Editor VS editor support for F# code, not covered elsewhere Area-LangService-API Plan labels Aug 16, 2021

dsyme mentioned this issue Aug 17, 2021

Migrate F# to LSP pull diagnostics #11969

Open

TIHan added the Feature Request label Aug 18, 2021

dsyme removed the Area-Compiler label Mar 4, 2022

dsyme removed the Area-VS-Editor VS editor support for F# code, not covered elsewhere label Mar 30, 2022

dsyme removed the Area-FCS label Apr 5, 2022

SilkyFowl mentioned this issue May 26, 2022

Is there any chance of getting a hot-reloading designer? fsprojects/Avalonia.FuncUI#115

Closed

vzarytovskii added this to the Backlog milestone Sep 21, 2022

0101 mentioned this issue Oct 6, 2022

Make Visual Studio provide FSharpSource objects based on live buffers #14033

Closed

0101 mentioned this issue May 3, 2023

[Experimental] [WIP] Transparent Compiler #15179

Merged

22 tasks

T-Gro self-assigned this Dec 7, 2023

T-Gro modified the milestones: Backlog, December-2023 Dec 7, 2023

T-Gro modified the milestones: December-2023, January-2024 Jan 4, 2024

vzarytovskii removed this from the January-2024 milestone Feb 5, 2024

brianrourkeboll mentioned this issue Jun 7, 2024

FS-1135 implementation - random functions for collections #17277

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plan: Modernizing F# Analysis #11976

Plan: Modernizing F# Analysis #11976

TIHan commented Aug 16, 2021 •

edited by dsyme

Loading

baronfel commented Aug 16, 2021

alfonsogarciacaro commented Sep 9, 2021

kerams commented Apr 4, 2022

Plan: Modernizing F# Analysis #11976

Plan: Modernizing F# Analysis #11976

Comments

TIHan commented Aug 16, 2021 • edited by dsyme Loading

Planning: Modernizing F# Analysis

Executive summary

The Current Situation and why it's a Problem

What's Needed

Technical Agenda

Looking ahead

Further Incrementality

LSP and Out of Process

baronfel commented Aug 16, 2021

alfonsogarciacaro commented Sep 9, 2021

kerams commented Apr 4, 2022

TIHan commented Aug 16, 2021 •

edited by dsyme

Loading