Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to System.Reflection.Metadata? #3976

Open
jack-pappas opened this Issue Nov 19, 2017 · 18 comments

Comments

Projects
None yet
10 participants
@jack-pappas
Copy link
Contributor

commented Nov 19, 2017

Is it worth simplifying/eliminating/consolidating a big chunk of the F# compiler code by replacing most of the AbsIL code with the System.Reflection.Metadata library? That’s the library used by Roslyn to read/write .NET assemblies; switching over to using that library would more closely align the F# compiler with the Roslyn compilers and reduce the amount of code we need to manage/maintain in the F# compiler. It’s also been well-optimized so may provide some performance benefits during compilation.

If there’s interest in this from the F# compiler team (e.g. @dsyme or @KevinRansom), I’m willing to take a shot at prototyping it.

@dsyme

This comment has been minimized.

Copy link
Contributor

commented Nov 20, 2017

@jack-pappas I'm not in favour for two reasons

  • It's a big job and destabilizing

  • We know that the F# language design and core implementation is sufficiently independent that it allows things like Fable to achieve viable lift-off from a technical perspective. We should keep it that way.

Specifically for (2), the Fable/Javascript port of F# includes the binary reader/writer for components on the .NET IL. Even though the compiler is run using Javascript, it still uses the .NET IL format as its binary format, both for input and output metadata. The same would apply whenever we cross-compile the F# compiler to other runtimes - e.g. there is an Erlang-runtime port of F#.

(2) means there are a major long-term advantages to having the F# compiler be an all-F# program: it helps maintain the technical independence of F# from the .NET runtime. This is at the cost of actually implementing the binary reader/writer in the first place, but that cost is paid.

(2) also implies that I'm generally not in favor of FSharp.Compiler.Service.dll picking up any new dependencies on C#-implemented libraries, ever. For things like emitting debug symbols it is less contentious, and we did pick up a new dependency there (though again I would generally prefer not to).

@dsyme dsyme added the discussion label Nov 20, 2017

@KevinRansom

This comment has been minimized.

Copy link
Contributor

commented Nov 20, 2017

@dsyme, @jack-pappas.

The advantage of moving over to libraries shared with Roslyn are that we can leverage more of the Roslyn team for work. The disadvantage is that if we manage it ourselves we can leverage more of the F# Open Source community. We have gained enormous benefits from the work of the community and the flexibility that offers them in implementing new things.
I think we should continue that way ... leveraging Roslyn when it makes sense but otherwise, continue to plough our own furrow.

@cartermp

This comment has been minimized.

Copy link
Collaborator

commented Nov 20, 2017

@dsyme (and @alfonsogarciacaro)

We know that the F# language design and core implementation is sufficiently independent that it allows things like Fable to achieve viable lift-off from a technical perspective. We should keep it that way.

Say that, prior to Fable being created, we deferred to S.R.M. for reading/writing IL. Would that have prevented Fable from being developed in the first place?

@dsyme

This comment has been minimized.

Copy link
Contributor

commented Nov 20, 2017

Would that have prevented Fable from being developed in the first place?

Yes.

That is, it would have likely prevented the cross-compilation of the F# compiler with Fable to execute on Javascript, maintained in this branch. This form of the F# compiler is used as the basis for the Fable REPL and has been mooted to be used for part of the re-implementation of tryfsharp.org.. Moving to S.R.M would almost certainly terminate that effort or cause a lot of extra re-implementation (basically forcing those contributors to re-implement S.R.M in F# or extend Fable to consume C#), which is not something I feel we should do.

I think we should continue that way ... leveraging Roslyn when it makes sense ...

An all-F# implementation of F# is much preferred for long-term language-independence. Otherwise F#'s existence and manifestations just becomes fundamentally dependent on Roslyn. The Fable port of the F# compiler is one great example, but Fez is another (F# to Erlang) and it's quite possible there will be others. The ideal is the FSharp.Compiler.Service.dll has no dependencies besides FSharp.Core.dll.

@alfonsogarciacaro

This comment has been minimized.

Copy link
Contributor

commented Nov 20, 2017

@ncave has done all the job of porting the F# compiler to Fable so I'm not the biggest authority here ;) but just a few notes:

  • There're two things: the Fable CLI and the Fable REPL. The former is still a netcore application, so it shouldn't be an issue if FCS had other .dll dependencies (though it makes everything easier if there're none), while the latter is FCS+Fable sources dogfooded into Fable CLI to create an F#-to-JS compiler that can run entirely on the browser. This would probably hadn't be possible if FCS had dependencies as @dsyme confirms.
  • Fable can only compile source files, no dlls (that's why Fable libraries include sources in the .nupkg). This is because FCS can only provide the AST from source files.
  • FCS still needs some of the .dlls from the BCL to read the type metadata. This is also a requirement in the Fable REPL. It was discussed if another format should be adopted, but precisely because FCS contained the code to read the binary files and Fable could compile the code to JS, the easiest option was to keep the .dll format.

The only problem with this is the Fable REPL has to download around 2MB of gzipped dll files. I would like if there was way to have only the metada without the rest of the code, be it in a .dll or a different format. But it's not really a big issue.

@ncave

This comment has been minimized.

Copy link
Contributor

commented Nov 21, 2017

IMO the easy availability of the F# IL reader/writer makes tinkering with assemblies (e.g. exporting the BCL metadata, porting to other platforms, etc.) trivial, thus lowering the bar for entry.

@alfonsogarciacaro The whole netcore 2.0 BCL metadata is less than 0.8 MB (gzipped), and FSharp.Core metadata is another 0.5 MB (gzipped).

@zpodlovics

This comment has been minimized.

Copy link

commented Nov 22, 2017

Please keep the whole compiler in pure F# (including the F# IL writer/reader), because it will allows F# to port F# to different runtimes. For example there is a port in the pipe (as I the compiler is implemented similarly to Fable) to Erlang BEAM virtual machine (https://github.com/kjnilsson/fez). I also have plans - but no time yet - to port a minimal subset to LLVM (probably on top of https://github.com/kp-tech/fshlvm).

@dsyme

This comment has been minimized.

Copy link
Contributor

commented Nov 22, 2017

@zpodlovics Potentially porting F# to the JVM would also be another example - no on has bitten that off but it is not impossible by any means (you would just lose some features, e.g. tailcalls not guaranteed, generics would box etc.)

@cartermp

This comment has been minimized.

Copy link
Collaborator

commented Dec 6, 2017

I'll close this as it's clear that we won't be moving to S.R.M. any time soon (or ever).

@cartermp cartermp closed this Dec 6, 2017

@TIHan

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2019

Re-opening, as moving to SRM could actually be on the horizon. I've re-implemented ilread.fs using SRM, https://github.com/TIHan/visualfsharp/blob/new-metadata-reader/src/fsharp/FSharp.Compiler.Private/ilread_srm.fs
We still have the same API as before, it's just the implementation is using SRM. It drastically simplifies the previous implementation and technically performs better. There is still work to be done, see "TODO:" in the source code. Some parts still need to read bytes manually but it's still simpler than what was there before. It will need heavy testing, but so far it can open mscorlib and .net core assemblies, as well as others.

The remaining unknowns are native resources, managed resources, and pdb info (such as sequence points) as I'm not sure what to do about them yet.

@TIHan TIHan reopened this Apr 9, 2019

@cartermp

This comment has been minimized.

Copy link
Collaborator

commented Apr 9, 2019

In summary,

image

@jack-pappas

This comment has been minimized.

Copy link
Contributor Author

commented Apr 10, 2019

@TIHan @cartermp I still think this is a good idea, but others had some concerns about it (above). What’s changed since then that you’re going ahead with the change now (in spite of those concerns)?

@cartermp

This comment has been minimized.

Copy link
Collaborator

commented Apr 10, 2019

We have an approach that doesn't involve ripping things out. F# on .NET would use SRM, and there would be a configuration to not use it in other contexts, should someone decide to take on the (insanely complicated) task of doing something like porting F# to the JVM or LLVM.

The primary motivations are perf and maintainability. Initial benchmarks show that it's considerably faster and uses significantly less memory. SRM is pretty highly-optimized, so that shouldn't be a surprise. Reading IL in particular is significant from a perf perspective for F# in an IDE, so there are very concrete benefits to doing this for the large majority of F# users.

@TIHan

This comment has been minimized.

Copy link
Contributor

commented Apr 10, 2019

Initial benchmarks show that it's better in perf, but this is with a lot of caching enabled so that may have something to do with it plus we still have more work to do that could impact perf later on; though we got most of the major stuff implemented already. The primary motivation of it was to improve the codebase and as a consequence, improves perf a bit. Thus, I believe it is worth it.

Yes, the concerns presented above are real in regards to Fable. I don't have a clear picture on what we need to do that could help them in this context. This type of change is to be well communicated in advance. I don't know when we will actually start trying to get this in; it will be a while, but not never.

We want to improve the F# compiler and tooling. We want to make it faster and more maintainable. There is no smoking bullet to make that happen; we need to try to improve everything. This is just another step into that direction.

@smoothdeveloper

This comment has been minimized.

Copy link
Contributor

commented Apr 10, 2019

@TIHan it sounds very interesting, do you happen to know if having this in place would help on the issue I was describing in #2345

The issue back then (not sure it still applies): memory was taken from referenced assemblies embedding large resources.

Finding ways to trim resources when reading the IL into memory for compile / type checking purpose could save a great deal of space.

When this code will get rearchitected, it may be worth to try to get a clear cut between the state and the querying over it that occurs later in the code, eventually those could be split for out of process cache of IL references in context of VS language service.

@TIHan

This comment has been minimized.

Copy link
Contributor

commented Apr 10, 2019

The issue you described should be resolved due to Don's work to integrate with roslyn metadata stuff. It shouldn't always load the resources into memory today.

@matthid

This comment has been minimized.

Copy link
Contributor

commented Apr 10, 2019

It drastically simplifies the previous implementation

So to be clear: What I read is that Microsoft will no longer maintain the F# version of the metadata reader and writer, but it will be handed over to the community, correct?

If this is true, one additional thing to consider is that it might be hard for the community to make sure the quality of the F# version will be "on-par" with the SRM implementation. In practice, that means breaking some of the above use-cases until someone is willing to PR a bugfix and then waiting for the next release.

Note: I'm not saying we shouldn't do this — just something we might need to keep in mind. Maybe we can do something about it (process wise or via some other way)?

Perhaps it's a non-issue just something I wanted to bring up into the discussion.

@dsyme

This comment has been minimized.

Copy link
Contributor

commented Apr 10, 2019

So to be clear: What I read is that Microsoft will no longer maintain the F# version of the metadata reader and writer, but it will be handed over to the community, correct?

I think it would become part of the Fable PR to cross-compile the compiler with Fable. That would make sense to me. Alternatively we could #if it but I could see the code being removed long term

I don't expect any need for the ilread.fs binary reader in any other situation (well, there's a variant of it living on in ProvidedTypes.fs, but that's a separate thing, see fsprojects/FSharp.TypeProviders.SDK#298

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.