New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions and thoughts on Compilation.EmitDifference #8962

Closed
JoshVarty opened this Issue Feb 20, 2016 · 22 comments

Comments

Projects
None yet
5 participants
@JoshVarty
Contributor

JoshVarty commented Feb 20, 2016

So I'm working on a project that would greatly benefit from a generalized version of the EditAndContinue API.

Today I began investigating what it would take to use Compilation.EmitDifference(). I don't think the API is currently usable so maybe we could discuss what path forward might enable its use from public users. My use-case is probably not very common so I understand that this probably won't be a high-priority issue. I'm willing to try to contribute however I can, but I'll definitely need some guidance.

My overall goal is to compile and emit a user's code. As they edit their code, I believe I can use Roslyn to emit the changes in IL and PDB and apply them to the running process via the CLR API ICorDebugModule2::ApplyChanges

I started by looking at the EmitDifference signature:

 public EmitDifferenceResult EmitDifference(
            EmitBaseline baseline,
            IEnumerable<SemanticEdit> edits,
            Stream metadataStream,
            Stream ilStream,
            Stream pdbStream,
            ICollection<MethodDefinitionHandle> updatedMethods,
            CancellationToken cancellationToken = default(CancellationToken))

EmitBaseline

To build EmitBaseline I can use EmitBaseline.CreateInitialBaseline(), a method with the following signature:

public static EmitBaseline CreateInitialBaseline(
              ModuleMetadata module,
              Func<MethodDefinitionHandle, EditAndContinueMethodDebugInformation> debugInformationProvider)
ModuleMetadata

ModuleMetadata can be created easily from CreateFromStream().

Func<MethodDefinitionHandle, EditAndContinueMethodDebugInformation> I'm not quite sure how to create this properly. I see in various tests that you're creating an ISymUnmanagedReader and then using the GetEncMethodDebugInfo method as your Func. Is this only suitable for tests, or would this work as a general approach?

IEnumerable<SemanticEdit>

This is where things get really fun. 😄 SemanticEdit is public, but it looks like a bit of a headache to create and manage these things. My use-case is simpler than Visual Studio's as I can assume that no active statements (or active methods) will ever be modified.

There are a few different options I can think of to make managing SemanticEdits easier:

  1. Make AbstractEditAndContinueAnalyzer public. Then I could probably implement by own EditAndContinueAnalyzer tailored to my use case. We'd also have to make a few other types such as DocumentAnalysisResults public so consumers could use AbstractEditAndContinueAnalyzer.AnalyzeDocumentAsync().
  2. Make CSharpEditAndContinueAnalyzer and IEditAndContinueAnalyzer public. Consumers would be able to use EnC with identical semantics to C#.
  3. Make Match<T> and associated APIs easier to work with. Match<T> is public, but has no public constructors and is instead exposed via the abstract TreeComparer class. TreeComparer has two concrete but internal implementations (StatementSyntaxComparer and TopSyntaxComparer) that would make this easier.

I'm personally leaning towards the first option, but I'm probably not fully aware of all the considerations we'd have to make when thinking about implementing this.

Stream metadataStream, Stream iLStream and Stream pdbStream

I believe (correct me if I'm wrong) these are just empty readable streams that get fed to the CLR.

ICollection<MethodDefinitionHandle>

I believe (correct me if I'm wrong) this is just an empty collection that I feed to the CLR. (Although I'm not sure how or where, yet).

Hopefully this gives you guys a good sense of where I'm coming from with this issue. I realize these might not be simple changes to make, so I'd love to get your feedback and hear whether or not you think this might be feasible.

@tmat

This comment has been minimized.

Show comment
Hide comment
@tmat

tmat Feb 20, 2016

Member

Before we even get to EmitDifferences, have you actually tried to use ICorDebugModule2::ApplyChanges to apply changes to the process?

Member

tmat commented Feb 20, 2016

Before we even get to EmitDifferences, have you actually tried to use ICorDebugModule2::ApplyChanges to apply changes to the process?

@JoshVarty

This comment has been minimized.

Show comment
Hide comment
@JoshVarty

JoshVarty Feb 20, 2016

Contributor

No I haven't, I figured I'd start at the end I was more familiar with (Roslyn) and work towards it. The way you posed the question makes me wonder: Am I trying to do something impossible here?

Contributor

JoshVarty commented Feb 20, 2016

No I haven't, I figured I'd start at the end I was more familiar with (Roslyn) and work towards it. The way you posed the question makes me wonder: Am I trying to do something impossible here?

@tmat

This comment has been minimized.

Show comment
Hide comment
@tmat

tmat Feb 20, 2016

Member

Almost.

Member

tmat commented Feb 20, 2016

Almost.

@tmat

This comment has been minimized.

Show comment
Hide comment
@tmat

tmat Feb 20, 2016

Member

I'd be actually interested in making it possible to apply changes to a process that doesn't have a debugger attached. If you want explore that start with figuring out how to do so with ApplyChanges:
https://github.com/dotnet/coreclr/blob/43b39a73cbf832ec13ec29ed356cb75834e7a8d7/src/debug/di/module.cpp#L2156

Member

tmat commented Feb 20, 2016

I'd be actually interested in making it possible to apply changes to a process that doesn't have a debugger attached. If you want explore that start with figuring out how to do so with ApplyChanges:
https://github.com/dotnet/coreclr/blob/43b39a73cbf832ec13ec29ed356cb75834e7a8d7/src/debug/di/module.cpp#L2156

@JoshVarty

This comment has been minimized.

Show comment
Hide comment
@JoshVarty

JoshVarty Feb 20, 2016

Contributor

I could probably start trying to figure that out. I suppose I could use one of the Roslyn test cases as examples for the metadata, IL and PDB changes and see what happens. (I've never worked with the CLR APIs before, but I'll give it a shot)

Contributor

JoshVarty commented Feb 20, 2016

I could probably start trying to figure that out. I suppose I could use one of the Roslyn test cases as examples for the metadata, IL and PDB changes and see what happens. (I've never worked with the CLR APIs before, but I'll give it a shot)

@tmat

This comment has been minimized.

Show comment
Hide comment
@tmat

tmat Feb 20, 2016

Member

That's exactly how I'd go about it.

Member

tmat commented Feb 20, 2016

That's exactly how I'd go about it.

@JoshVarty

This comment has been minimized.

Show comment
Hide comment
@JoshVarty

JoshVarty Feb 26, 2016

Contributor

I should preface this by saying I've never used the CLR APIs before, so I'm probably doing a lot of things wrong, but I've cobbled together an example based on the mdbg.exe sample debugger code.

You can see my work so far here.

I am trying to attach my debugger to a process that just writes out a string to the console:

using System;

class Program
{
    static System.Timers.Timer timer = new System.Timers.Timer();
    static int count = 0;
    static void Main(string[] args)
    {
        timer.Elapsed += Timer_Elapsed;
        timer.Interval = 2000;
        timer.Start();
        Console.ReadLine();
    }

    private static void Timer_Elapsed(object sender, System.Timers.ElapsedEventArgs e)
    {
        Console.WriteLine("ORIGINAL " + count++);
    }
}

I'm trying to change the text ORIGINAL to 12345678.

I tried to use the Roslyn EnC tests to generate the IL/Metadata delta byte[] that I need, but I'm not sure if I did it correctly.

Current issues I still have to figure out:

  1. When I attach my debugger, all the threads seem to immediately stop. I'm not sure why this is the case, but mdbg.exe seems to behave the same.
  2. When I actually apply my changes, I get an AccessViolationException. I'm guessing this means I'm not using the correct deltas, so I'll have to look into this. Maybe I should start with an even simpler example.
Contributor

JoshVarty commented Feb 26, 2016

I should preface this by saying I've never used the CLR APIs before, so I'm probably doing a lot of things wrong, but I've cobbled together an example based on the mdbg.exe sample debugger code.

You can see my work so far here.

I am trying to attach my debugger to a process that just writes out a string to the console:

using System;

class Program
{
    static System.Timers.Timer timer = new System.Timers.Timer();
    static int count = 0;
    static void Main(string[] args)
    {
        timer.Elapsed += Timer_Elapsed;
        timer.Interval = 2000;
        timer.Start();
        Console.ReadLine();
    }

    private static void Timer_Elapsed(object sender, System.Timers.ElapsedEventArgs e)
    {
        Console.WriteLine("ORIGINAL " + count++);
    }
}

I'm trying to change the text ORIGINAL to 12345678.

I tried to use the Roslyn EnC tests to generate the IL/Metadata delta byte[] that I need, but I'm not sure if I did it correctly.

Current issues I still have to figure out:

  1. When I attach my debugger, all the threads seem to immediately stop. I'm not sure why this is the case, but mdbg.exe seems to behave the same.
  2. When I actually apply my changes, I get an AccessViolationException. I'm guessing this means I'm not using the correct deltas, so I'll have to look into this. Maybe I should start with an even simpler example.
@JoshVarty

This comment has been minimized.

Show comment
Hide comment
@JoshVarty

JoshVarty Mar 2, 2016

Contributor

I've made more progress and I can start/stop program execution. I've just been trying to figure out this issue with the AccessViolationException and I had a question I hoped you could help me with @tmat.

I'm using Roslyn's tests to generate metadata and IL diffs for:

public class C
{
    public static void Main() { F(); }
    public static void F() { System.Console.WriteLine(1); System.Console.ReadLine(); }
}

and

public class C
{
    public static void Main() { F(); }
    public static void F() { System.Console.WriteLine(2); System.Console.ReadLine(); }
}

(Changing the 1 to 2 in Console.WriteLine())

I notice that I get different metadata deltas on each run. Is this expected? What are the differences? Would I need to be using the correct deltas when I try using the CLR's ApplyChanges ?

Contributor

JoshVarty commented Mar 2, 2016

I've made more progress and I can start/stop program execution. I've just been trying to figure out this issue with the AccessViolationException and I had a question I hoped you could help me with @tmat.

I'm using Roslyn's tests to generate metadata and IL diffs for:

public class C
{
    public static void Main() { F(); }
    public static void F() { System.Console.WriteLine(1); System.Console.ReadLine(); }
}

and

public class C
{
    public static void Main() { F(); }
    public static void F() { System.Console.WriteLine(2); System.Console.ReadLine(); }
}

(Changing the 1 to 2 in Console.WriteLine())

I notice that I get different metadata deltas on each run. Is this expected? What are the differences? Would I need to be using the correct deltas when I try using the CLR's ApplyChanges ?

@JoshVarty

This comment has been minimized.

Show comment
Hide comment
@JoshVarty

JoshVarty Mar 7, 2016

Contributor

Alright I've finally got it.

Noah Falk from Microsoft helped me navigate the the debugger landscape and I followed this tutorial to generate the deltas.

My implementation is available here on GitHub.

@tmat Now that I've got a proof of concept up, do you want to discuss how best to move forward on the Roslyn side of things?

Contributor

JoshVarty commented Mar 7, 2016

Alright I've finally got it.

Noah Falk from Microsoft helped me navigate the the debugger landscape and I followed this tutorial to generate the deltas.

My implementation is available here on GitHub.

@tmat Now that I've got a proof of concept up, do you want to discuss how best to move forward on the Roslyn side of things?

@hmemcpy

This comment has been minimized.

Show comment
Hide comment
@hmemcpy

hmemcpy Mar 7, 2016

@JoshVarty this isn't directly related to your suggestion, but there are couple of more code bases you could look at, to maybe get some more ideas about EnC and/or debugger API:

  1. mdbg - the Microsoft Debugging Sample: a reference implementation of a managed debugger, wrapping the Cor* APIs nicely.
  2. Wicca - an old debugger-based AOP framework, written by Marc Eaddy (together with Dr. Alfred Aho, of the Dragon compiler book fame), who later joined Microsoft Research to work on (now dead, sadly) Phoenix Compiler project... anyway, this is an Edit-and-Continue based framework, which emits diffs at runtime and applies them. I think you'll have hard time getting it to run (it requires the Phoenix SDK which is no longer available to download), and the code base it quite old. But anyway, it serves as a great example of how to use the managed debugging API and EnC API to emit and apply diffs at runtime.

Hope that helps :)

hmemcpy commented Mar 7, 2016

@JoshVarty this isn't directly related to your suggestion, but there are couple of more code bases you could look at, to maybe get some more ideas about EnC and/or debugger API:

  1. mdbg - the Microsoft Debugging Sample: a reference implementation of a managed debugger, wrapping the Cor* APIs nicely.
  2. Wicca - an old debugger-based AOP framework, written by Marc Eaddy (together with Dr. Alfred Aho, of the Dragon compiler book fame), who later joined Microsoft Research to work on (now dead, sadly) Phoenix Compiler project... anyway, this is an Edit-and-Continue based framework, which emits diffs at runtime and applies them. I think you'll have hard time getting it to run (it requires the Phoenix SDK which is no longer available to download), and the code base it quite old. But anyway, it serves as a great example of how to use the managed debugging API and EnC API to emit and apply diffs at runtime.

Hope that helps :)

@JoshVarty

This comment has been minimized.

Show comment
Hide comment
@JoshVarty

JoshVarty Mar 7, 2016

Contributor

👍 mdbg is largely what I based my implementation on. I'll take a peek at Wicca as well. Thanks!

Contributor

JoshVarty commented Mar 7, 2016

👍 mdbg is largely what I based my implementation on. I'll take a peek at Wicca as well. Thanks!

@tmat

This comment has been minimized.

Show comment
Hide comment
@tmat

tmat Mar 7, 2016

Member

@JoshVarty So you're ok with having to attach a debugger? Note that only one debugger can be attached to a process.

Member

tmat commented Mar 7, 2016

@JoshVarty So you're ok with having to attach a debugger? Note that only one debugger can be attached to a process.

@JoshVarty

This comment has been minimized.

Show comment
Hide comment
@JoshVarty

JoshVarty Mar 7, 2016

Contributor

Unfortunately it looks like the API requires a debugger to be attached. I have to set some JIT compiler flags when the module is first loaded and I'm not sure how I could do so without attaching the debugger. (It might be possible, but I'm not nearly familiar enough with the CLR to figure out how).

Anyways, it will still work for my use-case. We have full control over the process and we'll be the only ones debugging it.

Contributor

JoshVarty commented Mar 7, 2016

Unfortunately it looks like the API requires a debugger to be attached. I have to set some JIT compiler flags when the module is first loaded and I'm not sure how I could do so without attaching the debugger. (It might be possible, but I'm not nearly familiar enough with the CLR to figure out how).

Anyways, it will still work for my use-case. We have full control over the process and we'll be the only ones debugging it.

@ManishJayaswal ManishJayaswal added this to the 2.0 (RC) milestone Mar 7, 2016

@tmat

This comment has been minimized.

Show comment
Hide comment
@tmat

tmat Mar 8, 2016

Member

If you have full control over the process that you're good. I assume though that you won't be placing breakpoints, right?

How are you building the base binary and pdb for the project? Invoking regular build action on the solution?

First you'll need the initial baseline. That requires access to the .dll and the .pdb. You can use helpers from http://source.roslyn.io/#Roslyn.Test.PdbUtilities/Pdb/SymReaderFactory.cs to create ISymUnmanagedReader. You might need to copy some code right now, these helpers are not public API (yet).

Member

tmat commented Mar 8, 2016

If you have full control over the process that you're good. I assume though that you won't be placing breakpoints, right?

How are you building the base binary and pdb for the project? Invoking regular build action on the solution?

First you'll need the initial baseline. That requires access to the .dll and the .pdb. You can use helpers from http://source.roslyn.io/#Roslyn.Test.PdbUtilities/Pdb/SymReaderFactory.cs to create ISymUnmanagedReader. You might need to copy some code right now, these helpers are not public API (yet).

@JoshVarty

This comment has been minimized.

Show comment
Hide comment
@JoshVarty

JoshVarty Mar 8, 2016

Contributor

Nope, no breakpoints.

Right now we're not using the regular build action. We're calling Compilation.Emit directly. (This means any special post-build tasks like IL weaving/rewriting are ignored). Currently we emit into memory and load that directly into an AppDomain to avoid hitting disk.

If you'd like I can work on this and submit a PR where we can discuss what needs to be made public. Once I've got a good sense for what needs to be made public I'll probably cobble together a version using Reflection that we'll use while we wait for 2.0.

Contributor

JoshVarty commented Mar 8, 2016

Nope, no breakpoints.

Right now we're not using the regular build action. We're calling Compilation.Emit directly. (This means any special post-build tasks like IL weaving/rewriting are ignored). Currently we emit into memory and load that directly into an AppDomain to avoid hitting disk.

If you'd like I can work on this and submit a PR where we can discuss what needs to be made public. Once I've got a good sense for what needs to be made public I'll probably cobble together a version using Reflection that we'll use while we wait for 2.0.

@tmat

This comment has been minimized.

Show comment
Hide comment
@tmat

tmat Mar 8, 2016

Member

I don't think anything needs to get public atm. You should copy the PDB helpers for now.

Not sure I understand. Are you building the project directly in the process that's running it?

Member

tmat commented Mar 8, 2016

I don't think anything needs to get public atm. You should copy the PDB helpers for now.

Not sure I understand. Are you building the project directly in the process that's running it?

@JoshVarty

This comment has been minimized.

Show comment
Hide comment
@JoshVarty

JoshVarty Mar 8, 2016

Contributor

I don't think anything needs to get public atm. You should copy the PDB helpers for now.

Do you think producing the SemanticEdit is feasible without opening up AbstractEditAndContinueAnalyzer or the TreeComparer classes? That's the part I'm most worried about.

Another question: Can we just ignore PDBs? Currently we're not emitting them as we're not setting breakpoints. The CLR's EnC API doesn't seem to require PDBs either.

Not sure I understand. Are you building the project directly in the process that's running it?

Today we are, but we'll have to change our implementation in order to attach the debugger to it, so we're currently moving the code execution out-of-process. This is an ongoing effort so I'm not yet sure how we'll do it. It's possible we'll emit to disk, but we're also investigating whether or not sending the byte[] between processes is feasible.

Contributor

JoshVarty commented Mar 8, 2016

I don't think anything needs to get public atm. You should copy the PDB helpers for now.

Do you think producing the SemanticEdit is feasible without opening up AbstractEditAndContinueAnalyzer or the TreeComparer classes? That's the part I'm most worried about.

Another question: Can we just ignore PDBs? Currently we're not emitting them as we're not setting breakpoints. The CLR's EnC API doesn't seem to require PDBs either.

Not sure I understand. Are you building the project directly in the process that's running it?

Today we are, but we'll have to change our implementation in order to attach the debugger to it, so we're currently moving the code execution out-of-process. This is an ongoing effort so I'm not yet sure how we'll do it. It's possible we'll emit to disk, but we're also investigating whether or not sending the byte[] between processes is feasible.

@JoshVarty

This comment has been minimized.

Show comment
Hide comment
@JoshVarty

JoshVarty Mar 15, 2016

Contributor

Alright, I've got it working. I can compile code, run it, generate deltas and apply them to the running process. The current implementation is pretty sloppy, but it works.

Couple of notes:

  • Importing the SymReaderFactory.cs code was a bit of a hassle. Each class depended on more and more internal parts until I was pulling in ObjectPool, PooledDictionary etc.
  • I used the CSharpEditAndContinueAnalyzer and base method AnalyzeDocumentAsync directly (via Reflection). These have worked very well for me so far.
Contributor

JoshVarty commented Mar 15, 2016

Alright, I've got it working. I can compile code, run it, generate deltas and apply them to the running process. The current implementation is pretty sloppy, but it works.

Couple of notes:

  • Importing the SymReaderFactory.cs code was a bit of a hassle. Each class depended on more and more internal parts until I was pulling in ObjectPool, PooledDictionary etc.
  • I used the CSharpEditAndContinueAnalyzer and base method AnalyzeDocumentAsync directly (via Reflection). These have worked very well for me so far.
@tmat

This comment has been minimized.

Show comment
Hide comment
@tmat

tmat Oct 5, 2016

Member

@JoshVarty Re SymReaderFactory - it should be now simpler to instantiate native SymReader with the latest https://dotnet.myget.org/gallery/symreader package: http://source.roslyn.io/#Roslyn.Test.PdbUtilities/Pdb/SymReaderFactory.cs,26

Member

tmat commented Oct 5, 2016

@JoshVarty Re SymReaderFactory - it should be now simpler to instantiate native SymReader with the latest https://dotnet.myget.org/gallery/symreader package: http://source.roslyn.io/#Roslyn.Test.PdbUtilities/Pdb/SymReaderFactory.cs,26

@JoshVarty

This comment has been minimized.

Show comment
Hide comment
@JoshVarty

JoshVarty Oct 6, 2016

Contributor

I think the biggest issue for me is that I've currently copied over PdbTestUtilities.GetEncMethodDebugInfo() for use in EmitBaseline.CreateInitialBaseline().

PdbTestUtilities.GetEncMethodDebugInfo() ends up bringing stuff from CustomDebugInfoReader which forces me to bring in PooledStringBuilder, ObjectPool etc.

Contributor

JoshVarty commented Oct 6, 2016

I think the biggest issue for me is that I've currently copied over PdbTestUtilities.GetEncMethodDebugInfo() for use in EmitBaseline.CreateInitialBaseline().

PdbTestUtilities.GetEncMethodDebugInfo() ends up bringing stuff from CustomDebugInfoReader which forces me to bring in PooledStringBuilder, ObjectPool etc.

@tmat

This comment has been minimized.

Show comment
Hide comment
@tmat

tmat Oct 6, 2016

Member

Oh, yes... decoding CDIs is currently not well encapsulated :(

Member

tmat commented Oct 6, 2016

Oh, yes... decoding CDIs is currently not well encapsulated :(

@JoshVarty

This comment has been minimized.

Show comment
Hide comment
@JoshVarty

JoshVarty May 8, 2017

Contributor

Closing this because we no longer need the changes discussed.

Contributor

JoshVarty commented May 8, 2017

Closing this because we no longer need the changes discussed.

@JoshVarty JoshVarty closed this May 8, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment