Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad IL format from Memory Stream #39470

Open
lpeixotoo opened this issue Oct 18, 2019 · 14 comments
Open

Bad IL format from Memory Stream #39470

lpeixotoo opened this issue Oct 18, 2019 · 14 comments
Labels
Area-Compilers Concept-API This issue involves adding, removing, clarification, or modification of an API. Investigation Required

Comments

@lpeixotoo
Copy link

Bad IL format from Memory Stream

Hello guys,

I'm trying to run a basic compilation sample by

  • Compiling a source code;
  • Emiting it into a memory stream;
  • Loading and Invoking;

using a docker environment with:

  • NET Core SDK (reflecting any global.json):
    Version: 3.0.100
    Commit: 04339c3a26
  • Runtime Environment:
    OS Name: ubuntu
    OS Version: 18.04
    OS Platform: Linux
    RID: ubuntu.18.04-x64
    Base Path: /usr/share/dotnet/sdk/3.0.100/

General

I've based myself in this article.

A reproducible repository can be found here.

Sample Code

For a quick follow up, here's my code

using System;
using System.IO;                           
using System.Reflection;                   
using Microsoft.CodeAnalysis;              
using Microsoft.CodeAnalysis.CSharp;       
using Xunit;

namespace DotNetLib                        
{                                          
    public  class Lib                
    {                                      
                                           
        [Fact]                                   
        public int Compile()
        {                                  

           var tree = CSharpSyntaxTree.ParseText(@"
           using System;
           public class MyClass
           {
               public static void Main()
               {
                   Console.WriteLine(""Hello World!"");
               }
           }");
           
           var mscorlib = MetadataReference.CreateFromFile(typeof(object).Assembly.Location);
           var compilation = CSharpCompilation.Create("MyCompilation",
               syntaxTrees: new[] { tree }, references: new[] { mscorlib });
           
           //Emit to stream
           var ms = new MemoryStream();
           var emitResult = compilation.Emit(ms);

           //Load into currently running assembly. Normally we'd probably
           //want to do this in an AppDomain
           var ourAssembly = Assembly.Load(ms.GetBuffer());
           var type = ourAssembly.GetType("MyClass");
           
           //Invokes our main method and writes "Hello World" :)
           type.InvokeMember("Main", BindingFlags.Default | BindingFlags.InvokeMethod, null, null, null);
 	       return 0;                          
        }
    }                                      
}

Am i doing something wrong?

@vcsjones
Copy link
Member

I don't know much about Roslyn, but the use of GetBuffer on MemoryStream is usually suspicious to me. This returns the internal buffer of the memory stream, which may (probably) contains unwritten garbage data at the end.

Try using ToArray() instead of GetBuffer() on your memory stream.

@lpeixotoo
Copy link
Author

lpeixotoo commented Oct 18, 2019

@vcsjones

I've used ToArray() for Assembly loading, the error still persists. I've tried to set the memory stream to the beginning also. Same error.

@lpeixotoo
Copy link
Author

It happens specifically when Console.Writeline are in "to-be-compiled" code.

@scalablecory

This comment has been minimized.

@stephentoub stephentoub transferred this issue from dotnet/core Oct 23, 2019
@sharwell sharwell reopened this Oct 23, 2019
@jinujoseph jinujoseph added Area-IDE Bug Area-Compilers Concept-API This issue involves adding, removing, clarification, or modification of an API. and removed Area-IDE Bug labels Oct 23, 2019
@TylerBurnett
Copy link

TylerBurnett commented Nov 17, 2019

Can confirm the issue is reproducible, My code is quite similar to that of this:

   `public static void GenerateAssembly(string Code, string ClassName, string MethodName)
    {
        var tree = SyntaxFactory.ParseSyntaxTree(Code);
        string fileName = "payload.dll";

        List<MetadataReference> References = new List<MetadataReference>();

        foreach(string location in Refs)
        {
            References.Add(MetadataReference.CreateFromFile(location));
        }

        // A single, immutable invocation to the compiler
        // to produce a library
        var compilation = CSharpCompilation.Create(fileName)
          .WithOptions(new CSharpCompilationOptions(OutputKind.DynamicallyLinkedLibrary)).AddReferences(References.ToArray()).AddSyntaxTrees(tree);

        var stream = new MemoryStream();
        EmitResult compilationResult = compilation.Emit(stream);

        if (compilationResult.Success)
        {
            // Load the assembly
            Assembly asm = AssemblyLoadContext.Default.LoadFromStream(stream);
            // Invoke the RoslynCore.Helper.CalculateCircleArea method passing an argument
            asm.GetType(ClassName).GetMethod(MethodName).Invoke(null, null);
        }
        else
        {
            foreach (Diagnostic codeIssue in compilationResult.Diagnostics)
            {
                string issue = $"ID: {codeIssue.Id}, Message: {codeIssue.GetMessage()},Location: { codeIssue.Location.GetLineSpan()},Severity: { codeIssue.Severity} ";
                Console.WriteLine(issue);
            }
        }
    }

`
Originally I purposed my code to write the compiled contents to a assembly file which then would be read from, this method worked perfectly fine with no issues. However, deviating from this method to a memory stream creates this exact error.

Heres my error information:
System.BadImageFormatException
HResult=0x8007000B
Message=Bad IL format.
Source=
StackTrace:

@danielwcarey
Copy link

Maybe this working example will help. I took the original source above and turned it into a LINQpad snippet ( https://www.linqpad.net - v6.4.4).

// Sample Memory Stream Execution
//
// Packages
//   Microsoft.CodeAnalysis.CSharp
//
// using Microsoft.CodeAnalysis;
// using Microsoft.CodeAnalysis.CSharp;
// using System.Threading.Tasks;
//
void Main() {
  var tree = CSharpSyntaxTree.ParseText(@"
           using System;
           public class MyClass
           {
               public static void Main()
               {
                   Console.WriteLine(""Hello World!"");
               }
           }");

  var baseDotNetPath = @"C:\Program Files\dotnet\shared\Microsoft.NETCore.App\3.0.0\";  
  var references = new List<MetadataReference>() {
    MetadataReference.CreateFromFile($@"{baseDotNetPath}System.Private.CoreLib.dll"),
    MetadataReference.CreateFromFile($@"{baseDotNetPath}\System.dll"),
    MetadataReference.CreateFromFile($@"{baseDotNetPath}\System.Console.dll"),
    MetadataReference.CreateFromFile($@"{baseDotNetPath}\System.Runtime.dll"),
  };

  var compilation = CSharpCompilation.Create("MyCompilation",
      syntaxTrees: new[] { tree }, references: references);

  //Emit to stream
  var ms = new MemoryStream();
  var emitResult = compilation.Emit(ms);

  //Load into currently running assembly. Normally we'd probably
  //want to do this in an AppDomain
  var ourAssembly = Assembly.Load(ms.GetBuffer());
  var type = ourAssembly.GetType("MyClass");

  //Invokes our main method and writes "Hello World" :)
  type.InvokeMember("Main", BindingFlags.Default | BindingFlags.InvokeMethod, null, null, null);
}

@peteraritchie
Copy link
Contributor

I've encountered this, a little extra info: in my case, the compilation.Emit fails, the emitResult.Success == false. I don't see this being checked before the Assembly.Load above... Again in my case, when I look at emitResult.Diagnostics there are some errors.

@lpeixotoo
Copy link
Author

@danielwcarey
I've a working example similar to yours, but that kind of solution seems to be a workaround throughout the problem. I've used something similar, like:

using System;   
using System.Collections.Generic;          
using System.Globalization;                
using System.IO;                           
using System.Linq;                         
using System.Reflection;                   
using System.Runtime.InteropServices;      
using Microsoft.CodeAnalysis;              
using Microsoft.CodeAnalysis.CSharp;       
using Microsoft.CodeAnalysis.Text;         
                                           
namespace DotNetLib                        
{                                          
    public static class Lib                
    {                                      
        [StructLayout(LayoutKind.Sequential)]
        public struct LibArgs              
        {                                  
            public IntPtr SourceCode;      
            public int Number;             
            public int FuncOid;
        }                                  
                                           
        static MemoryStream memStream;
        static IDictionary<int, (string, MemoryStream)> funcBuiltCodeDict;

        public static int Compile(IntPtr arg, int argLength)
        {                                  
            LibArgs libArgs = Marshal.PtrToStructure<LibArgs>(arg);
            string sourceCode = Marshal.PtrToStringAuto(libArgs.SourceCode);

            if (Lib.funcBuiltCodeDict == null)
                Lib.funcBuiltCodeDict = new Dictionary<int, (string, MemoryStream)>();
            else {
                // Code has not changed then it is not needed to build it
                try {
                    Lib.funcBuiltCodeDict.TryGetValue(libArgs.FuncOid,
                    out (string src, MemoryStream builtCode) pair);
                    if  (pair.src == sourceCode) {
                        Lib.memStream = pair.builtCode;
                        return 0;
                    }
                }catch{}
            }

            SyntaxTree tree = SyntaxFactory.ParseSyntaxTree(sourceCode);

            var trustedAssembliesPaths = ((string)AppContext.GetData("TRUSTED_PLATFORM_ASSEMBLIES")).Split(Path.PathSeparator);

            var neededAssemblies = new[]
            {
                "System.Runtime",
                "System.Private.CoreLib",
                "System.Console",
            };

            List<PortableExecutableReference> references = trustedAssembliesPaths
                .Where(p => neededAssemblies.Contains(Path.GetFileNameWithoutExtension(p)))
                .Select(p => MetadataReference.CreateFromFile(p))
            .ToList();

            CSharpCompilation compilation = CSharpCompilation.Create(
                "plnetproc.dll",
                options: new CSharpCompilationOptions(OutputKind.DynamicallyLinkedLibrary),
                syntaxTrees: new[] { tree },
                references: references);
                                           
            Lib.memStream = new MemoryStream();
            Microsoft.CodeAnalysis.Emit.EmitResult compileResult = compilation.Emit(Lib.memStream);

            if(!compileResult.Success)
            {
                Console.WriteLine("\n********ERROR************\n");
                foreach(var diagnostic in compileResult.Diagnostics)
                {
                    Console.WriteLine(diagnostic.ToString());
                }
                Console.WriteLine("\n********ERROR************\n");
                return 0;
            }

            funcBuiltCodeDict[libArgs.FuncOid] = (sourceCode, Lib.memStream);

            return 0;
        }

        public static int Run(IntPtr arg, int argLength)
        {
            Assembly compiledAssembly;     
            compiledAssembly = Assembly.Load(Lib.memStream.GetBuffer());

            Type procClassType = compiledAssembly.GetType("DotNetLib.ProcedureClass");
            MethodInfo procMethod = procClassType.GetMethod("ProcedureMethod");
            procMethod.Invoke(null, new object[] {arg, argLength});

            return 0;
        }                                  
    }                                      
}

@lpeixotoo
Copy link
Author

@peteraritchie What kind of errors?

@peteraritchie
Copy link
Contributor

@peteraritchie What kind of errors?

Could be compile errors. Building source code means it's not valid for a time being and could be invalid when compiled. But, typically I see reference-related errors.

@BobSilent
Copy link

Can confirm the issue is reproducible, My code is quite similar to that of this:

   `public static void GenerateAssembly(string Code, string ClassName, string MethodName)
    {
        var tree = SyntaxFactory.ParseSyntaxTree(Code);
        string fileName = "payload.dll";

        List<MetadataReference> References = new List<MetadataReference>();

        foreach(string location in Refs)
        {
            References.Add(MetadataReference.CreateFromFile(location));
        }

        // A single, immutable invocation to the compiler
        // to produce a library
        var compilation = CSharpCompilation.Create(fileName)
          .WithOptions(new CSharpCompilationOptions(OutputKind.DynamicallyLinkedLibrary)).AddReferences(References.ToArray()).AddSyntaxTrees(tree);

        var stream = new MemoryStream();
        EmitResult compilationResult = compilation.Emit(stream);

        if (compilationResult.Success)
        {
            // Load the assembly
            Assembly asm = AssemblyLoadContext.Default.LoadFromStream(stream);
            // Invoke the RoslynCore.Helper.CalculateCircleArea method passing an argument
            asm.GetType(ClassName).GetMethod(MethodName).Invoke(null, null);
        }
        else
        {
            foreach (Diagnostic codeIssue in compilationResult.Diagnostics)
            {
                string issue = $"ID: {codeIssue.Id}, Message: {codeIssue.GetMessage()},Location: { codeIssue.Location.GetLineSpan()},Severity: { codeIssue.Severity} ";
                Console.WriteLine(issue);
            }
        }
    }

`
Originally I purposed my code to write the compiled contents to a assembly file which then would be read from, this method worked perfectly fine with no issues. However, deviating from this method to a memory stream creates this exact error.

Heres my error information:
System.BadImageFormatException
HResult=0x8007000B
Message=Bad IL format.
Source=
StackTrace:

My Code is similar

using (var archive = System.IO.Compression.ZipFile.OpenRead(@"Path to ZIP File"))
{
    var entry = archive.Entries.Single(e => string.Equals("xxx.dll", e.Name));

    var context = new AssemblyLoadContext("Test", true);
    using (Stream assemblyStream = entry.Open())
    using (MemoryStream ms = new MemoryStream())
    {
        assemblyStream.CopyTo(ms);
        var assembly = context.LoadFromStream(ms);
        ....
    }

First I tried directly loading the stream from ZipEntry (which is a DeflateStream in my case)
this throw System.NotSupportedException: 'This operation is not supported.'

Then I changed to first Copy to MemoryStream and this now throws the System.BadImageFormatException: 'Bad IL format.'.

Using a FileStream (first write File to disk then opening as FileStream) works.

Then I played a little bit around and finally i changed the code to first set MemoryStream.Position to 0, this now did the trick:

assemblyStream.CopyTo(ms);
ms.Position = 0;
var assembly = context.LoadFromStream(ms);

No Exception is thrown.

@jebwatson
Copy link

jebwatson commented May 20, 2020

Hi, I would recommend reviewing this article explaining runtime assembly creation, specifically the sentence in the Executing Code section that says:

It’s important to mention that the output assembly is a .NET Standard Library, so compiling source text will succeed only if the code parsed with Roslyn relies on APIs that are included in .NET Standard.

Console.Writeline is not a part of .NET Standard as far as I'm aware, so Emit will not be able to compile any code containing such a reference. Hope this helps.

Edit: I have been corrected. Thanks @svick for the correction.

@svick
Copy link
Contributor

svick commented May 21, 2020

@jebwatson

  1. I believe the article is wrong: to create a .Net Standard 2.x library, you need to reference netstandard.dll, not the runtime assembly that contains object (which is what you get with typeof(object).GetTypeInfo().Assembly.Location).
  2. Even if it was right, referencing something outside of the target framework should cause a compiler diagnostic, not BadImageFormatException.
  3. Even if that wasn't the case, Console.WriteLine is part of .Net Standard.

@svick
Copy link
Contributor

svick commented May 21, 2020

As far as I can see, @lpeixotoo, @TylerBurnett and @BobSilent all experienced issues because, due to a bug in their code, they attempted to load an empty assembly. I think the exception message in that case should be improved, so I opened dotnet/runtime#36814 about that.

With that, I think this issue should be closed, since there doesn't seem to be any problem with Roslyn here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers Concept-API This issue involves adding, removing, clarification, or modification of an API. Investigation Required
Projects
None yet
Development

No branches or pull requests