Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Visual Basic Yield Return state machine decompilation #2874

Merged
merged 5 commits into from
Apr 2, 2023

Conversation

ElektroKill
Copy link
Contributor

@ElektroKill ElektroKill commented Dec 23, 2022

Link to issue(s) this covers
N/A

Problem

ILSpy did not support yield return state machines generated by both the legacy Visual Basic compiler as well as the Roslyn Visual Basic compiler.

The following changes were made:

  • Adjust detection of methods such as MoveNext or Dispose to support the interface implementations generated by the VB.NET compiler.
  • Detect VB.NET doFinallyBodies variable used by the Legacy Visual Basic compiler.
  • Add necessary code to cleanup finally blocks by removing dead conditions.
  • Extend state machine detection logic for VB.NET generated code. I'm still not entirely happy with my way of detecting whether a state machine was compiled by Legacy or Roslyn Visual Basic compilers, but it works.
  • Execute additional cleanup steps before analyzing MoveNext to accommodate for dead code generated by the VB compiler.
  • Use -1 as the initial state when decompiling code generated by the Legacy VB.NET compiler.
  • Add additional logic to disregard conditions testing on disposeField when converting the body.
  • Extract the real body out of the try-catch handler covering all the code emitted by the Legacy Visual Basic compiler.

Solution

  • Extend the yield return decompilation logic to handle the different code generation properly.
  • Add a VBPretty test for the newly added behavior.

Feedback is greatly appreciated, this is probably my most complicated PR to the decompiler yet and I'm not sure whether I implemented everything up to standard :P

EDIT: Would using an enum for the compiler used for state machines be beneficial for code quality? It would help clean up all the boolean fields and we could even use bitwise operations and masks to implement the two visual basic fields in a prettier way.

/// <summary>
/// Gets whether the YieldReturnDecompiler determined that the Legacy VB compiler was used to compile this function.
/// </summary>
public bool StateMachineCompiledWithLegacyVisualBasic;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you suggested in your PR intro, maybe switching this to a flags enum is a good idea?

Copy link
Contributor Author

@ElektroKill ElektroKill Apr 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I came up with the following enum and additional extension methods. What do you think of this?

[Flags]
public enum CompilerFlags : ushort
{
	// Lower 8 bits are used for compiler flags, allowing for 256 possible languages
	Language_None = 0,
	Language_CSharp = 0x1,
	Language_VisualBasic = 0x2,
	LanguageMask = 0xFF,

	// Upper 8 bits are used for compiler kind, allowing for 256 possible compiler kinds
	CompilerKind_None = 0,
	CompilerKind_Mono = 0x100,
	CompilerKind_Legacy = 0x200,
	CompilerKindMask = 0xFF00,

	CSharp = Language_CSharp | CompilerKind_None,
	CSharpLegacy = Language_CSharp | CompilerKind_Legacy,
	CSharpMono = Language_CSharp | CompilerKind_Mono,

	VisualBasic = Language_VisualBasic | CompilerKind_None,
	VisualBasicLegacy = Language_VisualBasic | CompilerKind_Legacy,
	VisualBasicMono = Language_VisualBasic | CompilerKind_Mono,
}

public static class CompilerFlagsExtensions
{
	public static bool IsCSharp(this CompilerFlags flags) => (flags & CompilerFlags.LanguageMask) == CompilerFlags.Language_CSharp;

	public static bool IsVisualBasic(this CompilerFlags flags) => (flags & CompilerFlags.LanguageMask) == CompilerFlags.Language_VisualBasic;

	public static bool IsLegacy(this CompilerFlags flags) => (flags & CompilerFlags.CompilerKindMask) == CompilerFlags.CompilerKind_Legacy;

	public static bool IsMono(this CompilerFlags flags) => (flags & CompilerFlags.CompilerKindMask) == CompilerFlags.CompilerKind_Mono;
}

The rationale behind splitting up compiler kind and language is to allow easy future expansion in the case of entries for other languages and compilers being necessary in the future.

Although I am wondering whether the default compiler flag should be an unspecified language and compiler kind or should we default to C#.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do the refactoring in a later PR... the above proposal looks fine... maybe a little bit overengineered... I doubt there will be need for up to 256 languages or compiler kinds :)

I think that C# and None should share the same enum value... but the details are still up for discussion!

@siegfriedpammer
Copy link
Member

Otherwise LGTM. Thanks a lot for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants