Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BAML→XAML decompile: fix nested Type names & MarkupExtension handling #188

Merged
merged 9 commits into from Apr 26, 2023

Conversation

glenn-slayden
Copy link

@glenn-slayden glenn-slayden commented Apr 25, 2023

1.) BAML examples containing nested type names in MarkupExtensions,now much improved:

  System.Activities.Presentation.View.TypeBrowser
  System.Activities.Core.Presentation.FlowchartDesigner
  System.Activities.Core.Presentation.FlowchartDesigner
  System.Activities.Presentation.View.VBIdentifierDesigner

Use + when decompiling from BAML to XAML when some MarkupExtensions contain nested Type names e.g.
"{x:Static sad:WorkflowDesignerIcons+DesignerItems.WarningValidation}". The plus symbol is what the XAML compiler will accept, so I matched the behavior somewhat, but unfortunately, 'plus' is not a legal character in an XML name, when bare (unquoted) as I have currently done...

BEFORE...
The use of several nested types causes rampant corruption in System.Activities.Presentation.View.TypeBrowser:

<HierarchicalDataTemplate
	x:Key="{DataTemplateKey {x:Type global:AssemblyNode}}"
	x:Uid="HierarchicalDataTemplate_1" xmlns:global="clr-namespace:"
	DataType="{x:Type global:AssemblyNode}"
	ItemsSource="{Binding Path=Namespaces}">
    <!-- ... -->
</HierarchicalDataTemplate>

AFTER...
There are still serious problems with nested types support in XAML, but I was able to fix the major problems:

<HierarchicalDataTemplate
	x:Key="{DataTemplateKey {x:Type swdv:TypeBrowser+AssemblyNode}}"
	x:Uid="HierarchicalDataTemplate_1"
	DataType="{x:Type swdv:TypeBrowser+AssemblyNode}"
	ItemsSource="{Binding Path=Namespaces}">
    <!-- ... -->
</HierarchicalDataTemplate>

2.) Fix BAML→XAML corruption due to incorrect MarkupExtension re-expansion

BEFORE... (ToolboxDefaultTemplate.baml in System.Activities.Presentation.g.resources...)

<ResourceDictionary x:Uid="ResourceDictionary_1"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
	xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
	xmlns:swdv="clr-namespace:System.Activities.Presentation.View"
	xmlns:swd="clr-namespace:System.Activities.Presentation"
	xmlns:swdt="clr-namespace:System.Activities.Presentation.Toolbox">

	<ControlTemplate
		x:Key="ToolboxDefaultTemplate"
		x:Uid="ControlTemplate_1"
		TargetType="{x:Type swdt:ToolboxControl}"
		Resources="{swd:CachedResourceDictionary Uid=swd:CachedResourceDictionaryExtension_1, Source=pack://application:,,,/System.Activities.Presentation;component/Themes/Generic.xaml}">

		<Grid> <!-- ... --> </Grid>
	</ControlTemplate>
</ResourceDictionary>

Notice above that the "Resources" property accepting a MarkupExtension was attempted to be inlined, but this decision was faulty and led to the total corruption as shown. There were about three bugs affecting the decision conditions of whether it can be done. Most seriously, you cannout promote a MarkupExtension to inline if there is a XAML directive (i.e. x:) attribute on the instance itself (and recursively inward), because directives are not actual attributes and would not properly understood by the MarkupExtension syntax processing. The exisiting code seemed to be aware of this idea, but it was implemented incorrectly, and also had a logic-polarity bug whereby it was ignoring most of its own efforts.

AFTER...

<ResourceDictionary x:Uid="ResourceDictionary_1"
	xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
	xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
	xmlns:swdv="clr-namespace:System.Activities.Presentation.View"
	xmlns:swd="clr-namespace:System.Activities.Presentation"
	xmlns:swdt="clr-namespace:System.Activities.Presentation.Toolbox">
	
	<ControlTemplate
		x:Key="ToolboxDefaultTemplate"
		x:Uid="ControlTemplate_1"
		TargetType="{x:Type swdt:ToolboxControl}">
		
		<FrameworkTemplate.Resources>
			<swd:CachedResourceDictionaryExtension x:Uid="swd:CachedResourceDictionaryExtension_1" Source="pack://application:,,,/System.Activities.Presentation;component/Themes/Generic.xaml" />
		</FrameworkTemplate.Resources>
		
		<Grid> <!-- ... --> </Grid>
	</ControlTemplate>
</ResourceDictionary>

Nested Type names were quite broken.  Of course they are doomed in XAML anyway, but this solves a fair about of the corruption and crashing-out-of-decompile problems.

Examples now much improved:
System.Activities.Presentation.View.TypeBrowser
System.Activities.Core.Presentation.FlowchartDesigner
fix nested Type names in BAML -> XAML decompile
* fix nested Type names in BAML -> XAML decompile

Nested Type names were quite broken.  Of course they are doomed in XAML anyway, but this solves a fair about of the corruption and crashing-out-of-decompile problems.

Examples now much improved:
System.Activities.Presentation.View.TypeBrowser
System.Activities.Core.Presentation.FlowchartDesigner
System.Activities.Presentation.View.VBIdentifierDesigner

* Update XamlUtils.cs

use '+' when decompiling from BAML to XAML when some MarkupExtensions contain nested Type names e.g.

--> "{x:Static sad:WorkflowDesignerIcons+DesignerItems.WarningValidation}"
BEFORE...   (ToolboxDefaultTemplate.baml  in   System.Activities.Presentation.g.resources...)
```
<ResourceDictionary x:Uid="ResourceDictionary_1"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
	xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
	xmlns:swdv="clr-namespace:System.Activities.Presentation.View"
	xmlns:swd="clr-namespace:System.Activities.Presentation"
	xmlns:swdt="clr-namespace:System.Activities.Presentation.Toolbox">

	<ControlTemplate
		x:Key="ToolboxDefaultTemplate"
		x:Uid="ControlTemplate_1"
		TargetType="{x:Type swdt:ToolboxControl}"
		Resources="{swd:CachedResourceDictionary Uid=swd:CachedResourceDictionaryExtension_1, Source=pack://application:,,,/System.Activities.Presentation;component/Themes/Generic.xaml}">

		<Grid> <!-- ... --> </Grid>
	</ControlTemplate>
</ResourceDictionary>
```

AFTER...
```
<ResourceDictionary x:Uid="ResourceDictionary_1"
	xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
	xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
	xmlns:swdv="clr-namespace:System.Activities.Presentation.View"
	xmlns:swd="clr-namespace:System.Activities.Presentation"
	xmlns:swdt="clr-namespace:System.Activities.Presentation.Toolbox">
	
	<ControlTemplate
		x:Key="ToolboxDefaultTemplate"
		x:Uid="ControlTemplate_1"
		TargetType="{x:Type swdt:ToolboxControl}">
		
		<FrameworkTemplate.Resources>
			<swd:CachedResourceDictionaryExtension x:Uid="swd:CachedResourceDictionaryExtension_1" Source="pack://application:,,,/System.Activities.Presentation;component/Themes/Generic.xaml" />
		</FrameworkTemplate.Resources>
		
		<Grid> <!-- ... --> </Grid>
	</ControlTemplate>
</ResourceDictionary>
```
@@ -53,6 +54,8 @@ sealed class MarkupExtensionRewritePass : IRewritePass {
return doWork;
}

static bool NonInlineAttibute(XAttribute attr) => attr.Name.Namespace == XamlContext.KnownNamespace_Xaml;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about renaming the function to IsXAMLAttribute to better describe its behavior?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Settling on IsXamlNsAttr, because names of reflection member entities that aren't classes derived from System.Attribute now appear to be confusing to some of the latest static analysis tools. VS intellisense seems over-eager to strip off -Attribute as an aliasing suffx.

Comment on lines 92 to 99
static string NestedReflectionName(ITypeDefOrRef type, out string clrNs) {
var name = type.ReflectionFullName;
while (type.DeclaringType is ITypeDefOrRef t2)
type = t2;
clrNs = type.ReflectionNamespace;
name = name.Substring(clrNs.Length + 1);
return name;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename t2 to declaringType to better reflect what it is :p

sb.Append(name.LocalName);
sb.Append(name.LocalName.Replace("_x002B_", "+"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that undoing the escaping of the + character is always the right thing to do? The function you modified is used for values of XML attributes (where the + character is allowed since the value is stored within double quotes) as well as in the values of XML elements where I am not sure if the + character is valid since the value is stored raw without being placed in double quotes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I stated that this was sketchy in my intro. I’ll remove this line from the PR.

Don't use unquoted '+' char within the XAML name for nested types--even though that's what the MS toolset seems to expect--because it's not a valid character to be exposed as an identifier in naked XML.

A plus sign currently is still deployed into the name, as fetched from reflection via `nSpy.BamlDecompiler.XamlContextNestedReflectionName(...)`. But as a result of multiple XML identifier sterilizations, such cases will now appear in the final XAML as, e.g.:

```
ns:WorkflowDesignerIcons_x002B_DesignerItems.WarningValidation
```
@glenn-slayden
Copy link
Author

glenn-slayden commented Apr 26, 2023

New behavior of nested type names in Xaml... Although the dnSpy XAML result will not be directly usable by the MS toolset, it has been deemed wise not to expose the unsanctioned XML-identifier character '+' in XAML names, since they theoretically end up being exposed as naked XML, in the form of XAML tag-syntax instantiation.

Untitled5

@glenn-slayden
Copy link
Author

glenn-slayden commented Apr 26, 2023

By the way, not sure what the "unorthodox markup extension" is that was causing a the problem in 4522338, but you might want to re-check if your fix from last week is still needed after my changes, since it seemed like some of the corruption I repair here was able to spread beyond the immediate BAML elements that were directly affected, in a few cases.

I wasn't able to hit a breakpoint on your new code at all while running through hundreds of BAML-to-XAML decompliations

@ElektroKill
Copy link
Member

ElektroKill commented Apr 26, 2023

By the way, not sure what the "unorthodox markup extension" is that was causing a the problem in 4522338, but you might want to re-check if your fix from last week is still needed after my changes, since it seemed like some of the corruption I repair here was able to spread beyond the immediate BAML elements that were directly affected, in a few cases.

I wasn't able to hit a breakpoint on your new code at all while running through hunders of BAML-to-XAML decompliations

My fix was for a rare case of a special markup extension. The code was {TypeName String:}. Looking back at it now, it was not the proper fix as it only allowed to skip over the error in the decompilation. It should be decompiled as {TypeName 'String:'} due to the rules set for XAML markup extensions. The decompiler is not really aware of this rule and does not introduce the quotes when necessary. The XAML parser used by the UI is also not aware of this syntax. I'm currently working on a proper fix for this problem which will result in proper usage of ' when necessary in the decompiled output.

Your changes look good to me now so I will go ahead and merge now!

@ElektroKill ElektroKill merged commit a0535a3 into dnSpyEx:master Apr 26, 2023
3 checks passed
@glenn-slayden
Copy link
Author

glenn-slayden commented Apr 27, 2023

It should be decompiled as {TypeName 'String:'}

Super clever; I would have been tricked into just dismissing it as aberrant and moving on.

As an aside, despite what everybody says, {x:Reference abcd} (the amazing wormhole) does in fact work, within reason, in modern WPF, but I'd be fairly positive dnSpy won't be happy about encountering it. I have a major app that uses it profusely. I wonder how much effort it would be worth to improve that capability in dnSpy? Do you even know what I'm talking about?

@ElektroKill
Copy link
Member

It should be decompiled as {TypeName 'String:'}

Super clever; I would have been tricked into just dismissing it as aberrant and moving on.

The current decompilation as {TypeName String:} is correct and the XAML compiler will compile it how is expected, however, VS and other XAML editing tools will mark it as an error as they will try and interpret the String: as an incomplete type name thus I believe that placing the text in single quotes will be a better solution.

As an aside, despite what everybody says, {x:Reference abcd} (the amazing wormhole) does in fact work, within reason, in modern WPF, but I'd be fairly positive dnSpy won't be happy about encountering it. I have a major app that uses it profusely. I wonder how much effort it would be worth to improve that capability in dnSpy? Do you even know what I'm talking about?

I'm not exactly sure what you mean here. {x:TypeName SomeString} has been valid syntax for a while and dnSpy will properly decompile it unless it contains some special XAML or XML characters, in that case, it will not escape them properly (I have started working on improving the support for this but it won't be finished soon as I'm leaving for a 1-week long holiday today). What other support and capability would you be looking for in dnSpy in the case you mentioned?

@glenn-slayden
Copy link
Author

glenn-slayden commented Apr 29, 2023

I'm talking about the {x:Reference someName} markup extension, which deploys another copy of an existing instance of a named, xaml-instantiated object from the same document at another point elsewhere within that document.

Its use is discouraged in WPF, where it does work in some cases, but it seems to be fine in general for non-WPF Xaml usage. I've been assuming that dnSpy doesn't do the special handling that would be necessary to represent the various effects of co-identifying the duplicated instance?

@glenn-slayden
Copy link
Author

Oh wait nevermind, I guess you're saying that dnSpy doesn't need to care about the runtime effects of any MarkupExtension (or Xaml-instantiated entity in general) in order to re-generate the source Xaml?

Ok, perhaps true for {x:Reference} then, but on a related note, there is apparently an exception to that rule that you may be aware of, having to do with IComponentConnector and IStyleConnector. Currently, dnSpy attempts to do some elaborate analysis on the extracted IL code for the prevailing implementation of those two interfaces, if present, as part of re-generating the corresponding Xaml.

In many cases, the IStyleConnector processing part of this is detection is currently broken in dnSpy. In the broken cases, the branching structure of the decompiled IL code for IStyleConnector.Connect(...) does not match one of the hard-coded patterns that is expected/recognizable by dnSpy.BamlDecompiler.Rewrite.ConnectionIdRewritePass.ExtractConnectionId(...) via GetCaseBlocks(...), in particular because the extracted backing code is only checking for a single Id value (2), and thus doesn't need elaborate branching constructs or exhbit any "case blocks". For example, System.Activities.Presentation.View.TypeBrowser from Framework System.Activities.Presentation.dll. I am very close to a fix for this bug, and will create an issue or PR for it soon. Stop me if this conflicts with your work in this area.

@ElektroKill
Copy link
Member

Oh wait nevermind, I guess you're saying that dnSpy doesn't need to care about the runtime effects of any MarkupExtension (or Xaml-instantiated entity in general) in order to re-generate the source Xaml?

Yes, dnSpy does not need to know or care about what a MarkupExtension does during to properly generate XAML from its compiled form (BAML).

Ok, perhaps true for {x:Reference} then, but on a related note, there is apparently an exception to that rule that you may be aware of, having to do with IComponentConnector and IStyleConnector. Currently, dnSpy attempts to do some elaborate analysis on the extracted IL code for the prevailing implementation of those two interfaces, if present, as part of re-generating the corresponding Xaml.

These interfaces are not really part of any markup extension and the IDs used for lookup are stored in a special BAML record. The ID is added to the XElement when the XML model is being built and then during a later post processing step the IL of the Connect methods is analyzed to map the IDs to the event and handler names.

In many cases, the IStyleConnector processing part of this is detection is currently broken in dnSpy. In the broken cases, the branching structure of the decompiled IL code for IStyleConnector.Connect(...) does not match one of the hard-coded patterns that is expected/recognizable by dnSpy.BamlDecompiler.Rewrite.ConnectionIdRewritePass.ExtractConnectionId(...) via GetCaseBlocks(...), in particular because the extracted backing code is only checking for a single Id value (2), and thus doesn't need elaborate branching constructs or exhbit any "case blocks". For example, System.Activities.Presentation.View.TypeBrowser from Framework System.Activities.Presentation.dll. I am very close to a fix for this bug, and will create an issue or PR for it soon. Stop me if this conflicts with your work in this area.

The code which is responsible for detecting the conditions inside the Connect methods is currently quite buggy and I’ve already started to look into rewriting the part which handles the case when the method does not contain a switch statement but rather a series of if statements (the buggy case). I would tell you more about my new approach to this but I’m currently on holiday without access to my PC (typing this out on a mobile device :p). The examples of the current implementation not sufficing will definitely be helpful :D

@glenn-slayden
Copy link
Author

Ok, so we are conflicting then, so I will stop what I'm doing for now; the football is yours. The deepest point of the false-positive rejection for this bug is here:

if (body[pos] is not ILCondition cond) {
if (!body[pos].Match(ILCode.Stfld, out IField _, out var ldthis, out var ldci4) || !ldthis.MatchThis() || !ldci4.MatchLdcI4(1))
return null;

@ElektroKill
Copy link
Member

ElektroKill commented May 5, 2023

@glenn-slayden I partially rewrote the algorithm used to extract the connection ID information from the if-else structures found in the Connect methods. I also fixed a bug that resulted in the incomplete detection of EventSetter elements. The code with my changes is available on the https://github.com/dnSpyEx/dnSpy/tree/feature/improved-baml branch. Please check and let me know if the changes I made fix your issues with connection IDs.

@glenn-slayden
Copy link
Author

glenn-slayden commented May 6, 2023

TL;DR Thanks.

Wow super nice job! Looks like the issue I mentioned above (namely, corrupted BAML result when browsing e.g.,System.Activities.Presentation.View.TypeBrowser) is robustly fixed. I manually re-checked the list of crashing test cases I had gathered last week, and from doing that it was obvious that the BAML readback, as a whole, is now greatly improved versus before.

The left side is before your fix and the right side after:

Untitled4

Notice the green inline error message on is now gone, and instead there are now three new EventSetter members being displayed that were not included at all before. I kept a breakpoint on line 337 (the snippet of my previous message) for the duration of my re-testing, and it was never hit. Yay.

I haven't studied your changes in detail yet. Did it turn out to be what we suspected? My crude analysis was that, in some cases, the IL extracted from the Connect() method would manifest an outermost branching structure that didn't match any of the handful of basic patterns ExtractConnectionId() was qualified to understand.

But just a few hard-coded scenarios should have remained sufficient to cover any realistic number of connection ID that might happen, especially if one expects the switch opcode most of the time... And also, it seems likely that the ExtractConnectionId() code might have worked properly at some point in the distant past, fwiw...

My main hunch was that the bug would be related to whenConnect() only needs to handle exactly one ID value. I discovered that, before processing, the received IL did still show an outermost branch--in this case for the purpose of validating the singular expected ID value only. (As opposed to every other case n > 1, which all benefit from accidental/inherent validation, namely the "free" side-effect of the unavoidable routing process.)

Anyway, I was starting to suspect that, in the n == 1 case, the feeble branching structure--or maybe the tiny method size overall--caused the AST optimizer to get excited--and more emboldened (or capable, or enthusiastic) to transform this one more aggressively than the others. Based on studying a listing of the rewrite the AST applied, I speculated that this especially dramatic reorganization of the IL code was probably what sent ExtractConnectionId() beyond the bounds of anything it might expect to recognize. Am I even close?

@ElektroKill
Copy link
Member

ElektroKill commented May 7, 2023

Hi, the issue you were facing was caused by two major problems in the connection ID extraction.

  1. dnSpy expected only one EventSetter connection per ID, this was a false assumption as it is possible to have more than one per ID. This was fixed in 885f0ba
  2. The analysis of the if structure was not perfect and had flaws
    This algorithm was heavily modified in 90da19a to allow support for more than just a few if patterns. The old algorithm was also designed for IComponentConnector.Connect only which meant it expected that the method would end in an assignment to _contentLoaded field in the type which would signal the end of the if chain. The IStyleConnector.Connect methods do not use this field which resulted in the algorithm not completing successfully. The new algorithm "visits" every "body"/scope recursively and attempts to traverse the body somewhat simulating the actual control flow at runtime. Hopefully, this new implementation will last and won't need any more updates for BAML -> XAML (WPF is getting dropped by MS)

Hope this somewhat explains the changes made. I've also started to work on improving the way markup extensions are handled in the decompiler to prevent issues with incorrect escaped/unescaped characters and correct handling of whitespace in parameters to markup extensions which should hopefully resolve some more issues with the decompiled XAML not being compilable.

EDIT: I improved the algorithm slightly in commit 675b102, could you test it too to see just to make sure I have not introduced any regressions?

EDIT2: The changes have now been turned into a PR #197

@glenn-slayden
Copy link
Author

Ok, I should be able to sync up to your latest within the next couple days and I'll let you know if I see any problems.

@ElektroKill
Copy link
Member

Ok, I should be able to sync up to your latest within the next couple days and I'll let you know if I see any problems.

@glenn-slayden Any news on this? I would like to get the PR merged in the coming days :P

@glenn-slayden
Copy link
Author

@ElektroKill Sorry for the delay, I will try to take a look within the next 24 hours..

@glenn-slayden
Copy link
Author

Ok, ran through my ad-hoc BAML test suite again with your latest change and it looks great, no problems that I can see. No "green in-line error messages" appear in the decompiled BAML. NIce job!

@ElektroKill
Copy link
Member

Ok, ran through my ad-hoc BAML test suite again with your latest change and it looks great, no problems that I can see. No "green in-line error messages" appear in the decompiled BAML. NIce job!

Great to hear that! I will go ahead and merge the PR now :p

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants