Skip to content

Flippin' Mixins, how do they work?

Mumfrey edited this page Sep 7, 2017 · 2 revisions

One of the criticisms I see levelled against Mixin from time to time is that it's too complicated or is just overkill for a particular job. The aim of this article is therefore to explain exactly how Mixin functions in order to allay any fears that it's overly complex - in fact I hope to show that it's simply exactly as complex as it needs to be to do its job - by explaining exactly what each part of the subsystem does and how it functions.

The aim here is to give those familiar with bytecode manipulation, and with ASM in particular, a rounded understanding of Mixin's core, to understand its lifecycle and the reasons for certain design decisions which may seem arcane at first glance.

For the purposes of this article not becoming an un-navigable mess, I am going to assume a familiarity with Mixin's feature set, with the structure of Java classes and bytecode, and with ASM. This is a technical article and is really only suitable for those interested in the inner workings and lifecycle of Mixin. I have done my best to write in plain english, but I'm making no promises that this will make any sense if you don't already have a good understanding of the previously mentioned topics.

1. The Principle

Ultimately, at its core, Mixin takes two sets of bytecode, that of a target class and a mixin "class", and merges them together. ASM provides various ways to do this, but quite honestly anyone who has used ASM for more than five minutes will realise a vital truth:

  • Merging bytecode with ASM is trivially easy

In fact it's so easy that we can copy a method from one class to another with 8 lines of ASM:

ClassNode source = new ClassNode();
ClassNode dest = new ClassNode();
new ClassReader(sourceBytes).accept(source);
new ClassReader(destBytes).accept(dest);
source.methods.stream().filter(m -> m.name.equals("foo")).forEach(dest.methods::add);
ClassWriter cw = new ClassWriter(0);
dest.accept(cw);
return cw.toByteArray();

This isn't even the easiest or most efficient way, just the shortest.

Obviously we know that this will only work with a carefully crafted source method, and that things will immediately explode if a conflicting method already exists for example. Since ASM doesn't check that what we're doing is valid, it merely gives us tools to manipulate the underlying class bytes, it's quite easy for this to go wrong.

But assuming we can check the obvious preconditions, which surely must be pretty easy to check, where does all the complexity come from?

The answer is that Mixin undertakes a lot of work to ensure that incoming mixin code is both sane and correct, and also provides for functionality which extends far beyond the bounds of a single target class. The fact is that

  • Merging bytecode with ASM in a safe and reliable manner is nontrivially hard

The following sections take a look at some of the features Mixin provides, and the problems it tries to solve and explains how it attempts to solve them. It does this largely by taking a walk through Mixin's lifecycle and explaining what it does and how it does it.

2. The Feature Set

Merging bytecode in a dumb way is fine if the mixin author takes full responsibility for their actions, understands the inner workings of the JVM, and can carefully craft the incoming mixin to not break any of the rules. However one of the main goals of Mixin has always been a kind of Transformer Hippocratic Oath to "do no harm". If the mixin is invalid, Mixin should fail-fast and ideally fail with a meaningful error. The onus is then on the mixin author to write valid mixins and on Mixin to turn valid mixins into valid bytecode in the target. We also want to provide features to mixin authors which allow them to write powerful mixins in a straightforward way.

2.1 Target Class Scope

One of the features of Mixin which leads to the most scaffolding is supporting operations which reach outside of the target class, in particular operations which reference superclasses or members added by other mixins to the same class.

One of the most complex aspects of validating hierarchy operations is accessing fields and methods on a superclass of a target class which are themselves added by another mixin, by virtue of having the derived mixin extend the supermixin. This is such a complex topic that it already has its own article.

2.2 "Soft" Operations

Thanks to the workings of the JVM, there are some things we can get away with in mixins which would never be valid Java, but are valid bytecode. Supporting these operations requires defining a "soft" syntax for these operations which can be transformed into real code at application time. Examples include "soft-implementing" interfaces, defining Intrinsic Proxies, and merging methods which have conflicting signatures but differing return types.

2.3 Awareness of Other Mixins

Mixins are designed with awareness of other mixins in mind. Negotiation between mixins is provided by a simple priority system, a more complex companion plugin system, and by contract features such as the @Final annotation.

All mixins provided by all consumers are applied to a particular target class in one pass, so that mixins interact with each other in predictable and ultimately deterministic ways.

2.4 Fail Fast or Fail Safe

Ultimately, we never want to end up in a situation where we allow "bad" bytecode to make it all the way into the ClassLoader. We want our mixins to fail-fast in a predictable way - ideally providing as much information as possible, or at the very least to "fail safe" in that injected code is always valid or just remains wholly un-merged.

2.5 Compatibility and Environmental Awareness

As well as the core functionality of mixin which addresses the notion of obfuscation boundaries and different obfuscation contexts, the actual transformations undertaken by mixin should always take into account actions by other class transformers in the environment at run time. This awareness contributes to the previous goal: if a transformer changes a class in a way we don't expect, we want to detect this situation and again, fail fast or fail safe.

2.5 Efficiency

We want to achieve all of the above in the most efficient way possible. Obviously there are trade-offs with speed and memory usage, and in general Mixin will err on the side of using more memory as an opportunity to gain more speed.

In recent versions of Mixin I have undertaken to profile the different parts of the mixin process, and as expected the vast majority of time is actually spent in side-loading and transforming classes (see the sections on ClassInfo below) which ultimately would need to happen anyway. In general, the core of the Mixin processor is acceptably fast given its capabilities.

3. The Lifecycle

So now we know what we want to achieve, now it's time to dive into the guts of the transformer and see how the different classes interact. This discussion is going to discuss the role of different classes in Mixin's core, so I will take care to link to the relevant source code wherever possible.

I will skip over platform-specific actions in the subsytem and talk only about the Mixin process itself. This is mainly because the platform-specific handlers are resposible for low-level marshalling of jars, tweakers and remapping agents and the whole topic is frankly pretty dull and doesn't add to the understanding of Mixin itself, so I won't digress.

3.1 Startup

Mixin as a subsystem needs to start up somewhere, and most often this is initiated by one or more agents. All consumers in the environment will interact with a single instance of the Mixin subsystem, and thus the party responsible for bootstrapping has little bearing on the operation of Mixin, other than which version is in use.

Besides registering the relevant companion objects, the main task undertaken by consumers is to register their mixin configuration files with the subsystem itself.

It comes as a surprise to some that Mixin requires mixins to be added to a configuration file. "Why" - they ask - "can mixin not just scan for mixin classes in a jar, the same way we scan for mods?" To which the answer is that the main reason is speed, scanning jars is quite expensive and using a configuration file means we can also specify a lot of important corollary information too. It also means that mixins can exist in the jar which are not applied, or are selectively applied by utilising a companion plugin.

When configs are added they are immediately parsed by being internally deserialised to MixinConfig instances. Parsing the config determines the following attributes:

  • The required compatibility level - we can fail fast if the required level is unavailable

  • The required mixin subsystem version - again, fail fast if the version in use is too old

  • The desired phase for the config - more on phases later

The list of mixins in the config is deserialised but not parsed, the mixins in the config are not parsed until the config itself is activated by the start of the phase.

As well as gathering configs, Mixin registers its own Class Transformer which then becomes the entry point for most of the rest of Mixin's functionality.

3.2 Beginning a Phase

First of all, let's answer the question "Why does Mixin have phases at all?"

Before we begin applying mixins, as we shall discover below, we actually need to gather quite a lot of information. Mixins which depend on supermixins or upon transformations made by other class transformers need to know that the supermixins are available, that their target class(es) are as expected, etc. This means that before we begin applying mixins, we need to know about all the mixins we are planning to apply.

Normally this is fine. Mixins are pumped through the transformer chain just like regular classes, everything is loaded and processed in one go, and then we are free to apply the mixins as the target classes pass through the live transformer chain in the ClassLoader.

However, since it may be desirable - especially for other subsystem-level technologies, such as Sponge - to apply mixins to other parts of the loader infrastructure. The simple solution would seem to be "just load the mixins earlier". The problem is that certain mixins are going to rely on being transformed by - for example - the Side Transformer or in certain cases the deobfuscating transformer. At the very start of the game's lifecycle, these transformers are not active since they form part of the framework we wish to transform!

The answer is the introduction of phases. Thus the PREINIT phase comprises mixins which need to mix into infrastructure classes, which need not rely on the transformer chain being completed. Later DEFAULT phase mixins are then loaded once the transformer chain is substantively complete and the game is in its "ready to be loaded" state.

3.2.1 Selecting Configs

We already know which configs wish to participate in which phase, so the start of a phase is triggered by the first class to pass through the transformer chain when the criteria for beginning that phase are met. Thus, on entry, one of the MixinTransformer's first tasks is to check whether a new phase has begun, and select the configs for that phase.

Configs from previous phases are always retained, at the start of a phase, configs for that phase are cherry-picked from the environment's list of available phases and then sorted by priority into a pending list

MixinConfigs perform some initial self-configuration upon being selected:

  • The config's companion plugin (if specified) is instantiated
  • The config's reference map is loaded and parsed

Configs which fail selection are removed from the pending list and are not processed further.

3.2.2 Preparing Configs

MixinConfigs which pass selection are then prepared. Preparation can be quite a lengthy operation because the mixins in the config themselves must be loaded from disk and parsed.

Each MixinConfig contains 3 discrete mixin sets, server, client and common mixins, with the common mixins being parsed first and then either the client or server sets being parsed next depending on the detected side.

3.2.2.1 Parsing Mixins

During the prepare stage, the mixins in the config are loaded from disk, each Mixin is parsed into a MixinInfo struct which handles the initial parsing of the mixin, and acts as storage for all the generated metadata.

If the mixin bytecode is successfully loaded, the MixinInfo proceeds to gather the required information for the mixin:

  • The mixin State is initialised. The State is a struct used to hold the mixin bytecode and unvalidated details parsed from the mixin class. State is maintained as a separate struct within the MixinInfo for validation purposes and as scaffolding to facilitate hot-swapping later in the application lifecycle.

  • A ClassInfo for the mixin is created.

ClassInfo is used extensively throughout Mixin. ClassInfo is used to access and store class metadata without using reflection (which would trigger a class load) by side-loading the class and passing it manually through the transformer chain.

ClassInfo stores similar information to the Java Class object, but with a focus on information required by Mixin. To this end, ClassInfo is mutable, keeping track of which mixins target a particular class and tracking changes which are made by mixins so that the changes are visible to other mixins.

Classes side-loaded for the purposes of ClassInfo are passed through a subset of the transformer chain known as the delegation list. The delegation list is updated when a new phase is entered, and deliberately excludes any transformers known to be re-entrant, and the mixin transformer itself.

Whilst the performance impact of side-loading seems significant, in practice the most expensive operation tends to be the transformers themselves, since the second time the bytecode is loaded "for real" it is usually already present in memory and does not exact any measurable performance cost.

  • The SubType of the mixin is initialised. The criteria for determining the SubType are straightforward:

  • The mixin target classes are then parsed from the @Mixin annotation. Public targets specified as class literals are parsed first, followed by soft targets specified as strings.

    Each target is first checked for eligibility against the mixin config's companion plugin (if any) and a ClassInfo is then fetched for each successful target. If a target is unavailable, the preparation phase will fail-fast at this point.

    Targets which are successfully discovered are checked against the SubType for eligibility (eg. to make sure an interface mixin isn't targetting a class).

3.2.2.2 Completing Config Preparation

As each MixinInfo completes the parsing phase, the parent MixinConfig ingests the target classes into its own local set of targets, this allows the config to determine, for any given target class, whether it has any mixins to apply.

3.2.3 Validating Configs

Preparing the MixinConfigs generates the universe of configs and mixins for the phase, at this point we know that all mixins for the phase have been successfully parsed and all intended target classes are known.

At this stage, a validation pass is made which visits each MixinConfig and causes it to trigger validation of its child MixinInfos.

The validation pass for the MixinInfo visits the State which has retained the initial ClassNode created during the parsing phase. The State performs checks on the mixin to ensure that it is sane against its parsed SubType and target classes:

  • A MixinPreProcessor is constructed and the prepare action is executed, followed by a conform for each target class (see the section later on conforming). This has the effect of decorating each target class ClassInfo with conformed injector methods. This is done at an early stage so that injector handlers can be properly discovered by all other mixins in the stage during the application phase.

  • Basic sanity checks are made by the SubType. One of the main checks made is to ensure that the direct superclass of the mixin appears in the superclass hierarchy of each target class. This necessitates recursively navigating the class hierarchy using ClassInfo and may trigger further class side-loading in order to populate the additional metadata required at this stage. For Interface Mixins and Accessor Mixins, a more basic check is performed.

  • Basic sanity checks are made by the State itself.

  • Inner classes in the mixin are visited and internally stored so they can later be processed. More details in later sections.

If a MixinInfo fails the validation pass, an error is logged and the MixinInfo is removed from the MixinConfig's pending mixin set.

3.2.4 Activating Configs

MixinConfigs which survive the preparation process are added to the transformer's active collection of configs, which is then re-sorted to ensure that configs maintain priority order.

3.3 Applying Mixins

3.3.1 Initial Steps

The mixin transformer performs additional duties to simply applying mixins to target classes. Since some classes need to be generated at runtime, the MixinTransformer first delegates to any registered IClassGenerators if the incoming bytecode is null - implying that the specified class needs to be fabricated per the general contract of class transformers.

3.3.1.1 Inner Class Generator

Inner classes in mixins have to be handled quite delicately. It's important to note that inner classes aren't really particularly inner from the JVM's point of view, and a lot of the functionality afforded inner classes is simply syntactic sugar for package-private scaffolding created by the compiler to create the illusion of inner-ness.

This means that for inner classes, the synthetic scaffolding needs to be carefully reconstructed for each mixin target class, as does the inner class itself. Thus if a particular mixin MixinFoo has an inner class MixinFoo$Baz and targets two classes com.home.Foo and com.home.Bar, two copies of the inner class need to be made: one for each target.

I handle this situation at application time by conforming the class reference to the target and appending a unique identifier (to avoid conflicts between multiple mixins which just happen to have the same inner class name!). Thus our conformed class names will be com.home.Foo$Baz$abcdef0123 and com.home.Bar$Baz$defecdb4567.

Each unique name can be easily reverse-inflected via an internal map to a particular target and source class, and thus is is the task of the InnerClassGenerator to synthesise the bytecode for the conformed inner class by reading the original bytecode and transforming references to the mixin (the original outer class) to the desired target class. Note that access transformations are not necessary since any "private" members will have synthetic, package-private accessors to simulate the original inner-class behaviour.

Note that for pure synthetic inner classes we don't need to do this if the class is static, since currently the only source of pure synthetic inner classes is for the purposes of switch lookups. For these classes we simply pass-through the class bytecode as-is, in order to reduce the processing burden. This is handled by the MixinPostProcessor instead.

3.3.1.2 Args Class Generator

The other class generator currently in employ is the ArgsClassGenerator which is used to synthesise subclasses of Args for use in ModifyArgs injectors.

3.3.2 Preparing Mixins for Application

For each incoming candidate class, MixinConfigs in the transformer's active set are visited to check whether they have any available mixins for the specified class. Remember from above that each MixinConfig maintains an internal hit list of target classes it has mixins for, and thus the process of selection is very fast.

Each MixinConfig contributes eligible mixins to a TreeSet which intrinsically sorts the mixins by priority since MixinInfo implements Comparable<MixinInfo> in order that mixin priority describes the natural order of the mixins.

If any eligible mixins are offered for the target, processing continues and an ClassNode is created for the incoming target class and is wrapped in a TargetClassContext for marshalling through the rest of the pipeline. The ClassInfo for the target is also retrieved from the metadata pool, having been created earlier during the validation phase, and is thus pre-decorated with conformed members from declaring mixins.

3.3.2.1 Creating the Applicator and Pre-Processing Mixins

The first stage in creating the mixin applicator is to pre-process the eligible mixins ready for application. This process blends the two contexts of the mixins themselves (in the form of MixinInfo instances) and the target class (in the form of TargetClassContext) storing the intermediate transformations in a resulting MixinTargetContext object. MixinTargetContext therefore represents a mixin in the context of a specific target as it passes through the applicator.

The pre-processor performs work in three stages to produce the required MixinTargetContext:

  • Prepare First prepares the mixin class in non-context-specific ways, for example stripping the prefixes from shadow methods and soft-implementing methods. These changes decorate the underlying ClassInfo for the mixin so that renamed methods can be tracked.

  • Conform The conforming process is a target-context-specific action which prepares unique methods in the mixin (injector handlers, and any methods explicitly decorated with @Unique) by giving them names which are guaranteed to be unique within the target class hierarchy. Since this process is deterministic within the context of a given target, we can guarantee that the decorated names generated at this stage will match the conformed names generated in the validation pass performed earlier.

  • Attach The attach process is the most involved pre-processor step, and can be thought of as specialising the mixin to its target. It handles each type of method and field differently and thus has multiple responsibilities to discharge:

    • For shadow methods and fields, it ensures that the shadow target exists in the target class

    • For @Overwrite methods it ensures that the target exists

    • For @Unique methods, it handles renaming private methods to avoid clashing with members in the target, and raises an error if public methods conflict

    • For every member, it decides whether the member should remain in the mixin in order to be applied to the target, or discarded because it is not required. For example shadows are discarded after verification and are instead registered with the target context so that they can be visited later.

    Once attach is completed, only members which need to be merged into the target remain present in the mixin, and all members have been conformed to their final signature ready for application.

It's important to note that many of the actions taken during the preprocessing stage decorate the metadata counterparts of the mixin members in the ClassInfo, facilitating actions later in the application process. For example, conformed injector methods can be overridden in derived mixins, but doing so requires a way of communicating the conformed name to the derived mixin, which might - thanks to the magic of classloading - actually get applied before the superclass counterpart. It is for this reason that the conform process is run during the validation stage, in order to fully populate this required metadata. Members in the ClassInfo can be discovered by both their original name and their conformed name, facilitating later calls to isRenamed() to determine the renaming state. This propagation of renaming also applies to prefixed shadow and soft-implements members, which may be over-ridden or called in derived mixins, and will need to know the relevant conformed name in order to have those calls transformed.

3.3.3 Applying Mixins

After the preprocessing stage, a sorted list (in priority order) of MixinTargetContexts exists in the applicator, the applicator then proceeds to apply the prepared mixins to the target class.

Application of mixins proceeds in three stages, visiting mixins in their priority order.

3.3.3.1 Main Application Stage

The Main application stage processes the majority of the mixin application logic, the business of taking the prepared mixin code and merging it into the target class. This stage comprises the following tasks:

  • Merging and applying the generic signature of the mixin with the target class. This takes the form of blending declared signature elements on the mixin with declared elements on the target class. This is handled by ClassSignature structs which are lazily evaluated against the ClassInfo metaobjects of the mixins and the target class.

  • Merging interfaces declared on the mixins into the target class. Effectively adding an implements clause for each implementation declared on the mixins.

  • Merging class attributes such as class version and source file. This is done so that if, for example, the target class was compiled with Java 6 but the mixin usese Java 8 files, the output class has the correct major.minor version.

  • Merging annotations at the class level. Some annotations are discarded (for example the @Mixin annotation itself, whilst most others are merged onto the target.

  • Merging fields from the mixin into the target. Most fields are simply added to the target directly, shadows having been already discarded at the preparation stage. Annotations from shadow fields are merged however.

  • Merging methods from the mixin. This is the most involved step since methods require specialisation before merging, and the applicator also has to respect some other contractual obligations established by mixin.

    As each method is processed it is transformed, this transformation process is handled by the MixinTargetContext and broadly consists of walking through each opcode in the mixin method and changing references to the mixin class (for example method and field accesses) into references to the new target class.

    For methods and fields outside of the target class, the ClassInfo is used to lookup the relevant member, which may cause additional class side-loading if new classes need to be inspected. This is where earlier renames are applied to the actual calls to those members in the mixin.

    Type casts and method and field accesses in the mixin also get some special handling with respect to supermixins. Any references to a supermixin within the scope of the mixin code will be transformed to the target of that mixin *in the context of the current target. To see how this works it's easier to consider a simple example:

    In this example, we have 3 mixins which target 2 classes each, respecting the hierarchy rules. MixinZ accesses a field F in MixinX. When MixinZ is applied to class C, the field access is transformed to A.F because MixinZ in the context of target C has superclass A. Likewise, in the context of target L, the field is transformed to target J.F. This is the underlying reason for the importance of hierarchy traversal in both validation and in later stages of mixin specialisation.

    It is also sometimes desirable in mixin code to type-cast the mixin to a known target, this results in a cast via Object because the Java compiler does not allow a direct cast to a class which it "knows" in advance will fail. So we will often see code such as ((TargetClass)(Object)this).somePublicMethod(). Mixin detects and removes these double-casts for efficiency, meaning the resulting code has equivalent efficiency to simply calling this.somePublicMethod().

    Once the transformation of the method code is complete, the applicator continues the job of merging the method into the target

    • In general, methods are expected to displace their targets, unless the target was already merged by another mixin and was decorated with @Final, implying that further overwrite operations are not permitted. Other @Overwrite methods which displace previous overwrites merely log a warning message.

      There's also always the possibility that the method being merged is a synthetic bridge method in both targets. If this is the case, the bridges are compared for equivalence. If the bridges match then they are merged, otherwise an error is raised.

    • @Intrinsic methods are also handled differently, in that they may be required to displace their target if the displace property is true. Otherwise they are simply skipped or merged depending on the presence or absence of their target.

    • Merged methods are also contributed to the target ClassInfo at the point they are merged.

  • Merging initialisers from the mixin. Initialisers are one of the more tricky aspects of merging mixins because constructors in Java classes are a Frankenstein's Monster of different parts of the original class. Compiled constructors can contain:

    • The original constructor code, including explicit or implicit calls to the superconstructor.

    • Field initialisers from the class.

    • Instance initialisers from the class.

    • Initialisation code for synthetic aspects of the class, for example references to the outer class to a synthetic field.

    Because compiled constructors are such a mess of different parts, mixin limits the functionality afforded to the merging of initialisers. Where possible, field initialisers are detected by a naïve line-number-based algorithm and the detected initialiser ranges are merged into constructors in the target.

    The static initialiser is simply appended to the static initialiser of the target.

Once the Main application stage is completed, all mixins for the class are fully merged with the target. The applicator then makes two further passes over the merged code in order to process injectors:

3.3.3.2 Pre-Injection Application Stage

Since all mixin content is now present in the target class. Injectors can be visited to determine which methods need to be modified and to discover their target opcodes in the candidate methods.

Each method is visited and inspected for any valid injection opcodes which are parsed out into InjectionInfo structs. The heavy lifting is handled by the base class InjectionInfo and the derived types - specific to each injector - are simply responsible for instancing the correct Injector (which performs the actual manipulation) and handling any other injection-type-specific parsing actions.

The InjectionInfo is responsible for parsing out the information from the injector annotation, and locating the candidate target methods for the injection based on the parsed data.

It's important to note that during the Pre-Injection stage, all injectors perform their initial scan of the methods in the class in order to locate their target methods, and desired target opcodes opcodes within the the methods. The search must be completed for all injectors before injections begin in order to allow preservation of the semantics of ordinal and other code-sensitive operations. In early versions of Mixin, injectors were applied progressively, and opcodes altered by earlier injectors could alter the results of later ones. From version 0.5 onwards, injection is now performed in two passes.

Parsing and discovery of targets begins by parsing out defined slices into a MethodSlices holder. The slices themselves can be retrieved by ID when searching (though single-slice injectors will always use a fixed ID of empty string.

String references are first parsed out of the annotation into MemberInfo structs, with the incoming strings being pumped through the config-specified ReferenceMapper which uses the refMap generated at compile-time to convert the compiled-in strings to their obfuscated counterparts.

The parsed MemberInfos are then used as discriminators to locate matching methods in the class. Matching methods are added to an internal targets queue.

Finally, the declared injection points are parsed from the annotation into InjectionPoint instances which will be used later to match opcodes in the discoverd targets. The injector is now fully parsed and ready to be run.

Preparing the Injector

Once parsing is complete, the injector is checked for validity (that it has matched one or more targets) and is discarded from further processing if no targets were identified.

To run the discovery pass, the InjectionInfo visits each of its queued targets and obtains a Target wrapper for each.

Target is a method wrapper obtained from the TargetClassContext which keeps track - for all injectors - of changes made to the method by each injector. This allows aspects such as the max-stack and max-locals to be manipulated in a reliable manner without injectors needing to track this information themselves. It also - as we shall see below - allows injectors to observe the effects of other injectors and act accordingly.

The parsed list of injection points, and the Target is then fed into the Injector's find method, which runs the injection points on the target. Instead of storing the matched ASM "Instruction Nodes" (AbstractInsnNode) directly, the Injector wraps each discovered node in a TargetNode which allows nodes to be decorated with their nominating injection points, for later reference. This process returns a list of unique nominee nodes to the Injector which represents the universe of instructions which are targetted.

Once the nominee nodes have been identified, they are submitted to the Target which creates an InjectionNode for each nominated node. The InjectionNode represents a unique handle to each underlying instruction which can track when the instruction is replaced by an injector.

The returned collection of InjectionNodes is then stored with the Target in the InjectionInfo, ready for application.

The final stage is to strip the injection annotations from the methods, since they are now fully parsed and are no longer required.

3.3.3.3 Injection Application Stage

Now that all injectors have determined their target methods and nominee nodes, the final pass of applying the mixin is the actual Injection stage.

As you might imagine, the parsed InjectionInfos from the previous pass are simply visited in order and allowed to inject their changes into each target. The Injector is called for each combination of Target and nominee InjectionNodes in order to perform its injection.

Importantly, any instruction replacements or additions made by injectors are marshalled through the Target, which decorates the InjectionNode with its replacement instruction if replaced. Later injectors can then decide dynamically whether they care that the instruction was replaced: for example an @Redirect injector can fail deterministically with a useful error; whereas an @Inject callback injector can happily proceed since even though the instruction was replaced, the location is still intact.

Injectors can also use the InjectionNode to marshal metadata to each other. For example the @Redirect injector can respect priority semantics by decorating the injection node with its identity and priority. If it encounters a node which was previously redirected by an injector with lower priority, it can choose to usurp it, or throw an error if the previous injector was @Final. Metadata from the InjectionPoint itself can also be propagated in this way.

Populating Accessors

In addition to processing the injectors, the final Injection stage also visits all @Accessor and @Invoker methods discovered during the attach phase of the preprocessor by generating relevant methods with the appropriate proxy instructions and appending them to the target class.

Much like injectors, accessors and invokers are first parsed into AccessorInfo and InvokerInfo respectively, and then the appropriate generator is constructed in a second pass which fabricates the method and adds it to the target class.

3.3.3.4 Upgrade Stage

Upgrading is a final pass over the mixin which deals with a problematic side effect of shadows and overwrites when combined with access transformers.

In general, it is anticipated that the access modifiers of a @Shadow or @Overwrite are going to either match the target, or in the case of shadow methods be maybe more permissive (it's fully allowable to make a protected abstract shadow rather than a private one since it means the method body can be omitted). This means that in almost all cases, any invocations in the mixin of a private shadow method are going to use the correct INVOKESPECIAL opcode for the invocation.

However, consider the situation where an Access Transformer is used at runtime by a third party to increase the visibility of a base-class method from private to protected so that they can override the method in one of their own custom derived classes. If the mixin code were to silently proceed with its original INVOKESPECIAL, the override contract would be voided because the mixin code would still call the base-class method even in the derived class, because the INVOKESPECIAL would be statically-bound to the base class method and would not employ the vtable.

Mixin's upgrade functionality works around this by logging (at attach time) any methods which need to be "upgraded" from private to greater visibility to match their target, the final pass is then made over all invocations in the target class to ensure that the method is invoked using the correct opcode. This change is already made by the access transformer to incoming code and thus it is only mixin code which needs the upgrade performed.

3.4 Moving to a New Phase

If a new phase is triggered, processing once again returns to the preparation stage and the configs for the phase are ingested by the transformer. No state is released at this point, so existing configs, existing class metadata in the shape of ClassInfo is not discarded. New configs are simply added to the state and processing continues as before.

4. Conclusions

Mixin might at first glance appear to take a simple job - smushing bytecode together - and make it complex by hanging everything in a large amount of unnecessary framework. However, most of the complexity effectively boils down to supporting the following key aspects of applying mixins successfully:

  • Awareness of classes outside the scope of the mixin and its target is vital, this includes changes made to classes by other mixins allowing intra-mixin functionality to be a possibility

  • Storing sufficient class meta-information to provide awareness of changes made to classes so that they can be propagated throughout the incoming code; viz. prefixed shadows and soft-implemented methods.

  • Detecting and adapting to changes made by other transformers in the environment: running the transformer chain on incoming code is expensive but unavoidable

  • Specialising mixins to the context of their target is much more complex than "take code from mixin class and merge into target class", especially for aspects such as lamba functions (in Java, effectively anonymous methods) and invocations of methods in supermixins.

  • Scaffolding for other features, such as runtime exporting, bytecode validation, handling different obfuscation types, in-place auditing checks, and compatibility level adaptation all add additional layers of complexity and require careful management

  • Support for Injectors represents an entire subsystem of functionality which rides on the Mixin core.

4.1 Aspects of Mixin Not Covered in This Article

I have, unapologetically, omitted a few topics from this article because they would have bloated it without purpose. I will likely address them in future technical articles:

4.1.1 Platform Handlers and Platform-Specific Functionality

Mixin doesn't exist in a vacuum. As a subsystem, mixin needs co-operation from the application framework in order to load and perform its functions. There is a not inconsiderable amount of functionality devoted in interacting with different target platforms in order to leverage mixin into the target environments. However given that these features are extremely platform-specific and don't pertain to Mixin's own operations beyond bootstrapping, I have omitted them from discussion in this article.

4.1.2 The Annotation Processor

A huge part of Mixin as a framework, as discussed in the previous introduction-series article "Introduction to Mixins - Obfuscation in Mixins", is the need to traverse the obfuscation boundaries presented by development-time versus production-time. The agent responsible for this traversal is the Mixin Annotation Processor which acts at compile-time to produce the required data for mixins to successfully operate in an obfuscated environment.

Though I touched on the output of this system when mentioning ReferenceMapper above, discussion of the AP itself is outside the scope of this article.

4.1.3 Auditing Functionality

Various aspects of Mixin's operation are designed to provide development-time quality-of-life improvements to mixin authors. In particular exporting and live-decompiling of postprocessed bytecode is a core feature of Mixin, as well as user-enabled modules to run ASM's CheckClassAdapter and to audit the implementation of particular interfaces which are being mixed-in.

Since these parts of the mixin core do not directly pertain to the application of mixins, I have omitted them for clarity.

4.2 A Final Word

Mixin is constantly evolving, and my goal with Mixin is to stick to the core principles outlined above, in particular to do no harm. Nearly all of the framework surrounding the Mixin core is to enable it to do its job reliably and efficiently. Whilst this ultimately increases the complexity of Mixin as a whole, I believe that preserving the effectiveness of the system overall is worth this trade-off. I hope I have successfully demonstrated that everything Mixin does, it does for a reason, and that my motives are altruistic.

If you made it this far, congratulations. Put your comments on the back of postcard and come visit #spongedev on espernet to say "hi".

Clone this wiki locally