-
-
Notifications
You must be signed in to change notification settings - Fork 767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
avm2: Update scope design #5249
Conversation
I'm pretty confused what the scope for Does anyone know what the correct behavior is, or have an swf that uses the opcode? |
(master has the same issue, but I thought it made more sense to report here. I can open a separate issue if that's preferable.) https://www.homestuck.com/story/2792?fl=1 seems to have an issue with scopes. This is the relevant decompiled ActionScript: public class Character extends MovieClip
{
public static var dirs:Array = ["Front","Left","Back","Right"];
public var facing:String;
public function Character()
{
super();
this.facing = dirs[int(Math.random() * 4)];
}
}
public dynamic class cg extends Character
{
public function cg()
{
super();
}
} (warning: the error seems to occur in an infinite loop, and Ruffle will peg your CPU until the tab is closed) The |
From further troubleshooting, I don't think this has anything to do with scopes. |
7bd9a7f
to
a3a3b04
Compare
Okay so the root cause of #5292 is that all instance traits of a class should have its class on the top of its outer scope chain. When I was initially designing the scopechain design I was under the assumption that the scopechain was completely controlled by the SWF, and that flash never manually adds scopes to it, except for creating the global script/package. However from my understanding, it looks like flash is actually manually adding the class object on top of the outer scope of its instance traits (and instance initializer), so I was actually wrong here. One issue that this is causing me is that ScopeChain's use copy-on-write, so manually adding the class object on top of the ScopeChain is a bit expensive. Also, the constructor/instance initializer of the class is currently being created before the class exists, but it somehow needs a reference to the class. To solve this problem in the safest way possible, i'm going to store the class object and the scopes separately, and keep the class in a RefCell, that way we can set the class of the scopechain after its creation, and we can also avoid copying the whole scopechain unnecessarily. |
So I ended up actually just allowing small mutations to ScopeChain's so that this will function properly. This seems to also be what AVMPlus does (see here). |
Okay, so, the documentation about the scope design in AVM2 is really lacking, so I can totally understand why this PR may be hard to review. I can also totally see why this may be confusing to other contributors who are used to the AVM1 design. To help make reviewing this PR as easy as possible, I am going to write a detailed "documentation" about the scope design, based on everything that I have seen from working on this. DefinitionsHere are some definitions that I created to help make things as clear as possible. Some are specific to Ruffle, and others are taken from Flash/AVMPlus.
So, both a captured scopechain and an outer scopechain are a ScopeStack vs ScopeChainBoth the ScopeStack and ScopeChain will be used in an activation for resolving properties, and they both store a vector of For example with exceptions, when an exception is encountered and we are in a try block, we need to enter the catch block in a clean state, which includes clearing the local scope stack. The compiler is allowed to assume the local scope stack will be empty here, but it should still have access to the outer scope, so it will generate code based on that. Scope instructionsThe AVM2 docs have a lot of misleading/incorrect information about scope instructions, so I thought it would be a good idea to rewrite these to be more accurate.
Running the script initializerThe execution of a script initializer is the only time where the outer scope of the activation will be empty. The script initializer will setup the outer scope for the classes/functions it contains. This will happen automatically by the SWF, and we are allowed to just trust the SWF to do the right thing here. Generally speaking, we should never need to manually modify a scopechain or scopestack ever. It is up to the SWF to do everything correctly. There is, however, a special case with classes, but we will get into that later. Capturing a ScopeChain for a functionCapturing a ScopeChain for a function occurs when the Capturing a ScopeChain for a function is very straightforward. We simply take the local scope stack of the current activation, and merge it with the outer scope of the current activation. To make things easier to understand, pretend that the outer scope has 2 scopes, A and B, and the local scope stack also has 2 scopes, C and D, and we are capturing the scopechain of this activation, here is what it would look like:
This operation is copy-on-write, because we don't want to actually mutate the outer scope, because other functions unrelated to this may be sharing it (will get more into this later). When the function is executed, this captured scopechain will be used as the outer scope of the activation that was made for the execution. Capturing a ScopeChain for a classCapturing a ScopeChain for a class occurs when the Classes have a separate captured scopechain for both its instance traits and class traits. The captured scopechain created for class traits is created the exact same way as capturing a scopechain for a function. The captured scopechain for instance traits should be the exact same as the captured scopechain for the class traits, but with the class object on the end of it. That way when an instance method is executed, the class object will be on its outer scope, so the method will have access to the class's static properties. Resolving a definitionResolving a definition is actually pretty straightforward. We start with searching for the definition using the current local scope stack of the activation, from right to left. If it can't find it, we search the outer scope instead, also from right to left. If that doesn't find the definition either, we search the current domain instead. The domain is attached to the outer scope, so this will happen automatically (see When we are searching the scopes of the activation (this applies to local scope stack and outer scope), we only search the traits of the scope objects. There are 2 conditions however that allow us to search dynamic properties as well:
|
Would it make sense to put this in rustdoc?
…On Thu, Sep 16, 2021, 3:05 AM EmperorBale ***@***.***> wrote:
Okay, so, the documentation about the scope design in AVM2 is *really
lacking*, so I can totally understand why this PR may be hard to review.
I can also totally see why this may be confusing to other contributors who
are used to the AVM1 design. To help make reviewing this PR as easy as
possible, I am going to write a detailed "documentation" about the scope
design, based on everything that I have seen from working on this.
Definitions
Here are some definitions that I created to help make things as clear as
possible. Some are specific to Ruffle, and others are taken from
Flash/AVMPlus.
-
*Scope* - A scope refers to the Scope struct (avm2::scope::Scope<'gc>).
The Scope struct is a wrapper for an Object. A Scope also holds a
boolean value that represents if this is a with scope (I will get more
into with scopes later).
-
*Scope Object* - A *Scope Object* refers to an object wrapped inside a
Scope.
-
*ScopeStack (scope stack)* - A ScopeStack refers to the ScopeStack
struct (avm2::scope::ScopeStack). It is really just a Vec<Scope<'gc>>.
Scopes can be pushed on a scope stack, and popped off. A *scope stack*
is not meant to be shared - it should be owned only by whatever created it
(usually an activation).
-
*Local ScopeStack (local scope stack)* - This is a high level
definition, meaning it's not actually used in Ruffle, but it makes thinking
about what is happening easier to understand. All activations have their
own *scope stack*, so a *local scope stack* refers to the *scope stack*
that is being used by the current activation.
-
*ScopeChain* - Refers to the ScopeChain struct (avm2::scope::ScopeChain).
A ScopeChain is meant to be an immutable form of a ScopeStack, with
extra abilities. It is technically possible to mutate a ScopeChain,
but there is currently only 1 situation where this is needed, and I will
get into that special case later. Despite having the ability to mutate a
ScopeChain, it should feel immutable. A ScopeChain holds its scopes in
a Vec<Scope<'gc>>, but unlike a ScopeStack, it is wrapped in a gc
pointer, that way it can be shared & copied freely. A ScopeChain also
holds a Domain, which should be used during multiname resolution as a last
resort.
-
*Captured ScopeChain (captured scope)* - A *captured scopechain* is
used for remembering what the *local scope stack* and *outer scope* of
an activation looked like. This is used when functions or classes are
created. When we are creating a *captured scopechain*, we refer to it
as "capturing a scopechain". The technique we use to capture a scopechain
differs depending on whether it is a class or function, so i'll put more
details about this in another section. Behind the scenes, a *captured
scopechain* is really just a ScopeChain.
-
*Outer ScopeChain (outer scope)* - An *outer scopechain* refers to a *captured
scopechain* that is actively being used in an activation. Behind the
scenes, an *outer scopechain* is really just a ScopeChain.
-
*Global scope* - The *global scope* refers to the object at the bottom
of the *outer scope*. If the *outer scope* is empty, use the bottom of
the *local scope stack* instead. If both are empty, use undefined
instead.
So, both a *captured scopechain* and an *outer scopechain* are a
ScopeChain. We give it a different name depending on the context: When
the ScopeChain is being used in an activation that is executing, we refer
to it as the *outer scope* of the activation. In any other case we refer
to it as a *captured scopechain*.
ScopeStack vs ScopeChain
Both the ScopeStack and ScopeChain will be used in an activation for
resolving properties, and they both store a vector of Scope. The *local
scope stack* of an activation is created per activation, and is owned
exclusively by the activation. A ScopeChain (known as the *outer scope*
in an activation) may be shared, so there may be multiple references to it.
The *local scope stack* is constantly mutating, so for example it may
look completely different depending on where we are in the function. On the
other hand, the *outer scope* will always stay the same, and the function
executing does not have the ability to modify it for any reason. The
difference here is very important for getting the correct behavior while
emulating AVM2, because AVM2 compilers will generate code that *depends*
on this.
For example with exceptions, when an exception is encountered and we are
in a try block, we need to enter the catch block in a clean state, which
includes clearing the *local scope stack*. The compiler is allowed to
assume the *local scope stack* will be empty here, but it should still
have access to the outer scope, so it will generate code based on that.
Scope instructions
The AVM2 docs have a lot of misleading/incorrect information about scope
instructions, so I thought it would be a good idea to rewrite these to be
more accurate.
-
pushscope - Pops an object off the stack, and pushes it onto the *local
scope stack*. This object is now referred to as a *Scope Object*.
-
pushwith - Pops an object off the stack, and pushes it onto the *local
scope stack* as a with scope. When we are resolving a property in a
with scope, we allow searching dynamic properties of the *scope object*
.
-
popscope - Pops a scope off the *local scope stack*.
-
getscopeobject - Gets a *scope object* in the *local scope stack*
using a supplied index. Index 0 refers to the bottom of the *local
scope stack*. Push the result on the stack.
-
getouterscope - Gets a *scope object* in the *outer scope* using a
supplied index. Index 0 refers to the bottom of the *outer scope*.
Push the result on the stack.
-
getglobalscope - Gets the global scope of this activation. The global
scope refers to the object at the bottom of the *outer scope*. If the
outer scope is empty, use the bottom of the *local scope stack*
instead. If both are empty, use undefined instead. Push the result on the
stack.
(This same rule applies to all globalscope opcodes, so i'm not going
to write them all here)
Running the script initializer
The execution of a script initializer is the only time where the *outer
scope* of the activation will be empty. The script initializer will setup
the *outer scope* for the classes/functions it contains. This will happen
automatically by the SWF, and we are allowed to just trust the SWF to do
the right thing here. Generally speaking, we should never need to manually
modify a scopechain or scopestack ever. It is up to the SWF to do
everything correctly. There is, however, a special case with classes, but
we will get into that later.
Capturing a ScopeChain for a function
Capturing a ScopeChain for a function occurs when the newfunction opcode
is executed.
Capturing a ScopeChain for a function is very straightforward. We simply
take the *local scope stack* of the current activation, and merge it with
the *outer scope* of the current activation. To make things easier to
understand, pretend that the *outer scope* has 2 scopes, A and B, and the *local
scope stack* also has 2 scopes, C and D, and we are capturing the
scopechain of this activation, here is what it would look like:
OUTER SCOPE LOCAL SCOPE STACK CAPTURED SCOPECHAIN
[A, B] + [C, D] = [A, B, C, D]
(COPYING)
This operation is copy-on-write, because we don't want to actually mutate
the *outer scope*, because other functions unrelated to this may be
sharing it (will get more into this later). When the function is executed,
this *captured scopechain* will be used as the *outer scope* of the
activation that was made for the execution.
Capturing a ScopeChain for a class
Capturing a ScopeChain for a class occurs when the newclass opcode is
executed.
Capturing a ScopeChain for a class starts off the same way as capturing a
scopechain for a function. Once we have our *captured scopechain*, we
give this *captured scopechain* to all instance + class traits of this
class (including the constructor method!). The *captured scopechain*
should be *shared* here, no trait should have its own copy.
Here is where things get a bit complicated. We need to manually add the
class object to the end of this *captured scope*. *This is the only time
where we ever need to manually add a scope to anything.* Before we do
anything however, we first finish setting up the class completely (all
traits should be installed, class initializer should have ran, etc). Now
that we have our fully initialized class, we add it to the *captured
scopechain*, here is what that might look like:
CAPTURED SCOPECHAIN
[A, B, C, D] + class object = [A, B, C, D, class object]
(MUTATING)
Since all traits of the class share the same *captured scopechain*, this
will automatically update for everything.
Resolving a definition
Resolving a definition is actually pretty straightforward. We start with
searching for the definition using the current *local scope stack* of the
activation, from right to left. If it can't find it, we search the *outer
scope* instead, also from right to left. If that doesn't find the
definition either, we search the current domain instead. The domain is
attached to the *outer scope*, so this will happen automatically (see
avm2::scope::ScopeChain::find).
When we are searching the scopes of the activation (this applies to *local
scope stack* and *outer scope*), we only search the traits of the *scope
objects*. There are 2 conditions however that allow us to search dynamic
properties as well:
1. Scope is a with scope.
2. This is the *global scope*.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#5249 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB7X32MVFP5JURN7LSAZHVTUCGJMTANCNFSM5DTP3XOQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
40fb76b
to
94ec203
Compare
I took a look and reviewed the PR. Everything looks OK to merge - er, well, at least after a rebase. I concur with @leo60228 - we probably should include @EmperorBale's documentation in comments. |
avm2: Update scopechain design
So since this is getting closer to getting merged, there are a few more changes that should be made, but i'm not sure if I want to do it in this PR.
|
Also LGTM after the rebase - only thing I'd appreciate to have is some more tests, for example for new opcode and code that used to throw slot errors. |
(And as for caching, let's leave that for later maybe?) |
So I wasn't able to use rabcasm to create the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the difficulty of reproducing the slot errors in custom-compiled code, I guess we can just push this PR and add further tests as we go?
I'm giving the approve, but I'll wait for kmeisthax or Mike to press the button :)
The previous scope design was based on the AVM1 scope design, which works, but it can cause some pretty bizarre bugs. This PR makes a lot of changes to how AVM2 handles resolving properties, so this will probably take a while to be ready to merge. This PR attempts to switch the scope design to match the real AVM2 design. To accomplish this, I did the following:
Separate the global object and domain. Previously Ruffle was combining the global object and domain together and putting that at the beginning of the scope stack/chain. While this technically works for a lot of cases, this isn't what is actually happening in flash. In flash, the global object is just an instance of the global class, whose dynamic properties match the domain it is associated with. Unlike other objects on the scope stack/chain, we search the dynamic properties AND traits of the global object (usually we only search traits). Note that we are supposed to treat any object at the beginning of the scope chain as a global object, even if it not an instance of the global class.
Switch the scope design to a vec-stack design. All activations have 2 scopes: A local scope which is created per activation, and an inherited scope which refers to the "outer scope" of the activation. A local scope is a
ScopeStack
which is owned by the activation, and allows mutations, such as pushing new scopes on, and popping scopes off. The outer scope is aScopeChain
which is similar, except aScopeChain
is read-only, and can be shared freely. AScopeChain
also keeps a reference to the domain that was in use during it's initial creation (usually script initialization). When we lookup a property on a ScopeChain, we use that domain as a last resort when no other scopes can find the property. I left comments with more details inscope.rs
.When a function or class is created, they take the local scope of the current activation (scope stack) and "chain" it on top of the outer scope, creating a brand new outer scope for the function or class. If it is a class, all instance traits will use the scope of the class. If it is a function, it will be executed with the new scope as it's outer scope. This is scope inheritance, Ruffle was also doing this before, just with the wrong design.
The global scope refers to the bottom of the outer scope. If the outer scope is empty (this will only happen during script initialization), the global scope refers to the bottom of the local scope (scope stack). The global scope is almost always going to be an instance of the global class, but this rule is not enforced, and can be any object. When we search the global scope, we also search it's dynamic properties (as I said earlier).
All methods, including builtin methods have an outer scope. Previously when a builtin method was executed, it used the scope of the caller, which should never happen. During the script initialization for the global script (
load_player_globals
), we create a ScopeChain that connects to the global object, and give that to all builtin methods. That way, the global scope of all builtin methods is actually its global scope (before it was the global scope of the caller, which is wrong).Give all builtin methods a
code_context
. The change I made above makes it so builtin methods can't access the Domain of the caller, which is somewhat limiting. AVMPlus solves this by giving all builtin methods aCodeContext
which it can use to gain access to the domain of the AS3 caller, so I did the same. All activation's created usingfrom_builtin
have a specialcode_context
field that can be used by builtin methods to access the Domain of the caller. This is very useful for methods likeflash.utils.getDefinitionByName
andflash.system.ApplicationDomain.currentDomain
, which requires this type of info.Now that Ruffle is using an outer scope and local scope design, it is possible to implement the undocumented
getouterscope
opcode, which I also did in this PR.Implement functionality for
pushwith
. Scopes pushed onto the scope stack using this opcode should also search dynamic properties.