Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support obfuscation of symbolic information in the precompiler #30524

Closed
mraleph opened this issue Aug 23, 2017 · 10 comments
Closed

Support obfuscation of symbolic information in the precompiler #30524

mraleph opened this issue Aug 23, 2017 · 10 comments
Labels
area-front-end Use area-front-end for front end / CFE / kernel format related issues. area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. front-end-kernel P2 A bug or feature request we're likely to work on

Comments

@mraleph
Copy link
Member

mraleph commented Aug 23, 2017

IMPORTANT: If you are reading this bug because you tried enabling obfuscation and got the following warning

Warning: This VM has been configured to obfuscate symbol information which violates the Dart standard.
See dartbug.com/30524 for more information.

This warning is there just to remind you that you are now building your Dart application in a mode that behaves differently from a normal execution mode. In this mode some things (e.g. printing string representation of Type objects, stacktraces, noSuchMethod) will behave slightly differently then what is expected from a Dart program running in a default mode - because identifiers are transparently obfuscated.

If you understand what you are doing and how obfuscation affects semantics of your program - you should feel free to ignore this warning.

@mraleph mraleph added the area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. label Aug 23, 2017
@mraleph mraleph self-assigned this Aug 23, 2017
mraleph added a commit that referenced this issue Aug 25, 2017
…precompiler

Obfuscation is controlled by obfuscate flag in Dart_IsolateFlags.

Obfuscation of identifiers is performed during script tokenization - when TokenStream is generated from the source. All kIDENT and kINTERPOL_VAR tokens are renamed consistently using a persistent obfuscation map stored in ObjectStore::obfuscation_map.

Some identifiers (pseudo-keywords, arithmetic operators, builtin recognized methods and entry-points) are not renamed to keep name based lookups from breaking. All other identifiers are renamed.

Constant instances of Symbol-s (both created via literal syntax #ident and using constant constructor const Symbol("ident")) are renamed consistently with corresponding identifiers.

Script urls and Library urls and names are also obfuscated.

Obfuscation map can be dumped as a JSON array at the end of precompilation using Dart_GetObfuscationMap API.

BUG=#30524
R=rmacnak@google.com

Review-Url: https://codereview.chromium.org/3003583002 .
@mraleph
Copy link
Member Author

mraleph commented Aug 25, 2017

7d52317 provides initial implementation of the obfuscation.

gen_snapshot and dart_boostrap now support two additional options

  • --obfuscate which enables obfuscation;
  • --save-obfuscation-map=<filename> which makes VM store a mapping between original names and obfuscated ones in the given filename. The mapping is encoded as a JSON array [original_name_0, obfuscated_name_0, original_name_1, obfuscated_name_1, ...].

Implementation Details

All identifiers (except those that are forced to have an identity renaming) are renamed when TokenStream is created for the Script.

At the end of precompilation we also collect and rename scripts, libraries and Symbol instances. Libraries and scripts URI are also renamed.

⚠️ Because all identifiers are renamed methods like Object.runtimeType, Type.toString, Enum.toString, Stacktrace.toString, Symbol.toString (for constant symbols or those generated by runtime system) will return obfuscated results. Any code or tests that rely on this will break.

⚠️ Only constant instances of Symbol are renamed

This means that in the code below const Symbol('secretMember1') will be renamed consistently with secretMember1 identifier in the main(), and similarly #secretMember2 will be renamed consistently with secretMember2 identifier. However new Symbol('secretMember3') will not be renamed in any way because it is not a compile time value. Also notice that resulting snapshot will contain string 'secretMember3' and will not contain strings 'secretMember1' or 'secretMember2'.

class A {
  dynamic noSuchMethod(Invocation inv) {
    if (inv.memberName == const Symbol('secretMember1')) {
      return 'value1';
    } else if (inv.memberName == #secretMember2) {
      return 'value2';
    } else if (inv.memberName == new Symbol('secretMember3')) {
      return 'value3';
    }
    return 'unknown';    
  }
}

main() {
  final obj = new A();
  print(A.secretMember1);
  print(A.secretMember2);
  print(A.secretMember3);
  // Unobfuscated AOT build will print: value1, value2, value3
  // Obfuscated AOT build will print: value1, value2, unknown
}

Known Limitations

Obfuscator has a list of identifiers it can't rename for implementation reasons, because compiler, runtime or embedder expect to resolve this identifiers dynamically by unobfuscated name.

The following symbols are not renamed:

  • entry points specified by embedder and VM internal entry points (e.g. identityHashCode);
  • pseudo-keywords (e.g. show, hide, await, 'async', etc);
  • operator names (e.g. *, []=);
  • internal Dart VM symbols (see runtime/vm/symbols.h);
  • names of classes and methods that are recognized by the compilation pipeline and intensifier (see runtime/vm/method_recognizer.h).

Full list can be found in the source or extracted from obfuscation map generated by the VM.

Future Work

  • This obfuscator is implemented in the classic pipeline, when Kernel becomes default pipeline similar obfuscation would have to be implemented there too;
  • Some of the limitations on the obfuscation could be lifted if internal lookups in bootstrapper and in the compilation pipeline use obfuscator to mangle the names.

@mraleph
Copy link
Member Author

mraleph commented Feb 3, 2018

@kmillikin I wonder if we can find somebody to look at hooking up obfuscation in the kernel pipeline - it seems @mehmetf is on track hooking it up internally. which means we would need to support it in Dart 2 so that it does not become a blocker for switching to Kernel. Currently it is all done in the tokenizer :-/ Should probably be done in Kernel loader.

@mraleph mraleph added area-kernel P2 A bug or feature request we're likely to work on labels Feb 3, 2018
@mraleph
Copy link
Member Author

mraleph commented Feb 3, 2018

marking it P2 for now - because there are more important things for us to tackle.

@mehmetf
Copy link
Contributor

mehmetf commented Feb 4, 2018

Thanks for the update.

so that it does not become a blocker for switching to Kernel. Currently it is all done in the tokenizer

I am assuming and hoping that this implies we will not be losing this functionality with switch to Dart 2. Please clarify.

@mraleph
Copy link
Member Author

mraleph commented Feb 4, 2018

If we are to switch today - then yes, obfuscation would stop working. I don't anticipate us switching before we fix this anyway. I will make sure that this is tracked appropriately.

@mraleph
Copy link
Member Author

mraleph commented Feb 15, 2018

@jensjoha if you have some cycles, please check out if you can hook up obfuscation in Kernel mode too.

@mraleph mraleph assigned jensjoha and unassigned mraleph Feb 15, 2018
@mehmetf
Copy link
Contributor

mehmetf commented Feb 15, 2018

@mraleph qq about obfuscation and dart entrypoints that Android shell uses. We are experimenting with a solution to allow teams embed Flutter into their existing apps. One way to do this is to create multiple entry points in Dart and call into these from Android. Obfuscation would obviously break this.

My question is how does it work today with a single entrypoint (main)? Does obfuscation system know not to mess with main()? Do you have a proposal on how to support this use case? (e.g. don't obfuscate methods that start with a particular prefix?)

@mraleph
Copy link
Member Author

mraleph commented Feb 15, 2018

@mehmetf in reality there are already multiple entry points into Flutter program - anything that is looked up by its symbolic name from C++ side is a sort of entry point.

Deobfuscation builds on top of the same mechanism that AOT compiler uses and does not obfuscate entry points that are specified by the embedder. In case of Flutter this information comes from a file like this https://github.com/flutter/engine/blob/master/runtime/dart_vm_entry_points.txt that is fed into VM through gen_snapshot flags.

You can build on top of this.

Another alternative is to build on top of main method:

void main([args]) {
  switch (args.length == 1 ? args[0] : 'default') {
    case 'main0': main0(); break;
  }
}

@jensjoha
Copy link
Contributor

Kernel support landed yesterday (df92e14). Please test and give any feedback.

@jensjoha jensjoha removed their assignment Feb 28, 2018
@mraleph
Copy link
Member Author

mraleph commented Feb 28, 2018

I consider this as done, even though there might be still some work to make obfuscator more obfuscaty. Thanks for handling this @jensjoha !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-front-end Use area-front-end for front end / CFE / kernel format related issues. area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. front-end-kernel P2 A bug or feature request we're likely to work on
Projects
None yet
Development

No branches or pull requests

4 participants