Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diary - JS scope insulation #3

Open
sashafirsov opened this issue Mar 4, 2018 · 25 comments
Open

Diary - JS scope insulation #3

sashafirsov opened this issue Mar 4, 2018 · 25 comments
Labels
diary question Further information is requested

Comments

@sashafirsov
Copy link
Member

sashafirsov commented Mar 4, 2018

EPA / embed-page notes on JS scope, window and application security of scope insulation.

  • anonymous default scope makes complete insulation
  • none is a global scope, no insulation. Useful for html include
  • scope="xxx" named. Variables and API are shared between scope with same name

How to hide the page scope(global) JS objects and substitute those with own implementation?

eval() with closure-defined global ovverrides appeared to be a way to make embed-page content insulation from host page. Here are the notes on implementation evolution.

@sashafirsov sashafirsov changed the title Diary - scope insulation Diary - JS scope insulation Mar 4, 2018
@sashafirsov
Copy link
Member Author

sashafirsov commented Mar 4, 2018

How to make the window properties exposed as global( in EPA scope ) objects?
The top level objects are window and document. The trick is to make the window objects exposed as part of scope.

The pill is sweater due to the fact that initial window content could be used for populating the scope. Later changes on window and global variables still needed but is not a usual case, so it could be ignored for now. From security stand of point it does not weaken the insulation, except of ability to detect EPA environment. Which is possible in multiple other ways anyway.

The with() operator will do the exposure trick.

@sashafirsov
Copy link
Member Author

sashafirsov commented Mar 5, 2018

How to trap global location object assignment?
Location object is a tricky as it should be identical between window.location, document.location and global location object. The need to override assignment of string to object itself( to reset embed-page content ) is a special challenge.

While the window.location="someUrl" could be overridden using setter of window object property, the "global" in the EPA scope 'location' object could not have a setter associated. Unless the name resolution is involving the with operator:
with(window) { location='abc'; /* resolves to window.location='abc' */ }

@sashafirsov sashafirsov added question Further information is requested diary labels Mar 5, 2018
@sashafirsov
Copy link
Member Author

At this stage the solution chosen for location=url is to convert it into window.location=url during script load. It does not solve the eval( "location=url" ) case which will complain on attempt to override the const variable. Other cases like location.href, location.assign, etc. seems working fine and covered by unit tests.

@sashafirsov
Copy link
Member Author

sashafirsov commented Jul 8, 2018

While thinking on window management in <embed-page> the convention of cross-app grouping been revised again. JS frames( aka windows ) are referenced to each other via name or directly obtained during frame creation by window.open(). Note, the frame content could be (re-)set using a target attribute on <a> or <form> which matches the frame name. Security hole: it creates the ability to compromise the visual content of identified session by 3rd party app.

Frame names are scoped by browser identity session, which is usually either global and incognito. Which is definitely insufficient as multiple identities could be used and better not correlate between each other.

To avoid cross-identity session overlap, in <embed-page> ( and eventually in browser ) the target attribute could be used for additional scoping of frames (instances of embed-page). That way new window/embed-page instance will inherit the target and will be available over window.frames[], window.parent,... in same scope. The windows with another target will not have access to given scope and as result not able to manipulate its content like replacing content URLs, closing window or opening new ones.

The main window will have a collection of top level target 's and ability to operate with all app instances under same target: close, freeze, save, (re-)open. It similarity with group operation over bookmarks folder except of executing on host web page with microapplication content within embed-page.

@sashafirsov
Copy link
Member Author

sashafirsov commented Jul 8, 2018

target attribute on <embed-page> dedicated for "identity session".

It does not mean the hyperlinks and forms will be targeting the specific frame, rather it will define the scope where those instances will resolve frame names including named or special name like _top. That way it would be possible to create two and more "targets" serving the different cross-site accounts. Like app authorized by FaceBook, shown some FB pages along with disqus threads, associated with same identity. If you need to create another set of windows with different identity, it would be done by starting the "identity session" with own target value.

@sashafirsov
Copy link
Member Author

All scripts inside of meant to run in same scope, meaning sharing the global window.xxx functions and variables. JS engine does not give an access to enumeration of scope variables so implementation has to use some tricks to

  1. hide <embed-page> globals from container window.
  2. enumerate the variables/functions from EPA scripts
  3. detect changes to variable values referenced as global variable or as window.xxx between beginning of script section and in the end.

To make the insulated from container windows scope, scripts are executing within <script type="module"> . Which automatically scoping es6-style declared variables. Still var or undeclared variable assignment infecting the container window. The last could be trapped by preserving the container window state, adding container window changes to epa.globals and finally restoring original window state.

In order to share same scope each sub-script or element.onxxx event handler have to be executed within common script section.

poc/global-scope.html covers that behavior.

@sashafirsov
Copy link
Member Author

sashafirsov commented Sep 21, 2019

In order to reuse the scope, content of script tags and event handlers should be executed in same scope. Which could be achieved either by

  • running all code within single SCRIPT tag associated with embed-page.
  • wrapping each script individually.

1st method will use shared set of global variables, only sync window.xxx assignment with variables is needed.
CONS:

  • each SCRIPT could fail but it should not prevent to run others. That requires to surround each code section by try/catch
  • delay until all scripts are loaded

2nd method
PROS:

  • SCRIPT type=module could be handled differently, without sync to window props. Which actually makes implementation even more difficult in comparison with 1st method.
  • script execution could be done before following scripts loaded. Inapplicable.
  • script execution could be done simultaneously with fetching following scripts. Performance optimization. Inapplicable due to need of all variables collection before run of any script.
    CONS:
  • all variables from each script should be declared in each section

Common:

  • variables from each SCRIPT should be declared ahead of any SCRIPT execution to avoid container window props pollution. That could be achieved by collecting all scripts and extracting all words and window object properties (with exception of keywords and embed-page specific variables).
  • after execution of each script all "window" properties should be assigned to "globals" in each SCRIPT(one or many) scope.
  • after execution of each script all "global" variables should become a window object property.

@sashafirsov
Copy link
Member Author

sashafirsov commented Sep 21, 2019

Global script handling
will try to implement 1st option -

  1. Collect all scripts code, extract all variables as keywords with exception of
  • JS keywords
  • "clean" window properties from blank iframe window
  • EPA_ prefixed variables
  • EpaWindow properties
  1. in rendered script declare all collected variables.
  2. clone all EpaWindow properties into variables.
    For each script code
  3. temporary clear container window properties to avoid leaking container globals into embed-page scope
  • preserve "unclean" window properties
  • remove those properties from window
  1. in try{ section insert code }
  2. catch(ex){ console.error(ex)} will permit to run following SCRIPTs
  3. finally{}
  • move added to container window properties into EpaWindow ( detect by comparing with reference iframe )
  • restore container window properties
  1. For each onXXX attribute
  • in try section set event handler
  • in event handler body insert code as in 4-7

@sashafirsov
Copy link
Member Author

Async/defer scripts
Worth special treatment. Globals in "detauched" scripts most likely would not expose variables for reuse in external scripts. In this case execution could be delayed after page load.
There are exceptions though. Analytics on one side reside in defer script, but custom code is inside of page scripts. Globals in this case serve the joint. Need to collect most popular globals to include in vars list.

@sashafirsov
Copy link
Member Author

sashafirsov commented Sep 21, 2019

Most popular global variables

  • jQuery, $, query versions
  • '_' lowdash
  • dojo
  • d3
  • Sortable, Effect, ...( http://script.aculo.us )
  • shaka ( Shaka Player )
  • spf ( spfjs )
  • swfobject, THREE, WebFont,
  • React
  • Vue
  • Backbone
  • Hammer
  • $$, Class, Element, Request (Mootools.net)
  • zawgyiDetector,? (Myanmar tools)
  • s, t(), s_account (Adobe analytics)
    other
  • analytics
  • feedback

A good start to collect js libs with globals is
CDN https://developers.google.com/speed/libraries

@sashafirsov
Copy link
Member Author

sashafirsov commented Oct 6, 2019

Unifying JS under single script allows a single variables list sharing. But concatenation of multiple files conflicts with import module statements which meant to be used ONLY in beginning of JS file, definitely not in try{} scope closure ( finally section needed for container window recovery and emitting load event).

What could be done further?

  • try/finally could be substituted with setTimeout(x, 0) in beginning of concatenated script
    • problem with broken script- it would prevent to load all other scripts, even if those are valid
  • inject scripts individually, try/finally could be substituted with setTimeout(x, 0)
    • problem with common variable sets synchronized for all scripts
    • a bit hassle to collect all scripts completion to emit DOMContentLoaded and load event.
    • on positive side broken JS scripts would not prevent to execute other scripts

Intermediate (though more complex) solution, matching browser script loading convention :

  • collect all non-module scripts into concatenated text. Such scripts assumed do not have import statements and could use try/catch enclosure for each section.
    • still suffer from broken script preventing to load others
  • load type="module" scripts individually, synchronizing globals in beginning of script.
    • globals modified asynchronously would be changed only in own scope. Which means cross-module data sharing would need to be done over window.XXX instead of direct XXX use. That could be tolerated in most cases.

From MDN formats of static import is limited to:

    import defaultExport from "module-name";
    import * as name from "module-name";
    import { export1 } from "module-name";
    import { export1 as alias1 } from "module-name";
    import { export1 , export2 } from "module-name";
    import { foo , bar } from "module-name/path/to/specific/un-exported/file";
    import { export1 , export2 as alias2 , [...] } from "module-name";
    import defaultExport, { export1 [ , [...] ] } from "module-name";
    import defaultExport, * as name from "module-name";
    import "module-name";

@sashafirsov
Copy link
Member Author

sashafirsov commented Oct 6, 2019

Individual script loading seems to be quite attractive.
PROS:

  • broken scripts would not break others
  • no conflicts on import variables between otherwise concatenated scripts
  • development could use saved to FS rendered content for easy debug, in prod content would be embedded.

CONS: Synchronization of globals across all scripts scopes

  • upon completion of each.
  • for async/await module code

With try/finally or appending to the end( when script executed without exceptions) synchronization is not an issue: globals would be immediately populated to epa.globals and propagated into each script scope( mean scopes should be exposed for embed-page ). Unfortunately try/finally is not an option for scripts with static imports.

SOLUTION: Sync code exposed as local method and called

  • in the end of section
  • on timeout(0) to cover the case of error or async code, with check whether sync has already been done.

The list of variables would be collected by loading all scripts before execution, saved into epa.globals, listed and initialized from epa.globals in beginning of each script.

The scope would register itself in epa.scripts and expose

  • syncFrom( globals ) method which would copy globals to scope variables.
  • syncTo( globals ) method to be called in the end of script, timeout, or after async code( event handlers, promise, etc.). EPA_sync() method would be available in scope for implicit call by epa modules to populate epa.globals and propagate in each scope.

@sashafirsov
Copy link
Member Author

Rather using setTimeout(0) , need to evaluate https://github.com/YuzuJS/setImmediate or https://github.com/medikoo/next-tick

@sashafirsov
Copy link
Member Author

sashafirsov commented Oct 12, 2019

sequential execution of SCRIPT type="module" saved the hassle of hooking into last script execution. 'load' event is emitted on embed-page withing SCRIPT appended to HTML body.

@sashafirsov
Copy link
Member Author

Since each SCRIPT has own context with simulated list of global variables, functions would reside in own script closure. When functions used from another module, globals would be visible as local to script scope. Which mean

  • before function is invoked the globals have to be copied into function scope. but how to retrieve them from caller scope first?
  • after function call the locals should be populated back to globals ( EmbedPage.globals ). But how to populate caller's local variables from globals?

The top level SCRIPT functions would be trapped by SCRIPT wrapper (scriptTemplate) and on exit point from script would be wrapped with code which will perform globals to local sync before/after the function call.

Question of populating scope of caller is open for now.
As an idea, additional wrapper within the scope would populate locals.

@sashafirsov
Copy link
Member Author

Apparently when calling the function from another scope the marshaling of variables needed in both scopes. Otherwise upon return from caller scope the callee scope variables not updated.

The call sequence would be:

  • caller scope: EPA_vars2globals()
  • callee(function) scope: EPA_StartScope()
  • call function()
  • finally{ EPA_EndScope() }
  • caller scope: EPA_globals2Vars()

@sashafirsov
Copy link
Member Author

sashafirsov commented Oct 13, 2019

Each top level function( assuming it is not dynamically changed variable ) need to be surrounded by wrapper in each scope only once.

  • In callee (function declaration) scope wrapping is done before script execution( function by name is available in beginning of module ).
  • In other(caller) scopes wrapping should be done on demand in EPA_globals2Vars() call. To avoid populating back into epa.globals the wrapper would be marked as EPA_callerWrapper.

@sashafirsov
Copy link
Member Author

embed-page.loadCount
The SCRIPT once added to DOM, is not going to be removed from execution even if removed or its content cleared. Which given the difficulties if embed-page content changed before previous content is executed.
The way around is to keep the flag specific to loaded page within generated script and compare to embed-page current flag value before execution. If flag is not same, it means the content been changed and script is not valid anymore, hence need to be skipped.

@sashafirsov
Copy link
Member Author

sashafirsov commented Oct 20, 2019

Scripts inlining
The none scope assumes no insulation but each microapplication could expect the module script execution at least once in their life cycle which conflicts with container behavior where script run only once in page life cycle.
The use of imports and variables insulation are the perks of module script, but due to 'run only once' policy it is not possible to reuse such script in different contexts of individual microapplications.

embed-page scope="none" would not honor the 'run only once' policy for top level scripts module or not, keeping this policy only for modules loaded via import JS statement.

The scoped embed-page have same challenges with top level modules which have to be invoked in each embed-page instance. The difference is in implementation of scripts inlining.

@sashafirsov
Copy link
Member Author

sashafirsov commented Oct 20, 2019

document.currentScript
Browser has given current script access only for non-module type of script. Which excludes the use of modern ES6 modular dependencies in content-aware scripts.
embed-page breaks this pattern by making document.currentScript to module and non-module top level scripts. Which gives

  • ES6 modules dependencies capabilities
  • access to SCRIPT tag parameters
  • access to DOM location where SCRIPT is injected, (along with each occurrence execution) making scriptlets pattern useful.

Tricks to implement.

  • as script tag executed not in sync matter the setting of currentScript should happen in first lines of code. Which is possible only in inlined code where the source is available as a text to be changed. The scripts injection routine knows the sequence and adds the embed-page as variable and uses script index in code to set document.currentScript.
  • mode=none would require replacement of document object with proxy which override currentScript
  • In scoped embed-page the document is EpaDocument instance which gives setCurrentScript() method.

@sashafirsov
Copy link
Member Author

sashafirsov commented Nov 10, 2019

Performance issues due to unified variables treatment

While in rev 0.0.20 the global variables defined in script and in event attribute level are handled more or less properly, the implementation suffers from quite a bit of overhead:

  • variables list is literally whole bundle from all variables (internal and global)
  • variables list is compiled from all scripts
  • applying all variables even if those are not used in particular script.
  • does not make a distinction between VAR, CONST, Functions resulting in exception trapping on each initialization attempt. (try/catch is a costly method if applied on all vars on each script and event handler)

@sashafirsov
Copy link
Member Author

sashafirsov commented Nov 10, 2019

To make global variables sync more efficient:

  • extract global vars from AST into separate buckets ( per script or event attribute ):
    • const,
    • let,
    • var,
    • function,
    • import vars
    • html vars
      As opposite to sniff of all top level and scoped vars.
  • AST buckets extraction per each script and event handler. As opposite to collecting all names for all scripts.
  • in each script/event handler to sync only variables which actually used by this script. As opposite to sync all vars.
  • in script/event handler initialize locally used variables from globals only if they are not
    • locally declared as let, const, function
  • after script/event handler sync back only locally declared globals

@sashafirsov
Copy link
Member Author

sashafirsov commented Nov 30, 2019

Variables handling sequence

  1. sanitize epa.globals_removable from epa.globals

  2. load HTML

  3. extract globals from DOM by id="XXX" into epa.globals & epa.globals_removable

  4. extract event onXXX event handlers from DOM into not executed SCRIPT

  5. Load scripts body.

  6. extract globals from AST into script.globals & epa.globals

    • from variables declarations( var,let,const,function, )
    • from globals defined as window.XXX and window['XXX'] ( only for valid variable names )
    • from undeclared globals assigned without declaration XXX=abc;
  7. prepare and execute the scripts in defined by DOM order, one a time

    • declare and init var globals var XXX=epa_globals.XXX, YYY=epa_globals.YYY... from epa.globals except of vars in script.globals ( those are initialized within script itself )
    • define sync back from vars to epa.globals from script.globals list
    • define wrapper for global assignment window.XXX= to fill epa.document.currentScript.globals
    • execute the script

@sashafirsov
Copy link
Member Author

sashafirsov commented Nov 30, 2019

Import vars as globals
In many cases the import with variables is sufficient to identify APIs meant to be used in global scope. They are treated in particular script in similar fashion as const variables. I.e. not allowed to be defined before and overridden later.

Access to such APIs are valuable in event handlers but there is not much use in duplication the import statement in SCRIPT and within inline event handler.

Hence, event handler could have a good use of import declaration when located

  • bellow the import statement
  • after all SCRIPT tags are executed

To minimize the number of SCRIPT tags, the event handlers could fit into the end of last SCRIPT within finally section.

@sashafirsov
Copy link
Member Author

sashafirsov commented Nov 30, 2019

Globals in event handlers
Undeclared variables assignment is popular case: onclick="isClicked=true"

It means the body of event handler should be scanned in same fashion for globals as SCRIPT content.
By injecting event handler into tail of last script all imports made available from withing.

@sashafirsov sashafirsov mentioned this issue Dec 2, 2019
Merged
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
diary question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant