New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify evaluation in iframes #72

Open
martinpitt opened this Issue Nov 24, 2017 · 6 comments

Comments

Projects
None yet
5 participants
@martinpitt

martinpitt commented Nov 24, 2017

I'm currently porting Cockpit's ingration tests from PhantomJS to the Chrome Debug Protocol. By and large this is going nicely (big thanks!), but the aspect of testing pages with iFrames is excruciatingly hard to get right with chrome. It took me four rewrites with different approaches and several days to get this right, there is very little Google juice about this, and I figure others might stumble over this as well. So I'm filing this both as a place for discussing improvements to the protocol as well as publishing my solution where others can find it.

Cockpit's tests use an abstract Python API such as Browser.open(url), Browser.wait_present(selector), Browser.eval_js(expression), Browser.switch_to_frame(frame_name), and Browser.switch_to_top() (going back to the topmost document), i. e. the current iframe name is a state that needs to be respected by eval_js() or wait_present(); these all eventually are implemented through Runtime.evaluate() (and formerly in terms of PhantomJS incantations).

If all of your iframes come from the same origin, it's actually fairly simple. One can just remember the frame name and then determine HTML document to query with

if (current_frame)
    frame_doc = document.querySelector(`iframe[name="${current_frame}"]`).contentDocument.documentElement;
else
    frame_doc = document;

and run the query on frame_doc. However, this doesn't work if the iframe to query has a different origin, as JS that runs on the page cannot look inside the content. Then you have to use the DOM shadow tree. Runtime.evaluate() accepts a contextId to select which iframe document the query gets run in, which works fine. This requires building a frame name → contextId map.

However, execution context IDs are very transient things which need careful tracking. They get invalidated on page reloads and navigation clicks which switch pages (obviously), but I've also seen jQuery pages that destroy and recreate the execution context when changing an element (not so obvious), so that this could even hit you in the middle of a "wait for a JS condition to become true" query. Also, there is no way to enumerate the current ExecutionContexts, map an execution context to a frame name, or map a frame name to an execution context.

The only thing you can do is to keep track of an execution context ID → frame ID mapping through Runtime.executionContextCreated and -Destroyed, and keep another mapping between frame ID → frame name through frameNavigated. These two don't have a defined order either, so one has to keep both maps and only do the lookup when querying. On top of that we also need to provide a way to wait for a frame name to load (see above "jQuery can invalidate entire document" problem). As there can only be one handler for Page.frameNavigated(), we have to use a chained promise there:

var frameIdToContextId = {};
var frameNameToFrameId = {};
// set these to wait for a frame to be loaded
var frameWaitName = null;
var frameWaitPromiseResolve = null;

client.Page.enable();
client.Runtime.enable();

// map frame names to frame IDs; root frame has no name, no need to track that
client.Page.frameNavigated(info => {
    if (info.frame.name)
        frameNameToFrameId[info.frame.name] = info.frame.id;

    // were we waiting for this frame to be loaded?
    if (frameWaitPromiseResolve && frameWaitName === info.frame.name) {
        frameWaitPromiseResolve();
        frameWaitPromiseResolve = null;
    }
});

// track execution contexts so that we can map between context and frame IDs
client.Runtime.executionContextCreated(info => {
    frameIdToContextId[info.context.auxData.frameId] = info.context.id;
});

client.Runtime.executionContextDestroyed(info => {
    for (let frameId in frameIdToContextId) {
        if (frameIdToContextId[frameId] == info.executionContextId) {
            delete frameIdToContextId[frameId];
            break;
        }
    }
});

function getFrameExecId(frame) {
    var frameId = frameNameToFrameId[frame];
    if (!frameId)
        throw Error(`Frame ${frame} is unknown`);
    var execId = frameIdToContextId[frameId];
    if (!execId)
        throw Error(`Frame ${frame} (${frameId}) has no executionContextId`);
    return execId;
}

With that under the belt, we can finally do a query in the currently selected frame name:

client.Runtime.evaluate({expression: [...], contextId: getFrameExecId(cur_frame_name)});

and write a helper to wait for a frame to get loaded:

function expectLoadFrame(name, timeout) {
    return new Promise((resolve, reject) => {
        let tm = setTimeout( () => reject("timed out waiting for frame load"), timeout );

        // we can only have one Page.frameNavigated() handler, so let our handler above resolve this promise
        frameWaitName = name;
        new Promise((fwpResolve, fwpReject) => { frameWaitPromiseResolve = fwpResolve })
            .then(() => {
                // For the frame to be fully valid for queries, it also needs the corresponding
                // executionContextCreated() signal. This might happen before or after frameNavigated(), so wait in case
                // it happens afterwards.
               function pollExecId() {
                    if (frameIdToContextId[frameNameToFrameId[name]]) {
                        clearTimeout(tm);
                        resolve();
                    } else {
                        setTimeout(pollExecId, 100);
                    }
                }
                pollExecId();
            });
    });
}

This finally seems to work well, but I daresay that it's not entirely obvious. Can the API be extended to become simpler? PhantomJS has switch_to_frame(name) which henceforth makes all queries apply to that. This is stateful and thus doesn't directly fit into the CDP API. But these API extensions would help, in descending abstractness/ascending amount of work for the client:

  1. Do the frame name → frameId → contextId tracking internally and have Runtime.evaluate accept a frame name. This would get rid of all of the above code.

  2. Do the frame object → contextId tracking internally and have Runtime.evaluate accept a nodeId for the frame, in whose context the query runs. This would also get rid of all of the above code, just requires an extra DOM.querySelector() to map a frame name to a nodeId, and avoids introducing the rather special "frame name" type as an API parameter.

  3. Provide a way to map a frame name to its current contextId. Almost as easy as above, but doesn't require the API to constantly track the mapping itself, it can just be called right before each Runtime.evaluate(). This introduces the need to have expectLoadFrame() though, or handle "unknown context ID" errors from it and retry in a loop.

  4. Do the frameId → contextId tracking internally and have Runtime.evaluate accept a frameId. These are as transient as executionIds, but unlike contextIds they can be queried from the DOM tree (which is quite laborious, but avoids having to track all events).

Thanks in advance!

@JoelEinbinder

This comment has been minimized.

Show comment
Hide comment
@JoelEinbinder

JoelEinbinder Nov 29, 2017

Accepting frameId or nodeId in Runtime.evaluate is interesting. It feels a bit weird because there isn't a one-to-one mapping of ExecutionContexts to Frames. Frames can have multiple execution contexts with extensions and workers, or no execution contexts. Additionally the Runtime domain is used for node so it shouldn't work with frames. This might be one of the situations where things are complicated because they are complicated.

JoelEinbinder commented Nov 29, 2017

Accepting frameId or nodeId in Runtime.evaluate is interesting. It feels a bit weird because there isn't a one-to-one mapping of ExecutionContexts to Frames. Frames can have multiple execution contexts with extensions and workers, or no execution contexts. Additionally the Runtime domain is used for node so it shouldn't work with frames. This might be one of the situations where things are complicated because they are complicated.

@pavelfeldman

This comment has been minimized.

Show comment
Hide comment
@pavelfeldman

pavelfeldman Nov 29, 2017

Collaborator
Collaborator

pavelfeldman commented Nov 29, 2017

@martinpitt

This comment has been minimized.

Show comment
Hide comment
@martinpitt

martinpitt Nov 30, 2017

@pavelfeldman : I did look at puppeteer before starting this, but it's not suitable for us:

  • It does not work with an already installed/packaged Chromium, it bundles its own. That makes test runs a magnitude slower in CI and also much less convenient for developers.
  • It does not actually help much with abstracting frame handling (perhaps unsurprisingly, as the Debug Protocol itself doesn't provide good APIs for that). It just re-exposes all of the frameNavigated and execution context concepts into its interface, and moreover you now need to differentiate between page.evaluate() and frame.$eval(). The latter makes things a bit easier as you don't have to go through the execution context first, but you still have to track appearance and change of frame objects.
  • Our actual tests are written in Python, so cockpit has a test driver which basically translates abstract python functions to phantomjs or Chrome calls. So I can't benefit from the relatively nice code flow and test structure of puppeteer, and it just becomes a significantly bigger intermediate layer compared to the simple chrome-remote-interface npm.

That's not to say that I generally consider it a bad project - to the contrary, for a new project I'd actually wholeheartedly recommend people to consider it. It just doesn't fit well into Cockpit's particular needs.

martinpitt commented Nov 30, 2017

@pavelfeldman : I did look at puppeteer before starting this, but it's not suitable for us:

  • It does not work with an already installed/packaged Chromium, it bundles its own. That makes test runs a magnitude slower in CI and also much less convenient for developers.
  • It does not actually help much with abstracting frame handling (perhaps unsurprisingly, as the Debug Protocol itself doesn't provide good APIs for that). It just re-exposes all of the frameNavigated and execution context concepts into its interface, and moreover you now need to differentiate between page.evaluate() and frame.$eval(). The latter makes things a bit easier as you don't have to go through the execution context first, but you still have to track appearance and change of frame objects.
  • Our actual tests are written in Python, so cockpit has a test driver which basically translates abstract python functions to phantomjs or Chrome calls. So I can't benefit from the relatively nice code flow and test structure of puppeteer, and it just becomes a significantly bigger intermediate layer compared to the simple chrome-remote-interface npm.

That's not to say that I generally consider it a bad project - to the contrary, for a new project I'd actually wholeheartedly recommend people to consider it. It just doesn't fit well into Cockpit's particular needs.

@pavelfeldman

This comment has been minimized.

Show comment
Hide comment
@pavelfeldman

pavelfeldman Nov 30, 2017

Collaborator
Collaborator

pavelfeldman commented Nov 30, 2017

@yanivefraim

This comment has been minimized.

Show comment
Hide comment
@yanivefraim

yanivefraim Jan 3, 2018

I believe that I have a similar problem. My goal is to emulate touch on my page / iFrame (I have a code that is looking for window.ontouchstart inside my iframe).

I found using Emulation.setTouchEmulationEnabled not useful for this case, because is only affects the top window.

As I understand, puppeteer uses something like injecting those methods by evaluating scripts (https://github.com/GoogleChrome/puppeteer/blob/master/lib/EmulationManager.js#L62).

Is there a better solution for emulating stuff inside my iframe? Something like "switch to context"?

yanivefraim commented Jan 3, 2018

I believe that I have a similar problem. My goal is to emulate touch on my page / iFrame (I have a code that is looking for window.ontouchstart inside my iframe).

I found using Emulation.setTouchEmulationEnabled not useful for this case, because is only affects the top window.

As I understand, puppeteer uses something like injecting those methods by evaluating scripts (https://github.com/GoogleChrome/puppeteer/blob/master/lib/EmulationManager.js#L62).

Is there a better solution for emulating stuff inside my iframe? Something like "switch to context"?

@geekscrapy

This comment has been minimized.

Show comment
Hide comment
@geekscrapy

geekscrapy Apr 1, 2018

I've also got a similar requirement to resolve FrameId to ContextId - I need to grab all the frames document.documentElement.outerHTML using Runtime.evaluate. At this time I need to do as @martinpitt does with the below and loop over all ContextId's running the js on it:

// track execution contexts so that we can map between context and frame IDs client.Runtime.executionContextCreated(info => { frameIdToContextId[info.context.auxData.frameId] = info.context.id; });

geekscrapy commented Apr 1, 2018

I've also got a similar requirement to resolve FrameId to ContextId - I need to grab all the frames document.documentElement.outerHTML using Runtime.evaluate. At this time I need to do as @martinpitt does with the below and loop over all ContextId's running the js on it:

// track execution contexts so that we can map between context and frame IDs client.Runtime.executionContextCreated(info => { frameIdToContextId[info.context.auxData.frameId] = info.context.id; });

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment