Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: create isolated world #2671

Open
furstenheim opened this issue Jun 4, 2018 · 8 comments
Open

Feature request: create isolated world #2671

furstenheim opened this issue Jun 4, 2018 · 8 comments
Labels
chromium Issues with Puppeteer-Chromium feature upstream

Comments

@furstenheim
Copy link

I'd like to create an isolated context to execute some js in the same website I'm visiting without polluting the main context. It is possible right now, but it requires to access a lot of internal methods:

  const mainFrame = page.mainFrame()

  const isolatedWorldInfo = await page._client.send('Page.createIsolatedWorld', {frameId: mainFrame._id, worldName: 'new-isolated-world'})
  const executionContextId = isolatedWorldInfo.executionContextId
  const JsHandleFactory = page._frameManager.createJSHandle.bind(page._frameManager, executionContextId)

  const executionContext = new ExecutionContext(page._client, {id: executionContextId}, JsHandleFactory)
  await executionContext.evaluate(..)

It would be nice if puppeteer exposed this as a method. Something like: page.createNewIsolatedContext

If necessary I could write up the code and the tests

@aslushnikov
Copy link
Contributor

@furstenheim this is interesting. Can you please share your usecase?

@furstenheim
Copy link
Author

@aslushnikov I'm using it webscraper-headless https://github.com/geoblink/web-scraper-chrome-extension
Basically, it was an existing Chrome extension used for scraping. In the normal browser it will execute a bunch of code accessing the DOM to scrape the website. In the browser this is safe because there is a different context for the extension. For example, there cannot be collisions with jquery since it won't be globally defined.

To scrape on the server what I'm doing is opening a new context as in the snippet and injecting the code of the extension there.

https://github.com/geoblink/web-scraper-chrome-extension/blob/headless-mode/extension/scripts/ChromeHeadlessBrowser.js#L47

@aslushnikov
Copy link
Contributor

@furstenheim this looks very reasonable; once #2812 lands we'll add this one as well.

@aslushnikov
Copy link
Contributor

If necessary I could write up the code and the tests

I'd be happy to review a PR! Please make sure to add reference to https://developer.chrome.com/extensions/content_scripts when explaining isolated worlds in documentation.

furstenheim added a commit to furstenheim/puppeteer that referenced this issue Jun 30, 2018
furstenheim added a commit to furstenheim/puppeteer that referenced this issue Jun 30, 2018
furstenheim added a commit to furstenheim/puppeteer that referenced this issue Jul 1, 2018
@aslushnikov
Copy link
Contributor

(cross-posting from #2829 (comment))

I've played with this for a while. A few observations:

  • execution contexts per se are not very handy. Once I have an isolated world, I want to be able to run querySelector and all the ElementHandle goodness. THis means we need to have an IsolatedWorld class in our API.
  • once I have querySelector and everything, I want isolatedWorld.waitForSelector to be working just fine.
  • the isolatedWorld.waitForSelector should work just fine if my frame navigates away (this is what we have today for default frame worlds). It means that I should be able to create "persisted" worlds that are getting auto-recreated once the frame navigates.

In order to have this, we need to support "persistent" isolated worlds on the protocol level. It will be also nice to support exposeFunction and evaluateOnNewDocument methods to work with isolated worlds.

This will make a complete story around isolated worlds. With this level of isolation, things like #609 will be automatically addressed from the user-land.

aslushnikov added a commit to aslushnikov/puppeteer that referenced this issue Sep 26, 2018
This roll includes:
- https://crrev.com/593256 - Support fetching missing intermediate certificates in headless
- https://crrev.com/594161 - DevTools: allow addScriptToEvaluateOnNewDocument accept optional worldName parameter.

References puppeteer#2671.
Fixes puppeteer#2377.
aslushnikov added a commit that referenced this issue Sep 26, 2018
This roll includes:
- https://crrev.com/593256 - Support fetching missing intermediate certificates in headless
- https://crrev.com/594161 - DevTools: allow addScriptToEvaluateOnNewDocument accept optional worldName parameter.

References #2671.
Fixes #2377.
@aslushnikov
Copy link
Contributor

Once we have isolated worlds, we should push certain PPTR functions to the clean isolated world.
Another example of this is #3327: the page.select method is currently implemented in page and uses globally defined Event class. If the website happens to override it, we're screwed.

@Janpot
Copy link
Contributor

Janpot commented Oct 23, 2018

yep, having issues as well running code on websites that override globals, have badly implemented polyfills, run mootools/protoype.js,...
Isolated worlds would solve these for us. A possible API could be to:

await page.createIsolatedWorld().evaluate(...)

@aslushnikov aslushnikov added the chromium Issues with Puppeteer-Chromium label Dec 6, 2018
@jennycai0807
Copy link

Hi @aslushnikov could you share the plan that when will you fix this issue? We still have the page.select() issue #3327 when we are writing automation scripts.

aslushnikov added a commit to aslushnikov/puppeteer that referenced this issue Jan 16, 2019
This patch splits out `IsolatedWorld` class from Frame.
The `IsolatedWorld` abstraction is an execution context
with a designated set of DOM wrappers.

References puppeteer#2671
aslushnikov added a commit that referenced this issue Jan 16, 2019
This patch splits out `IsolatedWorld` class from Frame.
The `IsolatedWorld` abstraction is an execution context
with a designated set of DOM wrappers.

References #2671
aslushnikov added a commit to aslushnikov/puppeteer that referenced this issue Jan 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chromium Issues with Puppeteer-Chromium feature upstream
Projects
None yet
Development

No branches or pull requests

4 participants