-
Notifications
You must be signed in to change notification settings - Fork 9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: run puppeteer in the browser #2119
Comments
@Janpot this definitely sounds interesting. I don't think it makes sense to have it as a part of this repository - it deserves a separate project. Do you have any success with this? |
I tinkered a bit with it a month ago. I think it's feasible to create a build of puppeteer that runs in the browser but I haven't picked it back up. I think a separate project would only complicate things as the goal would be to not add code at all, just refactor here and there and add a build target. |
This would be ideal. @JoelEinbinder had a prototype as well and refactored pptr codebase to simplify things there. However, iirc he still needed to mock |
@aslushnikov Ok, so I did a quick and very very dirty test again, building
Then with running browserless locally ( const puppeteer = require('puppeteer');
puppeteer.connect({ browserWSEndpoint: 'ws://localhost:3000' })
.then(async browser => {
const page = await browser.newPage();
await page.goto('https://example.com');
console.log(await page.content())
}); work in the browser. So to list the main problems:
Will see if one of these days I can find some time to clean this up a bit, and find better solutions to these problems. |
Thanks @Janpot for the follow-up
Right, I'd expect both
This is an interesting approach. Another option might be implementing |
|
@aslushnikov I added my proof of concept as a PR #2374 |
The developer tools can overcome some browser security enforcements like CORS. |
A nice summary on what's anti-bundleable in pptr was given here: #2245 (comment) We should:
|
correct me if I'm wrong, but now that |
@Janpot: puppeteer-core is the same codebase as puppeteer; but yes, you'd probably want to depend on
Also the dynamic imports - I have a promising draft to cleanup these. |
This patch removes all dynamic requires in Puppeteer. This should make it much simpler to bundle puppeteer/puppeteer-core packages. We used dynamic requires in a few places in lib/: - BrowserFetcher was choosing between `http` and `https` based on some runtime value. This was easy to fix with explicit `require`. - BrowserFetcher and Launcher needed to know project root to store chromium revisions and to read package name and chromium revision from package.json. (projectRoot value would be different in node6). Instead of doing a backwards logic to infer these variables, we now pass them directly from `//index.js`. With this patch, I was able to bundle Puppeteer using browserify and the following config in `package.json`: ```json "browser": { "./lib/BrowserFetcher.js": false, "ws": "./lib/BrowserWebSocket", "fs": false, "child_process": false, "rimraf": false, "readline": false } ``` (where `lib/BrowserWebSocket.js` is a courtesy of @Janpot from puppeteer#2374) And command: ```sh $ browserify -r puppeteer:./index.js > ppweb.js ``` References puppeteer#2119
This patch removes all dynamic requires in Puppeteer. This should make it much simpler to bundle puppeteer/puppeteer-core packages. We used dynamic requires in a few places in lib/: - BrowserFetcher was choosing between `http` and `https` based on some runtime value. This was easy to fix with explicit `require`. - BrowserFetcher and Launcher needed to know project root to store chromium revisions and to read package name and chromium revision from package.json. (projectRoot value would be different in node6). Instead of doing a backwards logic to infer these variables, we now pass them directly from `//index.js`. With this patch, I was able to bundle Puppeteer using browserify and the following config in `package.json`: ```json "browser": { "./lib/BrowserFetcher.js": false, "ws": "./lib/BrowserWebSocket", "fs": false, "child_process": false, "rimraf": false, "readline": false } ``` (where `lib/BrowserWebSocket.js` is a courtesy of @Janpot from #2374) And command: ```sh $ browserify -r puppeteer:./index.js > ppweb.js ``` References #2119
Currently connection assumes that transport is a websocket and tries to handle websocket-related errors. This patch: - moves ConnectionTransport interface to use callbacks instead of events. This way it could be used in browser context as well. - introduces WebSocketTransport that implements ConnectionTransport interface for ws. This is a preparation step for 2 things: - exposing `transport` option in the `puppeteer.connect` method - better support for `browserify` References puppeteer#2119
Currently connection assumes that transport is a websocket and tries to handle websocket-related errors. This patch: - moves ConnectionTransport interface to use callbacks instead of events. This way it could be used in browser context as well. - introduces WebSocketTransport that implements ConnectionTransport interface for ws. This is a preparation step for 2 things: - exposing `transport` option in the `puppeteer.connect` method - better support for `browserify` References #2119
Bundled version of Puppeteer should rely on native WebSocket. Luckily, 'ws' module supports the same interface as the native browser websockets. This patch switches WebSocketTransport to use the browser-compliant interface of 'ws'. After this patch, I was able to bundle Puppeteer for browser using the following config in `package.json`: ```json "browser": { "./lib/BrowserFetcher.js": false, "ws": "./lib/BrowserWebSocket", "fs": false, "child_process": false, "rimraf": false, "readline": false } ``` where `./lib/BrowserWebSocket` is: ```js module.exports = WebSocket; ``` and the bundling command is: ```sh $ browserify -r ./index.js:puppeteer > ppweb.js ``` References puppeteer#2119
Bundled version of Puppeteer should rely on native WebSocket. Luckily, 'ws' module supports the same interface as the native browser websockets. This patch switches WebSocketTransport to use the browser-compliant interface of 'ws'. After this patch, I was able to bundle Puppeteer for browser using the following config in `package.json`: ```json "browser": { "./lib/BrowserFetcher.js": false, "ws": "./lib/BrowserWebSocket", "fs": false, "child_process": false, "rimraf": false, "readline": false } ``` where `./lib/BrowserWebSocket` is: ```js module.exports = WebSocket; ``` and the bundling command is: ```sh $ browserify -r ./index.js:puppeteer > ppweb.js ``` References #2119
When will the updated version be out? |
@noamalffasy The next release is scheduled for October, 4 (you can see next release date in the very beginning of our documentation). |
Is there a way I can get this version without waiting until the next release? |
@noamalffasy you can either clone from the github directly, or install the tip-of-tree release with |
That worked! |
Okay I have an issue now with bundling,
I'm using webpack |
@noamalffasy I'm not sure what's the Note though: we don't currently publish bits we use to bundle, but you can |
So this feature is only available if you clone the repository? Or is it temporary? |
@noamalffasy we're not shipping any bundled version of puppeteer for web, but we made sure that there are no obstacles in bundling puppeteer. |
But there is an issue, maybe I need to change my webpack config? const path = require("path");
module.exports = {
entry: "./src/main.ts",
mode: "production",
module: {
rules: [
{
test: /\.ts$/,
loaders: "babel-loader",
exclude: /node_modules/
},
{
test: /\.js$/,
use: ["source-map-loader"],
enforce: "pre"
}
]
},
resolve: {
extensions: [".ts", ".js", ".json"]
},
output: {
filename: "bundle.js",
path: path.resolve(__dirname, "dist")
}
}; |
I wrote some code that scrapes some web pages. It doesn't do too well in a cloud hosted environment like DigitalOcean. It'd be neat if a user could load a page served by my API that would then allow their browser tabs to be controlled through the regular puppeteer API (if they permitted/allowed it, etc. etc.). This is the opposite of me having to waste the server resources to run a web browser, while still allowing me to do scriptable things like user input, clicking, evaluating scripts, etc. Was that kind of the vision here? Is that possible and I am just misunderstood? |
I think this is possible using the bundled version of puppeteer and extension's Tip: you can pass a custom transport to puppeteer connect using the |
@aslushnikov Q: Can I use this to use puppeteer inside of an already opened browser? For example, if I'm already logged into Facebook, can I execute a Puppeteer script inside the same browser so I don't have to login again? I thought it wouldn't work because if I had to launch a new headless browser, the cookies would be gone but comments on this issue and the merged PR give me some hope. Looking forward to your reply! |
@woniesong92 You can connect puppeteer to any browser that talks the devtools protocol. For that you'll first need to start chrome with an extra CLI flag |
@Janpot do u have any document guideline how to run puppeteer in browser without running nodejs? |
@aslushnikov since chrome.debugger provides a The reason why I ask this is because I cannot manage to get the chrome.tabs.getCurrent((tab) => {
let currentTabTarget = {tabId: tab.id};
chrome.debugger.attach(currentTabTarget, '1.3', () => {
if(chrome.runtime.lastError) {
alert(chrome.runtime.lastError.message);
}
});
chrome.debugger.getTargets((targets) => {
currentTarget = targets.find((info) => { return info.url == tab.url });
chrome.debugger.sendCommand(currentTabTarget, 'Target.exposeDevToolsProtocol', {targetId: currentTarget.id});
chrome.debugger.detach(currentTabTarget, () => {
if(chrome.runtime.lastError) {
alert(chrome.runtime.lastError.message);
}else{
alert(window.cdp)
}
});
});
}); |
IIRC the
Yeah, I think this is because |
@aslushnikov many thanks for getting back to me on this. I can't seem to find any information about the DevTools version that chrome.debugger exposes, but also since its in an experimental feature I think you're right this is probably just not possible at the moment. In the future, if chrome.debugger becomes more stable and up-to-date, being able to run puppeteer-web using the chrome.debugger interface would be very useful for me. This could allow us to write chrome-extensions which could use puppeteer without the need for users to launch the browser from the command line. |
Hi, we are trying to use the @aslushnikov if I understand well, the devtools exposed by |
Wanted to check any update on "launching the browser other than the command line" from the client-side JS page? Basically, I want to deliver HTML page with some JavaScript file to the User, that will launch the browser (Currently, we are launching it from the command line). Once the browser is launched I will get "webSocketDebuggerURL" from http://127.0.0.1:9222/json/version. to connect. Any help on, how I can achieve it without using any server/command Line? |
…entEmitter. I noticed this while trying to actually use the TypeScript client in a browser context and went digging a bit into this `isomorphic-ws` module. Spoiler: it's a lie! The module is merely a switch which selects the right import at runtime. Yet, it does not attempt to fill the gaps between the Node.js and the Browser-base WebSocket. The main issue being that, on Node.js, the WebSocket is an instance of [EventEmitter](https://nodejs.org/api/events.html#events_class_eventemitter) which comes with fairly useful methods like `once`, `removeAllListener` and so forth. On the browser however, we are doomed with the [EventTarget](https://developer.mozilla.org/en-US/docs/Web/API/EventTarget) and its crappy API :'( ... One particularly surprising thing is that, Puppeteer and the supposed cross-platform testing isn't any useful here since Puppeteer seems to be using its own emulation of the WebSocket, which isn't at all the one used by the Browser but has an API closer to the Node.js one. So while the browser tests are all passing, they do not actually pass on a real browser 🤦 puppeteer/puppeteer#2119 (comment) This PR introduces a slightly better `IsomorphicWebSocket` interface as a drop-in replacement for our internal use. It only covers the `on`, `once`, `removeListener` and `removeAllListeners` which we use internally. I had to resort to a JavaScript module for that because I couldn't get the TypeScript compiler to cooperate. As a consequence, the .js module does not get copied in the `dist` by default, I had to manually copy it as part of the build command, which _seems wrong_ but I am too unfamiliar with the TypeScript tooling :/
…entEmitter. I noticed this while trying to actually use the TypeScript client in a browser context and went digging a bit into this `isomorphic-ws` module. Spoiler: it's a lie! The module is merely a switch which selects the right import at runtime. Yet, it does not attempt to fill the gaps between the Node.js and the Browser-base WebSocket. The main issue being that, on Node.js, the WebSocket is an instance of [EventEmitter](https://nodejs.org/api/events.html#events_class_eventemitter) which comes with fairly useful methods like `once`, `removeAllListener` and so forth. On the browser however, we are doomed with the [EventTarget](https://developer.mozilla.org/en-US/docs/Web/API/EventTarget) and its crappy API :'( ... One particularly surprising thing is that, Puppeteer and the supposed cross-platform testing isn't any useful here since Puppeteer seems to be using its own emulation of the WebSocket, which isn't at all the one used by the Browser but has an API closer to the Node.js one. So while the browser tests are all passing, they do not actually pass on a real browser 🤦 puppeteer/puppeteer#2119 (comment) This PR introduces a slightly better `IsomorphicWebSocket` interface as a drop-in replacement for our internal use. It only covers the `on`, `once`, `removeListener` and `removeAllListeners` which we use internally. I had to resort to a JavaScript module for that because I couldn't get the TypeScript compiler to cooperate. As a consequence, the .js module does not get copied in the `dist` by default, I had to manually copy it as part of the build command, which _seems wrong_ but I am too unfamiliar with the TypeScript tooling :/
…entEmitter. I noticed this while trying to actually use the TypeScript client in a browser context and went digging a bit into this `isomorphic-ws` module. Spoiler: it's a lie! The module is merely a switch which selects the right import at runtime. Yet, it does not attempt to fill the gaps between the Node.js and the Browser-base WebSocket. The main issue being that, on Node.js, the WebSocket is an instance of [EventEmitter](https://nodejs.org/api/events.html#events_class_eventemitter) which comes with fairly useful methods like `once`, `removeAllListener` and so forth. On the browser however, we are doomed with the [EventTarget](https://developer.mozilla.org/en-US/docs/Web/API/EventTarget) and its crappy API :'( ... One particularly surprising thing is that, Puppeteer and the supposed cross-platform testing isn't any useful here since Puppeteer seems to be using its own emulation of the WebSocket, which isn't at all the one used by the Browser but has an API closer to the Node.js one. So while the browser tests are all passing, they do not actually pass on a real browser 🤦 puppeteer/puppeteer#2119 (comment) This PR introduces a slightly better `IsomorphicWebSocket` interface as a drop-in replacement for our internal use. It only covers the `on`, `once`, `removeListener` and `removeAllListeners` which we use internally. I had to resort to a JavaScript module for that because I couldn't get the TypeScript compiler to cooperate. As a consequence, the .js module does not get copied in the `dist` by default, I had to manually copy it as part of the build command, which _seems wrong_ but I am too unfamiliar with the TypeScript tooling :/
Just trying to feel the water here, but it seems to me that apart from downloading chrome and launching a browser, puppeteer isn't really doing anything that can't be done in a browser. I'm thinking
puppeteer.connect()
in a webpage. Would there be any interest in supporting this? Am I overlooking any barriers that are in the way of achieving this? I can probably make some time to look into it.The text was updated successfully, but these errors were encountered: