-
Notifications
You must be signed in to change notification settings - Fork 9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
request.continue no longer working as expected with latest Puppeteer #4030
Comments
I still haven't got a fix for this. Why would the above code correctly intercept and alter the request URL in Puppeteer <= 1.6.2 but not work from Puppeteer 1.7.0 onwards?? |
@leem32 the What exactly is not working for you? |
I've added some script below that shows my problem. This works perfectly fine in Puppeteer <= 1.6..2. You can see in the const puppeteer = require('puppeteer');
const fs = require("fs");
// var url = "http://example.com";
var url = 'https://gazellegames.net/login.php/';
(async() => {
const browser = await puppeteer.launch({args: ['--no-sandbox']});
const page = await browser.newPage();
// make sure urls dont have double forward slashes except after protocol
url = url.replace(/([^:])(\/\/+)/g, '$1/');
// note: add trailing slash since chrome adds it
if (!url.endsWith('/')) {
url = url + '/';
}
// urls hold redirect chain
let urls = [url];
const client = await page.target().createCDPSession();
await client.send('Page.enable');
await page.setRequestInterception(true);
page.on('request', request => {
let blocked = false;
let hasForwardSlash = false;
// remove duplicate paths e.g example.com/foobar/foobar/ -> example.com/foobar/
if (request.resourceType() == "document") {
console.log("document request");
console.log(request.url());
let requestUrl = request.url();
if (requestUrl.endsWith("/")) {
hasForwardSlash = true; // readd slashes later
requestUrl = requestUrl.replace(/(\/)*$/, '');
// console.log("has forward slash");
}
let urlIntoArray = requestUrl.split("/");
if (urlIntoArray[urlIntoArray.length - 1] == urlIntoArray[urlIntoArray.length - 2]) {
let paths = urlIntoArray[urlIntoArray.length - 1];
let newRequestUrl = requestUrl.replace("/" + paths, "");
if (hasForwardSlash) {
newRequestUrl = newRequestUrl + "/";
// console.log("add back forward slash");
}
console.log("newRequestUrl: " + newRequestUrl);
request.continue({
url: newRequestUrl
});
return; // prevent calling continue twice
}
}
// console.log(request);
request.continue();
// return;
});
// get client side navigation redirect urls
await client.on('Page.frameNavigated', (e) => {
if (!e.frame.parentId) {
let lastUrl = urls[urls.length - 1];
let frameUrl = e.frame.url;
// intercepted request has removed duplicate paths in puppeteer <= 1.6.2 but not in puppeteer >= 1.7.0
console.log("frame url should now show duplicate paths are removed: ");
console.log(frameUrl);
// note: add trailing slash since chrome adds it
if (!frameUrl.endsWith('/')) {
frameUrl = e.frame.url + '/';
}
console.log("last url");
console.log(lastUrl);
if (!lastUrl.endsWith('/')) {
lastUrl = urls[urls.length - 1] + '/';
}
if (frameUrl !== lastUrl && frameUrl !== "chrome-error://chromewebdata/") {
urls.push(e.frame.url);
}
}
});
await page.goto(url);
browser.close();
console.log("Redirects: " + urls);
})(); |
@leem32 ah, the old behavior was a buggy one - when you continue to a new URL, the web page should not know there's been a "redirect". What you probably want instead is a real redirect. Instead of doing request.respond({
status: 302,
headers: {
location: newRequestURL
},
}); Does this help? |
Yep, that's fixed the issue. Thanks :) |
Drive-by: add clarification to docs/api.md regarding chaning "URL". References puppeteer#4030
Drive-by: add clarification to docs/api.md regarding chaning "URL". References #4030
Drive-by: add clarification to docs/api.md regarding chaning "URL". References puppeteer#4030
So, is there any way to rewrite request url using Puppeteer >= 1.7.0 ? |
Puppeteer version: 1.12.2
I've just updated to the latest version of puppeteer and noticed part of my script has stopped working.
The code below intercepts each request and if it was a document request check the URL for duplicate paths e.g 'example.com/login.php/login.php'. The code used to work fine with older versions of Puppeteer, but now the
request.continue
part of the URL no longer seems to be working. It just doesn't alter the request url.Has something changed for
request.continue
syntax in a recent version of Puppeteer??remove duplicate paths e.g example.com/foobar/foobar/ -> example.com/foobar/
EDIT: Just wanted to add that i've just tried the above code in Puppeteer 1.3.0 and can confirm it does indeed change the request URL. So why not in the latest puppeteer?
EDIT 2: I've managed to narrow when it stopped working down to between Puppeteer 1.6.0 and 1.7.0. It worked in Puppeteer 1.6.0, but stops working by 1.7.0.
EDIT 3: The problem starts it Puppeteer 1.7.0 in Puppeteer 1.6.2 it works. Has something changed with the request intercepted syntax in 1.7.0?? I've taken a look at the docs but couldn't find anything.
The text was updated successfully, but these errors were encountered: