Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net::ERR_ABORTED during headless testing #2794

Closed
maximkoshelenko opened this issue Jun 23, 2018 · 22 comments
Closed

net::ERR_ABORTED during headless testing #2794

maximkoshelenko opened this issue Jun 23, 2018 · 22 comments

Comments

@maximkoshelenko
Copy link

maximkoshelenko commented Jun 23, 2018

Hello team. Need help with issue which only reproduce in headless mode. I have a test which is checked 200 and 206 result after clicking the footer links of site. But after adding PDF file as a footer link test crashed in headless mode (in debug mode it is working). Please help solve this problem :).

Steps to reproduce

Tell us about your environment:

Please include code that reproduces the issue.

  await page.goto('http://www.cancernetwork.com/', { waitUntil: "domcontentloaded" });
  await page.waitForSelector('.expanded .menu');
  let footerLinks = await page.evaluate(
    () => Array.from(document.body.querySelectorAll('.expanded a[href]'), ({ href }) => href)
  );
  for (var r = 0; r < footerLinks.length; r++) {
    let  [response] = await Promise.all([
        page.waitForNavigation(),
        await page.goto(footerLinks[r], { waitUntil: "load" }),
      ]);

    if (response._status == 206) {
      expect(response._status).toBe(206);
    } else {
      expect(response._status).toBe(200);
    }

    response = '';
    console.log('Footer link: ' + footerLinks[r] + ' was checked successfully');
  }

What is the expected result?
Going to http://marketing.advanstar.info/mediakits/TC_MK_2016.pdf page with 200 or 206 status
What happens instead?
Without headless mode code is working, but in headless there is an error

net::ERR_ABORTED at http://marketing.advanstar.info/mediakits/TC_MK_2016.pdf
at navigate (node_modules/puppeteer/lib/Page.js:592:37)

I have tried execute code at https://try-puppeteer.appspot.com/ site
Code:
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('http://marketing.advanstar.info/mediakits/TC_MK_2016.pdf');
await browser.close();

Result:
Error running your code. Error: net::ERR_ABORTED at http://marketing.advanstar.info/mediakits/TC_MK_2016.pdf

@aslushnikov
Copy link
Contributor

Hi @maximkoshelenko,

Contrary to Headful Chrome, Chrome Headless doesn't know how to navigate to PDFs. I think it'll issue a download instead, but downloads are not supported yet: #299

@maximkoshelenko
Copy link
Author

Thanks. Hope downloads will be supported ASAP :)

@gsouf
Copy link

gsouf commented Feb 21, 2019

@aslushnikov are there any way to know that the net::ERR_ABORTED is issued because of a download when we catch the issue?

We could use request interception and to look at content type, but ideally I want to be able to handle it in the catch

@matej2
Copy link

matej2 commented Dec 16, 2019

Why is this closed? My issue occurs when redirecting...

@ghost
Copy link

ghost commented Jan 17, 2020

How can this error due to a pdf file be distinguished from other errors? I also hit this problem and could try-catch it, but I am afraid I will oversee other errors unrelated to pdfs.

@deansg
Copy link

deansg commented Aug 6, 2020

@gsouf Can you expand on how to implement that? The content type isn't known until there is a response, and the resourceType() for pdf pages is simply "document"

@gsouf
Copy link

gsouf commented Aug 6, 2020

@deansg I'm doing something like this. I cannot share more because the rest is part of a more complex thing but this is the gist of it:

const page = openSomePuppeteerPage();

async function pageOnResponseRequest(response) {
  if (response.frame() === page.mainFrame() && response.request().isNavigationRequest()) {
    const statusCode = response.status();
    const headers = response.headers();

    // At this point you have access to status code and headers which you can use to detect that it's an html document, an image, a downloadable document, etc...

  }
}

page.on('response', pageOnResponseRequest);

@deansg
Copy link

deansg commented Aug 18, 2020

@aslushnikov Do you know which content types experience this behaviour? So that I can use gsouf's idea to filter them out

@gsouf
Copy link

gsouf commented Aug 18, 2020

@deansg everything that is a download. I would say everything that is not a displayable text (html, maybe xml and json?) and images. Maybe video and audio too? Havent tested all of those

@deansg
Copy link

deansg commented Aug 18, 2020

@gsouf My problem is that I faced websites that return javascript content type, and then continue loading until proper HTML is loaded. If I intercept the first response and immediately decide that the website isn't relevant because that content type isn't HTML bases, then I lose valuable information (I need only extract content only from websites that return html). I'm not sure whether it's better to build a blacklist of content types, or a whitelist.

@gsouf
Copy link

gsouf commented Aug 18, 2020

@deansg the solution I proposed filters only navigation requests for the main frame. What you want to do is to process the request only if content type (from response headers) is html

@gsouf
Copy link

gsouf commented Aug 18, 2020

@deansg I have never seen a website returning javascript content type and running properly. The browser wont process the javascript. I will just display it on screen as simple text

@deansg
Copy link

deansg commented Aug 18, 2020

@gsouf when I try to navigate to the following website:
https://atelierhaussmann.de/en/
The 'response' event is called several times, and in several of the cases the content type is application/javascript

@gsouf
Copy link

gsouf commented Aug 18, 2020

@deansg you'll have to figure out what's wrong because it does not occur for me. Even with your website, I confirm it has content-type text/html

@pepsiamir
Copy link

I get the same error when I want to get the redirect chain of an url. any solution to exit the process after fetching the data?

Error: net::ERR_ABORTED at https://www.example.com/vip-dl/?filename=23309907.rar
at navigate (C:\Users\noora\AppData\Roaming\npm\node_modules\puppeteer\lib\FrameManager.js:120:37)

@vaishnavravi33
Copy link

I get the same error when I want to get the redirect chain of an url. any solution to exit the process after fetching the data?

Error: net::ERR_ABORTED at https://www.example.com/vip-dl/?filename=23309907.rar
at navigate (C:\Users\noora\AppData\Roaming\npm\node_modules\puppeteer\lib\FrameManager.js:120:37)

A similar error I am also facing, Any help appreciated.

@arnasledev
Copy link

going over the same issue for the past few days, was working fine before, nothing has changed...

@chinmaey
Copy link

I need to test the download speed for video download. I face same issue.

@Rbrandao7
Copy link

That issue is related to a bad connection? I am face same issue, sometimes this error appear and others no.

@StoneCypher
Copy link

In my case, the reason for this was that I was trying to run several jobs in parallel with promises, and I hadn't noticed that they were all using the same page object.

@smashah
Copy link

smashah commented Dec 19, 2022

Another reason you may be experiencing this error consistently is that your system is telling chrome to use jemalloc.

Turn off system-wide jemalloc to get rid of this error.

See: #8246 (comment)

@yunnysunny
Copy link

yunnysunny commented May 15, 2024

Another reason you may be experiencing this error consistently is that your system is telling chrome to use jemalloc.

Turn off system-wide jemalloc to get rid of this error.

See: #8246 (comment)

I use win10, when I called setCookie first, it will have the same error.
When I removed the code of cookie, it run OK.
I have captured package when request abort with wireshark:
2024-05-16 145414
The ip starts with 10.254 is my win10 pc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests