Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Using $eval or evaluate selector to get child elements results in missing HTML properties #9382

Closed
JohnnyRacer opened this issue Dec 8, 2022 · 8 comments

Comments

@JohnnyRacer
Copy link

Bug description

Hello, when I try to use await page.$eval('.container', e => e.children) to select a nested element like the one below:

 <div class="container">
    <div class="child">A</div>
    <div class="child">B</div>
    <div class="child">C</div>
  </div>

I do not get the element's available properties. For example , when I selected an element with children and want to get its innerHTML or innerText. All I end up with is after converting to an array from a HTMLCollection is:

[
  {  vei: { onClick: [Object] } },
  {  vei: { onClick: [Object] } },
  {  vei: { onClick: [Object] } },
  {  vei: { onClick: [Object] } },
  {  vei: { onClick: [Object] } },
  {  vei: { onClick: [Object] } },
  {  vei: { onClick: [Object] } },
  {  vei: { onClick: [Object] } },
  {},
  { vei: { onClick: [Object] } },
  { vei: { onClick: [Object] } }
]

This is a list of buttons for a paginated navigation element and I would like to get the innerText properties for each element so I can access the page numbers, but only the onClick event handler property is accessible. Selecting the elements using the selector individually I am able to access all of its properties. I've also tried using

await queryLoadedPage.evaluate(() => { 
      let elem = document.querySelector('.container');
      let children = elem?.children
      console.log(children)
   });

But this just returns undefined for the children. Any help would be greatly appreciated!

Puppeteer version

19.4.0

Node.js version

v18.12.1

npm version

9.1.3

What operating system are you seeing the problem on?

Linux, Windows

Configuration file

No response

Relevant log output

No response

@JohnnyRacer JohnnyRacer added the bug label Dec 8, 2022
@OrKoN
Copy link
Collaborator

OrKoN commented Dec 8, 2022

@JohnnyRacer you'd need to serialize the data inside the evaluate functions yourself (await page.$eval('.container', e => e.innerHTML) for the inner HTML content). I don't think there is a bug here? Or could you provide an executable reproductions?

@JohnnyRacer
Copy link
Author

JohnnyRacer commented Dec 8, 2022

@OrKoN Could you explain what you mean by serialize the data? The docs doesn't really cover anything on properly handling child elements. I've looked at some Stack Overflow solutions but none seem to address the issue I'm facing.

For this example below (a static HTML file named test_child.html served up locally) :

<!DOCTYPE html>
<html lang="en">
<body>
  <div class="container">
    <div class="child">A</div>
    <div class="child">B</div>
    <div class="child">C</div>
  </div>
<body>
</html>

Using the following snippet :

(async () => {
    const browser = await puppeteer.launch({
        headless: true,
        executablePath: executablePath(),
      });
    const page = await browser.newPage();
    await page.goto('http://localhost:5000/test_child.html',{
        waitUntil: 'networkidle2',
      });
    let child = await page.$eval('.container', e =>e.children);
      console.log(child)
    await browser.close();
    
})()  

I'm getting the following as the output, from console.log(child)

{ '0': {}, '1': {}, '2': {} }

There are no properties that are accessible within the objects.

Using evaluate with the following snippet :

(async () => {
    const browser = await puppeteer.launch({
        headless: true,
        executablePath: executablePath(),
      });
    const page = await browser.newPage();
    await page.goto('http://localhost:5000/test_child.html',{
        waitUntil: 'networkidle2',
      });
    await page.evaluate(() => { 
        let element = document.querySelector('.container');
        let child = element?.children;
        console.log(element);
        console.log(child);
     })
      
    await browser.close();
    
})()  

The element itself is null.

@OrKoN
Copy link
Collaborator

OrKoN commented Dec 8, 2022

What I mean is

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch({
        headless: true,
    });
    const page = await browser.newPage();
    await page.setContent(`
    <div class="container">
        <div class="child">A</div>
        <div class="child">B</div>
        <div class="child">C</div>
    </div>
    `)
    let children = await page.$eval('.container', e => {
        const data = [];
        for (const child of e.children) {
            data.push({ tagName: child.tagName, innerText: child.innerText });
        }
        return data;
    });
    console.log(children); // [ { tagName: 'DIV', innerText: 'A' }, { tagName: 'DIV', innerText: 'B' }, { tagName: 'DIV', innerText: 'C' }]
    await browser.close();
})()  

otherwise you get the default representation of the DOM Element as a string.

@JohnnyRacer
Copy link
Author

Can you explain why there is a await page.setContent? It kinda doesn't make sense to me since Puppeteer is trying to get the content from the page instead of modifying its contents. I'm sorry if this is a rhetorical question.

@OrKoN
Copy link
Collaborator

OrKoN commented Dec 8, 2022

@JohnnyRacer it's just easier so that I don't need to create an HTML for every bug report. Should be no difference whether you navigate or setContent.

@JohnnyRacer
Copy link
Author

Oh I see, thank you for the explanation. I just tested your snippet and it's worked as intended!

@JohnnyRacer
Copy link
Author

JohnnyRacer commented Dec 8, 2022

@OrKoN For nested HTML elements how would I be able to recurse through to obtain the inner elements. For example getting the properties of inner-nested and nested class of elements in the example below?

  <div class="child"> 
<div class="nested-child">
 nested A
    <div class="inner-nested-child">
      inner nested A
    </div>
  </div>
  <div class="nested-child">
    nested B
    <div class="inner-nested-child">
      inner nested B
    </div>
  </div>
  <div class="nested-child">
    nested C
    <div class="inner-nested-child">
      inner nested C
    </div>
  </div>
  </div>

@OrKoN
Copy link
Collaborator

OrKoN commented Dec 8, 2022

@JohnnyRacer better ask on StackOverflow. Short answer: the same way, either using selectors or DOM APIs in the evaluated function.

OrKoN added a commit that referenced this issue Dec 9, 2022
OrKoN added a commit that referenced this issue Dec 9, 2022
OrKoN added a commit that referenced this issue Dec 9, 2022
@OrKoN OrKoN closed this as completed Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants