Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to load and render files? #1120

Closed
grandemestre opened this issue Oct 5, 2023 · 7 comments
Closed

Is it possible to load and render files? #1120

grandemestre opened this issue Oct 5, 2023 · 7 comments
Labels
question Further information is requested

Comments

@grandemestre
Copy link

Hello, I have a question, I don't know if the resource is missing or I don't know how to use it. My goal is to scrape a dynamic page. I used jsdom but I found this project better because it appears to be compatible with fetch.

I wanted to know how to load an .html file and render it including javascripts. The equivalent of this in jsdom:

const { JSDOM } = require('jsdom');

let options = { runScripts: "dangerously", resources: "usable" }

JSDOM.fromFile('index.html', options).then(dom => {

  console.log(dom.serialize)

});

index.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Fetch Text Error</title>
</head>
<body>
    <div class="valor"></div>
    
</body>

<script>
let valor = document.getElementsByClassName('valor')[0];

fetch('https://api.github.com/users/xiaotian/repos').then(
  resp => resp.json() // this returns a promise
).then(repos => {
  for (const repo of repos) {

    
    valor.innerHTML += repo.name + '</br>';
    //console.log(repo.name);
  }
}).catch(ex => {
  console.error(ex);
})

</script>
</html>
@capricorn86 capricorn86 added the question Further information is requested label Oct 5, 2023
@capricorn86
Copy link
Owner

capricorn86 commented Oct 5, 2023

Hi @grandemestre! 🙂

You would do something like this:

import { Window } from 'happy-dom';

async function main() {
    const url = 'https://google.com';
    const window = new Window({ url });
    const document = window.document;
    const response = await window.fetch(url);
    const html = await response.text();

    document.write(html);

    // Wait for async tasks such as scripts, styles, fetches and timers to complete
    await window.happyDOM.whenAsyncComplete();

    // Output HTML of page
    console.log(document.documentElement.outerHTML);
}

main();

I can also mention that I'm working right now on improving this.

After the new update it will look something like this:

import { Browser } from 'happy-dom';

async function main() {
    const browser = new Browser();
    const page = browser.newPage();
  
    await page.goto('https://github.com');
    await page.whenComplete();
}

main();

// Do something with the result

@capricorn86
Copy link
Owner

capricorn86 commented Oct 5, 2023

With a file:

import { Window } from 'happy-dom';
import FS from 'fs';

const url = 'https://google.com';
const window = new Window({ url });
const document = window.document;
const html = (await FS.promises.readFile('index.html')).toString();
   
document.write(html);
   
// Wait for async tasks such as scripts, styles, fetches and timers to complete
await window.happyDOM.whenAsyncComplete();

// Do something with the result
const field = document.querySelector('.field');
console.log(field.textContent);

@grandemestre
Copy link
Author

I don't know why but neither of the two codes works for me. It doesn't return any errors, it just doesn't do anything

@capricorn86
Copy link
Owner

@grandemestre there was a problem in the example, but I have fixed it now. I also changed the URL to https://google.com as https://github.com is using a function that is not supported in Happy DOM yet.

Here is a working example:
https://runkit.com/capricorn86/651f36ea2eb0d90008446a82

@grandemestre
Copy link
Author

Thanks a lot for the help. Now it worked correctly. Out of curiosity, what functionality is not yet supported?

@capricorn86
Copy link
Owner

Thanks a lot for the help. Now it worked correctly. Out of curiosity, what functionality is not yet supported?

Sorry, I forgot to answer this and I don't remember anymore.

@capricorn86
Copy link
Owner

There is a new way now using the Happy DOM Browser API:
https://github.com/capricorn86/happy-dom/wiki/Browser

I will close this ticket now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants