Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow passing htmlDom #711

Merged
merged 4 commits into from
Jun 24, 2024
Merged

feat: allow passing htmlDom #711

merged 4 commits into from
Jun 24, 2024

Conversation

masylum
Copy link
Contributor

@masylum masylum commented Jun 23, 2024

This allows customizing the parser (using htmlparser2 instead of default's more strict and slower parse5). Also, if you do any post-processing, you can reuse the object and avoid parsing twice (expensive)

This allows customizing the parser (using htmlparser2 instead of default's more strict and slower parse5). Also, if you do any post-processing, you can reuse the object and avoid parsing twice (expensive)
@@ -14,6 +14,7 @@ module.exports = rules => {
return async ({
url,
html = '',
dom,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think rename it into htmlDom?

That's is how it's passed down

@Kikobeats Kikobeats changed the title Performance: Allow passing the dom directly feat: allow passing htmlDom Jun 24, 2024
@Kikobeats
Copy link
Member

Thanks for this!

Whe you pass htmlDom keep in mind you should to also take care about the url for resolving relative URLs:

const { load } = require('cheerio')
const htmlDom = load(html, { baseURI: url })

@Kikobeats Kikobeats merged commit 75e1d8e into microlinkhq:master Jun 24, 2024
30 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants