Replies: 2 comments
-
Yeah plus one! Is there no way to do this? The crawler seems to default to store the output into some storage provided by Crawlee, writing something to disk - but ideally the crawler would just return the data from the handler function and allow the developer choice on how/where to persist the data. |
Beta Was this translation helpful? Give feedback.
0 replies
-
You just need to collect the result into a variable. The crawler cannot just return data because it will not hold arbitrary results in its memory. // disable writing to disk
Configuration.getGlobalConfig().set('persistStorage', false);
let result;
const playwrightCrawler = new PlaywrightCrawler({
//some options,
async requestHandler({ request, page, log, parseWithCheerio }) {
result = getData();
},
});
await playwrightCrawler.run([{url: 'https://example.com'}]);
return result; // or resolve(result) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have an application, which does next things:
I've read the docs, and you have an example of how someone can parse a single URL, but there is no way to parse a single URL with the help of Puppeteer/Playwright.
It would be very nice if there would be an ability to parse a single URL in a way like this for example:
Beta Was this translation helpful? Give feedback.
All reactions