Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge sitemap takes forever to load #2384

Closed
1 task
teammakdi opened this issue Mar 18, 2024 · 0 comments
Closed
1 task

Huge sitemap takes forever to load #2384

teammakdi opened this issue Mar 18, 2024 · 0 comments
Labels
bug Something isn't working.

Comments

@teammakdi
Copy link

Which package is this bug report for? If unsure which one to select, leave blank

None

Issue description

https://dev.to/sitemap-index.xml has lot of links, hence takes forever to load. Maybe we can add a timeout/url limit parameter.

Code sample

import { Sitemap } from 'crawlee';

const { urls } = await Sitemap.load('https://dev.to/sitemap-index.xml');
console.log(urls.length)

Package version

3.8.1

Node.js version

20.11.1

Operating system

Ubuntu

Apify platform

  • Tick me if you encountered this issue on the Apify platform

I have tested this on the next release

No response

Other context

No response

@teammakdi teammakdi added the bug Something isn't working. label Mar 18, 2024
@apify apify locked and limited conversation to collaborators Mar 18, 2024
@janbuchar janbuchar converted this issue into discussion #2385 Mar 18, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
bug Something isn't working.
Projects
None yet
Development

No branches or pull requests

1 participant