Cannot resume if the request queue is too big #1990
dragospopa420
started this conversation in
General
Replies: 1 comment
-
What do you mean by pausing? And how do you resume? Can you provide an actual repro? There were various small improvements in recent versions, so things might behave better now. The memory storage persists things to the file system, and is generally not suited for millions of requests. In the long run, we'd like to introduce more storage clients, including some backed by regular databases like postgres, which should generally help. You could also try https://github.com/apify/apify-storage-local-js which uses sqlite as the backend. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Which package is this bug report for? If unsure which one to select, leave blank
@crawlee/memory-storage
Issue description
Steps to reproduce:
It doesn't matter if the scrapers is with Cheerio, Playwright or Puppeteer. Tested on all 3 of them. ( I'll add sample code from a Cheerio one)
From what I saw after 1 million requests ( + - obvious ), if for some reason I pause the scraper it cannot be resumed... it never resumes.
Maybe it would be useful to have the option to use a Redis backend for the request queues.
Code sample
Package version
3.3.0
Node.js version
18.14.2
Operating system
Fedora 37
Apify platform
I have tested this on the
next
releaseNo response
Other context
No response
Beta Was this translation helpful? Give feedback.
All reactions