You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
The time span between updating when the last item was scraped is too great, and currently if for some reason the script crashes, one has to start at the beginning and scrape all the items over again.
Describe the solution you'd like
Change the timing of when the "update last item downloaded" function to where it occurs every other item or every five or so items.
Describe alternatives you've considered
You could also write this in as an exception to when the script fails and crashes it updates the last item scraped before exiting.
Additional context
Will take a small bit of coding, but will save a lot of time.
The text was updated successfully, but these errors were encountered:
The update of the config file is based on the batch of messages processed. Currently, a batch of 100 messages are being processed concurrently and once the batch is done the config is updated.
I like the solution of graceful exit i.e, the script crashes update the config with the latest message_id before exit. Will add it to the next set of features.
A batch of 100 messages? That is a lot, especially if you don't have the fastest network, and your downloading rather large files. I am assuming that is set in the following line:
begin_import(config, pagination_limit=100)
Regardless, the graceful stop will accomplish the same desired end result. Although, I am completely clueless how to implement it.
That works. I will leave it for you to close out the issue when desired.
Is your feature request related to a problem? Please describe.
The time span between updating when the last item was scraped is too great, and currently if for some reason the script crashes, one has to start at the beginning and scrape all the items over again.
Describe the solution you'd like
Change the timing of when the "update last item downloaded" function to where it occurs every other item or every five or so items.
Describe alternatives you've considered
You could also write this in as an exception to when the script fails and crashes it updates the last item scraped before exiting.
Additional context
Will take a small bit of coding, but will save a lot of time.
The text was updated successfully, but these errors were encountered: