Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add keepAlive to crawler options #1452

Merged
merged 1 commit into from
Aug 11, 2022
Merged

feat: add keepAlive to crawler options #1452

merged 1 commit into from
Aug 11, 2022

Conversation

B4nan
Copy link
Member

@B4nan B4nan commented Aug 10, 2022

Allows to keep the crawler running even when the queue is empty. Use crawler.teardown() to stop it.

Closes #1436

Allows to keep the crawler running even when the queue is empty. Use `crawler.teardown()` to stop it.

Closes #1436
@B4nan B4nan requested a review from vladfrangu August 10, 2022 14:10
@szmarczak
Copy link
Contributor

Wouldn't it be better to make the crawlers always on standby? So there would be no need to call crawler.run. Just add requests, and as long as the Node.js event loop is busy then it would keep running.

@B4nan
Copy link
Member Author

B4nan commented Aug 10, 2022

Many things happen in the run method, so that would need to stay anyway, only thing I can imagine is using it in the background automatically (ensuring it was ran) from inside methods like addRequests(). Or do you see some easy to do refactor? I would rather not fall into a huge rebuild of this part right now, we are out in stable already.

@vladfrangu
Copy link
Member

I was more wondering if this means the crawler will keep running and pool the request queue every so often to see if there's new requests, and run them if so (so you can externally add requests to be handled, and make the crawler be always on)

@szmarczak
Copy link
Contributor

Or do you see some easy to do refactor?

No idea, just asking.

@B4nan
Copy link
Member Author

B4nan commented Aug 10, 2022

so you can externally add requests to be handled, and make the crawler be always on

Yes, I believe it should work like that.

cc @metalwarrior665, is there any more into this problem than just what I already did?

@B4nan B4nan merged commit 084b6b2 into master Aug 11, 2022
@B4nan B4nan deleted the keep-alive branch August 11, 2022 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add keepAlive to crawler options
3 participants