Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drain queue #10

Closed
grafana-dee opened this issue Jul 8, 2015 · 8 comments
Closed

Drain queue #10

grafana-dee opened this issue Jul 8, 2015 · 8 comments

Comments

@grafana-dee
Copy link

I have the following scenario:

  1. Start crawling a large website
  2. Queue 1,000 URLs
  3. Need to cancel the crawl ASAP (for whatever reason)

q.Close() does not help as even though it prevents new items from getting added to the queue, no method is offered for draining the queue of the existing items.

I'm not sure how to approach doing it, otherwise I'd offer a patch, but could a q.Drain() be added... or better still a q.Cancel() which first calls q.Close() and then drains the queue before returning and releasing q.Block().

@mna
Copy link
Member

mna commented Jul 8, 2015

q.Cancel sounds like a good idea, yeah. I may not get to it before a few weeks, though.

@mna
Copy link
Member

mna commented Jul 8, 2015

Actually I had a little free time and I implemented an experiment on the branch cancel if you want to take a look. I may have missed a couple things, haven't looked at that code in a while, but if that's it it would be pretty simple. I will test it when I have a bit more time.

@mna
Copy link
Member

mna commented Jul 23, 2015

Did you get a chance to try it out? Was it working as expected?

@grafana-dee
Copy link
Author

I got a chance to try it out, but not to debug it (got on a plane and am now in a different city doing different things).

It appears to work (it cancels further crawling), but never actually unblocks.

Stops here:

q.wg.Wait()

My queue was fairly long at time of cancellation (several thousand URLs at a default rate of 2 reqs per second), but it hangs indefinitely on waiting for something else to mark the waitgroup as done.

@mna
Copy link
Member

mna commented Jul 23, 2015

Cool, I'll try to look at it today.
On Thu, Jul 23, 2015 at 12:18 David Kitchen notifications@github.com
wrote:

I got a chance to try it out, but not to debug it (got on a plane and am
now in a different city doing different things).

It appears to work (it cancels further crawling), but never actually
unblocks.

Stops here:

q.wg.Wait()

My queue was fairly long at time of cancellation (several thousand URLs at
a default rate of 2 reqs per second), but it hangs indefinitely on waiting
for something else to mark the waitgroup as done.


Reply to this email directly or view it on GitHub
#10 (comment)
.

@grafana-dee
Copy link
Author

Much appreciated, in fact if you have a PayPal email let me know. Beer will be bought.

@mna
Copy link
Member

mna commented Jul 23, 2015

I found the bug and added a test for the Cancel method, it should now be working as expected and finishing quickly. Let me know how it goes and I'll merge once you confirm it's good for you too (if you're able to test it, of course, otherwise I'll just merge right away).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants