Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewritten Mojolicious::Guides::Cookbook Non-blocking User-Agent recipe #411

Closed
wants to merge 1 commit into from

Conversation

creaktive
Copy link

Do not revisit links and constrain traversal to a single domain.
Improved the parallelization, also.

@kraih
Copy link
Member

kraih commented Nov 8, 2012

-1 from me i'm afraid, would like to see the example become smaller, not bigger.

@kraih
Copy link
Member

kraih commented Nov 8, 2012

There are also a few style issues.

a) Multiple statements on a single line separated by semicolon.
b) No empty line before comment lines.
c) Comments that just sound wrong, like "Constraint our crawler".
d) "use" statements got removed, even though the class is used.

@jberger
Copy link
Member

jberger commented Nov 8, 2012

I have to agree with @kraih, examples in documentation need not be the most effective, they just need to highlight functionality. Showing good algorithms or effective use of logic is probably better reserved for a blog post. Note that I don't say that dismissively, I blog like this often: http://blogs.perl.org/users/joel_berger/

@marcusramberg
Copy link
Member

-1 from me too, for the above reasons.

@kraih
Copy link
Member

kraih commented Nov 8, 2012

It is in fact my least favorite example in the whole cookbook, so i would very much welcome a more elegant alternative that also brings the point across.

@kraih kraih closed this in 8b30830 Nov 8, 2012
@creaktive
Copy link
Author

I'm sorry, I really messed up forgetting to state the real purpose of non-blocking crawler recipe rewrite. I've got too enthusiastic with the coding process, I suppose :)
The former recipe never halts, even with a finite queue. At one point, this was misleading for me.
Also, as @kraih stated, the recipe isn't quite elegant, specially this part:

  # Dequeue or wait 2 seconds for more URLs
  return Mojo::IOLoop->timer(2 => sub { crawl($id) })
    unless my $url = shift @urls;

So, I've thrown away the URL de-duplication& domain constraint stuff, and the halting fix itself (which I originally called "Improved the parallelization") boiled down to this:
https://github.com/creaktive/mojo/commit/b609c1dcee83bc2ac3e096d795e1bdcfc4697df5
Please tell me what you think.

@kraih
Copy link
Member

kraih commented Nov 8, 2012

While i like that version a lot more, i'm not sure removing the timer is a good idea, it was added intentionally to demonstrate that timers can be used together with a non-blocking user agent.

@creaktive
Copy link
Author

Just sayin': it might be more didactical to add something like a "ticker" in this case:

Mojo::IOLoop->recurring(1 => sub { printf STDERR "\n%d URLs to go\n\n", scalar @urls });

Timer which fires only on the beginning and the end of the queue isn't very funny, IMHO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants