New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix closespider_itemcount/timeout features #79
Conversation
@Ziinc so far my test spiders are running normally. Noticed two problems that are addressed in this PR. |
Another question, maybe you could advise if there is a way to include the contents of the quickstart from documentation/quickstart.md into README.md? It will be better to have just one source... |
Curious. I think the tests don't cover scrape speed. One of them should be item_count though. For the docs, do this:
then remove the quickstart.md file from the docs folder This makes the README.md file the source of truth for the quickstart. |
lib/crawly/manager.ex
Outdated
@@ -99,7 +99,7 @@ defmodule Crawly.Manager do | |||
|
|||
maybe_stop_spider_by_itemcount_limit( | |||
state.name, | |||
items_count, | |||
delta, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incorrect, items_count
should be passed, as we are comparing items scraped, not scrape speed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, fixed
It is a bit tricky, indeed. E.g. the items count is an accumulated value. Which requires a bit of a spider runtime. In general, I think I would need to plan more time on improving these tests (the first implementation was quite basic, and these mutational change showed that coverage is not as good as it should be) One of them should be item_count though. Regarding the documentation, as I understand we can't restrict it to the Quickstart only. E.g. I would not want to include the fill readme.md there :( |
What's wrong with merging the current introduction.md with the readme.md? The content is mostly the same, and in the SERPs, if people were to click on the hexdocs link first (instead of the github repo), at least they would still get the same breakdown on what the project is |
It looks like a good idea. Maybe, in this case, we can replace the example from the introduction with the example from quickstart. And then I will remove the quickstart section from hex. |
Sorry, could you please have another glance? I have made the suggested changes to the documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleted quickstart.md
Just did a side by side check for the example, the content is the same |
Changed to correct extension, removed unnecessary port setting.
Thanks! I will plan the new release for 14 of April! |
No description provided.