Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On page error #29

Merged
merged 11 commits into from Jun 1, 2014
Merged

On page error #29

merged 11 commits into from Jun 1, 2014

Conversation

taganaka
Copy link
Owner

@taganaka taganaka commented Jun 1, 2014

No description provided.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) when pulling 1cf9863 on on_page_error into 637c70c on master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) when pulling 5344b07 on on_page_error into 637c70c on master.

@@ -190,28 +191,33 @@ def takeover
else
page = pages.last
end

if page.nil?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When can it be nil?
http.fetch_pages always returns an Array

Or am I overlooking something?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right 👍

@coveralls
Copy link

Coverage Status

Coverage increased (+0.18%) when pulling ba5876e on on_page_error into 637c70c on master.

@tmaier
Copy link
Contributor

tmaier commented Jun 1, 2014

👍

@taganaka
Copy link
Owner Author

taganaka commented Jun 1, 2014

Not sure if at this point we should skip the call to on_page_downloaded when a page has error.
Kinda of breaking change but more consistent though

@tmaier
Copy link
Contributor

tmaier commented Jun 1, 2014

Well.... There is no header and no body...
This will most likely lead to errors when processing the page with xpath or other ways.
Second, the method is called on_page_downloaded. So one should expect a page downloaded.

But on the other hand, you might have code you want to run always.
And the method is not called on_page_success or on_page_sucesssfully_downloaded...

So I think you are right.
We should run on_page_downloaded regardless of the error.

I would just wish for an example which shows the possibilities of error handling with polipus.

As a side note:
I just thought about @on_before_save.each {|e| e.call(page)}
Should it run right before we store the page? And thus, skip it when storable? returns false?
My conclusion was the same. The on_before_save block is the only place where one could manipulate storable? if necessary

@taganaka
Copy link
Owner Author

taganaka commented Jun 1, 2014

I agree we should not change the current workflow.
on_page_error is a better semantic way to encapsulate network errors.
A better documentation plus more comprehensive examples should help the user to understand better how to proper use DSL exposed.

re on_before_save : agree to run it just before page store and after on_page_error

@coveralls
Copy link

Coverage Status

Coverage increased (+0.18%) when pulling ad44653 on on_page_error into 637c70c on master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.21%) when pulling 68a2e43 on on_page_error into 637c70c on master.

taganaka added a commit that referenced this pull request Jun 1, 2014
@taganaka taganaka merged commit 9e1fe6e into master Jun 1, 2014
@taganaka taganaka deleted the on_page_error branch June 1, 2014 20:49
@coveralls
Copy link

Coverage Status

Coverage increased (+0.21%) when pulling 68a2e43 on on_page_error into 637c70c on master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants