Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve gather stage error handling #21

Closed
amercader opened this issue Mar 14, 2013 · 3 comments
Closed

Improve gather stage error handling #21

amercader opened this issue Mar 14, 2013 · 3 comments

Comments

@amercader
Copy link
Member

The way queue.py handles the gather stage is very inconvenient, as it captures all exceptions that may happen in the harvester gather_stage, preventing all debugging and leaving the job in a half state.

Similarly to the refactoring on the fetch stage, there should be no exceptions caught, as harvesters should be the ones being robust enough (if there are exceptions happening, we want to see them). If gather stage returns anything that is not a list of ids, the process stops. If the harvest object ids list is empty, the process stops. gather_finished is always set up, allowing the "run" command to flag the job as Finished (we want the "run" command to do it so the harvest source status is reindexed).

Also the debug messages with thousands of harvest object ids are not very useful.

On a later stage, we could investigate into implementing retry times as in the fetch stage.

amercader added a commit that referenced this issue Mar 14, 2013
See issue for full details. Basically we don't want to catch any
exception at the queue.py level, as they prevent debugging. Harvesters
should deal with them and return a list of ids or an empty list if no
objects need to be fetched.
Also improved the debug messages.
@davidread
Copy link
Contributor

You say that one's harvester's gather_stage/fetch_stage "should be the ones being robust enough" but I think that is most unhelpful for harvester writers.

A harvester writer should be most concerned about the normal operation. They should be able to then deploy a harvester and not have to worry about having to catch every last exception type. They may not be able to simulate everything that could go wrong. For example a socket time-out raises a different exception between Python 2.6 and 2.7.

So exceptions will sometimes happen when these things are deployed - we must accept that. As it is, this will cause the harvest to stop in an incomplete state, leaving it in limbo, unable to be rerun and display no error message in the web interface.

The reason you give is to aid development. Well how about printing the exception in the log and saving a HarvestGatherError/HarvestObjectError so that the user can see there was a problem. This should be quite sufficient for a harvester developer, reduces the burden on them and much more helpful to general users.

I have some code to do this - what do you think?

@amercader
Copy link
Member Author

I said these things a long time ago :) I guess that my reasoning at the time is that for instance your harvester gathers objects from a remote API and fails on the last page because you messed up with the loop code you want to notice this to fix it. But you are right that if the issue is flagged as new harvest error that should make it apparent.

Happy to review and merge your changes.

@davidread
Copy link
Contributor

Cool. I have various things to be more defensive with harvesters so that we don't end up in limbo states or without errors showing.

amercader pushed a commit that referenced this issue Oct 8, 2021
fix for py3 also works for py2;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants