Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cron thread #12

Merged
merged 64 commits into from
Jan 15, 2018
Merged

Cron thread #12

merged 64 commits into from
Jan 15, 2018

Conversation

prophile
Copy link
Contributor

@prophile prophile commented Dec 21, 2017

@danpalmer
Copy link
Contributor

I hope this isn't too much of a spanner in the works, but I've added a third type of trigger, an 'interval' trigger, that works on an interval rather than at a specific time. It seemed necessary to implement the full Iterable replacement. Happy to chat if you don't think it's necessary, and also happy to do write the cron code to make it work as well, once this is in a state where I can safely build off it and extend it.

This breaks out the actual scheduling logic to helper functions
outside the cron thread class, making it possible to test the
scheduling without needing to actually run the thread.
@coveralls
Copy link

Coverage Status

Coverage decreased (-1.07%) to 98.649% when pulling dcc83ab on cron into c1767c2 on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.3%) to 99.455% when pulling 36f4a32 on cron into c1767c2 on master.

@danpalmer
Copy link
Contributor

We chatted IRL about this, we decided that the thread should gracefully exit.

@danpalmer danpalmer changed the title [WIP] cron thread Cron thread Jan 10, 2018
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.2%) to 99.229% when pulling b3ac5b1 on cron into 0811a5b on master.

Copy link
Contributor

@PeterJCLaw PeterJCLaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a bit too much magic in the way that the _inner behaves, though is very clever.

except Exception as e:
# Something went wrong in the processing of a single item in the
# cron queue, log it and allow the processor to continue.
print(e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be in favour of using logging.exception here, even if we configure logging to go to stdout by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agreed, this (and other parts) are not yet complete.

@@ -2,34 +2,43 @@

import time
import threading
import contextlib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether it's a result of looking at the diff; unfortunately this file feels rather spaghetti-like to me at the moment. Maybe rearrange the functions so they're more clearly grouped? (I tend to enjoy ordering the file to roughly match an upside-down call stack -- detail/utils at the top, higher level logic at the bottom)

return self._terminating


class StopCronProcessing(Exception):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we pull this class and the following type definitions to the top of the file? (I think I'd expect the type definitions first, then the exception, but I'm not sure there's a clear standard yet)


with app.db.begin() as conn:
try:
with app.db.begin() as conn:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that the process_transitions(app, label) call below is outside this context?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I strongly suspect this is an error since process_one_job is an instance of routemaster.cron._process_cron_job._inner which silences exceptions thrown within the context. This means that if this context errors we'll run process_transitions(app, label) for the label anyway. I'm fairly sure we don't want that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've discussed this IRL and resolved in a refactor I think.

) -> None:

@contextlib.contextmanager
def _inner() -> Iterator[None]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is a very neat mechanism, it's a bit surprising that a) it's a context manager, and b) a context manager throws an exception as part of its normal operations. It seems plausible that the exception could accidentally get caught by one of our blanket except Exceptions (which we're using quite a lot of in this project).

Did you consider using something like a threading.Event, which would have clearer semantics?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're going to want to document whatever mechanism we go for somewhere so that it's clear what needs to be done to use it within other parts of the system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having just realised that this manager also silences exceptions and spotted a bug with its usage in routemaster.state_machines.actions.process_retries (see comments there), I don't think that this is a good mechanism to use.
I think there's just a bit too much magic happening here for it to be safe in general use.

_trigger_gate,
state,
)
elif isinstance(trigger, MetadataTrigger):
scheduler.every(60).seconds.do(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with the action triggering, this might be clearer as scheduler.every().minute.do(.

lock_label(label, conn)
current_state = get_current_state(label, state_machine, conn)

if current_state != action:
continue
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this need to change from continue to return? Surely we don't want to stop processing for all the pending users just because one of them is in the wrong state?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, leftover from refactoring. d77e59c

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.2%) to 99.232% when pulling 2d9fd45 on cron into 0811a5b on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.2%) to 99.195% when pulling aad2c63 on cron into 0811a5b on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.2%) to 99.248% when pulling 0400436 on cron into 0811a5b on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.2%) to 99.146% when pulling e682446 on cron into 7202f4d on master.

@danpalmer danpalmer merged commit 922a96f into master Jan 15, 2018
@danpalmer danpalmer deleted the cron branch January 15, 2018 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants