Skip to content

When using context.create_timer in monitor pattern, timer doesn't fire after several (usually 6 to 8) iterations #168

@dave-read

Description

@dave-read

We're trying to implement the monitoring pattern. The target scenario had a polling interval of 30 seconds with expiry time of start + 10 minutes that would result in about 20 polling attempts with a 30 second timer to delay retry. We consistently saw the function run correctly for a while, but after 6 to 8 iterations, the function is never continued.

We increase the delay to 60 seconds which we saw in the ML Orchestration sample to see if 30 seconds was too small, but we saw the same behavior of only getting through 6 to 8 iterations.

There's no error in the History table, and logs just stop from the runtime. We created a more narrow case from the Timer Usage example and saw the same behavior. Then narrowed it even more to just invoke the timer multiple times.

The function has a dedicated hubName and the number of history rows didn't seem to warrant the use of ContinueAsNew. Our assumption was that if memory pressure was causing this we'd see something in the runtime logs or History table.

Here's the narrow example.

def orchestrator_function(context: df.DurableOrchestrationContext):
    
    for i in range(0, 10):
        deadline = context.current_utc_datetime + timedelta(seconds=30)
        logging.info(">>>>>>>>>>>>>>>>>>> creating timer [timer:{}][deadline:{:%Y-%m-%d %H:%M:%S}][is_replaying:{}]".format(i,deadline,context.is_replaying))
        yield context.create_timer(deadline)
        logging.info("<<<<<<<<<<<<<<<<<<< timer fired    [timer:{}][deadline:{:%Y-%m-%d %H:%M:%S}][is_replaying:{}]".format(i,deadline,context.is_replaying))
    print("################### loop completed")
    
main = df.Orchestrator.create(orchestrator_function)

This runs until the 8th iteration. Then that timer never continues in the function.

[7/17/2020 3:39:32 AM] >>>>>>>>>>>>>>>>>>> creating timer [timer:0][deadline:2020-07-17 03:33:15][is_replaying:True]
[7/17/2020 3:39:32 AM] <<<<<<<<<<<<<<<<<<< timer fired    [timer:0][deadline:2020-07-17 03:33:15][is_replaying:True]
[7/17/2020 3:39:32 AM] >>>>>>>>>>>>>>>>>>> creating timer [timer:1][deadline:2020-07-17 03:33:45][is_replaying:True]
[7/17/2020 3:39:32 AM] <<<<<<<<<<<<<<<<<<< timer fired    [timer:1][deadline:2020-07-17 03:33:45][is_replaying:True]
[7/17/2020 3:39:32 AM] >>>>>>>>>>>>>>>>>>> creating timer [timer:2][deadline:2020-07-17 03:34:37][is_replaying:True]
[7/17/2020 3:39:32 AM] <<<<<<<<<<<<<<<<<<< timer fired    [timer:2][deadline:2020-07-17 03:34:37][is_replaying:True]
[7/17/2020 3:39:32 AM] >>>>>>>>>>>>>>>>>>> creating timer [timer:3][deadline:2020-07-17 03:35:26][is_replaying:True]
[7/17/2020 3:39:32 AM] <<<<<<<<<<<<<<<<<<< timer fired    [timer:3][deadline:2020-07-17 03:35:26][is_replaying:True]
[7/17/2020 3:39:32 AM] >>>>>>>>>>>>>>>>>>> creating timer [timer:4][deadline:2020-07-17 03:35:55][is_replaying:True]
[7/17/2020 3:39:32 AM] <<<<<<<<<<<<<<<<<<< timer fired    [timer:4][deadline:2020-07-17 03:35:55][is_replaying:True]
[7/17/2020 3:39:32 AM] >>>>>>>>>>>>>>>>>>> creating timer [timer:5][deadline:2020-07-17 03:36:44][is_replaying:True]
[7/17/2020 3:39:32 AM] <<<<<<<<<<<<<<<<<<< timer fired    [timer:5][deadline:2020-07-17 03:36:44][is_replaying:True]
[7/17/2020 3:39:32 AM] >>>>>>>>>>>>>>>>>>> creating timer [timer:6][deadline:2020-07-17 03:37:32][is_replaying:True]
[7/17/2020 3:39:32 AM] <<<<<<<<<<<<<<<<<<< timer fired    [timer:6][deadline:2020-07-17 03:37:32][is_replaying:True]
[7/17/2020 3:39:32 AM] >>>>>>>>>>>>>>>>>>> creating timer [timer:7][deadline:2020-07-17 03:38:21][is_replaying:True]
[7/17/2020 3:39:32 AM] <<<<<<<<<<<<<<<<<<< timer fired    [timer:7][deadline:2020-07-17 03:38:21][is_replaying:True]
[7/17/2020 3:39:32 AM] >>>>>>>>>>>>>>>>>>> creating timer [timer:8][deadline:2020-07-17 03:39:11][is_replaying:True]

After [7/17/2020 3:39:32 AM] there are no more runtime logs generated for the function and that's where the history stops


4c4c22878e044d679401719202af2842 | 24 | 2020-07-17T03:39:32.304Z | -1 | Edm.Int32 | OrchestratorStarted | Edm.String | 58ef97d79c884c63a4131467e96e3324 | Edm.String | FALSE | Edm.Boolean | 2020-07-17T03:39:32.258Z | Edm.DateTime
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
4c4c22878e044d679401719202af2842 | 25 | 2020-07-17T03:39:32.304Z | -1 | Edm.Int32 | TimerFired | Edm.String | 58ef97d79c884c63a4131467e96e3324 | Edm.String | TRUE | Edm.Boolean | 2020-07-17T03:38:42.018Z | Edm.DateTime
4c4c22878e044d679401719202af2842 | 26 | 2020-07-17T03:39:32.304Z | -1 | Edm.Int32 | OrchestratorCompleted | Edm.String | 58ef97d79c884c63a4131467e96e3324 | Edm.String | FALSE | Edm.Boolean | 2020-07-17T03:39:32.279Z | Edm.DateTime
4c4c22878e044d679401719202af2842 | sentinel | 2020-07-17T03:39:32.304Z |   |   |   |   | 58ef97d79c884c63a4131467e96e3324 | Edm.String |   |   |   |  

One other thing that we noticed is that the context property is_replaying is alway true which doesn't match our expectation or what's in the History table.

We've implemented the same pattern from the samples with C# and is seems to work fine. So this seems to be something in the Python implementation.

Anything obvious that we've missed?

Thanks

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions