-
Notifications
You must be signed in to change notification settings - Fork 63
Fix duplicate key error when exceptions occur #323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| private readonly ILogger _logger; | ||
|
|
||
| private readonly IDictionary<TelemetryPropertyName, string> _telemetryProps; | ||
| private readonly IDictionary<TelemetryPropertyName, string> _telemetryProps = new Dictionary<TelemetryPropertyName, string>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would suggest instantiating a new dictionary every time inside StartAsync method. It is possible that the WebJobs SDK calls StartAsync while the previously called StartAsync method is still running. We have checks to ensure proper functionality for such cases. See line 82 in this revision. With current code, we can have a situation where the second method instance clears the dictionary while the first instance (including class SqlTableChangeMonitor) is still logging messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm kind of surprised that's expected behavior - what scenario does that happen in?
And instead I'm just going to move the telemetry calls below so they're only called when it's actually successfully starting. We don't want to be tracking listener starts when it doesn't actually start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I did this in my open PR for additional debug logging - leave a comment there if you see any other issues)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm kind of surprised that's expected behavior - what scenario does that happen in?
I am making a guess by looking at the WebJobs SDK source code: If there are multiple listeners registered in the function app, the CompositeListener will call StartAsync on all listeners). If either one of the listeners throw exception, it seems all the listeners will be called again. The listeners which have already started are supposed to be skipping performing the startup operation in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, thanks for digging into that!
Noticed that if an exception occurred it would log the exception the first time but subsequent times would throw a top-level error about a dictionary key already existing and exit the function.
This is because the listener was keeping a list of common properties but never clearing them, which meant that if a failure occurred it would try to start the function up again (without reloading) so the properties in the dictionary were still populated and thus throw when they were attempted to be added again.
(There's a companion PR that I'll be sending out to fix another issue here, but that's in the core telemetry code so I'll be making that in main and porting up here)