If metrics workers error, ensure socket listeners are stopped #8112

Merged
merged 1 commit into from Nov 21, 2017

Conversation

Projects
None yet
3 participants
Owner

wallyworld commented Nov 21, 2017

Description of change

Bad data, eg zero length files in /var/lib/juju/metricspool/, cause the metric workers to bounce. But upon doing so, they don't close their socket. We add error handling to the "run" func passed to the periodic workers so that if there's an error, stop is called on the particular metrics structs, which stops the listener.

Also, if there's an error creating the content of the metric file, don't try and use it.

QA steps

https://bugs.launchpad.net/juju/+bug/1733469/comments/6
Create zero length files and check for leaks.

Bug reference

https://bugs.launchpad.net/juju/+bug/1733469

Given the need to do the defer in multiple places, it feels a bit like we should be incorporating something into the overall Periodic Worker interface. Maybe something like a TearDown or a Cleanup ?

@@ -55,6 +56,10 @@ func (f *metricFile) Close() error {
if err != nil {
return errors.Trace(err)
}
+ // If the file contents are garbage, don't try and use it.
+ if f.encodeErr != nil {
+ return nil
@jameinel

jameinel Nov 21, 2017

Owner

shouldn't this be returning an error rather than nil?

@wallyworld

wallyworld Nov 21, 2017

Owner

No because this is the Close(). The close itself worked and is done as cleanup as part of a defer.
The actual error is reported outside this. It's only here as a flag.

Owner

wallyworld commented Nov 21, 2017

Agree the periodic worker should provide a facility to better recover from errors invoking call().
I wanted to get the fix done in time for rc2. We certainly should follow up with an evaluation of the call sites for periodic worker and redo things a bit - periodic worker is upstream do it will require a new v2 branch which would make getting this done for rc2 a challenge.

Owner

wallyworld commented Nov 21, 2017

$$merge$$

Contributor

jujubot commented Nov 21, 2017

Status: merge request accepted. Url: http://ci.jujucharms.com/job/github-merge-juju

@jujubot jujubot merged commit 432cd56 into juju:develop Nov 21, 2017

1 check passed

continuous-integration/jenkins/pr-merge This commit looks good
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment