New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suppress errors when sending to a nonexistent receiver #6
Conversation
I specifically added the existing logging to catch this case (to help me and other debug why things are not working). How about we add a counter to the loop and only log every N (e.g. 100) times this happens? Should reduce log spam but still be helpful. |
Fair enough. Stopping after a set number of messages is a little tricky, particularly with this implementation. For instance, when do you reset your error counter back to 0? How about using |
Sorry, misread your message. Thought you said log the first 100 rather than every 100th error! My question about changing the behaviour when |
Like this, for instance? |
-> Metrics.Sample -- ^ Last sampled metrics | ||
-> Socket.Socket -- ^ Connected socket | ||
-> StatsdOptions -- ^ Options | ||
loop :: Metrics.Store -- ^ Metric store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you add an Int
parameter to this function and pass it to and return it from flushSample
you can do something like
Socket.sendAll socket msg `catch` \ (e :: IOException) -> do
when failCount `mod` 100 == 0 $ do
T.hPutStrLn stderr $ "ERROR: Couldn't send message: " <>
T.pack (show e) <> "(happened " <> show failCount <> " times)"
return $! failCount + 1
Yes, got that, but it wouldn't help here as you don't get these errors if you never call |
Sorry, I haven't found the time to think about this hard enough. Could we make this work in a way that the user gets periodically notified if things aren't working without spamming them too much? |
Sure. Your suggestion works for that. Again I would think different behaviour between But I really would like a way to avoid calling |
We ran into exactly this issue at work and we would like to fix it. I can pick up and continue where this pull request left off if nobody objects |
Fine with me. |
What's the status here? We also ran into this and would prefer not to see the output. Would it be enough to just wrap ekg-statsd/System/Remote/Monitoring/Statsd.hs Lines 207 to 208 in 66520aa
when isDebug ?
|
Sorry, I'm unlikely to get to this in the near future. @Gabriel439 did you get anywhere? |
I also haven't gotten to this yet but I'm still willing to work on this. I had just forgotten about this |
What is it that needs to be done? |
@joneshf: I'm reviewing the code right now. My understanding is that the intent of the PR in its current form is:
The part I'm still wrapping my head around is how Along the way @tibbe asked to log every 100 failures instead of logging every time, but I believe that is orthogonal to this pull request. Whether to log at all is a separate issue from how frequently we log when we do choose to log them. As far as I can tell, all three of @DaveCTurner, @joneshf and I want the errors silenced completely, not reduced in frequency. So my approach to this is to first verify whether or not |
I'm struggling to find an authoritative reference for this, but here's the explanation. It's not about Following on from this, if the socket is not |
Resolve merge conflict
@tibbe: Could you merge this in its present state now that the merge conflicts have been resolved? |
If the receiving statsd isn't running I think it'd be preferable to quietly drop the stats on the floor rather than report an error on each metric sent.
This PR changes the code to use
sendAllTo
instead ofsendAll
to achieve this end: as the socket is notconnect
ed it doesn't receive any destination-unreachable responses.