Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppress errors when sending to a nonexistent receiver #6

Merged
merged 6 commits into from May 1, 2017
Merged

Suppress errors when sending to a nonexistent receiver #6

merged 6 commits into from May 1, 2017

Conversation

DaveCTurner
Copy link
Contributor

If the receiving statsd isn't running I think it'd be preferable to quietly drop the stats on the floor rather than report an error on each metric sent.

This PR changes the code to use sendAllTo instead of sendAll to achieve this end: as the socket is not connected it doesn't receive any destination-unreachable responses.

@tibbe
Copy link
Owner

tibbe commented Nov 26, 2015

I specifically added the existing logging to catch this case (to help me and other debug why things are not working). How about we add a counter to the loop and only log every N (e.g. 100) times this happens? Should reduce log spam but still be helpful.

@DaveCTurner
Copy link
Contributor Author

Fair enough.

Stopping after a set number of messages is a little tricky, particularly with this implementation. For instance, when do you reset your error counter back to 0?

How about using connect if the debug flag is set in StatsdOptions and not otherwise?

@DaveCTurner
Copy link
Contributor Author

Sorry, misread your message. Thought you said log the first 100 rather than every 100th error!

My question about changing the behaviour when debug is set still stands.

@DaveCTurner
Copy link
Contributor Author

Like this, for instance?

-> Metrics.Sample -- ^ Last sampled metrics
-> Socket.Socket -- ^ Connected socket
-> StatsdOptions -- ^ Options
loop :: Metrics.Store -- ^ Metric store
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you add an Int parameter to this function and pass it to and return it from flushSample you can do something like

Socket.sendAll socket msg `catch` \ (e :: IOException) -> do
    when failCount `mod` 100 == 0 $ do
        T.hPutStrLn stderr $ "ERROR: Couldn't send message: " <>
            T.pack (show e) <> "(happened " <> show failCount <> " times)"
    return $! failCount + 1

@DaveCTurner
Copy link
Contributor Author

Yes, got that, but it wouldn't help here as you don't get these errors if you never call connect on the socket in the first place.

@tibbe
Copy link
Owner

tibbe commented Nov 28, 2015

Sorry, I haven't found the time to think about this hard enough. Could we make this work in a way that the user gets periodically notified if things aren't working without spamming them too much?

@DaveCTurner
Copy link
Contributor Author

Sure. Your suggestion works for that. Again I would think different behaviour between debug = True and debug = False makes sense: when True I reckon every message should be shown. I'll add that to the PR in the next week when I find the time

But I really would like a way to avoid calling connect on the socket so the failures are dropped in the kernel rather than coming all the way back to the app. Maybe another config option if not a function of the debug setting?

@Gabriella439
Copy link
Contributor

We ran into exactly this issue at work and we would like to fix it. I can pick up and continue where this pull request left off if nobody objects

@tibbe
Copy link
Owner

tibbe commented Oct 11, 2016

Fine with me.

@joneshf
Copy link

joneshf commented Apr 24, 2017

What's the status here? We also ran into this and would prefer not to see the output.

Would it be enough to just wrap

T.hPutStrLn stderr $ "ERROR: Couldn't send message: " <>
T.pack (show e)
in a when isDebug?

@DaveCTurner
Copy link
Contributor Author

Sorry, I'm unlikely to get to this in the near future. @Gabriel439 did you get anywhere?

@Gabriella439
Copy link
Contributor

I also haven't gotten to this yet but I'm still willing to work on this. I had just forgotten about this

@joneshf
Copy link

joneshf commented Apr 24, 2017

What is it that needs to be done?

@Gabriella439
Copy link
Contributor

@joneshf: I'm reviewing the code right now. My understanding is that the intent of the PR in its current form is:

  • When debug = True, preserve the original behavior and log every failed attempt
  • When debug = False, don't log failed attempts

The part I'm still wrapping my head around is how sendAllTo is supposed to silence failed attempts. I'm not still not clear on that

Along the way @tibbe asked to log every 100 failures instead of logging every time, but I believe that is orthogonal to this pull request. Whether to log at all is a separate issue from how frequently we log when we do choose to log them. As far as I can tell, all three of @DaveCTurner, @joneshf and I want the errors silenced completely, not reduced in frequency.

So my approach to this is to first verify whether or not sendAllTo is the correct and idiomatic way to silence these errors and if that's the case then I will just resolve merge conflicts and push to merge this as is. If that's not the idiomatic solution, then I will update the PR accordingly

@DaveCTurner
Copy link
Contributor Author

I'm struggling to find an authoritative reference for this, but here's the explanation. It's not about send() vs sendTo(), it's about whether the socket is connect()ed or not. If it's connected then undeliverable-message errors are reported as errors to the application, which end up as exceptions by the time they get to this code. If it's not connected then undeliverable-message errors are dropped in the kernel before they ever make it to userspace, so the application sees nothing.

Following on from this, if the socket is not connect()ed then you have to supply the destination address with each datagram, hence the need for sendTo() rather than send().

@Gabriella439
Copy link
Contributor

@tibbe: Could you merge this in its present state now that the merge conflicts have been resolved?

@tibbe tibbe merged commit 5cbb692 into tibbe:master May 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants