Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a non-blocking implementation for UDS #81

Merged
merged 10 commits into from
Apr 5, 2019
Merged

Conversation

arbll
Copy link
Member

@arbll arbll commented Mar 7, 2019

What does this PR do

Use a non-blocking UDS client implementation by default. Another PR will follow this one to allow choosing between the old blocking behavior that allowed error checking and this one.

Non-blocking UDS client

  • Write to the UDS in a single thread. Use a channel to buffer datagrams to send.
  • You no longer get an error back if there was an issue writing the datagram (like UDP) but you get one if the channel buffer is full. I think this is a fair tradeoff but this is breaking the original guarantees.

Motivation

More details are available in the original POC #79.

Notes

The old blocking implementation is kept in case someone started relying on error checking.

Being able to configure the client to use the blocking implementation requires some configuration refactoring. It will be done in #82.

You should probably review this PR commit by commit.

@arbll arbll changed the title add an non-blocking implementation for UDS add a non-blocking implementation for UDS Mar 7, 2019
@arbll arbll marked this pull request as ready for review March 11, 2019 13:30
@arbll arbll assigned truthbk and unassigned truthbk Mar 13, 2019
@arbll arbll requested a review from truthbk March 13, 2019 14:29
Copy link
Member

@truthbk truthbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Textbook approach for the asynchronous implementation, nice stuff. Added a few comments, mostly related to the interface and configurability, and an observation related to the blocking vs async clients (the latter can behave like the former).

statsd/statsd.go Outdated Show resolved Hide resolved
// asyncUdsWriter is an internal class wrapping around management of UDS connection
type asyncUdsWriter struct {
// Address to send metrics to, needed to allow reconnection on error
addr net.Addr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably put addr, conn and writeTimeout in a separate structure that both the async and blocking writer structs would include in their definitions.

edit: see note regarding using the async writer as a blocking writer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss https://github.com/DataDog/datadog-go/pull/81/files#r266391555 and come back to this if it still make sense

addr: udsAddr,
conn: nil,
writeTimeout: defaultUDSTimeout,
// 8192 * 8KB = 65.5MB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make sure this is clearly documented - this will have an immediate impact in memory footprint on higher throughput clients.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I am still missing a PR to add a bunch of docs before releasing this and #82

}

// Looks like we might need to connect - try again with write locking.
if w.conn != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems redundant with the above (https://github.com/DataDog/datadog-go/pull/81/files#diff-95ff682af2f28c4955d9ba7de9d1ef8dR100), am I missing something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be good now

}
}

func (w *asyncUdsWriter) write(data []byte) (int, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is concurrency unsafe, we should maybe add a note in case it ever occurs to us we want multiple writers for added performance.... Not sure we'd ever be able to do that with deadlines anyways, so feel free to ignore this comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment

statsd/uds_async.go Show resolved Hide resolved
}

// New returns a pointer to a new blockingUdsWriter given a socket file path as addr.
func newBlockingUdsWriter(addr string) (*blockingUdsWriter, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we need a specific implementation for a blocking writer, I believe in effect the async writer with a datagram queue depth of 1 is essentially the same thing. There might be some details we want to look at - channel blocking vs mutex locking, etc, but I think it'd behave exactly as expected, it would perhaps reduce the amount of duped code here by thinking about that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is error checking. Unlike UDP, UDS will error out if the agent does not receive the payload. With the blocking implementation you would get alerted if your message did not reach the agent.

However, in my opinion it's pretty flawed especially if the client buffers. In this case you only get one error on the call that actually sends the payload.

I think the best path might be:

WDYT ?
cc @xvello you might have an opinion

@truthbk
Copy link
Member

truthbk commented Mar 15, 2019

OK, I see #82 addresses some of my observations here, feel free to ignore anything you've addressed there ;)

@arbll
Copy link
Member Author

arbll commented Mar 18, 2019

Just noticed tests are failing. It seems travis was not working previously.
Seems like I can't use testify because it only supports Go 1.7+.

I guess we need to maintain support for those EOL'd version in 2.x ?

Copy link
Member

@truthbk truthbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're good to go 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants