Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raphael/redigo #51

Merged
merged 28 commits into from
May 1, 2017
Merged

Raphael/redigo #51

merged 28 commits into from
May 1, 2017

Conversation

raphaelgavache
Copy link
Member

@raphaelgavache raphaelgavache commented Apr 11, 2017

Summary

Exemple span:

Name:
- redis.command
Meta:
- redis.raw_command: "SET fleet truck "
- redis.args_length: 2
- out.host: 127.0.0.1
- out.port: 6379
- out.db: 0
- out.network: tcp

Patching guide:

using Dial()/DialURL() and Do

Instead of

client, err := redis.Dial("tcp", "127.0.0.1:6379", ...)
client.Do("SET", key, value, ...)

You use

client, err := tracedredis.TracedDial(service_name, tracer, "tcp", "127.0.0.1:6379", ...)
If you have a context:
client.Do("SET", key, value, ...,`context`)
Else same command as usual:
client.Do("SET", key, value, ...)

Using Pools

Instead of

pool :=  pool := &redis.Pool{
                (...)
                Dial: func() (redis.Conn, error) {
                        return Dial("tcp", "127.0.0.1:6379")
                },
        }
c := pool.Get()
c.Do(cmds)

Use

pool :=  pool := &redis.Pool{
                (...)
                Dial: func() (redis.Conn, error) {
                        return tracedredis.Dial(service_name, tracer"tcp", "127.0.0.1:6379")
                },
        }
c:= pools.Get()
c.Do(cmds,..., context)

host, port, err := net.SplitHostPort(u.Host)
if err != nil {
host = u.Host
port = "6379"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems incorrect, shouldn't we only do this if port is ""?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We set port to 6379 only when can't get the port via net.SplitHostPort (this is the err that gets in the if) https://github.com/garyburd/redigo/blob/master/redis/conn.go#L226

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok add a comment with that link


func (tc TracedConn) Do(commandName string, args ...interface{}) (reply interface{}, err error) {
ctx := context.Background()
ok := false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need this line. just do a declaration on Line 62 ctx, ok := ...

}
span.SetMeta("redis.raw_command", raw_command)
ret, err := tc.Conn.Do(commandName, args...)
return ret, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just return tc.Conn.Do(...)

span.Resource = "redigo.Conn.Flush"
}
raw_command := commandName
for _, arg := range args {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be slow and create a lot of garbage. use a bytes.Buffer

span.Resource = commandName
} else {
// According to redigo doc: when the command argument to the Do method is "",
// then the Do method will flush the output buffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a link to the doc please

}

span := tc.tracer.NewChildSpanFromContext("redis.command", ctx)
defer func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually we have a more idiomatic way to do this: defer span.FinishWithErr(err) https://github.com/DataDog/dd-trace-go/blob/master/tracer/span.go#L171


type TracedConn struct {
redis.Conn
tracer *tracer.Tracer
Copy link
Contributor

@talwai talwai Apr 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrap host, port, network, service, tracer in a TraceParams struct and pass it to TracedConn . avoids conflicts in field names

var ctx context.Context
var ok bool
if len(args) > 0 {
ctx, ok = args[len(args)-1].(context.Context)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be more idiomatic to use the sql style ExecContext https://golang.org/pkg/database/sql/#DB.ExecContext

so maybe another function DoContext that takes a context as the first arg. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with that API but indeed having ctx as a last optional arg can lead to surprising behaviors. The reason for doing this is keeping the API unchanged when using or not using tracing, right ? Because this would work one way -> if you replace the Conn by a TracedConn, code is still going to work. But then all redis traces are lone traces, not related to their parents. So you have to put an extra arg. But if you do so, it does not work the other way, the day you go backwards and replace the TracedConn by a Conn, the code will fail because the Redis lib is going to receive an extra param it does not know what to do with. Dig into it, but probably DoContext suggested by @talwai would be fine, it would cost some extra typing/refactor from people instrumenting the code, but would make it safer. Just my 2c.

Copy link
Contributor

@ufoot ufoot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good approach, yet some little fixes to do for general stability I think.

ddagent:
image: datadog/docker-dd-agent
environment:
- DD_APM_ENABLED=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be useless now I think, since trace agent should be running by default: DataDog/datadog-trace-agent@a51dd0e#diff-b7db7d3e3d014cfcce344902b2d07a78

c, err := redis.Dial(network, address)
addr := strings.Split(address, ":")
host := addr[0]
port := addr[1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beware here, you might need to check if there's no : in the string (no port given, and I think this is a real world case, just letting the lib pick up the default port) else we're going to have a panic() in our code and interrupt use code.

assert.Equal(span.GetMeta("redis.args_length"), "2")
}

func TestError(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably need another test that yields an error when connecting to the service, making sure if the underlying library can not connect to redis (bad credentials, bad host, whatever) it bubbles up to the caller.

@ufoot ufoot mentioned this pull request Apr 24, 2017
if len(addr) == 2 && addr[1] != "" {
port = addr[1]
} else {
port = "6379"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

var ctx context.Context
var ok bool
if len(args) > 0 {
ctx, ok = args[len(args)-1].(context.Context)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with that API but indeed having ctx as a last optional arg can lead to surprising behaviors. The reason for doing this is keeping the API unchanged when using or not using tracing, right ? Because this would work one way -> if you replace the Conn by a TracedConn, code is still going to work. But then all redis traces are lone traces, not related to their parents. So you have to put an extra arg. But if you do so, it does not work the other way, the day you go backwards and replace the TracedConn by a Conn, the code will fail because the Redis lib is going to receive an extra param it does not know what to do with. Dig into it, but probably DoContext suggested by @talwai would be fine, it would cost some extra typing/refactor from people instrumenting the code, but would make it safer. Just my 2c.


_, err := TracedDial("redis-service", testTracer, "tcp", "000.0.0:1111")

assert.Contains(err.Error(), "dial tcp: lookup 000.0.0:")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

@raphaelgavache
Copy link
Member Author

About the DoContext suggestion:
TracedConn struct is encapsulated in other structures within redigo library. Those structures implement Conn interface and if you try to apply a DoContext to them it won’t work.
For exemple we use redigo pools for our dd-go redis connections, if I add the TracedDial function those pools are traced but the TracedConn is encapsulated in this struct https://github.com/garyburd/redigo/blob/master/redis/pool.go#L324 that implements Conn. Can’t call DoContext on that struct, nor can we get the TracedConn from it to call DoContext.

I guess we should add a warning message to people adding Context, they have to make sure that they've used TracedDial before using a Do with context in the args

@ufoot
Copy link
Contributor

ufoot commented Apr 26, 2017

OK @furmmon I see your point with pools & co. What about writing a func that would maked a traced redis.Conn out of anything implementing redis.Conn. Like func TracedConn(ctx context.Context, conn redis.Conn) redis.Conn. The example:

c := pool.Get()
c.Do(cmds, ...)

Becomes:

c := tracedredis.TracedConn(ctx, pool.Get())
c.Do(cmds, ...)

@talwai does this make sense?

@raphaelgavache
Copy link
Member Author

raphaelgavache commented Apr 27, 2017

Problem raised

With current code, to pass the context for the creation of the redigo span we use the fact that redis.Conn Do method has the following args: commandName string, args ...interface{}. If a context is slipped in at the end of the args the spans created will inherit of it.
The risk of that is that if you stop using a TracedConn and keep the context at the end of the Do call you will make redis call crash. The redigo library doesn't implement context yet and Do isn't supposed to receive one in it's args

The ideal solution would be to have a DoContext(ctx context.Context, commandName string, args ...interface{} function.

In our back-end we use alot of redis.Pool for our redigo transactions, so let's dig into this use case.

General overview of redigo patching

Data we want:

  • connection metadata (host, port, ...)
  • commands to be executed and info about their args
  • inheritance (who executed that command)

Where to get those:

  • all data related to connection is in the Dial function and they are not accessible from Pools
  • commands appear at the Do function
  • for inheritance we have to pass the context at the span creation time so that it has parents (if exists)

Transactions with redis are made in the Do function, so this is the place where we want our spans to be created and complete themselves.

Current state

By using TracedDial or TracedDialURL function we get a TracedConn structure inside redigo.
this structure contains all the connection metadata from Dial parameters and it implements the redis.Conn interface. func (tc TracedConn) Do overwrites the Do function so that the redis transaction is traced.

About the proposed solutions

1. Add a DoWithContext function for TracedConn

When we do

connection := pool.Get()
connection.Do(cmds, ..)

pool.Get() returns a pooledConnection. Our TracedConn is encapsulated in that pooledConnection in a private variable c. We can't cast that back to a TracedConn so adding a method DoContext is impossible at that level.

2. Insert context upstream

We can easily set the context at the pool creation level without messing with the Do args. But then parent spans will be irrelevant. We usually create a pool then do various calls on it, and those calls can't be reasonably all linked in the same trace

@raphaelgavache
Copy link
Member Author

I'm open to any solution to the original problem, I didn't find any better then the current one

@talwai talwai merged commit c1aa041 into master May 1, 2017
@raphaelgavache raphaelgavache deleted the raphael/redigo branch June 20, 2017 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants