<img src="images/dd_logo.png" />

# Distributed Tracing with Datadog APM

Now that we've got the basics for how traces work, it's time to trace our first distributed system. 

In our case, we'll use `docker-compose` to load up two Flask APIs, an Agent, and a redis server. We'll first manually instrument our application, and then see how to enable the distributed tracer in Datadog to automatically instrument our API.

Before we get started, be sure to check out the repo that goes along with this. It'll have everything you need.

If you're running this notebook locally, you should already be good. Otherwise, you'll want to:

```bash
$ git clone https://github.com/burningion/dash-apm-workshop
$ cd dash-apm-workshop
$ jupyter notebook
```

Finally, stop running the existing Datadog Agent container if you're continuing from the Quickstart.

You can do this via a:

```bash
$ docker ps
$ docker kill <PSNAMEOFAGENT>
```

Once you've stopped the Datadog Agent container, you'll be able to move on to the example project.

## Becoming Acquainted with the Example Project

<img src="images/architecture.png" />

Before we instrument our example project, let's become familiar with its architecture.

Our example is an API, that sends of requests to a `thinker` microservice. It runs via a `docker-compose.yml` file, that spins up four containers.

Right now, we have two Flask apps (think api and thinker service), an instance of the Datadog Agent container, and a redis container.

Requests flow from the Think API to the Thinker microservice, and the redis instance is not yet hooked up to anything. We'll edit our code in a later step, and use it as a datastore.

The Datadog Agent is set up to receive traces at its default 8126 port.

Open up a new terminal, and spin up the docker-compose of the repo:

```bash
$ DD_API_KEY=<YOUR_API_KEY> STEP=1 docker-compose up
```


## Sending Test Requests to our API
Running docker-compose up spins up all the containers for our infrastructure. 

In this case, we're using docker-compose to spin up two microservices. For now, we've got an API that sits in front of our microservice, and of course, a microservice. 

The first example is already set up with a basic tracer initialized, so by putting in our key, we can already see traces being sent.

Let's try our first `curl` request to the API, and see if we can trace our request across both services:

In [None]:
!curl http://localhost:5000/think/?subject=war

In [None]:
!curl http://localhost:5000/think/?subject=mankind

In [None]:
!curl http://localhost:5000/think/?subject=music

## Viewing Default Traces  Across Systems
Now that a few requests have been sent, we can take a look at the Datadog APM dashboard, and see what's going on with our service.

<img src="images/first-thinker-api.png" />

Looking at the dashboard, it appears our trace which should be a single trace is broken out into two separate traces.

<img src="images/first-thinker-micro.png" />

Our customer facing API is hitting the `thinker` microservice, but the trace coming from the `api` service isn't being propagated across both.

By default, Datadog's APM implementation doesn't send or look for the request headers that would go across applications. 

This is because traces allow you to pass along potentially private information. It's better if we only pass the headers of our trace along to infrastructure that we know is our own.

Let's walk through adding our trace headers to our APIs, first manually, and then automatically with the `distributed_tracing` flag.


## Manually Continuing Our Trace Across Systems

If we look at the `thinker.py` file, we can see that even though our `think` function is wrapped in a trace, we're not continuing or checking for any exisisting spans. 

In order to do that within Flask, we'll need to add `X-Datadog-Trace-Id` and `X-Datadog-Parent-Id` to our requests that go into our private `thinker` API, injecting our `trace_id` and `parent_id`.

Once our request headers make it to the private `thinker` service, we then check to see if they exist, and add them into our current span context.

Our Python code for the `thinker` service becomes the following:

```python
@app.route('/')
def think_microservice():
    # continue the span from the called service
    trace_id = flask_request.headers.get("X-Datadog-Trace-Id")
    parent_id = flask_request.headers.get("X-Datadog-Parent-Id")
    if trace_id and parent_id:
        span = tracer.current_span()
        span.trace_id = int(trace_id)
        span.parent_id = int(parent_id)

    subject = flask_request.args.get('subject')
    thoughts = think(subject)
    return Response(thoughts, mimetype='application/json')
```

Notice the `think` function that gets called has a Python decorator. It's wrapping the function call with a span, and inserting the `subject` of the think call into the span's `tag`:


```python
@tracer.wrap(name='think')
def think(subject):
    tracer.current_span().set_tag('subject', subject)

    sleep(0.5)
    return thoughts[subject]
```

Going back to our original `API` application, we also need to instrument and send our trace information in the part where we make our web request:

```python
@app.route('/think/')
def think_handler():
    thoughts = requests.get('http://thinker:5001/', headers={
        'x-datadog-trace-id': str(tracer.current_span().trace_id),
        'x-datadog-parent-id': str(tracer.current_span().span_id),
    }, params={
        'subject': flask_request.args.getlist('subject', str),
    }).content
    return Response(thoughts, mimetype='application/json')
```

If we want, we can restart our containers now, and see how things look with requests being passed across services:

```bash
$ docker-compose down
$ DD_API_KEY=<YOUR_API_KEY> STEP=2 docker-compose up
```

## Viewing Cross Service Spans

In order to view our cross service spans, we'll first need to generate some more requests, creating new traces to be sent back to Datadog.

Let's do that now:

In [None]:
!curl http://localhost:5000/think/?subject=war

In [None]:
!curl http://localhost:5000/think/?subject=mankind

Let's try generating an error in our application:

In [None]:
!curl http://localhost:5000/think/?subject=peace

Now, when we switch over to view our traces in Datadog, we see them coming in as a single span, traversing our microservices.

<img src="images/second-thinker-api.png" />

But if you looked closely, you'll see that we have a library that can be instrumented by Datadog, but isn't.

That's the `requests` library, that's used to send our requests across from one microservice to the other. 

## Automatic Distributed Tracing 

Now that we've seen how to manually add distributed tracing headers to our internal infrastructure, let's set things up the easy way.

If you're following along with the code, we're now in `step03`.

We can add automatic distributed tracing to Datadog [supported libraries](https://docs.datadoghq.com/tracing/setup/python/#compatibility) by adding a simple `distributed_tracing=True` to our `TraceMiddleware`.

This adds checks for the headers from before, and automatically continues as a child span where necessary.

If we use Datadog's Python library function `patch`, we can also automatically instrument the `requests` library, along with the `redis` server we have running.

To send our headers along with the automatically instrumented `requests` library, we must also import `config` from `ddtrace`, and add the following lines:

```python
from ddtrace import tracer, patch, config

# Tracer configuration
tracer.configure(hostname='agent')
patch(requests=True)

# enable distributed tracing for requests
# to send headers (globally)
config.requests['distributed_tracing'] = True
```

By using the Datadog patch, we get more default metadata of our request along with the information set.

Now we can see our traces as they propagate across our entire distributed system.

<img src="images/automatic-distributed.png" />

But we're still running a simplified system. Let's add a datastore and see how that changes what distributed tracing shows us.

## Adding and Instrumenting a Datastore

## Where to go from here?

Bringing into your organization, other repositories with example code.

Once again, the great work done by Andrew McBurney with Homebrew while an intern at Datadog. Great use case of using Tracing to instrument a monolithic application:

https://www.datadoghq.com/blog/engineering/using-datadog-apm-to-find-bottlenecks-and-performance-benchmarking/