Skip to content

Commit

Permalink
feat(performance): Add queues product page and js instrumentation (#9950
Browse files Browse the repository at this point in the history
)

Add a product page for the new queues module and js docs for custom instrumentation.
  • Loading branch information
lizokm authored May 22, 2024
1 parent 4590472 commit fb9d30a
Show file tree
Hide file tree
Showing 4 changed files with 184 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: Instrument Queues
description: "Learn how to manually instrument your code to use Sentry's Queues module"
sidebar_order: 20
---

To ensure that you have performance data about your messaging queues, you'll need to instrument custom spans and transactions around your queue producers and consumers.

## Producer Instrumentation

To start capturing performance metrics, use the `Sentry.startSpan` function to wrap your queue producer events. Your span `op` must be set to `queue.publish`. Include the following attributes to enrich your producer spans with queue metrics:

| Attribute | Type | Description |
|:--|:--|:--|
| `messaging.message.id ` | string | The message identifier |
| `messaging.destination.name` | string | The queue or topic name |
| `messaging.message.body.size` | int | Size of the message body in bytes |

You must also include trace headers in your message using `spanToTraceHeader` and `spanToBaggageHeader` so that your consumers can continue your trace once your message is picked up.

<Note>

Your `queue.publish` span must exist as a child within a parent span in order to be recognized as a producer span. If you don't already have a parent producer span, you can start a new one using `Sentry.startSpan`.

</Note>

```javascript
import { spanToTraceHeader, spanToBaggageHeader } from '@sentry/core';

app.post('/publish', async (req, res) => { // Route handler automatically instruments a parent span
await Sentry.startSpan(
{
name: 'queue_producer',
op: 'queue.publish',
attributes: {
'messaging.message.id': messageId,
'messaging.destination.name': 'messages',
'messaging.message.body.size': messageBodySize,
},
},
async (span) => {
const traceHeader = spanToTraceHeader(span);
const baggageHeader = spanToBaggageHeader(span);
await redisClient.lPush('messages', JSON.stringify({
traceHeader,
baggageHeader,
timestamp: Date.now(),
messageId,
}));
},
);
});
```

## Consumer Instrumentation

To start capturing performance metrics, use the `Sentry.startSpan` function to wrap your queue consumers. Your span `op` must be set to `queue.process`. Include the following attributes to enrich your consumer spans with queue metrics:

| Attribute | Type | Description |
|:--|:--|:--|
| `messaging.message.id ` | string | The message identifier |
| `messaging.destination.name` | string | The queue or topic name |
| `messaging.message.body.size` | number | Size of the message body in bytes |
| `messaging.message.retry.count ` | number | The number of times a message was attempted to be processed |
| `messaging.message.receive.latency ` | number | The time in milliseconds that a message awaited processing in queue |

Use `Sentry.continueTrace` to connect your consumer spans to their associated producer spans, and `setStatus` to mark the trace of your message as success or failed.


<Note>

Your `queue.process` span must exist as a child within a parent span in order to be recognized as a consumer span. If you don't already have a parent consumer span, you can start a new one using `Sentry.startSpan`.

</Note>

```javascript
const message = JSON.parse(await redisClient.lPop(QUEUE_KEY));
const latency = Date.now() - message.timestamp;

Sentry.continueTrace(
{ sentryTrace: message.traceHeader, baggage: message.baggageHeader },
(transactionContext) => {
Sentry.startSpan({
...transactionContext,
name: 'queue_consumer_transaction',
},
(parent) => {
Sentry.startSpan({
...transactionContext,
name: 'queue_consumer',
op: 'queue.process',
attributes: {
'messaging.message.id': message.messageId,
'messaging.destination.name': 'messages',
'messaging.message.body.size': message.messageBodySize,
'messaging.message.receive.latency': latency,
'messaging.message.retry.count': 0,
}
}, (span) => {
... // Continue message processing
parent.setStatus({code: 1, message: 'ok'});
});
},
),
},
)
```
29 changes: 29 additions & 0 deletions docs/product/performance/queue-monitoring/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
title: "Queue Monitoring"
sidebar_order: 70
description: "Learn how to monitor your queues with Sentry for improved application performance and health. "
---

<Include name="feature-stage-alpha.mdx" />

Message Queues make asynchronous service-to-service communication possible in distributed architectures. Queues are great for making work that sometimes fails, more resilient and are therefore a building block for distributed applications. Some examples of what queues can help with include handling webhooks from third-party APIs or handling periodic tasks (such as calculating metrics for your users daily).

If you have performance monitoring enabled and your application interacts with message queue systems, you can configure Sentry to monitor their performance and health.

Queue monitoring allows you to monitor both the performance and error rates of your queue consumers and producers, providing observability into your distributed system.

The **Queues** page gives you a high-level overview so that you can see where messages are being written to. (You may see topic names or actual queue names, depending on the messaging system.) If you click on a transaction, you'll see the **Destination Summary** page, which provides metrics about specific endpoints within your applications that either write to, or read from the destination. You can also dig into individual endpoints within your application representing producers creating messages, and consumers reading messages. You'll see actual traces representing messages processed by your application.

### Prerequisites and Limitations

Queue monitoring currently supports [auto instrumentation](/platform-redirect/?next=%2Fperformance%2Finstrumentation%2Fautomatic-instrumentation) for the [Celery Distributed Task Queue](https://docs.celeryq.dev/en/stable/) in Python. Other messaging systems can be monitored using custom instrumentation.

Instructions for custom instrumentation in various languages are linked to below:

- [Python SDK](/platforms/python/performance/instrumentation/custom-instrumentation/queues-module/)
- [JavaScript SDK](/platforms/javascript/performance/instrumentation/custom-instrumentation/queues-module/)
- [Laravel SDK](/platforms/php/guides/laravel/performance/instrumentation/custom-instrumentation/)
- [Java SDK](/platforms/java/distributed-tracing/custom-instrumentation/)
- [Ruby SDK](/platforms/ruby/)
- [.NET SDK](/platforms/dotnet/distributed-tracing/custom-instrumentation/)
- [Symfony SDK](/platforms/php/guides/symfony/distributed-tracing/custom-instrumentation/)
48 changes: 48 additions & 0 deletions docs/product/performance/queue-monitoring/queues-page.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---
title: "Queues Page"
sidebar_order: 10
description: "Learn how to use Sentry's Queues page to get an overview of queue performance and investigate potential problems. "
---

The **Queues** page provides a breakdown of queue performance by destination (the topic name or queue name). Use it as a starting point to investigate potential problems with queues, such as higher than expected processing latency.

A the top of the page, the Average Latency graph shows the total time that messages take to complete. The Published versus Processed graph shows how many messages are being written to the queue versus how many are being completed. If you see an anomaly or want to investigate a time range further, click and drag to select a range directly in the graph and you'll see data for that specific time range only.

The destination table shows where messages are being published to, along with:

- Throughput (the time spent both waiting and being processed in the queue)
- Error rate (how often do jobs fail to complete)
- Published versus processed count
- Time spent (total time your application spent processing jobs)

If you want to dig deeper into the behavior of a specific destination, click the destination name to view the **Destination Summary** page.

## Destination Summary Page

To view the **Destination Summary** page click a [destination](https://opentelemetry.io/docs/specs/semconv/messaging/messaging-spans/#destinations) in the **Queues** page.

At the top of the page you'll see average time in queue, average processing latency, error rate, published versus processed counts, and the total time spent by your application processing jobs. These metrics are shown in relation to the destination, whereas the **Queues** page shows summed up metrics across all destinations. Below the summary you can view graphs for average latency and published versus processed counts.

At the bottom of the page, a table is shown listing transactions that either published or processed queue messages.

If a problem with a specific endpoint jumps out at you, click the transaction to view sample spans and to navigate to the corresponding trace.

### Sample List

Click on an endpoint to open a list of sample spans. This side panel varies depending on whether you click on a producer (your application writing messages to the queue) or consumer transaction.

### Producer Sample List

Sentry automatically finds a variety of samples to help you investigate performance problems. The chosen spans cover the entire selected time range, as well as a range of durations and failure statues.

The Producer panel shows the number of messages the producer has published, the error rate (representing errors that occur while publishing the message), and the average duration of the transaction that publishes the message to the queue.

To dig even deeper, click on a span identifier to view a detailed trace.

### Consumer Sample List

Sentry automatically finds a variety of samples to help you investigate performance problems. The chosen spans cover the entire selected time range, as well as a range of durations and failure statues.

The Consumer panel shows the number of messages the consumer has processed, the error rate, the average time a message spends in a queue, and the average amount of time spent processing the message.

To dig even deeper, click on a span identifier to view a detailed trace.

0 comments on commit fb9d30a

Please sign in to comment.