Skip to content

Commit

Permalink
2023-05-31 performance and tracing update
Browse files Browse the repository at this point in the history
  • Loading branch information
mgmeier committed Jun 1, 2023
1 parent 0b86329 commit ac8179a
Showing 1 changed file with 43 additions and 0 deletions.
43 changes: 43 additions & 0 deletions blog/2023-05-31-performance-and-tracing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---
title: Performance & tracing update
slug: 2023-05-31-performance-and-tracing
authors: mgmeier
tags: [performance-tracing]
hide_table_of_contents: false
---

## High level summary

* Benchmarking: We've performed and analysed first benchmarks with GHC9.2 builds. Additionally we have developed an early indicator for how build config changes might reflect on metrics from our model cluster.
* New tracing: Collaboration with Galois led to the new tracing system to be equipped with a re-forwarding mechanism.
* Nomad backend: Porting the 52 node model cluster to nomad cloud is ongoing, with the focus on deployment and health checks.


## Low level overview

### Benchmarking

The first set of runs with GHC9.2 as a build platform are in. We've discovered a significant difference in resource profile usage compared to GHC8.10. Further investigation uncovered the need for benchmarking another parameter change in the build
configuration: As it stands, the `ghc-bignum` package is using the Haskell `native-backend` as a default. We strive
to benchmark a build with the `gmp-backend` next.

A variant of our `forge-stress` local benchmark has been set up to serve as an early indicator for the resource usage profile
we'd expect to observe on the model cluster. This provides us with a much tighter feedback loop, as local run duration is way
shorter. This indicator is specific to changes in the configuration of build and the runtime systems, and will be of great
support when evaluating different compiler versions or RTS flags incrementally.

### Tracing

The hub of the new tracing system `cardano-tracer` is designed with a fixed output behaviour, which is limited to various
logging options. Thanks to the contribution from Galois, that design is now extended to be able to re-forward all, or a pre-filtered portion, of traces from the node in a configurable manner. This will enable downstream applications to
directly receive the set of trace values relevant to their logic, without any additional cost for the node itself at all.


### Nomad backend
We're currently working out the details of efficiently deploying and monitoring a fleet of 50+ nodes, along with
job definitions for tracing and transaction generation. Scaling up to those many instances, and monitoring an ongoing
benchmarking run required us to fine-tune communications with the nomad server.

Related to that, the new cloud backend will provide a monitoring and health-checking mechanism which is far more flexible
and offers more detailed insight than the previous iteration in `cardano-ops`. The backend will enable you to formulate
very specific conditions for an ongoing run to be considered healthy, and offer automation of certain actions should these conditions not be met.

0 comments on commit ac8179a

Please sign in to comment.