RFC: Tracing #14618

ysbaddaden · 2024-05-23T13:09:53Z

I propose to introduce a mechanism to trace the GC and scheduler operations as they happen, when they happen and what they did, sometimes even measuring how long the operation took (e.g. GC.malloc).

This has proven invaluable at understanding multi-threaded synchronization issues, that would have been impossible to reproduce using a debugger (you can't step manually and hope to reach a race condition: you need brute force and then analysis). In addition tracing the GC can lead to interesting stats about how much time is spent on GC (malloc, collect, ...), or find out how many allocations happened and when they happen. Tracing both the GC and scheduler, we can cross analyze the lifetime of a program.

The tracing itself is relatively simple: each trace is a line with its section (gc, sched), the operation (malloc, enqueue, ...) then context metadata (thread, fiber, time, duration) and eventually metadata about the operation itself. The trace is meant to be easy to parse and read, to be grepped, searched, copy pasted, you name it.

I propose to have the tracing enabled with a compile time flag (-Dtracing) and to have the feature directly built right into the stdlib. It could be implemented into the perf-tools shard, but it would be harder to patch itself in (especially into the GC collection) and harder to maintain when the stdlib changes. It would also be harder to use, as you would have to add the shard & require it before you can start using it.

Once compiled with tracing enabled, each section should be enabled manually (by default they're all disabled) using the CRYSTAL_TRACE environment variable, which is a list of section names separated by a comma (,). For example CRYSTAL_TRACE=gc,sched would log everything, while CRYSTAL_TRACE=gc would only log the GC.

Evolutions

We could augment the tracing to report more/less data about the program, for example only output GC collection stats (with more before/after details). We could also listen to a signal (and/or a keyboard shortcut) to stop the world and print details about the threads and schedulers (that one would fit nicely into perf-tools for starters).

Technical notes

See ~~#14599~~ #14659 for a proposed implementation and technical discussions.

The text was updated successfully, but these errors were encountered:

straight-shoota · 2024-05-23T16:16:46Z

To clarify/for the record: This is about logging for basic components of the Crystal runtime. We cannot use the sophisticated Log API because that itself builds on top of these runtime features. This has to be very low level.

The trace is meant to be easy to parse and read, to be grepped, searched, copy pasted, you name it.

Single-line log messages is great. But I'd suggest to use a standardized format instead of a kind of custom key=value sequence. JSON could be a good choice. It's maybe not as great in terms of human readability, but still quite okay. It's umabiguous and efficient to generate and parse. And it's super easy to pipe into all kinds of log analytics engines.

Dumping the trace unconditionally to stderr could mix up with application output going to the same file descriptor. We need to be able to isolate these different purposes from each other, i.e. write traces to a different location.
It might be fine to use stderr by default, but allow configuration of a specific file descriptor or a path to a file, via an environment variable. Tooling such as crystal tool trace could set that up automatically.

ysbaddaden · 2024-05-24T15:09:23Z

This is about logging for basic components of the Crystal runtime.

It's also meant to have the least impact on the program execution, so edge cases and race conditions can still be reproduced (not every time, it will still have some impact).

I'd suggest to use a standardized format instead of a custom sequence.

Good point for a standard format that could be easily ingested by external tools and databases (nice)! I hope I can find a better format than JSON that would hurt human reading (my main use case).

Dumping the trace unconditionally to stderr could mix up with application output going to the same file descriptor

Being able to trace to a file instead of stderr would be ❤️. I'm just afraid to lose the atomic write guarantee of pipes (up to PIPE_BUF bytes) that would lead to smashed traces when multiple threads are tracing.

straight-shoota · 2024-05-24T17:40:57Z

It should always be an option to pass a file descriptor to dump the trace into. And that file descriptor can be a pipe so it has that guarantee. Should be easy enough to setup externaly that the pipe dumps into a file.

yxhuvud · 2024-05-25T07:56:01Z

This is great to have and will reduce the need to do manual printf when debugging. It is probably not needed at the first implementation, but at some point it would eventually be nice if the handling of the events would be pluggable with a pretty locked down interface (ie, no strings except for things like fiber names, just records with enums and measurements in the recording steps) - I'm thinking of shards providing support for something like https://github.com/mikesart/gpuvis/wiki , or pushing the values into hdrhistogram to be processed later or whatever, without having to monkeypatch anything.

That also changes the perspective a bit, as it immediately changes the discussion to be about event generation and processing/presenting, as separate activities where the latter is up to the user. But perhaps that can be added later, as long as the door isn't closed to that by and implementation choices.

ysbaddaden · 2024-05-27T09:07:46Z

Researching formats:

the key=value notation is very popular in line protocols (Syslog, OpenTSDB, InfluxDB);
traces add up quickly. For example the crystal compiler compiling itself leads to roughly 60 million events. The more complex the message, the more impact it has (speed, log size).

I'd thus like to have a couple formats. By default a format that is both & readable as possible. I propose to follow OpenTSDB line protocol:

<section>.<operation> <timestamp> thread=<0x1234>:<name> fiber=<0x9876>:<name> [key=value]

Then have a CRYSTAL_TRACE_FORMAT=json to output a bloated JSON string of the form (pretty printed for the sake of the format description, the trace will be on one line):

{
  "event": "<section.operation>",
  "timestamp": <timestamp>,
  "thread_id": "0x1234",
  "thread_name": "<name>",
  "fiber_id": "0x5678",
  "fiber.name": "<name>",
  "key": "value"
}

Notes:

The duration becomes a simple metadata attribute;
Strings should be enclosed within double quotes (especially for JSON), which means we must add support for escaping a character to Crystal::System.printf.

philipp-kempgen · 2024-05-29T14:44:01Z

Single-line log messages is great. But I'd suggest to use a standardized format instead

Did you consider the OpenTelemetry Protocol (OTLP)? OTLP/HTTP maybe?
That could be fed into the Jaeger tracing platform, for example.
You can log time spans and log messages that way.

Ruby:
https://opentelemetry.io/docs/languages/ruby/
https://github.com/open-telemetry/opentelemetry-ruby

Line-based output would still be helpful for quick debugging, as OTLP can be slightly tricky to set up.
Maybe one of the OpenTelemetry exporters can do that.

philipp-kempgen · 2024-05-29T15:11:29Z

the key=value notation is very popular in line protocols (Syslog, OpenTSDB, InfluxDB);

Yes, that's "logfmt".
https://grafana.com/docs/loki/latest/send-data/promtail/stages/logfmt/
https://brandur.org/logfmt
imho that's good for log messages, not so much for nested time spans though.

ysbaddaden · 2024-05-31T10:07:13Z

Reading the OTLP spec, it looks far too complex for the bare metal level we want here. It's not about tracing an application, but tracing the core runtime, down to each GC malloc for example.

Implements tracing of the garbage collector and the scheduler as per #14618 Tracing is enabled by compiling with `-Dtracing` then individual tracing must be enabled at runtime with the `CRYSTAL_TRACE` environment variable that is a comma separated list of sections to enable, for example: - ` ` (empty value) or `none` to disable any tracing (default) - `gc` - `sched` - `gc,sched` - `all` to enable everything The traces are printed to the standard error by default, but this can be changed at runtime with the `CRYSTAL_TRACE_FILE` environment variable. For example `trace.log`. You can also redirect the standard error to a file (e.g. `2> trace.log` on UNIX shell). Example tracing calls: Crystal.trace :sched, "spawn", fiber: fiber Crystal.trace :gc, "malloc", size: size, atomic: 1 **Technical note:** tracing happens before the stdlib is initialized, so the implementation must rely on some `LibC` methods directly (i.e. read environment variable, write to file descriptor) and can't use the core/stdlib abstractions. Co-authored-by: Johannes Müller <straightshoota@gmail.com>

ysbaddaden added kind:feature topic:stdlib labels May 23, 2024

ysbaddaden self-assigned this May 23, 2024

ysbaddaden mentioned this issue May 23, 2024

Add Crystal::System::Time.ticks #14620

Merged

ysbaddaden mentioned this issue Jun 3, 2024

Add Crystal::Tracing for runtime tracing #14659

Merged

ysbaddaden linked a pull request Jun 3, 2024 that will close this issue

Add Crystal::Tracing for runtime tracing #14659

Merged

straight-shoota closed this as completed in #14659 Jun 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Tracing #14618

RFC: Tracing #14618

ysbaddaden commented May 23, 2024 •

edited

Loading

straight-shoota commented May 23, 2024

ysbaddaden commented May 24, 2024

straight-shoota commented May 24, 2024

yxhuvud commented May 25, 2024 •

edited

Loading

ysbaddaden commented May 27, 2024

philipp-kempgen commented May 29, 2024

philipp-kempgen commented May 29, 2024

ysbaddaden commented May 31, 2024

RFC: Tracing #14618

RFC: Tracing #14618

Comments

ysbaddaden commented May 23, 2024 • edited Loading

straight-shoota commented May 23, 2024

ysbaddaden commented May 24, 2024

straight-shoota commented May 24, 2024

yxhuvud commented May 25, 2024 • edited Loading

ysbaddaden commented May 27, 2024

philipp-kempgen commented May 29, 2024

philipp-kempgen commented May 29, 2024

ysbaddaden commented May 31, 2024

ysbaddaden commented May 23, 2024 •

edited

Loading

yxhuvud commented May 25, 2024 •

edited

Loading