Add Perfetto trace output to profiling.sampling

# Feature or enhancement

### Proposal:

## Feature

Add a Perfetto trace output backend to `profiling.sampling` (target: 3.16), alongside the existing other output formats. This would let users open Python sampling profiles directly in the [Perfetto UI](https://ui.perfetto.dev) and analyse them with PerfettoSQL in the [trace processor](https://perfetto.dev/docs/analysis/trace-processor).

Opening this for design sanity-check before sending a PR, as suggested by @pablogsal in offline discussion. The most load-bearing question (how to emit protobuf without a runtime dependency) is covered below.

## Motivation

- Perfetto is a trace viewer oriented systems-perf work, and being able to view Python samples there means a Python profile can sit on the same timeline as scheduler events, native CPU samples, and application instrumentation rather than living in a separate tool.
- Once trace processor can read the output, users get SQL-driven analysis of Python samples (top-N functions, per-thread breakdowns, custom aggregations).

## Proposed design

### Output format

Emit a Perfetto trace (`perfetto.protos.Trace`) containing:
- A `ProcessDescriptor` / `ThreadDescriptor` per observed process/thread.
- `InternedData` carrying `Frame`, `Mapping`, and `Callstack` entries
- One `TracePacket` per sample containing a `StackSample` message that references the interned callstack.

`StackSample` is part of a **new set of public profiling protos** I'm landing in Perfetto specifically so that producers like this one have a stable, transport-neutral surface to target (rather than reusing `PerfSample`, which is shaped by `perf_event_open` and leaks producer diagnostics into the data). The RFC is at https://github.com/google/perfetto/discussions/6027.

Concretely, each Python sample maps to roughly:

```
TracePacket {
  timestamp: <ns>
  trusted_packet_sequence_id: <seq>
  interned_data { ... }  // first packet only, or as new frames appear
  stack_sample {
    task_context_iid: <thread>
    execution_context_iid: <cpu/mode>     // optional
    callstack_iid: <interned callstack>
    primary_descriptor_iid: <"wall_time_ns" counter>
    primary_weight: <ns since last sample>
  }
}
```
### Proto serialisation without a runtime dependency

The stdlib can't depend on `protobuf`, so I'd hand-roll the wire format for the specific message types we emit. This is tractable because:

- **Proto wire format is tiny.** It's varints + length-delimited + fixed32/64 + a tag byte per field. The whole encoder for the messages we need is on the order of a couple hundred lines of Python.
- **We only encode, never decode.** Decoding protobuf is signifcantly harder than encode.
- **The set of message types is small and stable.** I've specifically designated in the RFC upstream that these protos I'm adding are going to be "eternally stabe" protos which we won't change the wire format or semantics in a non-backwards compatible way.
- **Precedent.** Perfetto itself ships an ad-hoc proto encoder/decoder (`protozero`) in C++ for similar reasons.

Sketch of the shape:

```python
# Hand-rolled, no runtime deps.
def _varint(buf, n): ...
def _tag(buf, field_no, wire_type): ...
def _string(buf, field_no, s): ...
def _message(buf, field_no, payload): ...

def encode_stack_sample(buf, sample):
    _tag(buf, 1, WIRE_VARINT); _varint(buf, sample.task_context_iid)
    _tag(buf, 6, WIRE_VARINT); _varint(buf, sample.callstack_iid)
    # ...
```

Field numbers and wire types come straight from the `.proto` definitions in the Perfetto RFC. Any wire type constants would be inlined as Python integers.

### CLI surface

A new `--format perfetto` CLI flag to the profiling.sampling module in all the same places `--gecko` is allowed today.

## Target

Python 3.16.

## Prerequisites

- [Perfetto RFC-0027](https://github.com/google/perfetto/discussions/6027): public stack-sampling and heap-profiling protos. Required before this lands so we're targeting the stable protos.

cc @pablogsal

### Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

### Links to previous discussion of this feature:

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Perfetto trace output to profiling.sampling #150327

Feature or enhancement

Proposal:

Feature

Motivation

Proposed design

Output format

Proto serialisation without a runtime dependency

CLI surface

Target

Prerequisites

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Add Perfetto trace output to profiling.sampling #150327

Description

Feature or enhancement

Proposal:

Feature

Motivation

Proposed design

Output format

Proto serialisation without a runtime dependency

CLI surface

Target

Prerequisites

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Metadata

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

Issue actions