[KeyInstr] Add docs #137991

OCHyams · 2025-04-30T16:48:38Z

These documents explain the core ideas and some implementation details of the Key Instructions project. The LLVM document also outlines the two main limitations of our approach.

RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

Main patch stack starts here: #133477

Is this the right level of detail? Happy to take suggestions.

I also wasn't sure whether to split this (as I have done) between LLVM and Clang docs. Does this split look right?

llvmbot · 2025-04-30T16:49:12Z

@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-clang

Author: Orlando Cazalet-Hyams (OCHyams)

Changes

These documents explain the core ideas and some implementation details of the Key Instructions project. The LLVM document also outlines the two main limitations of our approach.

RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

Main patch stack starts here: #133477

Is this the right level of detail? Happy to take suggestions.

I also wasn't sure whether to split this (as I have done) between LLVM and Clang docs. Does this split look right?

Full diff: https://github.com/llvm/llvm-project/pull/137991.diff

2 Files Affected:

(added) clang/docs/KeyInstructionsClang.md (+120)
(added) llvm/docs/KeyInstructionsDebugInfo.md (+114)

diff --git a/clang/docs/KeyInstructionsClang.md b/clang/docs/KeyInstructionsClang.md
new file mode 100644
index 0000000000000..441fab41a60ee
--- /dev/null
+++ b/clang/docs/KeyInstructionsClang.md
@@ -0,0 +1,120 @@
+# Key Instructions in Clang
+
+Key Instructions is an LLVM feature that reduces the jumpiness of optimized code debug stepping. This document explains how Clang applies the necessary metadata.
+
+## Implementation
+
+See the [LLVM docs](../../llvm/docs/KeyInstructionsDebugInfo.md) for general info about the feature (and LLVM implementation details).
+
+Clang needs to annotate key instructions with the new metadata. Variable assignments (stores, memory intrinsics), control flow (branches and their conditions, some unconditional branches), and exception handling instructions are annotated. Calls are ignored as they're unconditionally marked `is_stmt`. This is achieved with a few simple constructs:
+
+Class `ApplyAtomGroup` - This is a scoped helper similar to `ApplyDebugLocation` that creates a new source atom group which instructions can be added to. It's used during CodeGen to declare that a new source atom has started, e.g. in `CodeGenFunction::EmitBinaryOperatorLValue`.
+
+`CodeGenFunction::addInstToCurrentSourceAtom(llvm::Instruction *KeyInstruction, llvm::Value *Backup)` adds an instruction (and a backup instruction if non-null) to the current "atom group" defined with `ApplyAtomGroup`. The Key Instruction gets rank 1, and backup instructions get higher ranks (the function looks through casts, applying increasing rank as it goes). There are a lot of sites in Clang that need to call this (mostly stores and store-like instructions). FIXME?: Currently it's called at the CGBuilderTy callsites; it could instead make sense to always call the function inside the CGBuilderTy calls, with some calls opting out.
+
+`CodeGenFunction::addInstToNewSourceAtom(llvm::Instruction *KeyInstruction, llvm::Value *Backup)` adds an instruction (and a backup instruction if non-null) to a new "atom group". Currently mostly used in loop handling code.
+
+There are a couple of other helpers, including `addRetToOverrideOrNewSourceAtom` used for `rets` which is covered in the examples below.
+
+## Examples
+
+A simple example walk through:
+```
+void fun(int a) {
+  int b = a;
+}
+```
+
+There are two key instructions here, the assignment and the implicit return. We want to emit metadata that looks like this:
+
+```
+define hidden void @_Z3funi(i32 noundef %a) #0 !dbg !11 {
+entry:
+  %a.addr = alloca i32, align 4
+  %b = alloca i32, align 4
+  store i32 %a, ptr %a.addr, align 4
+  %0 = load i32, ptr %a.addr, align 4, !dbg !DILocation(line: 2, scope: !11, atomGroup: 1, atomRank: 2)
+  store i32 %0, ptr %b, align 4,       !dbg !DILocation(line: 2, scope: !11, atomGroup: 1, atomRank: 1)
+  ret void,                            !dbg !DILocation(line: 3, scope: !11, atomGroup: 2, atomRank: 1)
+}
+```
+
+The store is the key instruction for the assignment (`atomGroup` 1). The instruction corresponding to the final (and in this case only) RHS value, the load from `%a.addr`, is a good backup location for `is_stmt` if the store gets optimized away. It's part of the same source atom, but has lower `is_stmt` precedence, so it gets a higher `atomRank`.
+
+This is all handled during CodeGen. The atom group is set here:
+```
+>  clang::CodeGen::ApplyAtomGroup::ApplyAtomGroup(clang::CodeGen::CGDebugInfo * DI) Line 187
+   clang::CodeGen::CodeGenFunction::EmitAutoVarInit(const clang::CodeGen::CodeGenFunction::AutoVarEmission & emission) Line 1961
+   clang::CodeGen::CodeGenFunction::EmitAutoVarDecl(const clang::VarDecl & D) Line 1361
+   clang::CodeGen::CodeGenFunction::EmitVarDecl(const clang::VarDecl & D) Line 219
+   clang::CodeGen::CodeGenFunction::EmitDecl(const clang::Decl & D) Line 164
+   clang::CodeGen::CodeGenFunction::EmitDeclStmt(const clang::DeclStmt & S) Line 1611
+   clang::CodeGen::CodeGenFunction::EmitSimpleStmt(const clang::Stmt * S, llvm::ArrayRef<clang::Attr const *> Attrs) Line 466
+   clang::CodeGen::CodeGenFunction::EmitStmt(const clang::Stmt * S, llvm::ArrayRef<clang::Attr const *> Attrs) Line 72
+   clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(const clang::CompoundStmt & S, bool GetLast, clang::CodeGen::AggValueSlot AggSlot) Line 556+
+   clang::CodeGen::CodeGenFunction::EmitFunctionBody(const clang::Stmt * Body) Line 1307
+```
+
+And the DILocations are updated here:
+```
+>  clang::CodeGen::CodeGenFunction::addInstToCurrentSourceAtom(llvm::Instruction * KeyInstruction, llvm::Value * Backup, unsigned char KeyInstRank) Line 2551
+   clang::CodeGen::CodeGenFunction::EmitStoreOfScalar(llvm::Value * Value, clang::CodeGen::Address Addr, bool Volatile, clang::QualType Ty, clang::CodeGen::LValueBaseInfo BaseInfo, clang::CodeGen::TBAAAccessInfo TBAAInfo, bool isInit, bool isNontemporal) Line 2133
+   clang::CodeGen::CodeGenFunction::EmitStoreOfScalar(llvm::Value * value, clang::CodeGen::LValue lvalue, bool isInit) Line 2152
+   clang::CodeGen::CodeGenFunction::EmitStoreThroughLValue(clang::CodeGen::RValue Src, clang::CodeGen::LValue Dst, bool isInit) Line 2478
+   clang::CodeGen::CodeGenFunction::EmitScalarInit(const clang::Expr * init, const clang::ValueDecl * D, clang::CodeGen::LValue lvalue, bool capturedByInit) Line 805
+   clang::CodeGen::CodeGenFunction::EmitExprAsInit(const clang::Expr * init, const clang::ValueDecl * D, clang::CodeGen::LValue lvalue, bool capturedByInit) Line 2088
+   clang::CodeGen::CodeGenFunction::EmitAutoVarInit(const clang::CodeGen::CodeGenFunction::AutoVarEmission & emission) Line 2050
+   clang::CodeGen::CodeGenFunction::EmitAutoVarDecl(const clang::VarDecl & D) Line 1361
+   clang::CodeGen::CodeGenFunction::EmitVarDecl(const clang::VarDecl & D) Line 219
+   clang::CodeGen::CodeGenFunction::EmitDecl(const clang::Decl & D) Line 164
+   clang::CodeGen::CodeGenFunction::EmitDeclStmt(const clang::DeclStmt & S) Line 1611
+   clang::CodeGen::CodeGenFunction::EmitSimpleStmt(const clang::Stmt * S, llvm::ArrayRef<clang::Attr const *> Attrs) Line 466
+   clang::CodeGen::CodeGenFunction::EmitStmt(const clang::Stmt * S, llvm::ArrayRef<clang::Attr const *> Attrs) Line 72
+   clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(const clang::CompoundStmt & S, bool GetLast, clang::CodeGen::AggValueSlot AggSlot) Line 556
+   clang::CodeGen::CodeGenFunction::EmitFunctionBody(const clang::Stmt * Body) Line 1307
+
+```
+
+The implicit return is also key (`atomGroup` 2) so that it's stepped on, to match existing non-key-instructions behaviour.
+
+```
+>  clang::CodeGen::CodeGenFunction::addRetToOverrideOrNewSourceAtom(llvm::ReturnInst * Ret, llvm::Value * Backup, unsigned char KeyInstRank) Line 2567
+   clang::CodeGen::CodeGenFunction::EmitFunctionEpilog(const clang::CodeGen::CGFunctionInfo & FI, bool EmitRetDbgLoc, clang::SourceLocation EndLoc) Line 3839
+   clang::CodeGen::CodeGenFunction::FinishFunction(clang::SourceLocation EndLoc) Line 433
+```
+
+`addRetToOverrideOrNewSourceAtom` is a special function used for handling `ret`s. In this case it simply replaces the DILocation with the `atomGroup` and `atomRank` set, adding it to its own atom.
+
+To demonstrate why `ret`s need special handling, we need to look at a more "complex" example, below.
+
+```
+int fun(int a) {
+  return a;
+}
+```
+
+Rather than emit a `ret` for each `return` Clang, in all but the simplest cases (as in the first example) emits a branch to a dedicated block with a single `ret`. That branch is the key instruction for the return statement. If there's only one branch to that block, because there's only one `return` (as in this example), Clang folds the block into its only predecessor. We need to do some accounting to transfer the `atomGroup` number to the `ret` when that happens:
+
+When we hit the special-casing code that knows we've only got one block (the IR looks like this):
+```
+entry:
+  %a.addr = alloca i32, align 4
+  %allocapt = bitcast i32 undef to i32
+  store i32 %a, ptr %a.addr, align 4
+  br label %return, !dbg !6
+```
+
+...remember the branch-to-return-block's `atomGroup`:
+
+```
+>  clang::CodeGen::CGDebugInfo::setRetInstSourceAtomOverride(unsigned __int64 Group) Line 168
+   clang::CodeGen::CodeGenFunction::EmitReturnBlock() Line 332
+   clang::CodeGen::CodeGenFunction::FinishFunction(clang::SourceLocation EndLoc) Line 415
+```
+
+And apply it to the `ret` when it's added:
+```
+>  clang::CodeGen::CodeGenFunction::addRetToOverrideOrNewSourceAtom(llvm::ReturnInst * Ret, llvm::Value * Backup, unsigned char KeyInstRank) Line 2567
+   clang::CodeGen::CodeGenFunction::EmitFunctionEpilog(const clang::CodeGen::CGFunctionInfo & FI, bool EmitRetDbgLoc, clang::SourceLocation EndLoc) Line 3839
+   clang::CodeGen::CodeGenFunction::FinishFunction(clang::SourceLocation EndLoc) Line 433
+```
diff --git a/llvm/docs/KeyInstructionsDebugInfo.md b/llvm/docs/KeyInstructionsDebugInfo.md
new file mode 100644
index 0000000000000..1b2acfb2bfb29
--- /dev/null
+++ b/llvm/docs/KeyInstructionsDebugInfo.md
@@ -0,0 +1,114 @@
+# Key Instructions debug info in LLVM
+
+Key Instructions reduces the jumpiness of optimized code debug stepping. This document explains the feature and how it is implemented in LLVM. For Clang support please see the Clang docs.
+
+## Status
+
+In development - some details may change with little notice.
+
+Tell Clang [not] to produce Key Instructions metadata with `-g[no-]key-instructions`. See the Clang docs for implementation info.
+
+Use LLVM flag `-dwarf-use-key-instructions` to interpret Key Instructions metadata when producing the DWARF line table (Clang passes the flag to LLVM). The behaviour of this flag may change.
+
+The feature improves optimized code stepping; it's intended for the feature to be used with optimisations enabled. Although the feature works at O0 it is not recommended because in some cases the effect of editing variables may not always be immediately realised. (This is a quirk of the current implementation, rather than fundemental limitation, covered in more detail later).
+
+There is currently no plan to support CodeView.
+
+## Problem statement
+
+A lot of the noise in stepping comes from code motion and instruction scheduling. Consider a long expression on a single line. It may involve multiple operations that optimisations move, re-order, and interleave with other instructions that have different line numbers.
+
+DWARF provides a helpful tool the compiler can employ to mitigate this jumpiness, the is_stmt flag, which indicates that an instruction is a recommended breakpoint location. However, LLVM's current approach to deciding is_stmt placement essentially reduces down to "is the associated line number different to the previous instruction's?".
+
+(Note: It's up to the debugger if it wants to interpret is_stmt or not, and at time of writing LLDB doesn't; possibly because LLVM's is_stmts convey no information that can't already be deduced from the rest of the line table.)
+
+## Solution overview
+
+Taking ideas from two papers [1][2] that explore the issue, especially C. Tice's:
+
+From the perspective of a source-level debugger user:
+
+* Source code is made up of interesting constructs; the level of granularity for “interesting” while stepping is typically assignments, calls, control flow. We’ll call these interesting constructs Atoms.
+
+* Atoms usually have one instruction that implements the functionality that a user can observe; once they step “off” that instruction, the atom is finalised. We’ll call that a Key Instruction.
+
+* Communicating where the key instructions are to the debugger (using DWARF’s is_stmt) avoids jumpiness introduced by scheduling non-key instructions without losing source attribution (because non-key instructions retain an associated source location, they’re just ignored for stepping).
+
+## Solution implementation
+
+1. `DILocation` has 2 new fields, `atomGroup` and `atomRank`.
+2. Clang creates `DILocations` using the new fields to communicate which instructions are "interesting".
+3. There’s some bookkeeping required by optimisations that duplicate control flow.
+4. During DWARF emission, the new metadata is collected (linear scan over instructions) to decide is_stmt placements.
+
+1. *The metadata* - The two new `DILocation` fields are `atomGroup` and `atomRank`. Both are unsigned integers. Instructions in the same function with the same `(atomGroup, inlinedAt)` pair are part of the same source atom. `atomRank` determines is_stmt preference within that group, where a lower number is higher precedence. Higher rank instructions act as "backup" is_stmt locations, providing good fallback locations if/when the primary candidate gets optimized away. The default values of 0 indicate the instruction isn’t interesting - it's not an is_stmt candidate.
+
+2. *Clang annotates key instructions* with the new metadata. Variable assignments (stores, memory intrinsics), control flow (branches and their conditions, some unconditional branches), and exception handling instructions are annotated. Calls are ignored as they're unconditionally marked is_stmt.
+
+3. *Throughout optimisation*, the DILocation is propagated normally. Cloned instructions get the original’s DILocation, the new fields get merged in getMergedLocation, etc. However, pass writers need to intercede in cases where a code path is duplicated, e.g. unrolling, jump-threading. In these cases we want to emit key instructions in both the original and duplicated code, so the duplicated must be assigned new `atomGroup` numbers, in a similar way that instruction operands must get remapped. There’s facilities to help this: `mapAtomInstance(const DebugLoc &DL, ValueToValueMapTy &VMap)` adds an entry to `VMap` which can later be used for remapping using `llvm::RemapSourceAtom(Instruction *I, ValueToValueMapTy &VM)`. `mapAtomInstance` is called from `llvm::CloneBasicBlock` and `llvm::RemapSourceAtom` is called from `llvm::RemapInstruction` so in many cases no additional effort is actually needed.
+
+`mapAtomInstance` ensures `LLVMContextImpl::NextAtomGroup` is kept up to date, which is the global “next available atom number”.
+
+The `DILocations` carry over from IR to MIR as normal, without any changes.
+
+4. *DWARF emission* - Iterate over all instructions in a function. For each `(atomGroup, inlinedAt)` pair we find the set of instructions sharing the lowest rank. Only the last of these instructions in each basic block is included in the set. The instructions in this set get is_stmt applied to their source locations. That `is_stmt` then "floats" to the top of contiguous sequence of instructions with the same line number in the same block. That has two benefits when optimisations are enabled. First, this floats `is_stmt` to the top of epilogue instructions (rather than applying it to the `ret` instruction itself) which is important to avoid losing variable location coverage at return statements. Second, it reduces the difference in optimized code stepping behaviour between when Key Instructions is enabled and disabled in “uninteresting” cases. I.e., it appears to generally reduce unnecessary changes in stepping.
+
+We’ve used contiguous line numbers rather than atom membership as the test there because of our choice to represent source atoms with a single integer ID. We can’t have instructions belonging to multiple atom groups or represent any kind of grouping hierarchy. That means we can’t rely on all the call setup instructions being in the same group currently (e.g., if one of the argument expressions contains key functionality such as a store, it will be in its own group).
+
+## Adding the feature to a front end
+
+Front ends that want to use the feature need to do some heavy lifting; they need to annotate Key Instructions and their backups with `DILocations` with the necessary `atomGroup` and `atomRank` values. Currently they also need to tell LLVM to interpret the metadata by passing the `-dwarf-use-key-instructions` flag.
+
+The prototype had LLVM annotate instructions (instead of Clang) using simple heuristics (just looking at kind of instructions). This doesn't exist anywhere upstream, but could be shared if there's interest (e.g., so another front end can try it out before committing to a full implementation ), feel fre to reach out on Discourse (@OCHyams).
+
+## Limitations
+
+### Lack of multiple atom membership
+
+Using a number to represent atom membership is limiting; currently an instruction cannot belong to multiple atoms. Does this come up in practice? Yes. Both in the front end and during optimisations. Consider this C code:
+```c
+a = b = c;
+```
+Clang generates this IR:
+```llvm
+  %0 = load i32, ptr %c.addr, align 4
+  store i32 %0, ptr %b.addr, align 4
+  store i32 %0, ptr %a.addr, align 4
+```
+The load of `c` is used by both stores (which are the Key Instructions for each assignment respectively). We can only use it as a backup location for one of the two atoms.
+
+Certain optimisations merge source locations, which presents another case where it might make sense to be able to represent an instruction belonging to multiple atoms. Currently we deterministically pick one (choosing to keep the lower rank one if there is one).
+
+### Disabled at O0
+
+Consider the following code without optimisations:
+```
+int c =
+    a + b;
+```
+In the current implementation an `is_stmt` won't be generated for the `a + b` instruction, meaning debuggers will likely step over the `add` and stop at the `store` of the result into `c` (which does get `is_stmt`). A user might have hoped to edit `a` or `b` on the previous line in order to alter the result stored to `c`, which they now won't have the chance to do (they'd need to edit the variables on a previous line instead). If the expression was all on one line then they would be able to edit the values before the `add`. For these reasons we're choosing to recommend that the feature should not be enabled at O0.
+
+It should be possible to fix this case if we make a few changes: add all the instructions in the statement (i.e., including the loads) to the atom, and tweak the DwarfEmission code to understand this situation (same atom, different line). So there is room to persue this in the future. Though that gets tricky in some cases due to the [other limitation mentioned above](#lack-of-multiple-atom-membership), e.g.:
+```c
+int e =        // atom 1
+    (a + b)    // atom 1
+  * (c = d);   // - atom 2
+```
+```llvm
+  %0 = load i32, ptr %a.addr, align 4     ; atom 1
+  %1 = load i32, ptr %b.addr, align 4     ; atom 1
+  %add = add nsw i32 %0, %1               ; atom 1
+  %2 = load i32, ptr %d.addr, align 4     ; - atom 2
+  store i32 %2, ptr %c.addr, align 4      ; - atom 2
+  %mul = mul nsw i32 %add, %2             ; atom 1
+  store i32 %mul, ptr %e, align 4         ; atom 1
+```
+Without multiple-atom-membership or some kind of atom hierarchy it's not apparent how to get the `is_stmt` to stick to `a + b`, given the other rules the `is_stmt` placement follows.
+
+O0 isn't a key use-case so solving this is not a priority for the initial implementation. The trade off, smoother stepping at the cost of not being able to edit variables to affect an expression in some cases (and at particular stop points), becomes more attractive when optimisations are enabled (we find that editing variables in the debugger in optimized code often produces unexpected effects, so it's not a big concern that Key Instructions makes it harder sometimes).
+
+---
+
+**References**
+* [1] Key Instructions: Solving the Code Location Problem for Optimized Code (C. Tice, . S. L. Graham, 2000)
+* [2] Debugging Optimized Code: Concepts and Implementation on DIGITAL Alpha Systems (R. F. Brender et al)

OCHyams · 2025-06-16T08:03:44Z

Ping, and updated based on PR #144104

nikic · 2025-06-16T13:03:31Z

clang/docs/KeyInstructionsClang.md

+   clang::CodeGen::CodeGenFunction::EmitSimpleStmt(const clang::Stmt * S, llvm::ArrayRef<clang::Attr const *> Attrs) Line 466
+   clang::CodeGen::CodeGenFunction::EmitStmt(const clang::Stmt * S, llvm::ArrayRef<clang::Attr const *> Attrs) Line 72
+   clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(const clang::CompoundStmt & S, bool GetLast, clang::CodeGen::AggValueSlot AggSlot) Line 556+
+   clang::CodeGen::CodeGenFunction::EmitFunctionBody(const clang::Stmt * Body) Line 1307


Uh, I don't think this listing (especially the line numbers!) is going to age well..

Hmm, yeah that's true. I think it's best to just remove the stack traces.

nikic · 2025-06-16T13:06:00Z

llvm/docs/KeyInstructionsDebugInfo.md

I think you need to add this to either UserGuides.rst or Reference.rst for it to actually appear in the docs?

OCHyams · 2025-07-11T16:37:23Z

Sorry for the wait... I've addressed the feedback - how does this look now?

jryans

Overall, looks good. Most of my comments are stylistic nits.

I do think it's important to link to this new feature from the "how to updated debug info" page, as without that, pass writers (esp. in the future) are quite unlikely to know they need to do some work for this feature.

llvm/docs/UserGuides.rst

llvm/docs/KeyInstructionsDebugInfo.md

jryans · 2025-07-14T11:48:43Z

llvm/docs/KeyInstructionsDebugInfo.md

+
+## Status
+
+In development, but mostly complete. The feature is currently disabled for coroutines.


Perhaps briefly explain why it's disabled for coroutines...? Seems a bit random without any explanation.

I guess it essentially is "random"; it's just that at the time of writing, and the branch date, I haven't got to them yet. There's no fundamental reason why they're not supported and we intend to get to them soon. Mixing functions with different key-instructions-modes is fully supported - it just means currently coros use the non-key-instructions handling.

I've added a bit to the sentence, how does this sound?

Looks good to me, thanks!

llvm/docs/KeyInstructionsDebugInfo.md

clang/docs/KeyInstructionsClang.md

jmorse

Various recommendations, but looking good.

jmorse · 2025-07-14T11:12:19Z

clang/docs/KeyInstructionsClang.md

@@ -0,0 +1,46 @@
+# Key Instructions in Clang
+
+Key Instructions is an LLVM feature that reduces the jumpiness of optimized code debug stepping. This document explains how Clang applies the necessary metadata.


This provides a negative reason (reduces jumpiness), IMO we also want to provide a few words about the things positively added by KIs. "reduces jumpiness [...] by discriminating the 'significance' of the instructions making up a statement", perhaps?

Couldn't find better wording than yours, how does this look? I fear it's become a little bit of a word salad.

jmorse · 2025-07-14T11:20:19Z

clang/docs/KeyInstructionsClang.md

+
+Class `ApplyAtomGroup` - This is a scoped helper similar to `ApplyDebugLocation` that creates a new source atom group which instructions can be added to. It's used during CodeGen to declare that a new source atom has started, e.g. in `CodeGenFunction::EmitBinaryOperatorLValue`.
+
+`CodeGenFunction::addInstToCurrentSourceAtom(llvm::Instruction *KeyInstruction, llvm::Value *Backup)` adds an instruction (and a backup instruction if non-null) to the current "atom group" defined with `ApplyAtomGroup`. The Key Instruction gets rank 1, and backup instructions get higher ranks (the function looks through casts, applying increasing rank as it goes). There are a lot of sites in Clang that need to call this (mostly stores and store-like instructions). FIXME?: Currently it's called at the CGBuilderTy callsites; it could instead make sense to always call the function inside the CGBuilderTy calls, with some calls opting out.


IMO we can ditch the FIXME sentence to avoid un-necessary uncertainty. It could be re-shaped into "Most calls to $blah are annotated with addInstToCurrentSourceAtom, with a few unannotated when they shouldn't be 'key'". That signals to the reader that if they're using $blah, they should probably consider making this annotation, which is what they're looking to learn from this documentation.

jmorse · 2025-07-14T11:24:51Z

clang/docs/KeyInstructionsClang.md

+
+`CodeGenFunction::addInstToNewSourceAtom(llvm::Instruction *KeyInstruction, llvm::Value *Backup)` adds an instruction (and a backup instruction if non-null) to a new "atom group". Currently mostly used in loop handling code.
+
+There are a couple of other helpers, including `addInstToSpecificSourceAtom` used for `rets` which is covered in the examples below.


Signalling the code-construct where the other helps are needed with assist the reader, i.e., "Other helpers for more abnormal scenarios where any instruction is put into an atom that is not the 'current' atom" and then describing when that might occur? This alerts the reader to the possibility that that's a code-construct they'll encounter themselves and can be handled / expressed.

Deleted that part, not sure what other functions I was alluding to. How does the new para sound?

jmorse · 2025-07-14T11:28:15Z

clang/docs/KeyInstructionsClang.md

+}
+```
+
+The store is the key instruction for the assignment (`atomGroup` 1). The instruction corresponding to the final (and in this case only) RHS value, the load from `%a.addr`, is a good backup location for `is_stmt` if the store gets optimized away. It's part of the same source atom, but has lower `is_stmt` precedence, so it gets a higher `atomRank`.


Indicating what functions are called to generate these annotations, as you do for the second example, would be good.

jmorse · 2025-07-14T11:29:25Z

clang/docs/KeyInstructionsClang.md

+
+The implicit return is also key (`atomGroup` 2) so that it's stepped on, to match existing non-key-instructions behaviour. This is achieved by calling  `addInstToNewSourceAtom` from within `EmitFunctionEpilog`.
+
+Explicit return statements are handled uniquely. Rather than emit a `ret` for each `return` Clang, in all but the simplest cases (as in the first example) emits a branch to a dedicated block with a single `ret`. That branch is the key instruction for the return statement. If there's only one branch to that block, because there's only one `return` (as in this example), Clang folds the block into its only predecessor. Handily `EmitReturnBlock` returns the `DebugLoc` associated with the single branch in that case, which is fed into `addInstToSpecificSourceAtom` to ensure the `ret` gets the right group.


Also document how the other "returns" get expressed / annotated?

I did that in the para above, or are you referring to something else?

jmorse · 2025-07-14T11:37:28Z

llvm/docs/KeyInstructionsDebugInfo.md

+
+Tell Clang [not] to produce Key Instructions metadata with `-g[no-]key-instructions`. See the Clang docs for implementation info.
+
+The feature improves optimized code stepping; it's intended for the feature to be used with optimisations enabled. Although the feature works at O0 it is not recommended because in some cases the effect of editing variables may not always be immediately realised. (This is a quirk of the current implementation, rather than fundemental limitation, covered in more detail later).


IMO "effect of editing variables" needs to become more explicit: "debuggers might place a breakpoint after an expression has been evaluated, which harms editing variables", or something.

How does this look?

jmorse · 2025-07-14T13:50:31Z

llvm/docs/KeyInstructionsDebugInfo.md

+
+## Adding the feature to a front end
+
+Front ends that want to use the feature need to do some heavy lifting; they need to annotate Key Instructions and their backups with `DILocations` with the necessary `atomGroup` and `atomRank` values. They also need to set `keyInstructions` true in `DISubprogram`s to tell LLVM to interpret the new metadata in those functions.


IMO "heavy lifting" is a bit off-putting (it's true of C++, but might not be true of other languages). Instead, "Frontends [...] need to group and rank instructions according to their interesting-ness, and set the appropriate fields of [...]".

jmorse · 2025-07-14T13:51:43Z

llvm/docs/KeyInstructionsDebugInfo.md

+
+Front ends that want to use the feature need to do some heavy lifting; they need to annotate Key Instructions and their backups with `DILocations` with the necessary `atomGroup` and `atomRank` values. They also need to set `keyInstructions` true in `DISubprogram`s to tell LLVM to interpret the new metadata in those functions.
+
+The prototype had LLVM annotate instructions (instead of Clang) using simple heuristics (just looking at kind of instructions). This doesn't exist anywhere upstream, but could be shared if there's interest (e.g., so another front end can try it out before committing to a full implementation), feel fre to reach out on Discourse (@OCHyams, @jmorse).


IMO we want to shoe-horn the word "store" into this sentence as it signals what that prototype was doing (i.e., you step through all local assignments and all heap-stores).

How does this look?

jmorse · 2025-07-14T13:52:49Z

llvm/docs/KeyInstructionsDebugInfo.md

+
+### Lack of multiple atom membership
+
+Using a number to represent atom membership is limiting; currently an instruction cannot belong to multiple atoms. Does this come up in practice? Yes. Both in the front end and during optimisations. Consider this C code:


"Does this come up in practice? Yes" -> "This does occur in practice [...]".

How about this?

jmorse · 2025-07-14T13:53:38Z

llvm/docs/KeyInstructionsDebugInfo.md

+int c =
+    a + b;
+```
+In the current implementation an `is_stmt` won't be generated for the `a + b` instruction, meaning debuggers will likely step over the `add` and stop at the `store` of the result into `c` (which does get `is_stmt`). A user might have hoped to edit `a` or `b` on the previous line in order to alter the result stored to `c`, which they now won't have the chance to do (they'd need to edit the variables on a previous line instead). If the expression was all on one line then they would be able to edit the values before the `add`. For these reasons we're choosing to recommend that the feature should not be enabled at O0.


IMO, s/hoped/wished/

OCHyams

Thanks for for the comments, I think I've addressed everything. I've merged both files into the LLVM docs as I couldn't get a link to the Clang docs working properly. How does this look?

OCHyams · 2025-07-14T15:55:15Z

clang/docs/KeyInstructionsClang.md

@@ -0,0 +1,46 @@
+# Key Instructions in Clang
+
+Key Instructions is an LLVM feature that reduces the jumpiness of optimized code debug stepping. This document explains how Clang applies the necessary metadata.


Couldn't find better wording than yours, how does this look? I fear it's become a little bit of a word salad.

clang/docs/KeyInstructionsClang.md

OCHyams · 2025-07-14T16:02:03Z

clang/docs/KeyInstructionsClang.md

+
+Class `ApplyAtomGroup` - This is a scoped helper similar to `ApplyDebugLocation` that creates a new source atom group which instructions can be added to. It's used during CodeGen to declare that a new source atom has started, e.g. in `CodeGenFunction::EmitBinaryOperatorLValue`.
+
+`CodeGenFunction::addInstToCurrentSourceAtom(llvm::Instruction *KeyInstruction, llvm::Value *Backup)` adds an instruction (and a backup instruction if non-null) to the current "atom group" defined with `ApplyAtomGroup`. The Key Instruction gets rank 1, and backup instructions get higher ranks (the function looks through casts, applying increasing rank as it goes). There are a lot of sites in Clang that need to call this (mostly stores and store-like instructions). FIXME?: Currently it's called at the CGBuilderTy callsites; it could instead make sense to always call the function inside the CGBuilderTy calls, with some calls opting out.


OCHyams · 2025-07-14T16:11:50Z

clang/docs/KeyInstructionsClang.md

+
+`CodeGenFunction::addInstToNewSourceAtom(llvm::Instruction *KeyInstruction, llvm::Value *Backup)` adds an instruction (and a backup instruction if non-null) to a new "atom group". Currently mostly used in loop handling code.
+
+There are a couple of other helpers, including `addInstToSpecificSourceAtom` used for `rets` which is covered in the examples below.


Deleted that part, not sure what other functions I was alluding to. How does the new para sound?

OCHyams · 2025-07-15T08:42:48Z

clang/docs/KeyInstructionsClang.md

+}
+```
+
+The store is the key instruction for the assignment (`atomGroup` 1). The instruction corresponding to the final (and in this case only) RHS value, the load from `%a.addr`, is a good backup location for `is_stmt` if the store gets optimized away. It's part of the same source atom, but has lower `is_stmt` precedence, so it gets a higher `atomRank`.


OCHyams · 2025-07-15T10:04:03Z

llvm/docs/KeyInstructionsDebugInfo.md

+
+### Lack of multiple atom membership
+
+Using a number to represent atom membership is limiting; currently an instruction cannot belong to multiple atoms. Does this come up in practice? Yes. Both in the front end and during optimisations. Consider this C code:


How about this?

llvm/docs/KeyInstructionsDebugInfo.md

llvm/docs/UserGuides.rst

llvm/docs/KeyInstructionsDebugInfo.md

jmorse · 2025-07-15T12:11:51Z

Well, github has thrown away my review comments again, so I've no idea 🤷 . Phabricator was better.

jmorse

Soft LGTM, awaiting views from @jryans too

jmorse · 2025-07-15T12:17:21Z

llvm/docs/KeyInstructionsDebugInfo.md

@@ -0,0 +1,160 @@
+# Key Instructions debug info in LLVM and Clang
+
+Key Instructions is an LLVM feature that reduces the jumpiness of optimized code debug stepping by discriminating the significance of instrucions that make up source language statements. This document explains the feature and how it is implemented in [LLVM](#llvm) and [Clang](#clang-and-other-front-ends).


instructions

"distinguishing" instead of "discriminating" perhaps to make it less-word-soupy? Or "ranking the significance"?

jmorse · 2025-07-15T12:19:09Z

llvm/docs/KeyInstructionsDebugInfo.md

+
+## Status
+
+Feature complete except for coroutines, which fall back to not-key-instructions handling for now but will get support soon (there is no fundemental reason why they cannot be supported, we've just not got to it at time of writing).


fundamental

jryans · 2025-07-15T12:28:47Z

Well, github has thrown away my review comments again, so I've no idea 🤷 .

Hmm, I think it's just that most of your comments were on KeyInstructionsClang.md, and the latest version has deleted that file.

I can still see your comments in the "Conversations" menu in the "Files Changed" tab. And they're also visible in this default "Conversation" tab.

Phabricator was better.

I agree. 😢

jryans

Thanks, this version looks good to me. 😄

Since @jmorse already reviewed as well, I'll give formal approval.

jryans · 2025-07-15T12:35:23Z

llvm/docs/KeyInstructionsDebugInfo.md

+
+## Status
+
+In development, but mostly complete. The feature is currently disabled for coroutines.


Looks good to me, thanks!

OCHyams · 2025-07-15T12:36:39Z

Rebased to resolve conflict

Thanks for bearing with, despite the tooling :p

llvmbot added the clang Clang issues not falling into any other category label Apr 30, 2025

OCHyams requested review from nikic, jmorse and SLTozer April 30, 2025 16:49

OCHyams mentioned this pull request May 1, 2025

[KeyInstr] Add fields to DILocation behind compile time flag #133477

Merged

jryans added the debuginfo label May 1, 2025

nikic reviewed Jun 16, 2025

View reviewed changes

jryans reviewed Jul 14, 2025

View reviewed changes

jmorse reviewed Jul 14, 2025

View reviewed changes

OCHyams commented Jul 15, 2025

View reviewed changes

jmorse reviewed Jul 15, 2025

View reviewed changes

OCHyams added 10 commits July 15, 2025 13:32

[KeyInstr] Add docs

5515a9a

add worked example to Clang docs

0fdd098

self nits

b59f004

update docs

30dcf49

tidy up, address review comments

d0f3e2a

style and spelling nits etc

b7b18f4

fix layout, remove broken link to clang docs

f68adce

merge clang into llvm doc

a0ca3bf

add ref from HowToUpdateDebugInfo

f4b908e

spellllllling

05fa4ea

OCHyams force-pushed the ki-docs branch from 00e0f93 to 05fa4ea Compare July 15, 2025 12:34

jryans approved these changes Jul 15, 2025

View reviewed changes

OCHyams merged commit acffe83 into llvm:main Jul 15, 2025
8 of 10 checks passed

OCHyams added a commit that referenced this pull request Jul 15, 2025

Fix link in docs after #137991

9eb0fc8


		## Status

		In development, but mostly complete. The feature is currently disabled for coroutines.

		@@ -0,0 +1,46 @@
		# Key Instructions in Clang

		Key Instructions is an LLVM feature that reduces the jumpiness of optimized code debug stepping. This document explains how Clang applies the necessary metadata.


		Class `ApplyAtomGroup` - This is a scoped helper similar to `ApplyDebugLocation` that creates a new source atom group which instructions can be added to. It's used during CodeGen to declare that a new source atom has started, e.g. in `CodeGenFunction::EmitBinaryOperatorLValue`.

		`CodeGenFunction::addInstToCurrentSourceAtom(llvm::Instruction KeyInstruction, llvm::Value Backup)` adds an instruction (and a backup instruction if non-null) to the current "atom group" defined with `ApplyAtomGroup`. The Key Instruction gets rank 1, and backup instructions get higher ranks (the function looks through casts, applying increasing rank as it goes). There are a lot of sites in Clang that need to call this (mostly stores and store-like instructions). FIXME?: Currently it's called at the CGBuilderTy callsites; it could instead make sense to always call the function inside the CGBuilderTy calls, with some calls opting out.


		`CodeGenFunction::addInstToNewSourceAtom(llvm::Instruction KeyInstruction, llvm::Value Backup)` adds an instruction (and a backup instruction if non-null) to a new "atom group". Currently mostly used in loop handling code.

		There are a couple of other helpers, including `addInstToSpecificSourceAtom` used for `rets` which is covered in the examples below.


		The implicit return is also key (`atomGroup` 2) so that it's stepped on, to match existing non-key-instructions behaviour. This is achieved by calling `addInstToNewSourceAtom` from within `EmitFunctionEpilog`.

		Explicit return statements are handled uniquely. Rather than emit a `ret` for each `return` Clang, in all but the simplest cases (as in the first example) emits a branch to a dedicated block with a single `ret`. That branch is the key instruction for the return statement. If there's only one branch to that block, because there's only one `return` (as in this example), Clang folds the block into its only predecessor. Handily `EmitReturnBlock` returns the `DebugLoc` associated with the single branch in that case, which is fed into `addInstToSpecificSourceAtom` to ensure the `ret` gets the right group.


		Tell Clang [not] to produce Key Instructions metadata with `-g[no-]key-instructions`. See the Clang docs for implementation info.

		The feature improves optimized code stepping; it's intended for the feature to be used with optimisations enabled. Although the feature works at O0 it is not recommended because in some cases the effect of editing variables may not always be immediately realised. (This is a quirk of the current implementation, rather than fundemental limitation, covered in more detail later).


		## Adding the feature to a front end

		Front ends that want to use the feature need to do some heavy lifting; they need to annotate Key Instructions and their backups with `DILocations` with the necessary `atomGroup` and `atomRank` values. They also need to set `keyInstructions` true in `DISubprogram`s to tell LLVM to interpret the new metadata in those functions.


		Front ends that want to use the feature need to do some heavy lifting; they need to annotate Key Instructions and their backups with `DILocations` with the necessary `atomGroup` and `atomRank` values. They also need to set `keyInstructions` true in `DISubprogram`s to tell LLVM to interpret the new metadata in those functions.

		The prototype had LLVM annotate instructions (instead of Clang) using simple heuristics (just looking at kind of instructions). This doesn't exist anywhere upstream, but could be shared if there's interest (e.g., so another front end can try it out before committing to a full implementation), feel fre to reach out on Discourse (@OCHyams, @jmorse).


		### Lack of multiple atom membership

		Using a number to represent atom membership is limiting; currently an instruction cannot belong to multiple atoms. Does this come up in practice? Yes. Both in the front end and during optimisations. Consider this C code:

		@@ -0,0 +1,160 @@
		# Key Instructions debug info in LLVM and Clang

		Key Instructions is an LLVM feature that reduces the jumpiness of optimized code debug stepping by discriminating the significance of instrucions that make up source language statements. This document explains the feature and how it is implemented in [LLVM](#llvm) and [Clang](#clang-and-other-front-ends).


		## Status

		Feature complete except for coroutines, which fall back to not-key-instructions handling for now but will get support soon (there is no fundemental reason why they cannot be supported, we've just not got to it at time of writing).

[KeyInstr] Add docs #137991

[KeyInstr] Add docs #137991

Uh oh!

Conversation

OCHyams commented Apr 30, 2025

Uh oh!

llvmbot commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

OCHyams commented Jun 16, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

OCHyams commented Jul 11, 2025

Uh oh!

jryans left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jmorse left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

OCHyams left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

llvmbot commented Apr 30, 2025 •

edited

Loading

jryans commented Jul 15, 2025 •

edited

Loading