Skip to content

Commit 66943c3

Browse files
committed
[DebugInfo][Docs] Document how dbg.value intrinsics are interpreted in optimized code
This patch adds a section, ``Object lifetime in optimized code'', that documents how such intrinsics are supposed to be handled. It sets out some of the principles of how they specify variable locations, and how long those locations are valid for. This patch also documents one of the objectives behind the variable-location design, that we should never allow the debugger to observe a state of the program that would not have appeared without optimization. Differential Revision: https://reviews.llvm.org/D58726 llvm-svn: 356041
1 parent 360ce82 commit 66943c3

File tree

1 file changed

+125
-0
lines changed

1 file changed

+125
-0
lines changed

llvm/docs/SourceLevelDebugging.rst

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -391,6 +391,131 @@ inside of subprogram ``!4`` described above.
391391
The scope information attached with each instruction provides a straightforward
392392
way to find instructions covered by a scope.
393393

394+
Object lifetime in optimized code
395+
=================================
396+
397+
In the example above, every variable assignment uniquely corresponds to a
398+
memory store to the variable's position on the stack. However in heavily
399+
optimized code LLVM promotes most variables into SSA values, which can
400+
eventually be placed in physical registers or memory locations. To track SSA
401+
values through compilation, when objects are promoted to SSA values an
402+
``llvm.dbg.value`` intrinsic is created for each assignment, recording the
403+
variable's new location. Compared with the ``llvm.dbg.declare`` intrinsic:
404+
405+
* A dbg.value terminates the effect of any preceeding dbg.values for (any
406+
overlapping fragments of) the specified variable.
407+
* The dbg.value's position in the IR defines where in the instruction stream
408+
the variable's value changes.
409+
* Operands can be constants, indicating the variable is assigned a
410+
constant value.
411+
412+
Care must be taken to update ``llvm.dbg.value`` intrinsics when optimization
413+
passes alter or move instructions and blocks -- the developer could observe such
414+
changes reflected in the value of variables when debugging the program. For any
415+
execution of the optimized program, the set of variable values presented to the
416+
developer by the debugger should not show a state that would never have existed
417+
in the execution of the unoptimized program, given the same input. Doing so
418+
risks misleading the developer by reporting a state that does not exist,
419+
damaging their understanding of the optimized program and undermining their
420+
trust in the debugger.
421+
422+
Sometimes perfectly preserving variable locations is not possible, often when a
423+
redundant calculation is optimized out. In such cases, a ``llvm.dbg.value``
424+
with operand ``undef`` should be used, to terminate earlier variable locations
425+
and let the debugger present ``optimized out`` to the developer. Withholding
426+
these potentially stale variable values from the developer diminishes the
427+
amount of available debug information, but increases the reliability of the
428+
remaining information.
429+
430+
To illustrate some potential issues, consider the following example:
431+
432+
.. code-block:: llvm
433+
434+
define i32 @foo(i32 %bar, i1 %cond) {
435+
entry:
436+
call @llvm.dbg.value(metadata i32 0, metadata !1, metadata !2)
437+
br i1 %cond, label %truebr, label %falsebr
438+
truebr:
439+
%tval = add i32 %bar, 1
440+
call @llvm.dbg.value(metadata i32 %tval, metadata !1, metadata !2)
441+
%g1 = call i32 @gazonk()
442+
br label %exit
443+
falsebr:
444+
%fval = add i32 %bar, 2
445+
call @llvm.dbg.value(metadata i32 %fval, metadata !1, metadata !2)
446+
%g2 = call i32 @gazonk()
447+
br label %exit
448+
exit:
449+
%merge = phi [ %tval, %truebr ], [ %fval, %falsebr ]
450+
%g = phi [ %g1, %truebr ], [ %g2, %falsebr ]
451+
call @llvm.dbg.value(metadata i32 %merge, metadata !1, metadata !2)
452+
call @llvm.dbg.value(metadata i32 %g, metadata !3, metadata !2)
453+
%plusten = add i32 %merge, 10
454+
%toret = add i32 %plusten, %g
455+
call @llvm.dbg.value(metadata i32 %toret, metadata !1, metadata !2)
456+
ret i32 %toret
457+
}
458+
459+
Containing two source-level variables in ``!1`` and ``!3``. The function could,
460+
perhaps, be optimized into the following code:
461+
462+
.. code-block:: llvm
463+
464+
define i32 @foo(i32 %bar, i1 %cond) {
465+
entry:
466+
%g = call i32 @gazonk()
467+
%addoper = select i1 %cond, i32 11, i32 12
468+
%plusten = add i32 %bar, %addoper
469+
%toret = add i32 %plusten, %g
470+
ret i32 %toret
471+
}
472+
473+
What ``llvm.dbg.value`` intrinsics should be placed to represent the original variable
474+
locations in this code? Unfortunately the the second, third and fourth
475+
dbg.values for ``!1`` in the source function have had their operands
476+
(%tval, %fval, %merge) optimized out. Assuming we cannot recover them, we
477+
might consider this placement of dbg.values:
478+
479+
.. code-block:: llvm
480+
481+
define i32 @foo(i32 %bar, i1 %cond) {
482+
entry:
483+
call @llvm.dbg.value(metadata i32 0, metadata !1, metadata !2)
484+
%g = call i32 @gazonk()
485+
call @llvm.dbg.value(metadata i32 %g, metadata !3, metadata !2)
486+
%addoper = select i1 %cond, i32 11, i32 12
487+
%plusten = add i32 %bar, %addoper
488+
%toret = add i32 %plusten, %g
489+
call @llvm.dbg.value(metadata i32 %toret, metadata !1, metadata !2)
490+
ret i32 %toret
491+
}
492+
493+
However, this will cause ``!3`` to have the return value of ``@gazonk()`` at
494+
the same time as ``!1`` has the constant value zero -- a pair of assignments
495+
that never occurred in the unoptimized program. To avoid this, we must terminate
496+
the range that ``!1`` has the constant value assignment by inserting an undef
497+
dbg.value before the dbg.value for ``!3``:
498+
499+
.. code-block:: llvm
500+
501+
define i32 @foo(i32 %bar, i1 %cond) {
502+
entry:
503+
call @llvm.dbg.value(metadata i32 0, metadata !1, metadata !2)
504+
%g = call i32 @gazonk()
505+
call @llvm.dbg.value(metadata i32 undef, metadata !1, metadata !2)
506+
call @llvm.dbg.value(metadata i32 %g, metadata !3, metadata !2)
507+
%addoper = select i1 %cond, i32 11, i32 12
508+
%plusten = add i32 %bar, %addoper
509+
%toret = add i32 %plusten, %g
510+
call @llvm.dbg.value(metadata i32 %toret, metadata !1, metadata !2)
511+
ret i32 %toret
512+
}
513+
514+
In general, if any dbg.value has its operand optimized out and cannot be
515+
recovered, then an undef dbg.value is necessary to terminate earlier variable
516+
locations. Additional undef dbg.values may be necessary when the debugger can
517+
observe re-ordering of assignments.
518+
394519
.. _ccxx_frontend:
395520

396521
C/C++ front-end specific debug information

0 commit comments

Comments
 (0)