Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RemoveDIs] C API: Add before-dbg-record versions of IRBuilder position funcs #92417

Merged
merged 10 commits into from
Jun 10, 2024

Conversation

OCHyams
Copy link
Contributor

@OCHyams OCHyams commented May 16, 2024

Add LLVMPositionBuilderBeforeDbgRecords and
LLVMPositionBuilderBeforeInstrAndDbgRecords to llvm/include/llvm-c/Core.h
which behave the same as LLVMPositionBuilder and LVMPositionBuilderBefore
except that the position is set before debug records attached to the target
instruction (the existing functions set the insertion point to after any
attached debug records).

More info on debug records and the migration towards using them can be found
here: https://llvm.org/docs/RemoveDIsDebugInfo.html

The distinction is important in some situations. An important example is when
inserting a phi before another instruction which has debug records attached to
it (these come "before" the instruction). Inserting before the instruction but
after the debug records would result in having debug records before a phi,
which is illegal. That results in an assertion failure:

llvm/lib/IR/Instruction.cpp:166: Assertion '!isa<PHINode>(this) && "Inserting PHI after debug-records!"' failed.

In llvm (C++) we've added bit to instruction iterators that carries around the
extra information. Adding dedicated functions seemed like the least invasive
and least suprising way to update the C API.

Update llvm/tools/llvm-c-test/debuginfo.c to test this functionality.

Update the OCaml bindings, the migration docs and release notes.

@OCHyams OCHyams requested review from jryans and SLTozer May 16, 2024 15:44
@OCHyams OCHyams requested a review from nikic as a code owner May 16, 2024 15:44
@llvmbot
Copy link

llvmbot commented May 16, 2024

@llvm/pr-subscribers-llvm-ir

Author: Orlando Cazalet-Hyams (OCHyams)

Changes

Add a new parameter LLVMBool BeforeDbgRecords to LLVMPositionBuilderBefore
and LLVMPositionBuilder in llvm/include/llvm-c/Core.h. It determines whether
the insertion point is before debug records attached to Instr.

More info on debug records and the migration towards using them can be found
here: https://llvm.org/docs/RemoveDIsDebugInfo.html

The distinction is important in some situations. An important example is when
inserting a phi before another instruction which has debug records attached to
it (these come "before" the instruction). Inserting before the instruction but
after the debug records would result in having debug records before a phi,
which is illegal. That results in an assertion failure:

llvm/lib/IR/Instruction.cpp:166: Assertion '!isa&lt;PHINode&gt;(this) &amp;&amp; "Inserting PHI after debug-records!"' failed.

In llvm (C++) we've added bit to instruction iterators that carries around the
extra information. Adding a bool to two functions seemed like the least
invasive and least suprising way to update the C API.

The new PHI is tested in llvm/tools/llvm-c-test/debuginfo.c.

Update the OCaml bindings, the migration docs and release notes.


Full diff: https://github.com/llvm/llvm-project/pull/92417.diff

12 Files Affected:

  • (modified) llvm/bindings/ocaml/llvm/llvm.ml (+8-8)
  • (modified) llvm/bindings/ocaml/llvm/llvm.mli (+16-9)
  • (modified) llvm/bindings/ocaml/llvm/llvm_ocaml.c (+3-3)
  • (modified) llvm/docs/ReleaseNotes.rst (+5)
  • (modified) llvm/docs/RemoveDIsDebugInfo.md (+10)
  • (modified) llvm/include/llvm-c/Core.h (+9-2)
  • (modified) llvm/lib/IR/Core.cpp (+7-3)
  • (modified) llvm/test/Bindings/OCaml/core.ml (+2-2)
  • (modified) llvm/test/Bindings/OCaml/debuginfo.ml (+1-1)
  • (modified) llvm/test/Bindings/llvm-c/debug_info.ll (+3)
  • (modified) llvm/test/Bindings/llvm-c/debug_info_new_format.ll (+3)
  • (modified) llvm/tools/llvm-c-test/debuginfo.c (+18-1)
diff --git a/llvm/bindings/ocaml/llvm/llvm.ml b/llvm/bindings/ocaml/llvm/llvm.ml
index 003fd750cd9f8..f339ea1bae1a4 100644
--- a/llvm/bindings/ocaml/llvm/llvm.ml
+++ b/llvm/bindings/ocaml/llvm/llvm.ml
@@ -1137,22 +1137,22 @@ external delete_instruction : llvalue -> unit = "llvm_delete_instruction"
 
 (*===-- Instruction builders ----------------------------------------------===*)
 external builder : llcontext -> llbuilder = "llvm_builder"
-external position_builder : (llbasicblock, llvalue) llpos -> llbuilder -> unit
-                          = "llvm_position_builder"
+external position_builder : (llbasicblock, llvalue) llpos -> bool -> llbuilder ->
+                            unit = "llvm_position_builder"
 external insertion_block : llbuilder -> llbasicblock = "llvm_insertion_block"
 external insert_into_builder : llvalue -> string -> llbuilder -> unit
                              = "llvm_insert_into_builder"
 
-let builder_at context ip =
+let builder_at context ip before_dbg_records =
   let b = builder context in
-  position_builder ip b;
+  position_builder ip before_dbg_records b;
   b
 
-let builder_before context i = builder_at context (Before i)
-let builder_at_end context bb = builder_at context (At_end bb)
+let builder_before context i before_dbg_records = builder_at context (Before i) before_dbg_records
+let builder_at_end context bb = builder_at context (At_end bb) false
 
-let position_before i = position_builder (Before i)
-let position_at_end bb = position_builder (At_end bb)
+let position_before i before_dbg_records = position_builder (Before i) before_dbg_records
+let position_at_end bb = position_builder (At_end bb) false
 
 
 (*--... Metadata ...........................................................--*)
diff --git a/llvm/bindings/ocaml/llvm/llvm.mli b/llvm/bindings/ocaml/llvm/llvm.mli
index 93540c619efba..e2999fae656b2 100644
--- a/llvm/bindings/ocaml/llvm/llvm.mli
+++ b/llvm/bindings/ocaml/llvm/llvm.mli
@@ -1867,26 +1867,33 @@ val incoming : llvalue -> (llvalue * llbasicblock) list
     for [llvm::LLVMBuilder]. *)
 val builder : llcontext -> llbuilder
 
-(** [builder_at ip] creates an instruction builder positioned at [ip].
+(** [builder_at ip before_dbg_records] creates an instruction builder
+    positioned at [ip]. [before_dbg_records] determines whether the insertion
+    point is before debug records attached to [ip].
     See the constructor for [llvm::LLVMBuilder]. *)
-val builder_at : llcontext -> (llbasicblock, llvalue) llpos -> llbuilder
+val builder_at : llcontext -> (llbasicblock, llvalue) llpos ->
+                 bool -> llbuilder
 
-(** [builder_before ins] creates an instruction builder positioned before the
-    instruction [isn]. See the constructor for [llvm::LLVMBuilder]. *)
-val builder_before : llcontext -> llvalue -> llbuilder
+(** [builder_before ins before_dbg_records] creates an instruction builder
+    positioned before the instruction [isn]. [before_dbg_records] determines
+    whether the insertion point is before debug records attached to [isn].
+    See the constructor for [llvm::LLVMBuilder]. *)
+val builder_before : llcontext -> llvalue -> bool -> llbuilder
 
 (** [builder_at_end bb] creates an instruction builder positioned at the end of
     the basic block [bb]. See the constructor for [llvm::LLVMBuilder]. *)
 val builder_at_end : llcontext -> llbasicblock -> llbuilder
 
-(** [position_builder ip bb] moves the instruction builder [bb] to the position
-    [ip].
+(** [position_builder ip bb before_dbg_records] moves the instruction builder
+    [bb] to the position [ip]. [before_dbg_records] determines whether the
+    insertion point is before debug records attached to [ip].
     See the constructor for [llvm::LLVMBuilder]. *)
-val position_builder : (llbasicblock, llvalue) llpos -> llbuilder -> unit
+val position_builder : (llbasicblock, llvalue) llpos -> bool -> llbuilder ->
+                        unit
 
 (** [position_before ins b] moves the instruction builder [b] to before the
     instruction [isn]. See the method [llvm::LLVMBuilder::SetInsertPoint]. *)
-val position_before : llvalue -> llbuilder -> unit
+val position_before : llvalue -> bool -> llbuilder -> unit
 
 (** [position_at_end bb b] moves the instruction builder [b] to the end of the
     basic block [bb]. See the method [llvm::LLVMBuilder::SetInsertPoint]. *)
diff --git a/llvm/bindings/ocaml/llvm/llvm_ocaml.c b/llvm/bindings/ocaml/llvm/llvm_ocaml.c
index 6d08d78b84455..cb0f6f8ce71c9 100644
--- a/llvm/bindings/ocaml/llvm/llvm_ocaml.c
+++ b/llvm/bindings/ocaml/llvm/llvm_ocaml.c
@@ -2016,14 +2016,14 @@ value llvm_builder(value C) {
   return alloc_builder(LLVMCreateBuilderInContext(Context_val(C)));
 }
 
-/* (llbasicblock, llvalue) llpos -> llbuilder -> unit */
-value llvm_position_builder(value Pos, value B) {
+/* (llbasicblock, llvalue) llpos -> bool -> llbuilder -> unit */
+value llvm_position_builder(value Pos, value BeforeDbgRecords, value B) {
   if (Tag_val(Pos) == 0) {
     LLVMBasicBlockRef BB = BasicBlock_val(Field(Pos, 0));
     LLVMPositionBuilderAtEnd(Builder_val(B), BB);
   } else {
     LLVMValueRef I = Value_val(Field(Pos, 0));
-    LLVMPositionBuilderBefore(Builder_val(B), I);
+    LLVMPositionBuilderBefore(Builder_val(B), I, Bool_val(BeforeDbgRecords));
   }
   return Val_unit;
 }
diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index 84320461fa9e1..37e2827ed76cd 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -180,6 +180,11 @@ Changes to the C API
   * ``LLVMGetCallBrNumIndirectDests``
   * ``LLVMGetCallBrIndirectDest``
 
+* Modified the following functions, adding a new `BeforeDbgRecords` parameter which indicates whether or not the insertion position should also be before the debug records that precede the instruction. See the `debug info migration guide <https://llvm.org/docs/RemoveDIsDebugInfo.html>`_ for more info.
+
+  * ``LLVMPositionBuilder``
+  * ``LLVMPositionBuilderBefore``
+
 Changes to the CodeGen infrastructure
 -------------------------------------
 
diff --git a/llvm/docs/RemoveDIsDebugInfo.md b/llvm/docs/RemoveDIsDebugInfo.md
index 9e50a2a604aa6..7036ead9e092c 100644
--- a/llvm/docs/RemoveDIsDebugInfo.md
+++ b/llvm/docs/RemoveDIsDebugInfo.md
@@ -56,8 +56,18 @@ LLVMDIBuilderInsertDeclareBefore   # Insert a debug record (new debug info forma
 LLVMDIBuilderInsertDeclareAtEnd    # Same as above.
 LLVMDIBuilderInsertDbgValueBefore  # Same as above.
 LLVMDIBuilderInsertDbgValueAtEnd   # Same as above.
+
+Existing functions (new parameter)
+----------------------------------
+LLVMPositionBuilder               # Added BeforeDbgRecords parameter. See info below.
+LLVMPositionBuilderBefore         # Same as above.
 ```
 
+`LLVMPositionBuilder` and `LLVMPositionBuilderBefore` have gained a parameter `BeforeDbgRecords` which indicates whether or not the insertion position should also be before the debug records that precede the instruction. Note that this doesn't mean that debug intrinsics before the chosen instruction are skipped, only debug records (which unlike debug records are not themselves instructions).
+
+If you don't know whether it should be true or false then follow this rule:
+If you are trying to insert at the start of a block, or purposfully skip debug intrinsics to determine the insertion point for any other reason, then set it to true. Otherwise set it to false.
+
 # Anything else?
 
 Not really, but here's an "old vs new" comparison of how to do certain things and quickstart for how this "new" debug info is structured.
diff --git a/llvm/include/llvm-c/Core.h b/llvm/include/llvm-c/Core.h
index 9d09546513f0e..12f9b5e2e3cd2 100644
--- a/llvm/include/llvm-c/Core.h
+++ b/llvm/include/llvm-c/Core.h
@@ -3952,14 +3952,21 @@ const unsigned *LLVMGetIndices(LLVMValueRef Inst);
  * An instruction builder represents a point within a basic block and is
  * the exclusive means of building instructions using the C interface.
  *
+ * Some insertion-position-setting methods have a "BeforeDbgRecords"
+ * parameter. Set to true if the insertion point should be in front of debug
+ * records, for example when inserting a phi. If you are trying to insert at the
+ * start of a block, or purposfully skip debug intrinsics to determine the
+ * insertion point for any other reason, then set it to true. Otherwise set it
+ * to false.
  * @{
  */
 
 LLVMBuilderRef LLVMCreateBuilderInContext(LLVMContextRef C);
 LLVMBuilderRef LLVMCreateBuilder(void);
 void LLVMPositionBuilder(LLVMBuilderRef Builder, LLVMBasicBlockRef Block,
-                         LLVMValueRef Instr);
-void LLVMPositionBuilderBefore(LLVMBuilderRef Builder, LLVMValueRef Instr);
+                         LLVMValueRef Instr, LLVMBool BeforeDbgRecords);
+void LLVMPositionBuilderBefore(LLVMBuilderRef Builder, LLVMValueRef Instr,
+                               LLVMBool BeforeDbgRecords);
 void LLVMPositionBuilderAtEnd(LLVMBuilderRef Builder, LLVMBasicBlockRef Block);
 LLVMBasicBlockRef LLVMGetInsertBlock(LLVMBuilderRef Builder);
 void LLVMClearInsertionPosition(LLVMBuilderRef Builder);
diff --git a/llvm/lib/IR/Core.cpp b/llvm/lib/IR/Core.cpp
index df90b88341123..6e97a4222a7dc 100644
--- a/llvm/lib/IR/Core.cpp
+++ b/llvm/lib/IR/Core.cpp
@@ -3158,15 +3158,19 @@ LLVMBuilderRef LLVMCreateBuilder(void) {
 }
 
 void LLVMPositionBuilder(LLVMBuilderRef Builder, LLVMBasicBlockRef Block,
-                         LLVMValueRef Instr) {
+                         LLVMValueRef Instr, LLVMBool BeforeDbgRecords) {
   BasicBlock *BB = unwrap(Block);
   auto I = Instr ? unwrap<Instruction>(Instr)->getIterator() : BB->end();
+  I.setHeadBit(BeforeDbgRecords);
   unwrap(Builder)->SetInsertPoint(BB, I);
 }
 
-void LLVMPositionBuilderBefore(LLVMBuilderRef Builder, LLVMValueRef Instr) {
+void LLVMPositionBuilderBefore(LLVMBuilderRef Builder, LLVMValueRef Instr,
+                               LLVMBool BeforeDbgRecords) {
   Instruction *I = unwrap<Instruction>(Instr);
-  unwrap(Builder)->SetInsertPoint(I->getParent(), I->getIterator());
+  BasicBlock::iterator It = I->getIterator();
+  It.setHeadBit(BeforeDbgRecords);
+  unwrap(Builder)->SetInsertPoint(I->getParent(), It);
 }
 
 void LLVMPositionBuilderAtEnd(LLVMBuilderRef Builder, LLVMBasicBlockRef Block) {
diff --git a/llvm/test/Bindings/OCaml/core.ml b/llvm/test/Bindings/OCaml/core.ml
index 64bfa8ee412df..cd3e87c00628c 100644
--- a/llvm/test/Bindings/OCaml/core.ml
+++ b/llvm/test/Bindings/OCaml/core.ml
@@ -832,7 +832,7 @@ let test_instructions () =
     let fty = function_type void_type [| i32_type; i32_type |] in
     let f = define_function "f" fty m in
     let bb = entry_block f in
-    let b = builder_at context (At_end bb) in
+    let b = builder_at context (At_end bb) false in
 
     insist (At_end bb = instr_begin bb);
     insist (At_start bb = instr_end bb);
@@ -1152,7 +1152,7 @@ let test_builder () =
     (* CHECK: ret{{.*}}P1
      *)
     let ret = build_ret p1 atentry in
-    position_before ret atentry
+    position_before ret false atentry
   end;
 
   (* see test/Feature/exception.ll *)
diff --git a/llvm/test/Bindings/OCaml/debuginfo.ml b/llvm/test/Bindings/OCaml/debuginfo.ml
index f95800dfcb025..4f779743adaa0 100644
--- a/llvm/test/Bindings/OCaml/debuginfo.ml
+++ b/llvm/test/Bindings/OCaml/debuginfo.ml
@@ -273,7 +273,7 @@ let test_variables f dibuilder file_di fun_di =
   stdout_metadata auto_var;
   (* CHECK: [[LOCAL_VAR_PTR:<0x[0-9a-f]*>]] = !DILocalVariable(name: "my_local", scope: <{{0x[0-9a-f]*}}>, file: <{{0x[0-9a-f]*}}>, line: 10, type: [[INT64TY_PTR]])
   *)
-  let builder = Llvm.builder_before context entry_term in
+  let builder = Llvm.builder_before context entry_term false in
   let all = Llvm.build_alloca (Llvm.i64_type context)  "my_alloca" builder in
   let scope =
     Llvm_debuginfo.dibuild_create_lexical_block dibuilder ~scope:fun_di
diff --git a/llvm/test/Bindings/llvm-c/debug_info.ll b/llvm/test/Bindings/llvm-c/debug_info.ll
index 9358bac59bd21..71986fbb2348f 100644
--- a/llvm/test/Bindings/llvm-c/debug_info.ll
+++ b/llvm/test/Bindings/llvm-c/debug_info.ll
@@ -10,7 +10,10 @@
 ; CHECK-NEXT:   call void @llvm.dbg.declare(metadata i64 0, metadata !40, metadata !DIExpression()), !dbg !43
 ; CHECK-NEXT:   br label %vars
 ; CHECK:      vars:
+; CHECK-NEXT:   %p1 = phi i64 [ 0, %entry ]
+; CHECK-NEXT:   %p2 = phi i64 [ 0, %entry ]
 ; CHECK-NEXT:   call void @llvm.dbg.value(metadata i64 0, metadata !41, metadata !DIExpression(DW_OP_constu, 0, DW_OP_stack_value)), !dbg !44
+; CHECK-NEXT:   %a = add i64 %p1, %p2
 ; CHECK-NEXT:   ret i64 0
 ; CHECK-NEXT: }
 
diff --git a/llvm/test/Bindings/llvm-c/debug_info_new_format.ll b/llvm/test/Bindings/llvm-c/debug_info_new_format.ll
index 05b6ef4de9adb..1b6f2c4ea403a 100644
--- a/llvm/test/Bindings/llvm-c/debug_info_new_format.ll
+++ b/llvm/test/Bindings/llvm-c/debug_info_new_format.ll
@@ -11,7 +11,10 @@
 ; CHECK-NEXT:     #dbg_declare(i64 0, !40, !DIExpression(), !43)
 ; CHECK-NEXT:   br label %vars
 ; CHECK:      vars:
+; CHECK-NEXT:   %p1 = phi i64 [ 0, %entry ]
+; CHECK-NEXT:   %p2 = phi i64 [ 0, %entry ]
 ; CHECK-NEXT:     #dbg_value(i64 0, !41, !DIExpression(DW_OP_constu, 0, DW_OP_stack_value), !44)
+; CHECK-NEXT:   %a = add i64 %p1, %p2
 ; CHECK-NEXT:   ret i64 0
 ; CHECK-NEXT: }
 
diff --git a/llvm/tools/llvm-c-test/debuginfo.c b/llvm/tools/llvm-c-test/debuginfo.c
index 35c65f885af32..ed1cb4e30223a 100644
--- a/llvm/tools/llvm-c-test/debuginfo.c
+++ b/llvm/tools/llvm-c-test/debuginfo.c
@@ -228,7 +228,24 @@ int llvm_test_dibuilder(bool NewDebugInfoFormat) {
   LLVMPositionBuilderAtEnd(Builder, FooVarBlock);
   LLVMTypeRef I64 = LLVMInt64TypeInContext(Ctx);
   LLVMValueRef Zero = LLVMConstInt(I64, 0, false);
-  LLVMBuildRet(Builder, Zero);
+  LLVMValueRef Ret = LLVMBuildRet(Builder, Zero);
+
+  // Insert a `phi` before the `ret`. In the new debug info mode we need to
+  // be careful to insert before debug records too, else the debug records
+  // will come before the `phi` (and be absorbed onto it) which is an invalid
+  // state.
+  LLVMValueRef InsertPos = LLVMGetFirstInstruction(FooVarBlock);
+  LLVMPositionBuilderBefore(Builder, InsertPos, true);
+  LLVMValueRef Phi1 = LLVMBuildPhi(Builder, I64, "p1");
+  LLVMAddIncoming(Phi1, &Zero, &FooEntryBlock, 1);
+  // Do the same again using the other position-setting function.
+  LLVMPositionBuilder(Builder, FooVarBlock, InsertPos, true);
+  LLVMValueRef Phi2 = LLVMBuildPhi(Builder, I64, "p2");
+  LLVMAddIncoming(Phi2, &Zero, &FooEntryBlock, 1);
+  // Insert a non-phi before the `ret` but not before the debug records to
+  // test that works as expected.
+  LLVMPositionBuilder(Builder, FooVarBlock, Ret, false);
+  LLVMBuildAdd(Builder, Phi1, Phi2, "a");
 
   char *MStr = LLVMPrintModuleToString(M);
   puts(MStr);

@alan-j-hu
Copy link
Contributor

Just ran the OCaml binding unit tests, they all pass.

@alan-j-hu
Copy link
Contributor

I just want to fix some lines that go past 80 columns in llvm.ml. Although the file already contains a few lines that do go past 80 columns, most of the file is formatted with newlines inserted to avoid this.

Copy link
Contributor

@alan-j-hu alan-j-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add newline to limit to 80 columns

llvm/bindings/ocaml/llvm/llvm.ml Outdated Show resolved Hide resolved
llvm/bindings/ocaml/llvm/llvm.ml Outdated Show resolved Hide resolved
llvm/bindings/ocaml/llvm/llvm.ml Outdated Show resolved Hide resolved
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not possible to change the signature of a function in the C API. You can only add a new function with the new signature.

The usual pattern would be to add LLVMPositionBuilder2 (in this case an alternative would be to add a new function with the parameter set to the non-default value).

@OCHyams
Copy link
Contributor Author

OCHyams commented May 20, 2024

Thanks for taking a look @nikic

It is not possible to change the signature of a function in the C API. You can only add a new function with the new signature.

Oh, I can't see this rule in the developer policy. But that would make the changes more stable.

The usual pattern would be to add LLVMPositionBuilder2 (in this case an alternative would be to add a new function with the parameter set to the non-default value).

The reason we need to modify the API is that at the moment it's possible to insert PHIs after debug records which is illegal and leads to an assertion (edit: more specifically, there's no way to avoid that behaviour using the C API with the new debug info format - inserting a phi before an instruction with debug records attached will always cause the assertion without this patch). I chose to modify existing functions because I figured would help reduce nasty surprises, but if stability is preferable this isn't the right tactic.

We could add new functions with the extra parameter, keep the existing functions untouched and forward args and a "default" value for the new parameter to the new functions (possibly this is what you're suggesting already)? I guess this would be most "stable" (and I think then we can omit most of the OCaml changes then).

I wonder if some of the other changes we made need re-reviewing in this case:
#84915 - adds some new functions (I imagine this patch is fine)
#85657 - changes return type of existing functions (is this one ok?)

@nikic
Copy link
Contributor

nikic commented May 20, 2024

Thanks for taking a look @nikic

It is not possible to change the signature of a function in the C API. You can only add a new function with the new signature.

Oh, I can't see this rule in the developer policy. But that would make the changes more stable.

The phrasing here as a bit unclear as to what "best effort stability" means, but the rules are basically:

  • You are allowed to add new functions.
  • You shouldn't remove functions, but are allowed to do so if there's a strong reason (e.g. the underlying feature has been removed from LLVM).
  • You are never allowed to make an ABI-breaking change to an existing function.

Changes to the C API should, at worst, produce a linker error, but never introduce an FFI mismatch.

The usual pattern would be to add LLVMPositionBuilder2 (in this case an alternative would be to add a new function with the parameter set to the non-default value).

The reason we need to modify the API is that at the moment it's possible to insert PHIs after debug records which is illegal and leads to an assertion (edit: more specifically, there's no way to avoid that behaviour using the C API with the new debug info format - inserting a phi before an instruction with debug records attached will always cause the assertion without this patch). I chose to modify existing functions because I figured would help reduce nasty surprises, but if stability is preferable this isn't the right tactic.

I think this does not affect typical use of the C API (where a frontend emits instructions in order -- it will not go back to a previous position), so I wouldn't be overly concerned about accidental misuse.

We could add new functions with the extra parameter, keep the existing functions untouched and forward args and a "default" value for the new parameter to the new functions (possibly this is what you're suggesting already)? I guess this would be most "stable" (and I think then we can omit most of the OCaml changes then).

Yes, that's what I would suggest.

I wonder if some of the other changes we made need re-reviewing in this case: #84915 - adds some new functions (I imagine this patch is fine)

This is fine.

#85657 - changes return type of existing functions (is this one ok?)

Yeah, this is not allowed :(

I think what I'd do for the C API is to keep just the new functions you added for dbg records and remove the old functions for intrinsics. Keeping the old name around and making it create intrinsics would be too much of a footgun. And I don't think that we actually need support for the old format in the C API.

@OCHyams
Copy link
Contributor Author

OCHyams commented May 20, 2024

Ok thanks for the clarifications and explanations. I'll make those changes to this patch and then follow up with a patch to fix my previous changes too.

Yeah, this is not allowed :(
I think what I'd do for the C API is to keep just the new functions you added for dbg records and remove the old functions for intrinsics. Keeping the old name around and making it create intrinsics would be too much of a footgun. And I don't think that we actually need support for the old format in the C API.

For the intrinsic-inserting functions, do you think we should:
a) mark as deprecated but otherwise leave them alone for now
b) mark as deprecated and change their behaviour so that they do nothing and return nullptr
c) outright delete them (it sounds like you're voting this one, but just wanted to double check)

@OCHyams OCHyams force-pushed the c-api-iterator-debug-bit-upstream branch from bc9d4af to 568e5b6 Compare May 20, 2024 14:37
@OCHyams
Copy link
Contributor Author

OCHyams commented May 20, 2024

We could add new functions with the extra parameter, keep the existing functions untouched and forward args and a "default" value for the new parameter to the new functions (possibly this is what you're suggesting already)? I guess this would be most "stable" (and I think then we can omit most of the OCaml changes then).

Yes, that's what I would suggest.

Done, how does this look now?

@alan-j-hu sorry for the turbulence but could you re-review the OCaml changes? They're much smaller this time.

Copy link
Contributor

@alan-j-hu alan-j-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, you wrote a second declaration for position_before in llvm.mli where I assume you meant to declare position_before_dbg_records. Consequently, the tests do not pass.

Are LLVMPositionBuilder and LLVMPositionBuilder2 intended to co-exist forever, or will the latter eventually deprecate the former?

If LLVMPositionBuilder will be deprecated, I would rather change position_builder right now to take the bool, and call the underlying C function LLVMPositionBuilder2 than do the song and dance with the 2. As far as I know, OCaml libraries don't have these kinds of ABI compatibility concerns, it's much better to rip off the band-aid once, between two major releases, than go through multiple steps of deprecation across multiple major releases.

Are position_before and position_before_dbg_records intended to co-exist, or will the latter eventually deprecate the former? By inserting after debug records, does position_before have "surprising" behavior that may cause errors?

If position_before has "surprising" behavior, or behavior that shouldn't be the "default," I would rather have it behave as position_before_dbg_records, i.e. inserting before debug records be the default action unless the programmer says otherwise.

I'm not an expert on the debug record migration, so please let me know if:

  • If I didn't explain any of my suggestions clearly
  • Any of my suggestions don't make sense
  • Any of my suggestions may cause issues for users

llvm/bindings/ocaml/llvm/llvm.mli Outdated Show resolved Hide resolved
Copy link
Contributor

@alan-j-hu alan-j-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attaching comments that I forgot to explicitly add to the code review.

llvm/bindings/ocaml/llvm/llvm_ocaml.c Outdated Show resolved Hide resolved
llvm/bindings/ocaml/llvm/llvm.ml Outdated Show resolved Hide resolved
@OCHyams
Copy link
Contributor Author

OCHyams commented May 21, 2024

First of all, you wrote a second declaration for position_before in llvm.mli where I assume you meant to declare position_before_dbg_records. Consequently, the tests do not pass.

Oops, I'll fix that.

Are LLVMPositionBuilder and LLVMPositionBuilder2 intended to co-exist forever, or will the latter eventually deprecate the former?

With the latest changes - intended to co-exist forever.

If LLVMPositionBuilder will be deprecated, I would rather change position_builder right now to take the bool, and call the underlying C function LLVMPositionBuilder2 than do the song and dance with the 2. As far as I know, OCaml libraries don't have these kinds of ABI compatibility concerns, it's much better to rip off the band-aid once, between two major releases, than go through multiple steps of deprecation across multiple major releases.

Are position_before and position_before_dbg_records intended to co-exist, or will the latter eventually deprecate the former? By inserting after debug records, does position_before have "surprising" behavior that may cause errors?

There is some room for surprising behaviour imo which is why I took my initial approach, however, see @nikic's comment - typical api usage probably won't run into it.

Assuming a front end emits instructions in order, most of the time it'll just be inserting at the end of a block or before a terminator. In both these scenarios there's no surprising behaviour and position_before does the right thing. It would actually be a mistake to use position_before_dbg_records in the case of inserting in order before a terminator, because then all the debug records would end up collecting above the terminator rather than in order of insertion.

If a tool using this api inserts instructions in reverse order by repeatedly setting the "insert before" position then position_before would be incorrect - all the debug records would get pushed up to the top of the block. There are other situations where position_before is wrong too, which I've described in other comments (e.g. inserting phis).

If position_before has "surprising" behavior, or behavior that shouldn't be the "default," I would rather have it behave as position_before_dbg_records, i.e. inserting before debug records be the default action unless the programmer says otherwise.

The position_before behaviour is closer to the default in LLVM (LLVM uses a different mechanism to determine "before dbg records", a flag in instruction iterators, and most of the time that is false).


I'll wait for @nikic to comment before making further changes - or is it worth opening a discourse discussion at this point?

@alan-j-hu
Copy link
Contributor

  • If LLVMPositionBuilder and LLVMPositionBuilder2 are intended to co-exist - i.e., the latter is not intended to replace the former in any future release, I'm wondering if LLVMPositionBuilder2 should have a more descriptive name. (The main case where I've seen the usage of 2 is the migration from typed to opaque pointers, where the old functions were eventually removed. Have there been any instances where a 2 function was added that didn't deprecate the older function?) For both the C and OCaml APIs, just adding a 2 at the end of the name doesn't indicate to the user what this function does differently from the other function.

  • I would rather see position_builder implemented in OCaml code, not C FFI code. i.e.

    let position_builder ip = position_builder2 ip false
  • Don't forget to replace the errorneous second declaration of position_before with the intended declaration of position_before_dbg_records in llvm.mli.

These are my only comments now; you have justified your design choices about position_before and position_before_dbg_records.

@OCHyams
Copy link
Contributor Author

OCHyams commented May 28, 2024

Ping @nikic wdyt of the discussion above?

@alan-j-hu

If LLVMPositionBuilder and LLVMPositionBuilder2 are intended to co-exist - i.e., the latter is not intended to replace the former in any future release, I'm wondering if LLVMPositionBuilder2 should have a more descriptive name. [...]

I think that makes sense but I'm struggling to think of a suitable name (any ideas?). An alternative could be to drop the bool argument from the new functions and add a "BeforeDbgRecords" suffix to the names (though this leaves with with PositionBuilderBeforeDbgRecords and PositionBuilderBeforeBeforeDbgRecords, the latter of which is ugly and looks like a typo). @nikic do you have an opinion on using 2 in the name for functions that are not getting deprecated?

@nikic
Copy link
Contributor

nikic commented Jun 3, 2024

For the intrinsic-inserting functions, do you think we should: a) mark as deprecated but otherwise leave them alone for now b) mark as deprecated and change their behaviour so that they do nothing and return nullptr c) outright delete them (it sounds like you're voting this one, but just wanted to double check)

I'd go for c), yes.

  • If LLVMPositionBuilder and LLVMPositionBuilder2 are intended to co-exist - i.e., the latter is not intended to replace the former in any future release, I'm wondering if LLVMPositionBuilder2 should have a more descriptive name. (The main case where I've seen the usage of 2 is the migration from typed to opaque pointers, where the old functions were eventually removed. Have there been any instances where a 2 function was added that didn't deprecate the older function?) For both the C and OCaml APIs, just adding a 2 at the end of the name doesn't indicate to the user what this function does differently from the other function.

You are right that 2 usually indicates that the old function is deprecated -- though in some cases it is soft-deprecated without intent to remove (e.g. when we're just changing parameter types from unsigned to uint64_t or similar).

The naming pattern that often gets used for cases like this is "With", so we have both LLVMBuildCall2 and LLVMBuildCallWithOperandBundles, where the latter just provides additional functionality, but most use-cases can still use the former just fine. This is similar to the situation here.

So I guess you could call it LLVMPositionBuilderWithDbgRecordControl, which is, uh, quite the mouthful :) I don't really have strong opinions on the precise naming here.

@OCHyams
Copy link
Contributor Author

OCHyams commented Jun 6, 2024

Would either of you be opposed to the new functions unconditionally inserting before debug records instead of providing the option through a bool parameter (given you can choose the "insert after debug records" behaviour by calling the existing functions which we don't plan to deprecate)?

@nikic
Copy link
Contributor

nikic commented Jun 6, 2024

@OCHyams That's fine by me.

@OCHyams
Copy link
Contributor Author

OCHyams commented Jun 6, 2024

Ok I've reworked the patch:

  • C: Added LLVMPositionBuilderBeforeDbgRecords and LLVMPositionBuilderBeforeInstrAndDbgRecords. Same as LLVMPositionBuilder and LLVMPositionBuilderBefore except the insertion position is set to before the debug records that precede the target instruction (the bool parameter has been removed from the new functions).
  • OCaml: Added position_builder_before_dbg_records which does the same thing as LLVMPositionBuilderBeforeDbgRecords and position_before_dbg_records which does the same thing as LLVMPositionBuilderBeforeInstrAndDbgRecords. The OCaml names are different to and more sensible than the C ones because unlike the C ones the existing OCaml function names lend themselves to simply appending "before_dbg_records".

@OCHyams OCHyams changed the title [RemoveDIs] Add a 'BeforeDbgRecords' parameter to C API isnt insertion functions [RemoveDIs] C API: Add before-dbg-record versions of IRBuilder position funcs Jun 6, 2024
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C API changes look fine to me. Would be good if @alan-j-hu takes a look at the new OCaml changes.

llvm/include/llvm-c/Core.h Show resolved Hide resolved
Inserting a PHI before an instruction with debug records attached results in
the following assert because the PHI needs to be inserted before its debug
records too.

llvm/lib/IR/Instruction.cpp:166: void llvm::Instruction::insertBefore(llvm::BasicBlock&, llvm::iplist_impl<llvm::simple_ilist<llvm::Instruction, llvm::ilist_iterator_bits<true> >, llvm::SymbolTableListTraits<llvm::Instruction, llvm::ilist_iterator_bits<true> > >::iterator): Assertion `!isa<PHINode>(this) && "Inserting PHI after debug-records!"' failed.
* C: add new versions without bool param
* ocaml: position_before_dbg_records stays the same (equ. to LLVMPositionBuilderBeforeInstrAndDbgRecords)
* ocaml: position_builder2 -> position_builder_before_dbg_records (equ. to LLVMPositionBuilderBeforeDbgRecords)
@OCHyams OCHyams force-pushed the c-api-iterator-debug-bit-upstream branch from d16ab0f to 902f282 Compare June 10, 2024 08:04
@OCHyams
Copy link
Contributor Author

OCHyams commented Jun 10, 2024

Rebase + fix typo. I'll land this shortly.

Thanks both for your patience and reviews.

@OCHyams OCHyams merged commit d732a32 into llvm:main Jun 10, 2024
5 of 7 checks passed
OCHyams added a commit that referenced this pull request Jun 10, 2024
@HerrCai0907 HerrCai0907 mentioned this pull request Jun 13, 2024
CarlosAlbertoEnciso added a commit to CarlosAlbertoEnciso/llvm-project that referenced this pull request Jun 14, 2024
llvm#84915
llvm#85657
llvm#92417 (comment)

As discussed by @nikic and @OCHyams:

llvm#92417 (comment)

> For the intrinsic-inserting functions, do you think we should:
>
> a) mark as deprecated but otherwise leave them alone for now.
> b) mark as deprecated and change their behaviour so that they
>    do nothing and return nullptr.
> c) outright delete them (it sounds like you're voting this one,
>    but just wanted to double check).

This patch implements option (c).
CarlosAlbertoEnciso added a commit that referenced this pull request Jun 18, 2024
The DIBuilder C API was changed to deal with DbgRecord functions:

#84915
#85657
#92417 (comment)

As discussed by @nikic and @OCHyams:

#92417 (comment)

> For the intrinsic-inserting functions, do you think we should:
>
> a) mark as deprecated but otherwise leave them alone for now.
> b) mark as deprecated and change their behaviour so that they
>    do nothing and return nullptr.
> c) outright delete them (it sounds like you're voting this one,
>    but just wanted to double check).

This patch implements option (c).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants