Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang] Add design document for debug info generation. #86939

Merged
merged 6 commits into from
Apr 11, 2024

Conversation

abidh
Copy link
Contributor

@abidh abidh commented Mar 28, 2024

This document discusses some options where the debug metadata can be generated.
It also goes through various language constructs and explains how the debug metadata
will look like for that construct and how we can extract that information.

The real point of discussion is how and where to extract the information about various
language features to generate the debug metadata. The structure of the metadata itself
is mostly settled as that is dictated by the DWARF and structure of LLVM IR metadata.
The classic flang and gfortran generate quite similar DWARF for the various language constructs.

This document is based on what Kiran posted in https://reviews.llvm.org/D138534.

This document discusses some options where the debug metadata can be generated.
 It also goes through various language constructs and explains how the debug metadata
 will look like for that construct and how we can extract that information.
@llvmbot llvmbot added the flang Flang issues not falling into any other category label Mar 28, 2024
Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for writing this comprehensive design document.

I have a few questions but on the whole I like the approach here.

flang/docs/DebugGeneration.md Outdated Show resolved Hide resolved
flang/docs/DebugGeneration.md Show resolved Hide resolved
flang/docs/DebugGeneration.md Outdated Show resolved Hide resolved
flang/docs/DebugGeneration.md Show resolved Hide resolved
flang/docs/DebugGeneration.md Outdated Show resolved Hide resolved
flang/docs/DebugGeneration.md Outdated Show resolved Hide resolved
flang/docs/DebugGeneration.md Outdated Show resolved Hide resolved
flang/docs/DebugGeneration.md Outdated Show resolved Hide resolved
flang/docs/DebugGeneration.md Outdated Show resolved Hide resolved
Co-authored-by: Tom Eccles <t@freedommail.info>
Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this down, this very useful!
I am in line with the approach. Few questions/notes below.

members of the derived type.

2. Use of derived type causes the generation of many other global variables.
Do they need to be retained in debug info?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They would be meaningless to the programmer, I do not think so.

still some open questions about derived types.

1. What is the correct way to get the offset and alignment information for the
members of the derived type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can generate that info from the FIR RecordType type directly + mlir datalayout (there is currently no utility generating this info since this is left-up to LLVM, but this is doable (similar logic as what happens in fir::getTypeSizeAndAlignment.


3. The location where a component of the derived type is declared is not
available in RecordType or TypeInfoOp. This probably will need to be
added.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it could be added as FusedLoc on the TypeInfoOp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can work. I looped over all components in createTypeInfoOp and added the locations as FusedLoc. Later in the AddDebugInfo pass, I can see the following.

fir.type_info @_QMhelperTt_pair noinit nodestroy nofinal : !fir.type<_QMhelperTt_pair{i:i32,i1:i32,x:f32}> loc(fused<loc("derived.f90":6:13)>[fused<loc("derived.f90":5:16)>[fused<loc("derived.f90":4:16)>["derived.f90":3:11]]])

In FIR, the `DeclareOp` points to a `ShapeOp` and we can walk the chain
to get the value that represents the array bound in any dimension. We will
create a `DILocalVariableAttr` that will point to that location. This
variable will be used in the `DISubrangeAttr`. Note that this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a variable as in some memory location, or can a DILocalVariableAttr point to any operation producing an SSA value (like an arith.addi)?

Lowering does not put n+1 in memory when evaluating specification expressions of x(n+1).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first argument to DebugDeclareOp needs to be an llvm pointer type so an SSA value may not work. I looked at the IR from the classic flang and it does seem to put such expressions to memory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could llvm.dbg.value be used instead of llvm.dbg.declare here?
If not, I guess the debug info pass will have to insert allocas for bounds that are non constant SSA values (even if lowering did lowered them in memory, we could not guarantee they would not be promoted to values by some MLIR optimization).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The llvm.dbg.value could work. As you mentioned, there is option of adding allocas during debug info generation. The issue may be to keep the value alive throughout the function.

#### Assumed Size

This is treated as raw array. Debug information will not provide any bounds
information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will not provide any bounds information.

Is this true for any bounds or only the upper bound in the last dimension?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upper bound in last dimension.


### Namelists

FIR does not seem to have a way to extract information about namelists.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Namelists are handled by creating runtime data structures that describes them. These data structures are globals when possible, and locals otherwise (without fir.declare currently). They can be identified via their mangling using InternalNames.h.

It may be non trivial to generate the debug info from the global/local. Maybe we could abstract that more in FIR/Lowering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the information. Although when I tried a simple namelist example, I did not see a local or global with this name mangling scheme in fir. Ideally we will need something that represent namelist and then refer to DeclareOp of all the variables that are part of the given namelist.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These data structures are created lazily when encountering the namelist in an IO statement, so if your test had no IO statements using it, you would indeed not have seen it. I understand this is not convenient from a debug info point of view, again more abstraction might be needed in lowering here to generate the debug info.


They will be treated like normal variables. Although we may require to handle
the case where the `DeclareOp` of one variable points to the `DeclareOp` of
another variable (e.g. a => b).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently inlining is not done in FIR, but that would also create DeclareOp chains.

available in RecordType or TypeInfoOp. This probably will need to be
added.

### CommonBlocks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if something special is needed to deal with equivalence outside of Common Blocks, or can the output of the fir.coordinate_of be used for the DILocalVariableAttr?

e.g:

subroutine test_locals()
  real :: x(100), y
  equivalence (x(10), y)
end subroutine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out this case. The DbgDeclareOp takes a 3rd expression argument which we can use to handle cases like this by using something like DIExpression(DW_OP_plus_uconst, 36) for the example case.


### Strings

Fixed sized strings will be treated like fixed sizes arrays and
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting from LLVM 12, the Fortran string types (fixed length, assumed length, deferred length) LLVM debug metadata are represented by DIStringType (with name, size, stringLength, stringLengthExpression fields).

Copy link
Contributor Author

@abidh abidh Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out. The DIStringType also has stringLocationExpression field which will result in very clean implementation for strings. I will update the document and add the example IR. We will also need to add DIStringTypeAttr in the MLIR.

1. Mention the upper bound of the last dimension in
   the 'Assumed Size' section.
2. Redo 'Strings' section using 'DIStringType'.
3. Mention 'DIStringType' in 'Missing metadata in MLIR'
Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the document and the answers to the comments. Looks great!

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for all of your work on this!

Copy link
Contributor

@kiranchandramohan kiranchandramohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. Thanks for the writing this down and the initial work done for this. I have a few questions. But no need to wait for further replies from my side.

flang/docs/DebugGeneration.md Outdated Show resolved Hide resolved

The AddDebugFoundationPass will be renamed to AddDebugInfo Pass. The
information mentioned in the line info section above will be passed to it from
the driver. This pass will run quite late in the pipeline but before
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does FIRDeclare have all the information? HLFIRDeclare probably has more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you kindly give an example of thing which will be present in HLFIRDeclare and not in FIRDeclare. I don't mind using it but then we will have to run the AddDebugInfo pass early when hlfir is still there.

- Break at functions.
- print type (ptype) of function names.
- print values and types (ptype) of various type of variables
- Manually run `GDB`'s gdb.fortran testsuite with llvm-flang.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add the gdb.fortran tests to the llvm-testsuite just like the gfortran tests are added?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GDB tests require its tcl based framework to issue commands and look for results. It may not be feasible to add them to llvm-testsuite. Adding a bot that builds flang and then run gdb.fortran tests with it may be a possibility.

Comment on lines +40 to +41
`DISubroutineTypeAttr` currently has a fixed type. This will be changed to
match the signature of the actual function/subroutine.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you plan to get the actual signature since this is going to be different from what is there in FIR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is meant by actual signature and how this will be different to the information that is in the FIR. My plan was to create the DISubroutineTypeAttr based on the information about arguments and results in FuncOp. A few simple cases that I tried, the results of the ptype command of the GDB were similar for executable generated with flang-new and gfortran.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in https://reviews.llvm.org/D138534, for the following program, what would be the ptype for the function foo?

 program mn
  integer :: arr(10), cnt
  call foo(arr, cnt)
contains
  subroutine foo( a, n)
    integer  :: a(:), n
  end subroutine foo
end program

gfortran (Actual signature)

ptype foo
type = void (integer(kind=4) (:), integer(kind=4))

Classic Flang (LLVM IR signature)

(gdb) ptype foo
type = void (integer*8, integer, integer (187651372366496), uinteger*8)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect it to be similar to gfortran.

Comment on lines +307 to +311
1. There will be a `datalocation` field which will be an expression. This will
enable debugger to get the data pointer from array descriptor.

2. The field in `DISubrangeAttr` for array bounds will be expression which will
allow the debugger to get the bounds from descriptor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will all these be generated during FIRToLLVM conversion? How will you get the right offsets from the descriptor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will be generated during normal debug attribute generation in AddDebugInfoPass. I am not clear about the 2nd part of the question. If you mean how we will get the offsets of this information in descriptor, I think we know the layout of the descriptor in the memory and can tell where various things are located. If you mean how debugger will get this information, we will encode them in DWARF's expressions which can be decoded by the debuggers.

Please note that most of the examples that I included in the document are known to work. Here are the steps I followed to verify the example debug metadata.

  1. Use flang-new to create LLVM IR file from the sample fortran file.
  2. Manually edit the .ll file to add the example debug metadata
  3. Build the edited file into an executable
  4. Run the executable under GDB and check that debugger is able to extract the right information.

Copy link
Contributor

@kiranchandramohan kiranchandramohan Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean how we will get the offsets of this information in descriptor

It is this part. The offsets are only available when the descriptor is converted to a structure in the FIRToLLVM conversion pass. So how can you get this info in the AddDebugInfo pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the layout was known. For example, the FIRToLLVM seems to rely on the hardcoded values like kDimsPosInBox. We could have similar thing in AddDebugInfo pass but with offsets.

I also inspected the layout of the descriptor at runtime in the debugger and then looked at the CFI_cdesc_t and it matched.

Comment on lines +326 to +327
In assumed shape case, the rank can be determined from the FIR's `SequenceType`.
This allows us to generate a `DISubrangeAttr` in each dimension.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this also available from the descriptor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean to say that DISubrangeAttr can encode where the information about the array bounds is located in the descriptor for each dimension.

abidh and others added 2 commits April 11, 2024 14:29
Review Suggestion.

Co-authored-by: Kiran Chandramohan <kiranchandramohan@gmail.com>
@abidh abidh merged commit 357f6c7 into llvm:main Apr 11, 2024
4 of 5 checks passed
@abidh abidh deleted the debug_design branch April 11, 2024 16:54
abidh added a commit to abidh/llvm-project that referenced this pull request May 2, 2024
Currently, cg-rewrite removes the DeclareOp. As AddDebugInfo runs after
that, it cannot process the DeclareOp. My initial plan was to make the
AddDebugInfo pass run before the cg-rewrite but that has few issues.

1. Initially I was thinking to use the memref op to carry the variable
attr. But as @tblah suggested in the llvm#86939, it makes more sense to
carry that information on DeclareOp. It also makes it easy to handle it
in codegen and there is no special handling needed for arguments. For
that, we need to preserve the DeclareOp till the codegen.

2. Running earlier, we will miss the changes in passes that run between
cg-rewrite and codegen.

But not removing the DeclareOp in cg-rewrite has the issue that ShapeOp
remains and it causes errors during codegen. To solve this problem, I
convert DeclareOp to XDeclareOp in cg-rewrite instead of removing
it. This was mentioned as possible solution by @jeanPerier in
https://reviews.llvm.org/D136254

The conversion follows similar logic as used for other operators in that
file. The FortranAttr and CudaAttr are currently not converted but left
as TODO when the need arise. A later commit will use the XDeclareOp to
extract the variable information.
abidh added a commit that referenced this pull request May 15, 2024
We need the information in the `DeclareOp` to generate debug information
for variables.  Currently, cg-rewrite removes the `DeclareOp`. As
`AddDebugInfo` runs after that, it cannot process the `DeclareOp`. My
initial plan was to make the `AddDebugInfo` pass run before the cg-rewrite
but that has few issues.
    
1. Initially I was thinking to use the memref op to carry the variable
attr. But as @tblah suggested in the #86939, it makes more sense to
carry that information on `DeclareOp`. It also makes it easy to handle
it in codegen and there is no special handling needed for arguments. For
this reason, we need to preserve the `DeclareOp` till the codegen.
    
2. Running earlier, we will miss the changes in passes that run between
cg-rewrite and codegen.
    
But not removing the DeclareOp in cg-rewrite has the issue that ShapeOp
remains and it causes errors during codegen. To solve this problem, I
convert DeclareOp to XDeclareOp in cg-rewrite instead of removing
it. This was mentioned as possible solution by @jeanPerier in
https://reviews.llvm.org/D136254
    
The conversion follows similar logic as used for other operators in that
file. The FortranAttr and CudaAttr are currently not converted but left
as TODO when the need arise.

Now `AddDebugInfo` pass can extracts information about local variables
from `XDeclareOp` and creates `DILocalVariableAttr`. These are attached
to `XDeclareOp` using `FusedLoc` approach. Codegen can use them to
create `DbgDeclareOp`.  I have added tests that checks the debug
information in mlir from and also in llvm ir.

Currently we only handle very limited types. Rest are given a place
holder type. The previous placeholder type was basic type with
`DW_ATE_address` encoding. When variables are added, it started
causing assertions in the llvm debug info generation logic for some
types. It has been changed to an interger type to prevent these issues
until we handle those types properly.
abidh added a commit to abidh/llvm-project that referenced this pull request May 15, 2024
Currently, cg-rewrite removes the DeclareOp. As AddDebugInfo runs after
that, it cannot process the DeclareOp. My initial plan was to make the
AddDebugInfo pass run before the cg-rewrite but that has few issues.

1. Initially I was thinking to use the memref op to carry the variable
attr. But as @tblah suggested in the llvm#86939, it makes more sense to
carry that information on DeclareOp. It also makes it easy to handle it
in codegen and there is no special handling needed for arguments. For
that, we need to preserve the DeclareOp till the codegen.

2. Running earlier, we will miss the changes in passes that run between
cg-rewrite and codegen.

But not removing the DeclareOp in cg-rewrite has the issue that ShapeOp
remains and it causes errors during codegen. To solve this problem, I
convert DeclareOp to XDeclareOp in cg-rewrite instead of removing
it. This was mentioned as possible solution by @jeanPerier in
https://reviews.llvm.org/D136254

The conversion follows similar logic as used for other operators in that
file. The FortranAttr and CudaAttr are currently not converted but left
as TODO when the need arise. A later commit will use the XDeclareOp to
extract the variable information.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants