Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support opaque pointers #102

Closed
langston-barrett opened this issue Sep 27, 2022 · 4 comments · Fixed by #110
Closed

Support opaque pointers #102

langston-barrett opened this issue Sep 27, 2022 · 4 comments · Fixed by #110

Comments

@langston-barrett
Copy link
Collaborator

LLVM is in the process of migrating all of its pointer types from i8*, i32*, etc. to an opaque ptr type, as described here. LLVM 13 takes an important step in that direction, as it is the first LLVM release to include ptr in the LLVM AST. See llvm/llvm-project@2155dc5.

We should support opaque pointers in the LLVM AST and pretty-printer. In particular, we will have to either add a new OpaquePointer constructor that doesn't take a pointee type as an argument, or make PtrTo take a Maybe.

See also: GaloisInc/llvm-pretty-bc-parser#177

@RyanGlScott
Copy link
Collaborator

Hah, I just learned that LLVM also has an opaque structure type that is completely unrelated to opaque pointers (just in case you thought there weren't enough concepts with "opaque" in the name), which llvm-pretty represents as Opaque. I wonder if it would be confusing to have both Opaque and OpaquePointer.

@RyanGlScott
Copy link
Collaborator

To echo similar sentiments as in GaloisInc/llvm-pretty-bc-parser#177 (comment), I've discovered that adding support the opaque ptr type in llvm-pretty itself is actually quite straightforward. The tricky part is making sure that instructions which use opaque pointers behave as expected.

In particular, I've recently discovered that the way that llvm-pretty-bc-parser treats call and callbr instructions is subtly broken, and I only realized this when trying to make them work with opaque pointers. See GaloisInc/llvm-pretty-bc-parser#189 (comment) for the full story, but the tl;dr version is that call is using a pointer type in a place where it should actually be using a function type. We didn't notice this problem when using pointers with explicit pointee types, but once we switch over to opaque pointers, we will no longer be able to "look through" pointer types.

This convention of using a pointer type instead of a function type in call instructions can also be found in llvm-pretty:

https://github.com/elliottt/llvm-pretty/blob/a272e380bbd44d6af5afba63a70772b2dab6a2d0/src/Text/LLVM/PP.hs#L695-L697

I will need to adjust this code as well. I'll open a patch for this.

There may be more instructions that need similar adjustments, but I haven't done a thorough audit of all pointer-aware instructions yet.

RyanGlScott added a commit that referenced this issue Mar 21, 2023
This adds basic support for representing and pretty-printing opaque `ptr`
types, which were introduced in LLVM 13 as part of a longer-term plan to stop
using non-opaque pointer types (e.g., `i8*`). See
https://llvm.org/docs/OpaquePointers.html for further details.

The next step in `llvm-pretty` is to make sure that all of the instructions
that involve pointers in some way (e.g., `call`) have all the information that
they need to work with opaque pointers. I will be addressing this in subsequent
commits.

This is one step towards #102.
RyanGlScott added a commit that referenced this issue Mar 21, 2023
Previously, the code in `ppCallSym` assumed that the function type stored in a
`call` instruction was a pointer type, but this should actually be a raw
function type. See the discussion at
GaloisInc/llvm-pretty-bc-parser#189 (comment)
for more on this.  We now assume the convention of raw function types in `call`
instructions, as well as the related `callbr` and `invoke` instructions.

This addresses one part of #102.
RyanGlScott added a commit that referenced this issue Apr 3, 2023
This adds basic support for representing and pretty-printing opaque `ptr`
types, which were introduced in LLVM 13 as part of a longer-term plan to stop
using non-opaque pointer types (e.g., `i8*`). See
https://llvm.org/docs/OpaquePointers.html for further details.

The next step in `llvm-pretty` is to make sure that all of the instructions
that involve pointers in some way (e.g., `call`) have all the information that
they need to work with opaque pointers. I will be addressing this in subsequent
commits.

This is one step towards #102.
RyanGlScott added a commit that referenced this issue Apr 3, 2023
Previously, the code in `ppCallSym` assumed that the function type stored in a
`call` instruction was a pointer type, but this should actually be a raw
function type. See the discussion at
GaloisInc/llvm-pretty-bc-parser#189 (comment)
for more on this.  We now assume the convention of raw function types in `call`
instructions, as well as the related `callbr` and `invoke` instructions.

This addresses one part of #102.
@RyanGlScott
Copy link
Collaborator

RyanGlScott commented Apr 5, 2023

I did a slightly more thorough audit of Text.LLVM.PP, and I found three opaque pointer "violations" that we will need to address:

@RyanGlScott
Copy link
Collaborator

Another issue that I ran into in GaloisInc/llvm-pretty-bc-parser#177 (comment) was needing to compare types modulo pointer opaqueness. This is because many parts of llvm-pretty and its surrounding ecosystem still use PtrTo (and will likely do so for the foreseeable future), and as a result, there are often cases where you have a type equality between a PtrTo and a PtrOpaque. For this reason, I ended up implementing the following functions to make llvm-pretty-bc-parser work with opaque pointers:

eqTypeModuloOpaquePtrs :: Eq ident => Type' ident -> Type' ident -> Bool
cmpTypeModuloOpaquePtrs :: Ord ident => Type' ident -> Type' ident -> Ordering

There's nothing parser-specific about these functions, however, so I wonder if we should put these in llvm-pretty.


Another challenge is that LLVM itself does not allow mixing opaque and non-opaque pointers in a single file, so if you try to use Text.LLVM.AST to pretty-print such a file, then llvm-as will reject it. I wrote another utility function that checks if a file contains at least one opaque pointer, and if so, replaces all non-opaque pointers with opaque ones to make is llvm-as–compliant:

fixupOpaquePtrs :: forall a. Data a => a -> a

Perhaps this is something that llvm-pretty's pretty-printer should always do. On the other hand, it requires an extra traversal of the entire AST using syb, so this has potential performance implications. Perhaps we should not do this by default, but offer the function to those who want to use it?

RyanGlScott added a commit that referenced this issue Apr 6, 2023
Previously, the code for pretty-printing `callbr` instructins was inspecting a
pointer's pointee type, which won't work for opaque pointers. Thankfully, the
fix is simple: just use the return type that is stored separately in the
`CallBr` data constructor.

See #102.
RyanGlScott added a commit that referenced this issue Apr 6, 2023
With opaque pointers, one cannot tell what the type being loaded is by
inspecting the pointer in the `load` instruction. As a result, we now store the
load type separately in the `Load` instruction so that it can be determined
regardless of whether opaque pointers are used or not.

See #102.
RyanGlScott added a commit that referenced this issue Apr 6, 2023
With opaque pointers, one cannot tell what the basis type for a `getelementptr`
instruction (or constant expression) is by inspecting the parent pointer. As a
result, we now store the basis type separately in `GEP`/`ConstGEP` so that it
can be determined regardless of whether opaque pointers are used or not.

See #102.
RyanGlScott added a commit that referenced this issue Apr 6, 2023
With opaque pointers, one cannot tell what the type being loaded is by
inspecting the pointer in the `load` instruction. As a result, we now store the
load type separately in the `Load` instruction so that it can be determined
regardless of whether opaque pointers are used or not.

See #102.
RyanGlScott added a commit that referenced this issue Apr 6, 2023
With opaque pointers, one cannot tell what the basis type for a `getelementptr`
instruction (or constant expression) is by inspecting the parent pointer. As a
result, we now store the basis type separately in `GEP`/`ConstGEP` so that it
can be determined regardless of whether opaque pointers are used or not.

Because this requires making a backwards-incompatible change to `ConstGEP`, I
took the opportunity to include the parent pointer value as a distinguished
field in `ConstGEP`. The `GEP` data constructor already does this, and it seems
oddly asymmetric to not have `ConstGEP` do the same, especially since LLVM
requires it to be present.

See #102.
RyanGlScott added a commit that referenced this issue Apr 8, 2023
With opaque pointers, one cannot tell what the basis type for a `getelementptr`
instruction (or constant expression) is by inspecting the parent pointer. As a
result, we now store the basis type separately in `GEP`/`ConstGEP` so that it
can be determined regardless of whether opaque pointers are used or not.

This also requires tweaking the types of the `resolveGepFull` and `resolveGep`
functions to make the basis type separate from the parent pointer type.

Because this requires making a backwards-incompatible change to `ConstGEP`, I
took the opportunity to include the parent pointer value as a distinguished
field in `ConstGEP`. The `GEP` data constructor already does this, and it seems
oddly asymmetric to not have `ConstGEP` do the same, especially since LLVM
requires it to be present.

See #102.
RyanGlScott added a commit that referenced this issue Apr 8, 2023
This patch implements `eqTypeModuloOpaquePtrs` and `cmpTypeModuloOpaquePtrs`
functions, which provide coarser notions of type equality and comparison that
do not distinguish between `PtrTo` and `PtrOpaque`, unlike the `Eq` and `Ord`
instances for `Type`. This is useful to support code in which `PtrTo` and
`PtrOpaque` exist in the same file.

See #102.
RyanGlScott added a commit that referenced this issue Apr 8, 2023
…ointers

This is useful for ensuring that pretty-printing an `llvm-pretty` AST produces
something that will be accepted by `llvm-as`, which unlike `llvm-pretty`, does
_not_ support mixing opaque and non-opaque pointers.

See #102.
RyanGlScott added a commit that referenced this issue Apr 21, 2023
Previously, the code for pretty-printing `callbr` instructins was inspecting a
pointer's pointee type, which won't work for opaque pointers. Thankfully, the
fix is simple: just use the return type that is stored separately in the
`CallBr` data constructor.

See #102.
RyanGlScott added a commit that referenced this issue Apr 21, 2023
With opaque pointers, one cannot tell what the type being loaded is by
inspecting the pointer in the `load` instruction. As a result, we now store the
load type separately in the `Load` instruction so that it can be determined
regardless of whether opaque pointers are used or not.

See #102.
RyanGlScott added a commit that referenced this issue Apr 21, 2023
With opaque pointers, one cannot tell what the basis type for a `getelementptr`
instruction (or constant expression) is by inspecting the parent pointer. As a
result, we now store the basis type separately in `GEP`/`ConstGEP` so that it
can be determined regardless of whether opaque pointers are used or not.

This also requires tweaking the types of the `resolveGepFull` and `resolveGep`
functions to make the basis type separate from the parent pointer type.

Because this requires making a backwards-incompatible change to `ConstGEP`, I
took the opportunity to include the parent pointer value as a distinguished
field in `ConstGEP`. The `GEP` data constructor already does this, and it seems
oddly asymmetric to not have `ConstGEP` do the same, especially since LLVM
requires it to be present.

See #102.
RyanGlScott added a commit that referenced this issue Apr 21, 2023
This patch implements `eqTypeModuloOpaquePtrs` and `cmpTypeModuloOpaquePtrs`
functions, which provide coarser notions of type equality and comparison that
do not distinguish between `PtrTo` and `PtrOpaque`, unlike the `Eq` and `Ord`
instances for `Type`. This is useful to support code in which `PtrTo` and
`PtrOpaque` exist in the same file.

See #102.
RyanGlScott added a commit that referenced this issue Apr 21, 2023
…ointers

This is useful for ensuring that pretty-printing an `llvm-pretty` AST produces
something that will be accepted by `llvm-as`, which unlike `llvm-pretty`, does
_not_ support mixing opaque and non-opaque pointers.

See #102.
RyanGlScott added a commit that referenced this issue Apr 24, 2023
This patch implements `eqTypeModuloOpaquePtrs` and `cmpTypeModuloOpaquePtrs`
functions, which provide coarser notions of type equality and comparison that
do not distinguish between `PtrTo` and `PtrOpaque`, unlike the `Eq` and `Ord`
instances for `Type`. This is useful to support code in which `PtrTo` and
`PtrOpaque` exist in the same file.

See #102.
RyanGlScott added a commit that referenced this issue Apr 24, 2023
…ointers

This is useful for ensuring that pretty-printing an `llvm-pretty` AST produces
something that will be accepted by `llvm-as`, which unlike `llvm-pretty`, does
_not_ support mixing opaque and non-opaque pointers.

See #102.
RyanGlScott added a commit that referenced this issue May 30, 2023
Previously, the code for pretty-printing `callbr` instructins was inspecting a
pointer's pointee type, which won't work for opaque pointers. Thankfully, the
fix is simple: just use the return type that is stored separately in the
`CallBr` data constructor.

See #102.
RyanGlScott added a commit that referenced this issue May 30, 2023
With opaque pointers, one cannot tell what the type being loaded is by
inspecting the pointer in the `load` instruction. As a result, we now store the
load type separately in the `Load` instruction so that it can be determined
regardless of whether opaque pointers are used or not.

See #102.
RyanGlScott added a commit that referenced this issue May 30, 2023
With opaque pointers, one cannot tell what the basis type for a `getelementptr`
instruction (or constant expression) is by inspecting the parent pointer. As a
result, we now store the basis type separately in `GEP`/`ConstGEP` so that it
can be determined regardless of whether opaque pointers are used or not.

This also requires tweaking the types of the `resolveGepFull` and `resolveGep`
functions to make the basis type separate from the parent pointer type.

Because this requires making a backwards-incompatible change to `ConstGEP`, I
took the opportunity to include the parent pointer value as a distinguished
field in `ConstGEP`. The `GEP` data constructor already does this, and it seems
oddly asymmetric to not have `ConstGEP` do the same, especially since LLVM
requires it to be present.

See #102.
RyanGlScott added a commit that referenced this issue May 30, 2023
This patch implements `eqTypeModuloOpaquePtrs` and `cmpTypeModuloOpaquePtrs`
functions, which provide coarser notions of type equality and comparison that
do not distinguish between `PtrTo` and `PtrOpaque`, unlike the `Eq` and `Ord`
instances for `Type`. This is useful to support code in which `PtrTo` and
`PtrOpaque` exist in the same file.

See #102.
RyanGlScott added a commit that referenced this issue May 30, 2023
…ointers

This is useful for ensuring that pretty-printing an `llvm-pretty` AST produces
something that will be accepted by `llvm-as`, which unlike `llvm-pretty`, does
_not_ support mixing opaque and non-opaque pointers.

See #102.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants