Skip to content

Commit

Permalink
[LangRef] Always allow getelementptr inbounds with zero offset
Browse files Browse the repository at this point in the history
Currently, our GEP specification has a special case that makes
gep inbounds (null, 0) legal. This patch proposes to expand this
special case to all gep inbounds (ptr, 0), where ptr is no longer
required to point to an allocated object.

This was previously discussed in some detail at
https://discourse.llvm.org/t/question-about-getelementptr-inbounds-with-offset-0/62533.

The motivation for this change is twofold:

 * Rust relies on getelementptr inbounds with zero offset to be
   legal for arbitrary pointers to support zero-sized types. The
   current rules are unclear on whether this is legal or not
   (saying that there is a zero-size "allocated object" at every
   address may be consistent with our current rules, but more
   clarity is desired here).
 * The current semantics require us to drop the inbounds flag
   when materializing zero-index GEPs, which is done by some
   InstCombine transforms. Preserving the inbounds flag can
   substantially improve optimization quality in some cases, as
   illustrated in D154055.

As far as I know, the only analysis/transforms affected by this
semantics change are:

 * A special-case for comparisons with null in CaptureTracking,
   which is fixed by D154054. As far as I can tell, that special
   case is not particularly valuable and should be recovered by
   other transforms.
 * Folding gep inbounds undef, idx to poison. We now need to fold
   to undef instead (D154215).

Differential Revision: https://reviews.llvm.org/D154051
  • Loading branch information
nikic committed Jul 6, 2023
1 parent 5b666cf commit 2de812f
Showing 1 changed file with 9 additions and 6 deletions.
15 changes: 9 additions & 6 deletions llvm/docs/LangRef.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10923,14 +10923,12 @@ for the given testcase is equivalent to:
ret ptr %t5
}

If the ``inbounds`` keyword is present, the result value of the
``getelementptr`` is a :ref:`poison value <poisonvalues>` if one of the
following rules is violated:
If the ``inbounds`` keyword is present, the result value of a
``getelementptr`` with any non-zero indices is a
:ref:`poison value <poisonvalues>` if one of the following rules is violated:

* The base pointer has an *in bounds* address of an allocated object, which
means that it points into an allocated object, or to its end. The only
*in bounds* address for a null pointer in the default address-space is the
null pointer itself.
means that it points into an allocated object, or to its end.
* If the type of an index is larger than the pointer index type, the
truncation to the pointer index type preserves the signed value.
* The multiplication of an index by the type size does not wrap the pointer
Expand All @@ -10945,6 +10943,11 @@ following rules is violated:
* In cases where the base is a vector of pointers, the ``inbounds`` keyword
applies to each of the computations element-wise.

Note that ``getelementptr`` with all-zero indices is always considered to be
``inbounds``, even if the base pointer does not point to an allocated object.
As a corollary, the only pointer in bounds of the null pointer in the default
address space is the null pointer itself.

These rules are based on the assumption that no allocated object may cross
the unsigned address space boundary, and no allocated object may be larger
than half the pointer index type space.
Expand Down

0 comments on commit 2de812f

Please sign in to comment.