Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out how to handle index vectors in getelementptr instructions and constant expressions #35

Closed
mewmew opened this issue Nov 9, 2018 · 6 comments

Comments

@mewmew
Copy link
Member

mewmew commented Nov 9, 2018

Today I noticed a strange use of getelementptr instructions, that I have yet to find any official documentation describing its semantics. Rather than integer values being used as indices of gep, I found an instruction which uses integer vectors. And the resulting type of the gep instruction is not a pointer type but a vector of pointers type.

From ls.ll of Coreutils:

%37 = getelementptr inbounds %struct.fileinfo, %struct.fileinfo* %20, <2 x i64> %34, !dbg !4706
...
%40 = bitcast i8** %39 to <2 x %struct.fileinfo*>*, !dbg !4708
store <2 x %struct.fileinfo*> %37, <2 x %struct.fileinfo*>* %40, align 8, !dbg !4708, !tbaa !1793

Notice that the first index of gep is <2 x i64> %34, a vector value and not an integer value.

Furthermore, notice that the type of %37 is <2 x %struct.fileinfo*>, a vector of pointers type, and not a pointer type.

@pwaller Have you seen this before, and do you know how the result type of gep is calculated?

I skimmed through https://llvm.org/docs/GetElementPtr.html and found no reference of this behaviour.

Cheers!
Robin

mewmew added a commit that referenced this issue Nov 9, 2018
Note, the caluclation of the resulting type of `gep` is
a best effort guess, as no official documentation has yet
been located which describes the semantics of vector indices
in gep instructions.

Updates #35.
@pwaller
Copy link
Member

pwaller commented Nov 9, 2018

The pointer is just a vector of whatever it would ordinarily be, no? Which bit is presenting a difficulty?

I found a few references in the LLVM source by grepping for getelementptr.*vec, but nothing terribly interesting.

It also brings to mind scatter/gather instructions like VGATHER* and VSCATTER* that are found in AVX512 for example.

@mewmew
Copy link
Member Author

mewmew commented Nov 9, 2018

The pointer is just a vector of whatever it would ordinarily be, no? Which bit is presenting a difficulty?

Yes, this seems to be the case. What I would like to verify is what happens (or is supposed to happen), if you pass two or more vector indices instead of just one. Is that valid? If so, what is the result type?

Given the type definition.

%struct.fileinfo = type { i8*, i8*, i8*, %struct.stat, i32, i32, i8*, i8, i8, i32, i8, i32 }

For the following example, the type of %37 is <2 x %struct.fileinfo*> (this is from ls.ll):

%37 = getelementptr inbounds %struct.fileinfo, %struct.fileinfo* %20, <2 x i64> %34

For the following example, is the type of %37 <2 x i32*>? Is this even valid?

%37 = getelementptr inbounds %struct.fileinfo, %struct.fileinfo* %20, <2 x i64> %34, i32 4

For the following example, is the type of %37 <2 x <3 x i32*>>? Is this even valid?

%37 = getelementptr inbounds %struct.fileinfo, %struct.fileinfo* %20, <2 x i64> %34, <3 x i32> <i32 4, i32 4, i32 4>

Edit: also, are we ever required to inspect the contents of the vector index to determine the result type of the gep instruction?

For reference, this is how we currently handle result type calculation for gep instructions:

// gepType returns the pointer type to the element at the position in the type
// specified by the given indices, as calculated by the getelementptr
// instruction.
func gepType(elemType types.Type, indices []value.Value) types.Type {
	e := elemType
	for i, index := range indices {
		if i == 0 {
			// Ignore checking the 0th index as it simply follows the pointer of
			// src.
			//
			// ref: http://llvm.org/docs/GetElementPtr.html#why-is-the-extra-0-index-required
			continue
		}
		switch t := e.(type) {
		case *types.PointerType:
			// ref: http://llvm.org/docs/GetElementPtr.html#what-is-dereferenced-by-gep
			panic(fmt.Errorf("unable to index into element of pointer type `%v`; for more information, see http://llvm.org/docs/GetElementPtr.html#what-is-dereferenced-by-gep", elemType))
		case *types.VectorType:
			e = t.ElemType
		case *types.ArrayType:
			e = t.ElemType
		case *types.StructType:
			idx, ok := index.(*constant.Int)
			if !ok {
				panic(fmt.Errorf("invalid index type for structure element; expected *constant.Int, got %T", index))
			}
			e = t.Fields[idx.X.Int64()]
		default:
			panic(fmt.Errorf("support for indexing element type %T not yet implemented", e))
		}
	}
	// TODO: Validate how index vectors in gep are supposed to work.
	//
	// Example from dir.ll:
	//    %113 = getelementptr inbounds %struct.fileinfo, %struct.fileinfo* %96, <2 x i64> %110, !dbg !4736
	//    %116 = bitcast i8** %115 to <2 x %struct.fileinfo*>*, !dbg !4738
	//    store <2 x %struct.fileinfo*> %113, <2 x %struct.fileinfo*>* %116, align 8, !dbg !4738, !tbaa !1793
	if len(indices) > 0 {
		if t, ok := indices[0].Type().(*types.VectorType); ok {
			return types.NewVector(t.Len, types.NewPointer(e))
		}
	}
	return types.NewPointer(e)
}

@pwaller
Copy link
Member

pwaller commented Nov 11, 2018

For the following example, is the type of %37 <2 x i32*>? Is this even valid?

This one seems reasonable to me.

For the following example, is the type of %37 <2 x <3 x i32*>>? Is this even valid?

I guess it would be, in theory. But I wouldn't have high expectations of this being supported.

@mewmew mewmew added this to the v0.3 milestone Nov 12, 2018
@mewmew
Copy link
Member Author

mewmew commented Nov 22, 2018

For reference, I found a test case from LLVM which contains gep with vector indices:

From llvm/test/Assembler/getelementptr.ll:

; Verify that constant expression vector GEPs work.

@z = global <2 x i32*> getelementptr ([3 x {i32, i32}], <2 x [3 x {i32, i32}]*> zeroinitializer, <2 x i32> <i32 1, i32 2>, <2 x i32> <i32 2, i32 3>, <2 x i32> <i32 1, i32 1>)

; Verify that struct GEP works with a vector of pointers.
define <2 x i32*> @test7(<2 x {i32, i32}*> %a) {
  %w = getelementptr {i32, i32}, <2 x {i32, i32}*> %a, <2 x i32> <i32 5, i32 9>, <2 x i32> zeroinitializer
  ret <2 x i32*> %w
}

; Verify that array GEP works with a vector of pointers.
define <2 x i8*> @test8(<2 x [2 x i8]*> %a) {
  %w = getelementptr  [2 x i8], <2 x  [2 x i8]*> %a, <2 x i32> <i32 0, i32 0>, <2 x i8> <i8 0, i8 1>
  ret <2 x i8*> %w
}

Also, found a good test case for invalid cases.

From llvm/test/Assembler/getelementptr_vec_struct.ll:

; Test that a vector struct index with non-equal elements is rejected.

; CHECK: invalid getelementptr indices

define <2 x i32*> @test7(<2 x {i32, i32}*> %a) {
  %w = getelementptr {i32, i32}, <2 x {i32, i32}*> %a, <2 x i32> <i32 5, i32 9>, <2 x i32> <i32 0, i32 1>
  ret <2 x i32*> %w
}

mewmew added a commit that referenced this issue Nov 22, 2018
@mewmew
Copy link
Member Author

mewmew commented Nov 25, 2018

After enabling the Assembler test cases from the LLVM project, we now have a few test cases for gep vector indices that will help us get the implementation correct. Currently 6 or so test cases are failing, so we still have work to do. I really wish there was an official documentation somewhere for how the result type of gep is calculated.

Test cases from llvm/asm/asm_test.go:

// TODO: investigate why we are able to parse `return i32* %gep`, should not be possible as there is no `return` token in the grammar.
//{path: "../testdata/llvm/test/Assembler/getelementptr_invalid_ptr.ll"},
// TODO: fix computation of gep type. We currently do not update the return type to vector if the first gep index is a scalar, but the second (or later) indices is a vector. As such, we currently compute the type `i32*` where `<4 x i32*>` should have been computed.
//{path: "../testdata/llvm/test/Assembler/getelementptr_vec_ce.ll"},
// TODO: fix computation of gep type. We currently do not update the return type to vector if the first gep index is a scalar, but the second (or later) indices is a vector. As such, we currently compute the type `i32*` where `<4 x i32*>` should have been computed.
//{path: "../testdata/llvm/test/Assembler/getelementptr_vec_ce2.ll"},
// TODO: fix gep type computation.
//{path: "../testdata/llvm/test/Assembler/getelementptr_vec_idx1.ll"},
// TODO: fix gep type computation.
//{path: "../testdata/llvm/test/Assembler/getelementptr_vec_idx2.ll"},
// TODO: fix gep type computation.
//{path: "../testdata/llvm/test/Assembler/getelementptr_vec_idx3.ll"},

mewmew added a commit that referenced this issue Nov 30, 2018
Note, the caluclation of the resulting type of `gep` is
a best effort guess, as no official documentation has yet
been located which describes the semantics of vector indices
in gep instructions.

Updates #35.


Former-commit-id: 5a1fcec
mewmew added a commit that referenced this issue Nov 30, 2018
@mewmew mewmew added the bug label Mar 14, 2019
@mewmew
Copy link
Member Author

mewmew commented Dec 6, 2019

resolved as of rev d35d3c5.

@mewmew mewmew closed this as completed Dec 6, 2019
mewmew added a commit that referenced this issue Dec 6, 2019
The issue was twofold, firstly, the vector length was only checked for
index when indexing into struct types, it should always be checked
(including for the 0:th index!). The second issue was that the conversion
from constant.Constant, value.Value and ast.TypeValue omited the index
vector length for indices not considered suited for use to index into
structure types. However, as already concluded, the vector length is
relevent for indexing into other fields and just struct types.

Updates #35.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants