[LLVM] Fix a possible tbaa issue#11181
Conversation
|
That makes sense to me. By passing a scalar type into The one potential issue would be for vectorized types that contain alignment padding (this conditional in I can see two possible options to avoid this issue. Option 1 would be to swap the order of the |
Thanks for the notes! After a re look-through, IIUC, I think the "index" for alias info should keep the same element unit for all accesses, so we may not change it either the However, before any changes, I find another concern on aliased buffers. since in the new convention, two buffer object (maybe of different datatype) with the same buffer var alias to each other, it may not be safe to use current buffer's dtype as index unit for alias info. eg1: # A[16] and B[4] may inferred as NoAlias by tbaa
A = T.allocate([64], "int8")
B = T.buffer_decl([16], "int32", data=A.data)
A[16] = 1 # tag: (A.data, idx=16)
B[4] = 1 # tag: (A.data, idx=4) eg2: # A and B inferred as NoAlias since they have different buffer var
A = T.allocate([64], "int8")
B_data = T.address_of(A[4]) # usmp style alias
B = T.buffer_decl([64], "int8", data=B_data)
A[7] = 1 # tag: (A.data, idx=7)
B[3] = 2 # tag: (B.data, idx=3) |
That makes sense to me, so long as all access uses the vectorized data type or the scalar data type, but not both.
Good point, and thank you for the examples. I agree with the conclusion that this can impact any aliased buffer that has a different element type, either differing by number of lanes or scalar datatype. There's a few cases I can think of where that could occur in practice, such as What are the restrictions on the element type presented to the tbaa annotations? If we write the alias information using a byte-based indexing, I think that would solve both the vector size and the dtype size issues. |
|
@wrongtest in your particular case and all other cases, we should make sure TBAA info indexed by buffer->data instead of buffer itself, which would resolve the problem of buffer re-declaration |
Current implementation is using the data field: (eg2) is something that though two buffer object have totally different buffer data, the accesses still have possible overlaps (by memory pool reuse, IIUC). Also find a related PR by @kparzysz-quic #6046 by key word search, but it is when the backend tir use |
99128b4 to
60e7760
Compare
|
@wrongtest when no-alias is set to True, we should ensure that aliasing is only indicated by the |
You mean in LLVM? TBAA has two kinds of types: scalars and structs. Scalars are elementary, i.e. are not composed from other types, while structs are. What you present as a "scalar" to TBAA is up to you, there are no links there to any actual LLVM IR types. |
Thanks! So we do not need to worry about (eg2) form. |
eb8378f to
092142e
Compare
|
Change the index of tbaa to be based on "underlying datatype` inspired by #6046, or fallback to byte. Tag node on buffer dtype is removed because it seems that there should not exist type tree paths of different dtype tag on the same buffer. Could you kindly take another review? @Lunderberg Unfortunately I still fail to construct conterexample of runtime result error on cpu, though llvm ir of suspicious illegal tbaa could be provided. |
Lunderberg
left a comment
There was a problem hiding this comment.
Based on @kparzysz-quic 's comment about scalar types in LLVM's TBAA, I think there's a couple of potential improvements if we change the scalar type from buffer_var->type_annotation->element_type to just bits or bytes. That would avoid needing to check the type annotations, and would avoid marking some types of non-aliased access as aliasing. That said, this PR is a huge improvement over the current state of incorrect annotations, and so those changes could also be a separate PR entirely.
I took a glance at the CI failure, and it looks like a timeout on the Windows build that just needs to be restarted.
src/target/llvm/codegen_llvm.cc
Outdated
| arith::PVar<int> planes; | ||
| // create meta-data for alias analysis | ||
| // Use a group of binary tree ranges of memory banks. | ||
| if (index.defined()) { |
There was a problem hiding this comment.
Tangentially-related cleanup: I think we can remove the check on index.defined(). AddAliasInfo is only called from BufferAccessHelper, which provides a defined index.
src/target/llvm/codegen_llvm.cc
Outdated
|
|
||
| // Extract the underlying element bit width of the allocated buffer. | ||
| // fallback to byte type if no type annotation present. | ||
| int64_t buffer_elem_bits = 8; |
There was a problem hiding this comment.
I don't think we need the size from the type annotation. The type annotation on the Var would only include the buffer's type as allocated, and may not be correlated with the type used for accessing it. When accessing the buffer in CodeGenLLVM::CreateBufferPtr, if the allocation type and access type differ, the buffer var is cast to the access type. So the bytes being accessed by a load/store should only depend on the access type and the access index.
src/target/llvm/codegen_llvm.cc
Outdated
| xwith = 1; | ||
| } | ||
| if (buffer_elem_bits != access_elem_bits) { | ||
| base = base * access_elem_bits / buffer_elem_bits; |
There was a problem hiding this comment.
Would this cause false positives for aliasing of a buffer whose access type is smaller than the allocation type? I'm picturing something like the following:
@T.prim_func
def func():
A = T.alloc_buffer(32, dtype='int32')
A_bytes = T.buffer_decl(128, dtype='int8', data=A.data)
A_bytes[0] = 42
A_bytes[3] = 42By scaling the alias information to the size of the original allocation, both A_bytes[0] and A_bytes[3] are treated as access of A[0]. This would treat it as an alias even though they are accessing different addresses.
There was a problem hiding this comment.
to just bits or bytes
I agree and would like to follow that in current pr. That make codes much clean and avoid sort of false positives.
Note there is a magic width number 1024, above which the access fallbacks to full region access (root tag for current buffer var). Thus the type tree depth will decrease on huge vector compared to original version, but I think it could be a minor issue and we can turn back until certain performance regression detected.
092142e to
adeff51
Compare
Before the "typed buffers" were introduced, it was possible to use all kinds of types to access the memory (for example by redeclaring buffers with the same buffer variable, but with different types). So, you could read/write |
https://github.com/vinx13/tvm-rfcs/blob/clarify-buffer-access/rfcs/0063-clarifying-buffer-declaration-and-access.md Currently I understand that means all accesses with the same buffer data must be alias (irrelavant to dtype) and the word |
Thank you for the link. I guess the document should also specify that in eg2 in your earlier comment, the buffers will not be aliased. |
adeff51 to
b536400
Compare
* fix a possible tbaa issue * Correct tbaa index unit by underlying buffer elemtype * always use byte as index unit in tbaa
* fix a possible tbaa issue * Correct tbaa index unit by underlying buffer elemtype * always use byte as index unit in tbaa
Hi, we encounter some weird problem on llvm generated codes, seems caused by current llvm tbaa annotations.
The function
AddAliasInfowill distinguish the index of scalar form and vectorized form. If we pass a scalar index but actually it is just the head of a ramp access, there is a possibility that overlapped accesses are infered asNoAliasby tbaa analysis unsafely.However, I am not sure how to reproduce the problem on common target like X86 cpu. Glad to see any suggestions:)
cc @Lunderberg