-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standalone/stdx-ops.mlir crash #45
Comments
This does not seem to be a random crash. I have executed that line a dozen thousand times (on both my desktop and the cluster - head node and compute node) and I can never reproduce it. We're seeing this error on our CI, correct? Can you reproduce it locally, on your machine? A quick look at the error messages show the pattern:
Be we are not calling safe_realloc directly, not even manipulating the |
On
with
So, the I can see no fault in this code. :( |
Turns out it was miscompilation with |
Side-note: we do not use |
Problem was ccache on a local machine. Fixes plaidml#45
Problem was ccache on a local machine. Fixes #45
This has happened on the cluster a while ago and is now happening on my machine as well, always when using GCC 12.2. I don't use We need to understand what's the problem, because most people will just try and build with the system compiler, which on Linux happens to be GCC. |
Spoke too soon, now I get the same problem with clang... I'll try to reduce some test. |
Before, it was crashing when parsing the
|
The problem is still the same. In addition, the generated parser for the assembly format parses everything and only crash when it gets to parsing the return type. The reduced logic in it is:
It did not reach Need to build the whole of LLVM in |
For completion, just adding the error I have encountered. My setup: WSL2 Ubuntu-20.04, LLVM compiled with the default Ubuntu clang 10.0.0, tpp-sandbox compiled with the default Ubuntu gcc 9.4.0. The following four tests consistently end up throwing the above
Also, in my case all benchmarks produce stack dumps. For example,
Compiling tpp-sandbox with the same default clang 10.0.0 appears to address these problems as the errors are no longer present when running |
Ok, the benchmark one is a slightly different error, it's not in the parsing of the instruction, but in the |
As a test of patience, when LLVM is build in Debug mode, the error doesn't happen, so I can't step through the code and will have to resort to printfs inside LLVM. Yay! But this also tells me the bug is more likely in LLVM (as UB) or GCC (as an optimisation bug). |
Just to report my findings, I see following tests fail consistently in my setup:
The call stack looks same as what Adam shared. I am using gcc-9.4.0 to compile LLVM and tap-sandbox. |
I think these failures are because we do not register properly the operations in the jitter. |
If you mean the benchmark failures, that's tracked by #158. Unfortunately, this issue got mixed up with those failures, but this one is about the |
Building the tpp-mlir with GCC goes smoothly. However, running tests crashes with
For the record, the following debugs into the issue: ( gdb-oneapi --args /path/to/tpp-mlir/build/bin/tpp-opt -transform-dialect-interpreter -verify-diagnostics -split-input-file /path/to/tpp-mlir/test/TPP/transform/transform-propagation.mlir Note: our standard environment needs a newer |
Just did a new test and here are the ops failing in the same way (
which have the following
Which is exactly the same problem as our
But it also fails on variadic arguments, which is the case for Perf but not VNNI. However, the failure isn't when parsing the result types, but when parsing the argument types. And both But if |
Looking at The possible undefined behaviour here is that in the case of the |
The only two ops that have this problem as of today is The problem seems to be when there actually is a return value. This works: func.func @vnni_dialect(%arg0: memref<4x256x512xbf16>,
%arg1: memref<4x256x1024x2xbf16>,
%arg2: memref<256x1024xbf16>,
%arg3: memref<512x2048x2xbf16>,
%arg4: memref<256x2048xbf16>) {
vnni.brgemm ins(%arg0 : memref<4x256x512xbf16>, %arg1 : memref<4x256x1024x2xbf16>) out(%arg2 : memref<256x1024xbf16>)
vnni.matmul ins(%arg2: memref<256x1024xbf16>, %arg3: memref<512x2048x2xbf16>) out(%arg4: memref<256x2048xbf16>)
return
} While this crashes: func.func @vnni_dialect(%arg0: memref<4x256x512xbf16>,
%arg1: memref<4x256x1024x2xbf16>,
%arg2: memref<256x1024xbf16>,
%arg3: memref<512x2048x2xbf16>,
%arg4: memref<256x2048xbf16>) -> memref<256x2048xbf16> {
vnni.brgemm ins(%arg0 : memref<4x256x512xbf16>, %arg1 : memref<4x256x1024x2xbf16>) out(%arg2 : memref<256x1024xbf16>)
%ret = vnni.matmul ins(%arg2: memref<256x1024xbf16>, %arg3: memref<512x2048x2xbf16>) out(%arg4: memref<256x2048xbf16>) -> memref<256x2048xbf16>
return %ret : memref<256x2048xbf16>
} Same thing on |
Just tried on VNNI dialect and using This should be fine for most of our ops ( |
Adds a parser and printer for both bench and yield ops as parseTypeList was crashing the parser with `grow_pod` on `emplace_back`. This is likely an upstream problem that isn't being hit by upstream tests because no one uses `Variadic<>` quite like we do with the inline assembly (and TableGen code). So we add a custom parser/printer for both (mainly stolen from TableGen implementation, replacing parseTypeList with parseCommaSeparatedList and a similar lambda. This takes care of the GCC code generation crash. Fixes plaidml#45
Adds a parser and printer for both bench and yield ops as parseTypeList was crashing the parser with `grow_pod` on `emplace_back`. This is likely an upstream problem that isn't being hit by upstream tests because no one uses `Variadic<>` quite like we do with the inline assembly (and TableGen code). So we add a custom parser/printer for both (mainly stolen from TableGen implementation, replacing parseTypeList with parseCommaSeparatedList and a similar lambda. This takes care of the GCC code generation crash. Fixes plaidml#45
Adds a parser and printer for both bench and yield ops as parseTypeList was crashing the parser with `grow_pod` on `emplace_back`. This is likely an upstream problem that isn't being hit by upstream tests because no one uses `Variadic<>` quite like we do with the inline assembly (and TableGen code). So we add a custom parser/printer for both (mainly stolen from TableGen implementation, replacing parseTypeList with parseCommaSeparatedList and a similar lambda. This takes care of the GCC code generation crash. Fixes plaidml#45
Adds a parser and printer for both bench and yield ops as parseTypeList was crashing the parser with `grow_pod` on `emplace_back`. This is likely an upstream problem that isn't being hit by upstream tests because no one uses `Variadic<>` quite like we do with the inline assembly (and TableGen code). So we add a custom parser/printer for both (mainly stolen from TableGen implementation, replacing parseTypeList with parseCommaSeparatedList and a similar lambda. This takes care of the GCC code generation crash. Fixes #45
The text was updated successfully, but these errors were encountered: