Skip to content

Conversation

JDPailleux
Copy link
Member

@JDPailleux JDPailleux commented May 23, 2025

Hello,

This PR is a discussion around a dialect to define “coarrray” operations in Flang. This is a draft and a few operations have been proposed to start with and see if they might be suitable.

@JDPailleux JDPailleux force-pushed the prif-dialect-doc branch 3 times, most recently from 9b0a827 to 3c1dd2c Compare June 2, 2025 12:50
@JDPailleux JDPailleux changed the title [PRIF] Dialect PRIF [flang][Coarray] Dialect Coarray Jun 2, 2025
jvillette38 pushed a commit that referenced this pull request Jun 9, 2025
Fixes llvm#123300

What is seen 
```
clang-repl> int x = 42;
clang-repl> auto capture = [&]() { return x * 2; };
In file included from <<< inputs >>>:1:
input_line_4:1:17: error: non-local lambda expression cannot have a capture-default
    1 | auto capture = [&]() { return x * 2; };
      |                 ^
zsh: segmentation fault  clang-repl --Xcc="-v"

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
  * frame #0: 0x0000000107b4f8b8 libclang-cpp.19.1.dylib`clang::IncrementalParser::CleanUpPTU(clang::PartialTranslationUnit&) + 988
    frame #1: 0x0000000107b4f1b4 libclang-cpp.19.1.dylib`clang::IncrementalParser::ParseOrWrapTopLevelDecl() + 416
    frame #2: 0x0000000107b4fb94 libclang-cpp.19.1.dylib`clang::IncrementalParser::Parse(llvm::StringRef) + 612
    frame #3: 0x0000000107b52fec libclang-cpp.19.1.dylib`clang::Interpreter::ParseAndExecute(llvm::StringRef, clang::Value*) + 180
    frame #4: 0x0000000100003498 clang-repl`main + 3560
    frame #5: 0x000000018d39a0e0 dyld`start + 2360
```

Though the error is justified, we shouldn't be interested in exiting
through a segfault in such cases.

The issue is that empty named decls weren't being taken care of
resulting into this assert


https://github.com/llvm/llvm-project/blob/c1a229252617ed58f943bf3f4698bd8204ee0f04/clang/include/clang/AST/DeclarationName.h#L503

Can also be seen when the example is attempted through xeus-cpp-lite.


![image](https://github.com/user-attachments/assets/9b0e6ead-138e-4b06-9ad9-fcb9f8d5bf6e)
jvillette38 pushed a commit that referenced this pull request Jun 9, 2025
…142952)

This was removed in llvm#135343 in
favour of making it a format variable, which we do here. This follows
the precedent of the `[opt]` and `[artificial]` markers.

Before:
```
 thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
 * frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3
   frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
   frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43
   frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
   frame #4: 0x0000000186345be4 dyld`start + 7040
```

After (note the `[inlined]` markers):
```
thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
* frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3 [inlined]
  frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
  frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43 [inlined]
  frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
  frame #4: 0x0000000186345be4 dyld`start + 7040
```

rdar://152642178
@JDPailleux JDPailleux force-pushed the prif-dialect-doc branch 2 times, most recently from 044ebee to 96b5bb5 Compare June 12, 2025 08:45
@JDPailleux JDPailleux force-pushed the prif branch 3 times, most recently from 5f82bf1 to fdf02a9 Compare June 17, 2025 12:43
Copy link

@bonachea bonachea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for putting this together! I'm very glad to see this effort progressing.
I've added some initial questions based on a quick skim.

I will probably have more feedback once I've done a detailed review.

Comment on lines +727 to +730
//===----------------------------------------------------------------------===//
// Collective Operations
//===----------------------------------------------------------------------===//

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do Collective Subroutines really belong in this dialect?

These are often conflated via informal conversations, but Collective Subroutines have absolutely nothing to do with coarrays. They are a completely orthogonal multi-image feature. In fact the argument to collective subroutines is even prohibited from being a coindexed object.

I'm not sure these belong in ANY dialect - for most static purposes, collective subroutines are just a regular intrinsic subroutine. IMHO they don't belong lumped into a coarray dialect any more than say, the SQRT intrinsic.

Comment on lines +39 to +86
def coarray_StopOp : coarray_Op<"stop", [AttrSizedOperandSegments]> {
let summary = "Operation that synchronizes all execution images, clean up the"
"parallel runtime environment, and terminates the program.";
let description = [{
This operation synchronizes all executing images, cleans up the parallel
runtime environment, and terminates the program. Calls to this operation do
not return. This operation supports both normal termination at the end of a
program, as well as any STOP statements from the user source code.
}];

let arguments = (ins Optional<BoolLike>:$quiet,
Optional<AnyIntegerType>:$stop_code_int,
Arg<Optional<AnyRefOrBoxType>, "", [MemWrite]>:$stop_code_char,
UnitAttr:$is_error);

let hasVerifier = 1;
}

def coarray_FailImageOp : coarray_Op<"fail_image", [NoMemoryEffect]> {
let summary =
"Causes the executing image to cease participating in program execution";
}

//===----------------------------------------------------------------------===//
// Image Queries
//===----------------------------------------------------------------------===//

def coarray_NumImagesOp
: coarray_Op<"num_images", [NoMemoryEffect, AttrSizedOperandSegments]> {
let summary = "Query the number of images in the specified or current team";
let description = [{
This operation query the number of images in the specified or current
team and can be called with 3 differents way :
- `num_images()`
- `num_images(team)`
- `num_images(team_number)`

Argumemnts:
- `team` : Shall be a scalar of type `team_type` from the `ISO_FORTRAN_ENV`
module with a value that identifies the current or ancestor team.
- `team_number` : Shall be an integer scalar. It shall identify the
initial team or a sibling team of the current team.

Result Value: The number of images in the specified team, or in the current
team if no team is specified.
}];

let arguments = (ins Optional<AnyInteger>:$team_number,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The STOP and FAIL IMAGE statements and intrinsics NUM_IMAGES, STOPPED_IMAGES, FAILED_IMAGES and IMAGE_STATUS also have nothing to do with coarrays. Neither does SYNC ALL, nor (arguably) all the TEAM support routines.

Perhaps this dialect should be renamed to the multi-image dialect? That name could encompass all the features related to multi-image parallel execution, of which coarrays are an important subset.

Comment on lines +800 to +826
//===----------------------------------------------------------------------===//
// Coarray Access Operations
//===----------------------------------------------------------------------===//

def coarray_GetOp : coarray_Op<"get", []> {
let summary = "Fetch data in a coarray from an image number";

let arguments = (ins AnyIntegerType:$image_num,
Arg<fir_BoxType, "", [MemRead]>:$coarray,
I64:$offset,
I64:$size_in_bytes,
Arg<AnyRefOrBoxType, "", [MemWrite]>:$dest,
Arg<Optional<AnyRefOrBoxType>, "", [MemWrite]>:$errmsg);
let results = (outs AnyIntegerType:$stat);
}

def coarray_PutOp : coarray_Op<"put", []> {
let summary = "Assign data to a coarray to an image number";

let arguments = (ins AnyIntegerType:$image_num,
Arg<fir_BoxType, "", [MemWrite]>:$coarray,
I64:$offset,
I64:$size_in_bytes,
Arg<AnyRefOrBoxType, "", [MemRead]>:$src,
Arg<Optional<AnyRefOrBoxType>, "", [MemWrite]>:$errmsg);
let results = (outs AnyIntegerType:$stat);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first glance, the coarray access ops appear incomplete.

  1. How do you encode strided access?
  2. How do you encode access to pointer/allocatable components of derived types in a coarray?

Perhaps I'm missing something here?

@JDPailleux JDPailleux closed this Jul 10, 2025
@JDPailleux JDPailleux deleted the prif-dialect-doc branch July 10, 2025 14:20
JDPailleux pushed a commit that referenced this pull request Aug 8, 2025
…erver (llvm#148774)

Summary:
There was a deadlock was introduced by [PR
llvm#146441](llvm#146441) which changed
`CurrentThreadIsPrivateStateThread()` to
`CurrentThreadPosesAsPrivateStateThread()`. This change caused the
execution path in
[`ExecutionContextRef::SetTargetPtr()`](https://github.com/llvm/llvm-project/blob/10b5558b61baab59c7d3dff37ffdf0861c0cc67a/lldb/source/Target/ExecutionContext.cpp#L513)
to now enter a code block that was previously skipped, triggering
[`GetSelectedFrame()`](https://github.com/llvm/llvm-project/blob/10b5558b61baab59c7d3dff37ffdf0861c0cc67a/lldb/source/Target/ExecutionContext.cpp#L522)
which leads to a deadlock.

Thread 1 gets m_modules_mutex in
[`ModuleList::AppendImpl`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Core/ModuleList.cpp#L218),
Thread 3 gets m_language_runtimes_mutex in
[`GetLanguageRuntime`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Process.cpp#L1501),
but then Thread 1 waits for m_language_runtimes_mutex in
[`GetLanguageRuntime`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Process.cpp#L1501)
while Thread 3 waits for m_modules_mutex in
[`ScanForGNUstepObjCLibraryCandidate`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Plugins/LanguageRuntime/ObjC/GNUstepObjCRuntime/GNUstepObjCRuntime.cpp#L57).

This fixes the deadlock by adding a scoped block around the mutex lock
before the call to the notifier, and moved the notifier call outside of
the mutex-guarded section. The notifier call
[`NotifyModuleAdded`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Target.cpp#L1810)
should be thread-safe, since the module should be added to the
`ModuleList` before the mutex is released, and the notifier doesn't
modify the module list further, and the call is operates on local state
and the `Target` instance.

### Deadlocked Thread backtraces:
```
* thread #3, name = 'dbg.evt-handler', stop reason = signal SIGSTOP
  * frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563786bd5f40) at    futex-internal.h:146:13
   /*... a bunch of mutex related bt ... */    
   liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007f2f0f1927b0, __m=0x0000563786bd5f40) at std_mutex.h:229:19
    frame llvm#8: 0x00007f2f27946eb7 liblldb.so.21.0git`ScanForGNUstepObjCLibraryCandidate(modules=0x0000563786bd5f28, TT=0x0000563786bd5eb8) at GNUstepObjCRuntime.cpp:60:41
    frame llvm#9: 0x00007f2f27946c80 liblldb.so.21.0git`lldb_private::GNUstepObjCRuntime::CreateInstance(process=0x0000563785e1d360, language=eLanguageTypeObjC) at GNUstepObjCRuntime.cpp:87:8
    frame llvm#10: 0x00007f2f2746fca5 liblldb.so.21.0git`lldb_private::LanguageRuntime::FindPlugin(process=0x0000563785e1d360, language=eLanguageTypeObjC) at LanguageRuntime.cpp:210:36
    frame llvm#11: 0x00007f2f2742c9e3 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeObjC) at Process.cpp:1516:9
    ...
    frame llvm#21: 0x00007f2f2750b5cc liblldb.so.21.0git`lldb_private::Thread::GetSelectedFrame(this=0x0000563785e064d0, select_most_relevant=DoNoSelectMostRelevantFrame) at Thread.cpp:274:48
    frame llvm#22: 0x00007f2f273f9957 liblldb.so.21.0git`lldb_private::ExecutionContextRef::SetTargetPtr(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:525:32
    frame llvm#23: 0x00007f2f273f9714 liblldb.so.21.0git`lldb_private::ExecutionContextRef::ExecutionContextRef(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:413:3
    frame llvm#24: 0x00007f2f270e80af liblldb.so.21.0git`lldb_private::Debugger::GetSelectedExecutionContext(this=0x0000563785d83bc0) at Debugger.cpp:1225:23
    frame llvm#25: 0x00007f2f271bb7fd liblldb.so.21.0git`lldb_private::Statusline::Redraw(this=0x0000563785d83f30, update=true) at Statusline.cpp:136:41
    ...
* thread #1, name = 'lldb', stop reason = signal SIGSTOP
  * frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563785e1dd98) at futex-internal.h:146:13
   /*... a bunch of mutex related bt ... */    
   liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007ffe62be0488, __m=0x0000563785e1dd98) at std_mutex.h:229:19
    frame llvm#8: 0x00007f2f2742c8d1 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeC_plus_plus) at Process.cpp:1510:41
    frame llvm#9: 0x00007f2f2743c46f liblldb.so.21.0git`lldb_private::Process::ModulesDidLoad(this=0x0000563785e1d360, module_list=0x00007ffe62be06a0) at Process.cpp:6082:36
    ...
    frame llvm#13: 0x00007f2f2715cf03 liblldb.so.21.0git`lldb_private::ModuleList::AppendImpl(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, use_notifier=true) at ModuleList.cpp:246:19
    frame llvm#14: 0x00007f2f2715cf4c liblldb.so.21.0git`lldb_private::ModuleList::Append(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, notify=true) at ModuleList.cpp:251:3
    ...
    frame llvm#19: 0x00007f2f274349b3 liblldb.so.21.0git`lldb_private::Process::ConnectRemote(this=0x0000563785e1d360, remote_url=(Data = "connect://localhost:1234", Length = 24)) at Process.cpp:3250:9
    frame llvm#20: 0x00007f2f27411e0e liblldb.so.21.0git`lldb_private::Platform::DoConnectProcess(this=0x0000563785c59990, connect_url=(Data = "connect://localhost:1234", Length = 24), plugin_name=(Data = "gdb-remote", Length = 10), debugger=0x0000563785d83bc0, stream=0x00007ffe62be3128, target=0x0000563786bd5be0, error=0x00007ffe62be1ca0) at Platform.cpp:1926:23
```

## Test Plan:
Built a hello world a.out
Run server in one terminal:
```
~/llvm/build/Debug/bin/lldb-server g :1234 a.out
```
Run client in another terminal
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3"
```

Before:
Client hangs indefinitely
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b main"
(lldb) gdb-remote 1234

^C^C
```

After:
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3"
(lldb) gdb-remote 1234
Process 837068 stopped
* thread #1, name = 'a.out', stop reason = signal SIGSTOP
    frame #0: 0x00007ffff7fe4a60
ld-linux-x86-64.so.2`_start:
->  0x7ffff7fe4a60 <+0>: movq   %rsp, %rdi
    0x7ffff7fe4a63 <+3>: callq  0x7ffff7fe5780 ; _dl_start at rtld.c:522:1

ld-linux-x86-64.so.2`_dl_start_user:
    0x7ffff7fe4a68 <+0>: movq   %rax, %r12
    0x7ffff7fe4a6b <+3>: movl   0x18067(%rip), %eax ; _dl_skip_args
(lldb) b hello.cc:3
Breakpoint 1: where = a.out`main + 15 at hello.cc:4:13, address = 0x00005555555551bf
(lldb) c
Process 837068 resuming
Process 837068 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x00005555555551bf a.out`main at hello.cc:4:13
   1   	#include <iostream>
   2
   3   	int main() {
-> 4   	  std::cout << "Hello World" << std::endl;
   5   	  return 0;
   6   	}
```
JDPailleux pushed a commit that referenced this pull request Aug 14, 2025
## Problem

When the new setting

```
set target.parallel-module-load true
```
was added, lldb began fetching modules from the devices from multiple
threads simultaneously. This caused crashes of lldb when debugging on
android devices.

The top of the stack in the crash look something like this:
```
#0 0x0000555aaf2b27fe llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/llvm/bin/lldb-dap+0xb87fe)
 #1 0x0000555aaf2b0a99 llvm::sys::RunSignalHandlers() (/opt/llvm/bin/lldb-dap+0xb6a99)
 #2 0x0000555aaf2b2fda SignalHandler(int, siginfo_t*, void*) (/opt/llvm/bin/lldb-dap+0xb8fda)
 #3 0x00007f9c02444560 __restore_rt /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:13:0
 #4 0x00007f9c04ea7707 lldb_private::ConnectionFileDescriptor::Disconnect(lldb_private::Status*) (usr/bin/../lib/liblldb.so.15+0x22a7707)
 #5 0x00007f9c04ea5b41 lldb_private::ConnectionFileDescriptor::~ConnectionFileDescriptor() (usr/bin/../lib/liblldb.so.15+0x22a5b41)
 #6 0x00007f9c04ea5c1e lldb_private::ConnectionFileDescriptor::~ConnectionFileDescriptor() (usr/bin/../lib/liblldb.so.15+0x22a5c1e)
 llvm#7 0x00007f9c052916ff lldb_private::platform_android::AdbClient::SyncService::Stat(lldb_private::FileSpec const&, unsigned int&, unsigned int&, unsigned int&) (usr/bin/../lib/liblldb.so.15+0x26916ff)
 llvm#8 0x00007f9c0528b9dc lldb_private::platform_android::PlatformAndroid::GetFile(lldb_private::FileSpec const&, lldb_private::FileSpec const&) (usr/bin/../lib/liblldb.so.15+0x268b9dc)
```
Our workaround was to set `set target.parallel-module-load ` to `false`
to avoid the crash.

## Background

PlatformAndroid creates two different classes with one stateful adb
connection shared between the two -- one through AdbClient and another
through AdbClient::SyncService. The connection management and state is
complex, and seems to be responsible for the segfault we are seeing. The
AdbClient code resets these connections at times, and re-establishes
connections if they are not active. Similarly, PlatformAndroid caches
its SyncService, which uses an AdbClient class, but the SyncService puts
its connection into a different 'sync' state that is incompatible with a
standard connection.

## Changes in this diff

* This diff refactors the code to (hopefully) have clearer ownership of
the connection, clearer separation of AdbClient and SyncService by
making a new class for clearer separations of concerns, called
AdbSyncService.
* New unit tests are added
* Additional logs were added (see
llvm#145382 (comment)
for details)
JDPailleux pushed a commit that referenced this pull request Aug 14, 2025
…namic (llvm#153420)

Canonicalizing the following IR:

```
func.func @mul_zero_dynamic_nofold(%arg0: tensor<?x17xf32>) -> tensor<?x17xf32> {
  %0 = "tosa.const"() <{values = dense<0.000000e+00> : tensor<1x1xf32>}> : () -> tensor<1x1xf32>
  %1 = "tosa.const"() <{values = dense<0> : tensor<1xi8>}> : () -> tensor<1xi8>
  %2 = tosa.mul %arg0, %0, %1 : (tensor<?x17xf32>, tensor<1x1xf32>, tensor<1xi8>) -> tensor<?x17xf32>
  return %2 : tensor<?x17xf32>
}
```

resulted in a crash

```
#0 0x000056513187e8db backtrace (./build-release/bin/mlir-opt+0x9d698db)                                                                                                                                                                                                                                                                                                                   
 #1 0x0000565131b17737 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:838:8                                                                                                                                                                                                                
 #2 0x0000565131b187f3 PrintStackTraceSignalHandler(void*) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:918:1                                                                                                                                                                                                                                
 #3 0x0000565131b18c30 llvm::sys::RunSignalHandlers() /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Signals.cpp:105:18                                                                                                                                                                                                                                         
 #4 0x0000565131b18c30 SignalHandler(int, siginfo_t*, void*) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:409:3                                                                                                                                                                                                                              
 #5 0x00007f2e4165b050 (/lib/x86_64-linux-gnu/libc.so.6+0x3c050)                                                                                                                                                                                                                                                                                                                            
 #6 0x00007f2e416a9eec __pthread_kill_implementation ./nptl/pthread_kill.c:44:76                                                                                                                                                                                                                                                                                                            
 llvm#7 0x00007f2e4165afb2 raise ./signal/../sysdeps/posix/raise.c:27:6                                                                                                                                                                                                                                                                                                                         
 llvm#8 0x00007f2e41645472 abort ./stdlib/abort.c:81:7                                                                                                                                                                                                                                                                                                                                          
 llvm#9 0x00007f2e41645395 _nl_load_domain ./intl/loadmsgcat.c:1177:9                                                                                                                                                                                                                                                                                                                           
llvm#10 0x00007f2e41653ec2 (/lib/x86_64-linux-gnu/libc.so.6+0x34ec2)                                                                                                                                                                                                                                                                                                                            
llvm#11 0x00005651443ec4ba mlir::DenseIntOrFPElementsAttr::getRaw(mlir::ShapedType, llvm::ArrayRef<char>) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/IR/BuiltinAttributes.cpp:1361:3                                                                                                                                                                                    
llvm#12 0x00005651443f1209 mlir::DenseElementsAttr::resizeSplat(mlir::ShapedType) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/IR/BuiltinAttributes.cpp:0:10                                                                                                                                                                                                              
llvm#13 0x000056513f76f2b6 mlir::tosa::MulOp::fold(mlir::tosa::MulOpGenericAdaptor<llvm::ArrayRef<mlir::Attribute>>) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/Dialect/Tosa/IR/TosaCanonicalizations.cpp:0:0
```

from the folder for `tosa::mul` since the zero value was being reshaped
to `?x17` size which isn't supported. AFAIK, `tosa.const` requires all
dimensions to be static. So in this case, the fix is to not to fold the
op.
JDPailleux pushed a commit that referenced this pull request Aug 20, 2025
This can happen when JIT code is run, and we can't symbolize those
frames, but they should remain numbered in the stack. An example
spidermonkey trace:

```
    #0 0x564ac90fb80f  (/builds/worker/dist/bin/js+0x240e80f) (BuildId: 5d053c76aad4cfbd08259f8832e7ac78bbeeab58)
    #1 0x564ac9223a64  (/builds/worker/dist/bin/js+0x2536a64) (BuildId: 5d053c76aad4cfbd08259f8832e7ac78bbeeab58)
    #2 0x564ac922316f  (/builds/worker/dist/bin/js+0x253616f) (BuildId: 5d053c76aad4cfbd08259f8832e7ac78bbeeab58)
    #3 0x564ac9eac032  (/builds/worker/dist/bin/js+0x31bf032) (BuildId: 5d053c76aad4cfbd08259f8832e7ac78bbeeab58)
    #4 0x0dec477ca22e  (<unknown module>)
```

Without this change, the following symbolization is output:

```
    #0 0x55a6d72f980f in MOZ_CrashSequence /builds/worker/workspace/obj-build/dist/include/mozilla/Assertions.h:248:3
    #1 0x55a6d72f980f in Crash(JSContext*, unsigned int, JS::Value*) /builds/worker/checkouts/gecko/js/src/shell/js.cpp:4223:5
    #2 0x55a6d7421a64 in CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), js::CallReason, JS::CallArgs const&) /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:501:13
    #3 0x55a6d742116f in js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:597:12
    #4 0x55a6d80aa032 in js::jit::DoCallFallback(JSContext*, js::jit::BaselineFrame*, js::jit::ICFallbackStub*, unsigned int, JS::Value*, JS::MutableHandle<JS::Value>) /builds/worker/checkouts/gecko/js/src/jit/BaselineIC.cpp:1705:10
    #4 0x2c803bd8f22e  (<unknown module>)
```

The last frame has a duplicate number. With this change the numbering is
correct:

```
    #0 0x5620c58ec80f in MOZ_CrashSequence /builds/worker/workspace/obj-build/dist/include/mozilla/Assertions.h:248:3
    #1 0x5620c58ec80f in Crash(JSContext*, unsigned int, JS::Value*) /builds/worker/checkouts/gecko/js/src/shell/js.cpp:4223:5
    #2 0x5620c5a14a64 in CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), js::CallReason, JS::CallArgs const&) /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:501:13
    #3 0x5620c5a1416f in js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:597:12
    #4 0x5620c669d032 in js::jit::DoCallFallback(JSContext*, js::jit::BaselineFrame*, js::jit::ICFallbackStub*, unsigned int, JS::Value*, JS::MutableHandle<JS::Value>) /builds/worker/checkouts/gecko/js/src/jit/BaselineIC.cpp:1705:10
    #5 0x349f24c7022e  (<unknown module>)
```
JDPailleux pushed a commit that referenced this pull request Sep 10, 2025
Reverts llvm#154949 due to suspected buildbot breakage
(https://lab.llvm.org/buildbot/#/builders/55/builds/16630/steps/11/logs/stdio).
Previously commented on the original pull request:
llvm#154949 (comment)

```
******************** TEST 'MLIR :: Dialect/XeGPU/subgroup-distribute.mlir' FAILED ********************
...
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
# | Stack dump:
# | 0.	Program arguments: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/mlir-opt -xegpu-subgroup-distribute -allow-unregistered-dialect -canonicalize -cse -split-input-file /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/test/Dialect/XeGPU/subgroup-distribute.mlir
# |  #0 0x0000c0af4b066df0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:834:13
# |  #1 0x0000c0af4b060e20 llvm::sys::RunSignalHandlers() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Signals.cpp:105:18
# |  #2 0x0000c0af4b0691b4 SignalHandler(int, siginfo_t*, void*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:426:38
# |  #3 0x0000ee25a3dcb8f8 (linux-vdso.so.1+0x8f8)
# |  #4 0x0000ee25a36c7608 (/lib/aarch64-linux-gnu/libc.so.6+0x87608)
# |  #5 0x0000ee25a367cb3c raise (/lib/aarch64-linux-gnu/libc.so.6+0x3cb3c)
# |  #6 0x0000ee25a3667e00 abort (/lib/aarch64-linux-gnu/libc.so.6+0x27e00)
# |  llvm#7 0x0000c0af4ae7e4b0 __sanitizer::Atexit(void (*)()) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp:168:10
# |  llvm#8 0x0000c0af4ae7c354 __sanitizer::Die() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:52:5
# |  llvm#9 0x0000c0af4ae66a30 Unlock /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_mutex.h:250:16
# | llvm#10 0x0000c0af4ae66a30 ~GenericScopedLock /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_mutex.h:386:51
# | llvm#11 0x0000c0af4ae66a30 __hwasan::ScopedReport::~ScopedReport() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:54:5
# | llvm#12 0x0000c0af4ae661b8 __hwasan::(anonymous namespace)::BaseReport::~BaseReport() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:477:7
# | llvm#13 0x0000c0af4ae63f5c __hwasan::ReportTagMismatch(__sanitizer::StackTrace*, unsigned long, unsigned long, bool, bool, unsigned long*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:1094:1
# | llvm#14 0x0000c0af4ae4f8e0 Destroy /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common.h:532:31
# | llvm#15 0x0000c0af4ae4f8e0 ~InternalMmapVector /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common.h:642:56
# | llvm#16 0x0000c0af4ae4f8e0 __hwasan::HandleTagMismatch(__hwasan::AccessInfo, unsigned long, unsigned long, void*, unsigned long*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan.cpp:245:1
# | llvm#17 0x0000c0af4ae51e8c __hwasan_tag_mismatch4 /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan.cpp:764:1
# | llvm#18 0x0000c0af4ae67b30 __interception::InterceptFunction(char const*, unsigned long*, unsigned long, unsigned long) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/interception/interception_linux.cpp:60:0
# | llvm#19 0x0000c0af5641cd24 getNumResults /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/include/mlir/IR/Operation.h:404:37
# | llvm#20 0x0000c0af5641cd24 getOpResultImpl /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/include/mlir/IR/Operation.h:1010:5
# | llvm#21 0x0000c0af5641cd24 getResult /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/include/mlir/IR/Operation.h:407:54
# | llvm#22 0x0000c0af5641cd24 mlir::OpTrait::detail::MultiResultTraitBase<mlir::gpu::WarpExecuteOnLane0Op, mlir::OpTrait::VariadicResults>::getResult(unsigned int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/include/mlir/IR/OpDefinition.h:638:62
# | llvm#23 0x0000c0af56426b60 getType /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/include/mlir/IR/Value.h:63:33
# | llvm#24 0x0000c0af56426b60 getType /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/include/mlir/IR/Value.h:105:39
# | llvm#25 0x0000c0af56426b60 (anonymous namespace)::LoadDistribution::matchAndRewrite(mlir::gpu::WarpExecuteOnLane0Op, mlir::PatternRewriter&) const /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp:991:55
...
```
JDPailleux pushed a commit that referenced this pull request Sep 22, 2025
A few improvements to logging when lldb-dap is started in **Server
Mode** AND when the **`lldb-dap.logFolder`** setting is used (not
`lldb-dap.log-path`).

### Improvement #1
**Avoid the prompt of restarting the server when starting each debug
session.**

That prompt is caused by the combination of the following facts:
1. The log filename changes every time a new debug session is starting
(see
[here](https://github.com/llvm/llvm-project/blob/9d6062c490548a5e6fea103e010ab3c9bc73a86d/lldb/tools/lldb-dap/src-ts/logging.ts#L47))
2. The log filename is passed to the server via an environment variable
called "LLDBDAP_LOG" (see
[here](https://github.com/llvm/llvm-project/blob/9d6062c490548a5e6fea103e010ab3c9bc73a86d/lldb/tools/lldb-dap/src-ts/debug-adapter-factory.ts#L263-L269))
3. All environment variables are put into the "spawn info" variable (see
[here](https://github.com/llvm/llvm-project/blob/9d6062c490548a5e6fea103e010ab3c9bc73a86d/lldb/tools/lldb-dap/src-ts/lldb-dap-server.ts#L170-L172)).
4. The old and new "spawn info" are compared to decide if a prompt
should show (see
[here](https://github.com/llvm/llvm-project/blob/9d6062c490548a5e6fea103e010ab3c9bc73a86d/lldb/tools/lldb-dap/src-ts/lldb-dap-server.ts#L107-L110)).

The fix is to remove the "LLDBDAP_LOG" from the "spawn info" variable,
so that the same server can be reused if the log path is the only thing
that has changed.

### Improvement #2
**Avoid log file conflict when multiple users share a machine and start
server in the same second.**

The problem: If two users start lldb-dap server in the same second, they
will share the same log path. The first user will create the log file.
The second user will find that they cannot access the same file, so
their server will fail to start.

The fix is to add a part of the VS Code session ID to the log filename.

### Improvement #3
**Avoid restarting the server when the order of environment variables
changed.**

This is done by sorting the environment variables before putting them
into the "spawn info".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants