Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[browser][mt] RangeError: Invalid atomic access index #96545

Closed
pavelsavara opened this issue Jan 5, 2024 · 8 comments
Closed

[browser][mt] RangeError: Invalid atomic access index #96545

pavelsavara opened this issue Jan 5, 2024 · 8 comments
Assignees
Labels
arch-wasm WebAssembly architecture area-VM-threading-mono blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab os-browser Browser variant of arch-wasm
Milestone

Comments

@pavelsavara
Copy link
Member

pavelsavara commented Jan 5, 2024

Error Blob

{
  "ErrorMessage": "RangeError: Invalid atomic access index",
  "BuildRetry": false,
  "ErrorPattern": "",
  "ExcludeConsoleLog": false
}

Reproduction Steps

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=513268

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=513268
Error message validated: RangeError: Invalid atomic access index
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 1/5/2024 12:40:40 PM UTC

Report

Build Definition Test Pull Request
569971 dotnet/runtime WasmTestOnBrowser-System.Transactions.Local.Tests.WorkItemExecution #98492
567613 dotnet/runtime WasmTestOnBrowser-System.Threading.Tasks.Dataflow.Tests.WorkItemExecution #98482
566720 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.Logging.Console.Tests.WorkItemExecution
565264 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.Configuration.Tests.WorkItemExecution #98199
565182 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.DependencyInjection.Tests.WorkItemExecution #98336
564782 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.Logging.Console.Tests.WorkItemExecution #98336
563571 dotnet/runtime WasmTestOnBrowser-System.Net.Http.Functional.Tests.WorkItemExecution #98128
563337 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.Logging.Console.Tests.WorkItemExecution
562985 dotnet/runtime WasmTestOnBrowser-System.IO.Hashing.Tests.WorkItemExecution #97213
562540 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.Logging.Console.Tests.WorkItemExecution #97508
560637 dotnet/runtime WasmTestOnBrowser-System.Threading.Tests.WorkItemExecution #98230
559787 dotnet/runtime WasmTestOnBrowser-System.Net.Http.Json.Functional.Tests.WorkItemExecution
559039 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.Logging.Console.Tests.WorkItemExecution
556974 dotnet/runtime WasmTestOnBrowser-System.Runtime.InteropServices.JavaScript.Tests.WorkItemExecution #95001
555167 dotnet/runtime WasmTestOnBrowser-System.Threading.Tasks.Parallel.Tests.WorkItemExecution #98027
555069 dotnet/runtime WasmTestOnBrowser-System.Runtime.InteropServices.JavaScript.Tests.WorkItemExecution #97021
549104 dotnet/runtime WasmTestOnBrowser-Microsoft.Extensions.HostFactoryResolver.Tests.WorkItemExecution #97560
549003 dotnet/runtime WasmTestOnBrowser-System.Runtime.InteropServices.JavaScript.Tests.WorkItemExecution #97806
548143 dotnet/runtime WasmTestOnBrowser-System.Runtime.InteropServices.JavaScript.Tests.WorkItemExecution #94764
540864 dotnet/runtime WasmTestOnBrowser-System.Net.Http.Json.Functional.Tests.WorkItemExecution #97441
536247 dotnet/runtime WasmTestOnBrowser-System.Threading.Tests.WorkItemExecution #97270

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
1 10 21
@pavelsavara pavelsavara added arch-wasm WebAssembly architecture blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' area-VM-threading-mono Known Build Error Use this to report build issues in the .NET Helix tab os-browser Browser variant of arch-wasm labels Jan 5, 2024
@pavelsavara pavelsavara added this to the 9.0.0 milestone Jan 5, 2024
@pavelsavara pavelsavara self-assigned this Jan 5, 2024
@ghost
Copy link

ghost commented Jan 5, 2024

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

Error Blob

{
  "ErrorMessage": "RangeError: Invalid atomic access index",
  "BuildRetry": false,
  "ErrorPattern": "",
  "ExcludeConsoleLog": false
}

Reproduction Steps

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=513268

Author: pavelsavara
Assignees: pavelsavara
Labels:

arch-wasm, blocking-clean-ci, area-VM-threading-mono, Known Build Error, os-browser

Milestone: 9.0.0

@pavelsavara
Copy link
Member Author

Log

[22:52:59] info: [2024-01-22T22:52:59.062Z] [PASS] System.Threading.Tests.MonitorTests.IsEntered_Invalid
[22:52:59] warn: MONO_WASM [0xdc19c-main]: Failed to find loaded WebWorker, this may deadlock. Please increase the pthreadPoolSize. Running threads 25. Loading workers: 8
[22:52:59] warn: worker sent an error! http://127.0.0.1:39799/_framework/dotnet.native.worker.js:1: Uncaught RangeError: Invalid atomic access index
[22:52:59] fail: MONO_WASM: Error
                     at Ze (http://127.0.0.1:39799/_framework/dotnet.js:3:19870)
                     at e (http://127.0.0.1:39799/_framework/dotnet.js:3:33924)
                     at http://127.0.0.1:39799/_framework/dotnet.js:3:34069
[22:52:59] info: WASM EXIT 1

@lambdageek
Copy link
Member

Not getting very far with trying to repro this locally. See #96191 (comment) for my methodology

@pavelsavara
Copy link
Member Author

Log
Build

[15:34:43] info: Starting:    System.Threading.Tasks.Parallel.Tests.dll
[15:34:46] fail: [worker-idle 15:34:46.377] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[15:34:46] fail: [worker-idle 15:34:46.377] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[15:34:46] fail: [worker-idle 15:34:46.378] MONO_WASM: Invalid atomic access index
                 RangeError: Invalid atomic access index
                     at Atomics.waitAsync (<anonymous>)
                     at Object.__emscripten_thread_mailbox_await (http://127.0.0.1:39079/_framework/dotnet.native.js:8:102771)
                     at handleMessage (http://127.0.0.1:39079/_framework/dotnet.native.worker.js:1:2103)
[15:34:46] fail: [0x000dc33c 15:34:46.381] MONO_WASM: Uncaught RangeError: Invalid atomic access index
                 Error
                     at Object.Ke [as mono_exit] (http://127.0.0.1:39079/_framework/dotnet.js:3:20623)
                     at ft.config.exitOnUnhandledError.e.onerror (http://127.0.0.1:39079/_framework/dotnet.runtime.js:3:37630)
[15:34:46] warn: [0x000dc33c 15:34:46.384] MONO_WASM: mono_wasm_exit failed: unwind
[15:34:46] fail: [out of order message from the browser]: http://127.0.0.1:39079/_framework/dotnet.native.js 7:1295 Uncaught ErrorEvent: Uncaught RangeError: Invalid atomic access index
[15:34:46] fail: [out of order message from the browser]: http://127.0.0.1:39079/_framework/dotnet.js 2 Uncaught (in promise) RangeError: Invalid atomic access index

@pavelsavara
Copy link
Member Author

Log
Build

[14:20:00] info: [STRT] System.Runtime.InteropServices.JavaScript.Tests.WebWorkerHttpTest.HttpClient_ContentInDifferentThread(executor1: ThreadPool, executor2: JSWebWorker)
[14:20:00] info: [0x04cde380-jsww 14:20:00.231] MONO_WASM: forceDisposeProxies done: 0 imports, 0 exports, 0 GCHandles, 0 JSHandles.
[14:20:00] fail: [0x05410000-long=>detached 14:20:00.337] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[14:20:00] fail: [0x05410000-long=>detached 14:20:00.337] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[14:20:00] fail: [0x05410000-long=>detached 14:20:00.337] MONO_WASM: Invalid atomic access index
                 RangeError: Invalid atomic access index
                     at Atomics.waitAsync (<anonymous>)
                     at Object.__emscripten_thread_mailbox_await (http://127.0.0.1:44739/_framework/dotnet.native.js:8:99729)
                     at handleMessage (http://127.0.0.1:44739/_framework/dotnet.native.worker.js:1:2103)
[14:20:00] info: [STRT] System.Runtime.InteropServices.JavaScript.Tests.WebWorkerHttpTest.HttpClient_ContentInDifferentThread(executor1: NewThread, executor2: Main)
[14:20:00] fail: [0x000dd05c 14:20:00.346] MONO_WASM: Uncaught RangeError: Invalid atomic access index
                 Error
                     at Object.Ke [as mono_exit] (http://127.0.0.1:44739/_framework/dotnet.js:3:20623)
                     at ft.config.exitOnUnhandledError.e.onerror (http://127.0.0.1:44739/_framework/dotnet.runtime.js:3:37630)
[14:20:00] fail: [out of order message from the browser]: http://127.0.0.1:44739/_framework/dotnet.native.js 7:1295 Uncaught ErrorEvent: Uncaught RangeError: Invalid atomic access index

@pavelsavara
Copy link
Member Author

In both last cases it's the setup of await for next SAB emscripten messsage.
It seems from the message prefix, that the thread is not attached to mono and possibly not even yet pthread.
So this could be during worker becoming pthread.

Simplified emscripten code

var pthread_ptr = _pthread_self();
var wait = Atomics.waitAsync(GROWABLE_HEAP_I32(), pthread_ptr >> 2, pthread_ptr);

The method is updating the memory view via GROWABLE_HEAP_I32(), so this is not memory growth problem.

The pthread_id in out console message prefix could be obsolete, from the last run of the worker as a pthread.

We are trying to await adress _pthread_self(). I wonder how that could become un-aligned ?
Especially with pthread_ptr >> 2 being there.

@pavelsavara
Copy link
Member Author

The memory allocation for the pthread is (latest version, not the one we have)

https://github.com/emscripten-core/emscripten/blob/21f25339df9c6aa8c288297de73236e6aaa79582/system/lib/pthread/pthread_create.c#L165-L172

There is some ASAN dance above it and I'm not sure if this is ASAN opt-out ?

@pavelsavara
Copy link
Member Author

Again, this seems to be thread which is not currently attached. Probably memory growth ?

08:44:17] info: [PASS][14] System.Transactions.Tests.NonMsdtcPromoterTests.PSPENonMSDTCWithDependentClones
[08:44:17] fail: [worker-idle 08:44:17.406] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[08:44:17] fail: [worker-idle 08:44:17.406] MONO_WASM: Unhandled error: RangeError: Invalid atomic access index
[08:44:17] fail: [worker-idle 08:44:17.407] MONO_WASM: Invalid atomic access index
                 RangeError: Invalid atomic access index
                     at Atomics.waitAsync (<anonymous>)
                     at Object.__emscripten_thread_mailbox_await (http://127.0.0.1:33189/_framework/dotnet.native.js:8:102771)
                     at handleMessage (http://127.0.0.1:33189/_framework/dotnet.native.worker.js:1:2103)
[08:44:17] info: [0x000dc93c--UI- 08:44:17.415] MONO_WASM: Dumping web worker info as seen by UI thread, it could be stale:
[08:44:17] info: 00 | 0x000dc93c--UI-: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - UI Thread
[08:44:17] info: 01 | 0x02f9b1f8-norm: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - Finalizer
[08:44:17] info: 02 | 0x032df290-pool: isRunning: true isAttached: true isEventLoop: true reuseCount:  1 - .NET TP Worker
[08:44:17] info: 03 | 0x035ed010-gate: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET TP Gate
[08:44:17] info: 04 | 0x0436da68-pool: isRunning: true isAttached: true isEventLoop: true reuseCount:  1 - .NET TP Worker
[08:44:17] info: 05 | 0x03fe2048-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 06 | 0x04a80030-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 07 | 0x04c802e0-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 08 | 0x04f90040-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 09 | 0x052a0040-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 10 | 0x056c0040-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 11 | 0x058c02f0-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 12 | 0x05bd0040-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  2 - .NET Long Running Task
[08:44:17] info: 13 | 0x061303b0-timr: isRunning: true isAttached: true isEventLoop:false reuseCount:  2 - .NET Timer
[08:44:17] info: 14 | 0x06627ea8-pool: isRunning: true isAttached: true isEventLoop: true reuseCount:  1 - .NET TP Worker
[08:44:17] info: 15 | 0x06db0000-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 16 | 0x07069bc8-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 17 | 0x07370040-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 18 | 0x07680040-long: isRunning: true isAttached: true isEventLoop:false reuseCount:  1 - .NET Long Running Task
[08:44:17] info: 19 |           -    : isRunning:false isAttached:false isEventLoop:false reuseCount:  0 - emscripten-pool
[08:44:17] info: 20 |           -    : isRunning:false isAttached:false isEventLoop:false reuseCount:  0 - emscripten-pool
[08:44:17] info: 21 | --06ba8030-emsc: isRunning:false isAttached:false isEventLoop:false reuseCount:  1 - unregistered:(.NET Long Running Task)
[08:44:17] info: 22 | --06ba8030-emsc: isRunning:false isAttached:false isEventLoop:false reuseCount:  1 - unregistered:(.NET Long Running Task)
[08:44:17] info: 23 |           -    : isRunning:false isAttached:false isEventLoop:false reuseCount:  0 - emscripten-pool
[08:44:17] info: 24 |           -    : isRunning:false isAttached:false isEventLoop:false reuseCount:  0 - emscripten-pool
[08:44:17] info: 25 |           -    : isRunning:false isAttached:false isEventLoop:false reuseCount:  0 - emscripten-pool
[08:44:17] info: 26 |           -    : isRunning:false isAttached:false isEventLoop:false reuseCount:  0 - emscripten-pool

pavelsavara added a commit to pavelsavara/runtime that referenced this issue Feb 17, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Mar 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly architecture area-VM-threading-mono blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab os-browser Browser variant of arch-wasm
Projects
None yet
Development

No branches or pull requests

2 participants