-
-
Notifications
You must be signed in to change notification settings - Fork 787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory access out of bounds in Chrome #1384
Comments
I can confirm this behavior for 0.17.0a2. I'm getting quite a few crashes when trying to load pandas. It seems like the dev version consumes more memory than the previous version ? |
I've seen this a couple of times too, but not enough times to have any idea of a pattern for when it happens. If anyone can figure out a way to reproduce it, that would be super helpful. |
Also it would be useful to know whether it is browser dependent, so if anyone has seen a similar crash in firefox that'd be interesting info. |
Weirdly enough it happens less when I have dev tools open. I'll try to reproduce with a minimal setup/MRE. |
Thanks! |
+1 on open dev tool makes this happen less frequently. Another pattern I observe is that this seems to happen more frequently on Mac (chrome) than Windows (chrome). |
I can now say that this happens constantly when Chrome DEV tools is NOT open. The issue doesn't happen in Firefox. I'm able to reproduce with the following codepen: https://codepen.io/dmondev/pen/eYgdBmm This is as close as possible to how I'm using pyodide (in a worker, with custom namespaces). Also, the error goes away if I remove the imports import pandas as pd
import numpy as np |
@dmondev I can confirm the reproduction on my computer, thanks! |
Note that the problem goes away here if you import numpy before pandas. Have you tried that in your actual use case? Presumably the error just appears further down the road? |
Done. updated the codepen with output to the document itself. Yes, the error happens further down the road ...so dependency order issues ? |
It's confusing because pandas depends on numpy so by the time pandas is imported, numpy must have been imported too. Does the error always happen inside of |
It app
It appears so. Removed pandas as well. Issue seems to be with loading numpy. |
Additional note: Not happening in v0.17.0a1 |
I don't think that's it, since #887 is not included in 0.17.0a2.
@leopsidom 0.17.0a1 (or the dev version) should consume less. Doing memory snapshots in Firefox 87,
In what code did you see a memory increase? |
We might be able to git bisect it between 0.17.0a1 and 0.17.0a2 using CircleCI artifacts of commits on master, though those are only kept for a month I think (#648). Personally I am not able to reproduce with the provided codepen in Chrome 88 on Linux (with the dev tools closed). |
Can confirm both codepens also crash consistently with dev tools closed on MacOS. MacOS: Version 10.15.7 (19H524) |
@rth I don't think my assumption was correct now. I had the belief based on page crash info given by chrome, which I think is too general to identify the issue. But I double checked the memory on chrome. It was somewhere around 50 ~ 60MB IIRC after pandas is loaded, which should be far below the per tab memory limit (I believe it's at least GB level based on my research). So I guess the issue is due to something else. |
The memory limit might be lower in a webworker, though I have seen this a couple of times on the main thread. Also, I am no longer getting either codepen to reproduce the issue, I only ever got the original version to show the error. So it seems to be pretty inconsistent. |
I pasted the codepen twice, but it's just one. As I made a few changes to cleanup, just want to make sure we are all seeing the same thing: |
I'm seeing higher than that in Chrome Task Manager: Also, Chrome sometimes groups multiple tabs per process, which could possible push the memory up. Googling a bit about this issue, one common cause of "memory access out of bounds" is apparently an object being garbage-collected before use. "Intp" stands for "integer pointer", which I'm guessing numpy uses for things like allowing a slice of an array to point to the same memory location as the original array. Perhaps there is some over-eager garbage collection happening? EDIT: Possibly relevant: https://old.reddit.com/r/WebAssembly/comments/g44ldd/what_are_some_possible_causes_of_runtimeerror/fnw9kw2/ Since it happens when numpy is imported, there may be some platform tests that happen on load (e.g., "try creating such and such arrays, and depending on which ones fail, set the following parameters"). |
The specific function in which the error occurs is defined here: |
I should update the fatal error handling so that it prints the current Python stack frame if possible. |
Okay using
Links to numpy source code: |
So here's where in |
Looking at line 853 in numerictypes.py: https://github.com/numpy/numpy/blob/v1.15.4/numpy/core/numerictypes.py#L853
we see a call to "empty", which is imported from numpy.core.multiarray (compiled from C).
We see a use of PyArray_IntpConverter (in this case, it looks like it is called by this function PyArg_ParseTupleAndKeywords). Taking a look inside conversion_utils.c, at the function PyArray_IntpConverter: https://github.com/numpy/numpy/blob/v1.15.4/numpy/core/src/multiarray/conversion_utils.c#L131
Which finally gives us our call to PyArray_IntpFromIndexSequence: https://github.com/numpy/numpy/blob/v1.15.4/numpy/core/src/multiarray/conversion_utils.c#L902 The only relevant changes I could see from 0.17.0a1 to 0.17.0a2 in numpy itself are in make_int_return_values.patch: diff here I think this change should only affect calls to Fortran code, and in a cursory glance I don't see any of those in PyArray_IntpFromIndexSequence (but I haven't traced back every function call here, it's kind of hard to follow!). EDIT: I see hoodmane beat me to the punch! |
Yeah, |
I opened PR #1390 to dump stack on fatal error so this will be easier in the future. |
@joemarshall you have any ideas about what's going on? |
I'm still able to reproduce building this way. |
Well I built with debug symbols using: export EXTRA_CFLAGS="-g4"
export EXTRA_LD_FLAGS="-g4 --source-map-base=localhost:8000"
export PYODIDE_PACKAGES="numpy"
make Following @joemarshall's advice on #1102, I made a custom server to fix up the source map urls: import socket
import socketserver
from http.server import SimpleHTTPRequestHandler
import pathlib
alternative_bases=["cpython/build/Python-3.8.2/","src/", "build/"]
def fixup_url(path):
if pathlib.Path("." + path).exists():
return path
for base in alternative_bases:
q = pathlib.Path(base + path)
if q.exists():
return str(q)
dir = list(
pathlib.Path("packages/numpy/build/numpy-1.17.5/").glob("**" + path),
)
if dir:
return str(dir[0])
print("not found:", path)
return path
class MyTCPServer(socketserver.TCPServer):
def server_bind(self):
"""Use socket.SO_REUSEADDR to allow faster restart"""
self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
self.socket.bind(self.server_address)
class Handler(SimpleHTTPRequestHandler):
def end_headers(self):
# Enable Cross-Origin Resource Sharing (CORS)
self.send_header('Access-Control-Allow-Origin', '*')
super().end_headers()
def do_GET(self):
self.path = fixup_url(self.path)
super().do_GET()
if __name__ == '__main__':
port = 8000
with MyTCPServer(("", port), Handler) as httpd:
print("Serving at: http://127.0.0.1:{}".format(port))
httpd.serve_forever() Using this, the source map identified that the fatal error happens on line 960: vals[i] = PyArray_PyIntAsIntp(op);
Py_DECREF(op);
if(vals[i] == -1) { // error happened here |
Okay here's the WAT from I think this shows that this is a v8 bug. A pointer is stored into (func $PyArray_IntpFromIndexSequence (;1429;)
(export "PyArray_IntpFromIndexSequence")
(param $var0 i32) ;; seq
(param $var1 i32) ;; vals
(param $var2 i32) ;; maxvals, reused for op and err
(result i32)
(local $var3 i32) ;; i / loop variable
(local $var4 i32) ;; return value (so sometimes nd)
(local $var5 i32) ;; nd
(local $var6 i32) ;; min(nd, maxvals)
(local $var7 i32) ;; &vals[i]
(local $var8 i32) ;; used for op->obj_refcnt (in code from Py_DECREF)
block $label0 ;; wraps the whole body, we use jump to $label0 to return $var4
;; line 931
local.get $var0
call $PySequence_Size
local.tee $var5 ;; nd 5 = PySequence_Size(seq 0)
;; line 932
i32.const -1
i32.ne
if ;; no error occurred
;; line 952, computing min(nd, maxvals)
local.get $var5
local.get $var2
local.get $var2
local.get $var5
i32.gt_s
select
local.tee $var6 ;; set var6 = min(nd, maxvals)
i32.const 0
i32.gt_s
if ;; min(nd, maxvals) > 0, (otherwise loop skipped entirely)
i32.const -1
local.set $var4 ;; return value is -1
loop $label2
;; line 953
local.get $var0 ;; seq
local.get $var3 ;; i (loop variable)
call $PySequence_GetItem ;; GetItem(seq, i)
local.tee $var2 ;; op = GetItem(seq, i) (overwrite maxvals)
;; line 954
i32.eqz ;; if(op == NULL)
;; line 955
br_if $label0 ;; return -1 (in $var4)
;; line 958, start with LHS
local.get $var1 ;; vals
local.get $var3 ;; i
i32.const 2
i32.shl ;; i << 2
i32.add ;; &vals[i] == vals + 4*i
local.tee $var7 ;; v7 = &vals[i]
;; inlined PyArray_PyIntAsIntp here
local.get $var2 ;; op
;; fetch static string "an integer is required"
global.get $__memory_base
i32.const 9084
i32.add
call $PyArray_PyIntAsIntp_ErrMsg ;; PyArray_PyIntAsIntp_ErrMsg(o, "an integer is required")
i32.store ;; vals[i] = PyArray_PyIntAsIntp(op)
;; line 959, Py_DECREF macro
local.get $var2
local.get $var2
i32.load ;; op->ob_refcnt
i32.const 1
i32.sub ;;
local.tee $var8
i32.store ;; store back op->ob_refcnt -= 1
local.get $var8
i32.eqz
if ;; if new refcount is 0, dealloc op
local.get $var2
call $_Py_Dealloc
end ;; Py_DECREF is done now
block $label1
;; line 960 -----------------------------------------------------------------------------------
local.get $var7 ;; &vals[i]
i32.load ;; vals[i] ;; <================================== crashes here!
i32.const -1
i32.ne ;; vals[i] != -1
br_if $label1 ;; no error, continue loop
;; line 961 ------------------------------------------------------------------------------------
call $PyErr_Occurred
local.tee $var2 ;; store err into $var2
i32.eqz
br_if $label1 ;; no error, continue loop
;; Now we know there was an error, replace overflow error message
local.get $var2
global.get $PyExc_OverflowError
i32.load
call $PyErr_GivenExceptionMatches
i32.eqz
br_if $label0 ;; Not an overflow error, return -1 ($var4)
;; An overflow error, replace error message
;; lines 964 / 965
global.get $PyExc_ValueError
i32.load
;; Look up static string "Maximum allowed dimension exceeded"
global.get $__memory_base
local.tee $var2
i32.const 9107
i32.add
call $PyErr_SetString
i32.const -1
return
end $label1
;; Now we need to decide whether to loop again
;; line 952 again. increment i then test whether we are done
local.get $var3 ;; i / loop variable
i32.const 1
i32.add ;; i + 1
local.tee $var3 ;; store back
local.get $var6 ;; MIN(nd, maxvals)
i32.ne ;; i != MIN(nd, maxvals)
br_if $label2 ;; loop unless we're done (?)
end $label2
end
local.get $var5
local.set $var4 ;; move nd from $var5 into $var4 (now that we succeeded)
br $label0 ;; return nd (in $var4)
end
;; 933 (an error occurred in PySequenceLength(seq))
call $PyErr_Occurred
if
;; line 934
call $PyErr_Clear
end
;; line 937
;; inlined PyArray_PyIntAsIntp again
local.get $var1 ;; vals (used by i32.store for assign to vals[0])
;; fetch static string "an integer is required"
local.get $var0
global.get $__memory_base
i32.const 9084
i32.add
call $PyArray_PyIntAsIntp_ErrMsg
;; PyArray_PyIntAsIntp_ErrMsg(o, "an integer is required")
local.tee $var2 ;; override maxvals with result
i32.store ;; vals[0] = PyArray_PyIntAsIntp_ErrMsg(o, "an integer is required")
i32.const 1
;; line 949
local.set $var4 ;; nd = 1
;; line 938
local.get $var2 ;; vals[0]
i32.const -1
i32.ne ;; vals[0] != -1
br_if $label0 ;; if vals[0] != -1 return nd (in $var4)
;; line 939
call $PyErr_Occurred
local.tee $var2 ;; use $var2 as err
;; line 940/941
i32.eqz ;; err == 0
br_if $label0 ;; if(err == 0) return nd (in $var4)
;; Now there was an error.
i32.const -1
local.set $var4 ;; there was an error, we're going to return -1
local.get $var2
global.get $PyExc_OverflowError
i32.load
call $PyErr_GivenExceptionMatches ;; PyErr_GivenExceptionMatches(err, PyExc_OverflowError)
i32.eqz
br_if $label0 ;; not an overflow error, return -1 (in $var4)
;; An overflow error, replace error message
;; lines 942/943
global.get $PyExc_ValueError
i32.load
;; get static string "Maximum allowed dimension exceeded"
global.get $__memory_base
local.tee $var2
i32.const 9107
i32.add
call $PyErr_SetString
i32.const -1
return
end $label0
local.get $var4
) |
I guess I'll patch numpy a bit to remove the offending |
Okay I think I may have fixed it. |
If anyone is willing to build my branch in #1449 and import numpy, I'd appreciate verification that it works. It seems to fix the problem on my machine. We don't have any test coverage for this so it's a bit hard to be sure. |
Okay, so on my computer the version that I built locally doesn't crash but the version built on CI does crash. I wonder if it's because the CPython that I'm using locally was built with debug symbols? Very frustrating... Here's the link to the CI build: |
Update: generated WASM from the CI was identical to the WASM generated from the original code without my numpy patch. |
I've had similar problems (seemingly solved by deleting everything in the numpy build folder to force recompile). I'll check if your branch fixes it for me later tonight. |
Yup I've been doing that a lot, can never be too safe. The problem I was having though is that I was able to load the circle-ci build from console.html using a local override for console.html replacing the cdn url with the circle-ci one. When I did this it worked for me as expected. With the local override I was able to avoid the CORS proxy issue they talk about here: |
Thanks! |
Ran a few tests, seems to be working! |
Very pleased to hear the fix! |
Still have crash problem using Chrome 90.0.4430.72 in Windows with the latest master branch pyodide code. @hoodmane @jmsmdy (I've substituded the domain name)
|
Yes, this fatal error occurs even in the benchmarks, I haven't minimized the reproduction on this one yet would appreciate it if anyone can provide a simple way to produce it. |
Okay I can sometimes trigger this with
but it's a bit inconsistent. |
Crash happens on line 1177: it = mit->iters[j]; A bit perplexing... |
Okay I'm going to close this in favor of a clean issue, since this discussion is already quite long. (#1473) |
I'm getting the following intermittent error running the develop version in Chrome, running from a web worker.
Chrome Version 89.0.4389.90 (Official Build) (64-bit)
Windows 10 1909 build 18363.1379
pyodide.js:406 The cause of the fatal error was:
RuntimeError: memory access out of bounds
at PyArray_IntpFromIndexSequence (:wasm-function[991]:0x737a1)
at PyArray_IntpConverter (:wasm-function[990]:0x73702)
at byn$fpcast-emu$PyArray_IntpConverter (:wasm-function[3415]:0x15bfeb)
The text was updated successfully, but these errors were encountered: