-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Got error log like "** On entry to XXXXparameter number xxx had an illegal value" on RISCV platform #20423
Comments
What are the inputs to the lstsq routine, X and y? |
These values are identical in my x86 and RISC-V platform platform. When i debug into to /usr/local/lib/python3.10/dist-packages/scipy/linalg/lapack.py routine in below code
RISCV platfrom return -7 error, But x86 platform have no problem |
can you just run this on your platform import numpy as np
import scipy.linalg as la
X = np.zeros([90, 1], dtype=np.float64)
y = np.array([
-1220.63333333, -1168.63333333, -904.63333333, -618.63333333, -14.63333333, 1331.36666667, 2438.36666667,
4453.36666667, 3460.36666667, 1087.36666667, -966.63333333, -1391.63333333, -1305.63333333, -1210.63333333,
-1080.63333333, 795.36666667, 1195.36666667, 1919.36666667, 334.36666667, -1080.63333333, -1338.63333333,
-1444.63333333, -1421.63333333, -1276.63333333, -943.63333333, -456.63333333, 639.36666667, 1046.36666667,
-532.63333333, -1128.63333333, -1112.63333333, -1264.63333333, -1129.63333333, -758.63333333, 148.36666667,
1235.36666667, 1381.36666667, 629.36666667, -805.63333333, -1190.63333333, -1253.63333333, -1244.63333333,
-937.63333333, 133.36666667, 1821.36666667, 5231.36666667, 2764.36666667, -802.63333333, -1234.63333333,
-1016.63333333, -1131.63333333, -705.63333333, 104.36666667, 186.36666667, 761.36666667, -63.63333333,
-733.63333333, -1190.63333333, -1288.63333333, -1260.63333333, -1020.63333333, -753.63333333, 552.36666667,
1321.36666667, 2941.36666667, 1021.36666667, -1100.63333333, -1416.63333333, -1450.63333333, -1440.63333333,
-1430.63333333, -1301.63333333, -1112.63333333, -197.63333333, 2541.36666667, 2005.36666667, -902.63333333,
-1384.63333333, -1336.63333333, -1102.63333333, -731.63333333, -182.63333333, 1975.36666667, 5501.36666667,
4823.36666667, 2304.36666667, 346.36666667, -1144.63333333, -1107.63333333, -681.63333333])
la.lstsq(X, y) |
Also have the issue. below is the error log** On entry to DGELSD parameter number 7 had an illegal value |
Yes nice. That code should give you
as a result. So something is not right with your NumPy/SciPy installation because -7 indicates the array passed to LAPACK function is not right shape through |
Any suggestion to correct my installation? |
I don't know anything about RISC versions. But, stating the obvious terrible attempt, I would uninstall numpy scipy scikit-learn and Sometimes, things get mixed up when you try to use locally installed package with system wide python and/or virtual environments (which seems like you are not using any). |
Can anywhere i can force the program pass the array to LAPACK function through f2py. |
I will try the action uninstalling the numpy scipy scikit-learn, And reinstall them in order. |
If this persists for a clean reinstall (you don't need scikit-learn, just numpy and scipy), I'd
It's possible that the problem is indeed within RISC-V OpenBLAS kernels. Switching a LAPACK library could help isolating the issue if so. See this page :https://scipy.github.io/devdocs/dev/contributor/debugging_linalg_issues.html#debugging-linalg-issues |
By GDB method, I can't step into routine(*args, **kwargs) in below code. Is there any simple way i can know whether the error is in f2py-generated C wrappers or in LAPACK itself.
|
One trick is to add pair of breakpoints: a python one + a C breakpoint:
|
Thanks for the help , Now i can stop at the .so dgelsd_ enrty , Can below msg show any clue? For help, type "help".
Thread 1 "python3" hit Breakpoint 1, 0x0000003ff669e990 in dgelsd_ () from /usr/lib/riscv64-linux-gnu/openblas-pthread/libopenblas.so.0 |
My debug py is: blas_dep = scipy.show_config(mode='dicts')['Build Dependencies']['blas'] X = np.zeros([90, 1], dtype=np.float64) la.lstsq(X, y) |
Step through the python code of lstsq, find which python line fails. From it, find the corresponding C symbol and add a gdb breakpoint on it. It might be that the failure is at the lwork call, so the C symbol may be |
Per my debug, It get dgelsd_lwork function attibution from _flapack.cpython-310-riscv64-linux-gnu.so, and then fall into dgelsd_lwork and report error "** On entry to DGELSD parameter number 7 had an illegal value". Don't know what's happened And if i set a beakpoint with "b dgelsd_lwork", it can't trigger the break. below is the so info about dgelsd_lwork below is the .so info about dgelsd_lwork SO canit be f2py-generated C wrappers problem? Where can i get the src code of _flapack.cpython-310-riscv64-linux-gnu.so Thank you so much @ev-br |
Close |
Probably best to leave this open if the problem is still unresolved. Maintainers are generally busy and don't always have the time to reply quickly to every issue :) |
The point is, RISC-V is currently supported on a best-effort basis. Meaning, we are happy to accept patches which do not break our supported set of platforms (e.g., do not break our CI). We are also happy to offer advice and otherwise discuss problems which arise for "non-standard" platform like RISC-V. But in the end of the day somebody needs to provide a patch, to either SciPy or OpenBLAS, depending on where the problem is. ATM it's not even clear this is a scipy issue. |
Describe your issue.
Hi I am running a pmdarima example (https://github.com/alkaline-ml/pmdarima/blob/master/examples/arima/example_auto_arima.py )on my RISCV platform , It has the calltree like below:
But this example run well on x86_linux platfrom with the same packeage version. When i debug into the script, I find at last it call lapack.py _compute_lwork function and enter below code, and got -7 ret value.
Here i print out the value of *args, they are:
90 1 1 2.220446049250313e-16
Looks like it call the _flapack.cpython-310-riscv64-linux-gnu.so list functions like DGELSD ,then got error, With the same action it works well in x86 linux platform.
Can anyone indicate how can i continue debug such "On enrty to xxxxx parameter number xx had an illegal value" error?
Reproducing Code Example
Error message
SciPy/NumPy/Python version and system information
The text was updated successfully, but these errors were encountered: