You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The functions scipy.sparse.linalg.eigsh and scipy.sparse.linalg.eigs raise the misleading error message
scipy.sparse.linalg.eigen.arpack.arpack.ArpackError: ARPACK error -4: The maximum number of Arnoldi update iterations allowed must be greater than zero.
if the parameter maxiter exceeds the maximum value a 32-bit integer can hold (see example below).
The use of 32-bit integers is probably enforced by the ARPACK interface and the parameter maxiter is apparently passed without checking for overflow. An overflowing integer then appears to ARPACK as a negative value, thereby provoking the error quoted above.
In my application, I am using scipy.sparse.linalg.eigsh to find the quantum mechanical ground state of a Hamiltonian represented as an instance of scipy.sparse.linalg.LinearOperator. The default value of the option maxiter is n * 10, where n is the dimension of the linear operator. Now, for large systems, the default value of maxiter can easily exceed the maximum value a 32-bit integer can hold. The above error message is then very confusing, especially if the parameter maxiter is not even set explicitly.
As a minimum fix, I suggest to check the variable maxiter for overflow, and if this is the case, raise an error with a meaningful message. This could be realised by adding the following conditional statement to the __init__ method of the class _ArpackParams in the file arpack.py:
ifmaxiter<=0: # this check is already doneraiseValueError("maxiter must be positive, maxiter=%d"%maxiter)
elifmaxiter>np.iinfo(np.int32).max: # this check should be addedraiseValueError("maxiter must not exceed the maximum value a 32-bit integer can hold, maxiter=%d"%maxiter")
In addition, it would be good to choose the default value of maxiter more carefully, respecting the limit set by the use of 32-bit integers. I'm not sure though if it is easy to come up with a better, general heuristics for a reasonable cutoff value.
For a very large system, choosing the default value as a multiple of the dimension of the linear operator, as it is currently the case, practically amounts to setting no cutoff at all (since the computation will take forever if it doesn't converge after a reasonable number of iterations). One solution would therefore be to use maxiter=np.iinfo(np.int32).max as the default value and leave it to the user to come up with a reasonable cutoff depending on the respective use case. Or, in order to be more in line with the current behaviour, use maxiter=min(n * 10, np.iinfo(np.int32).max) as the default value.
Traceback (most recent call last):
File "scipy_arpack_maxiter_overflow.py", line 9, in <module>
evals, evecs = eigsh(identity, maxiter=maxiter)
File "/usr/local/lib/python3.7/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 1687, in eigsh
params.iterate()
File "/usr/local/lib/python3.7/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 573, in iterate
raise ArpackError(self.info, infodict=self.iterate_infodict)
scipy.sparse.linalg.eigen.arpack.arpack.ArpackError: ARPACK error -4: The maximum number of Arnoldi update iterations allowed must be greater than zero.
@rlucas7 In my application, n = 300540195 (corresponding to the Hilbert space dimension of a Bose-Hubbard system with 16 sites and 16 particles). This value is about 0.14 * np.iinfo(np.int32).max and, although not far from the limits of a 32-bit integer, it still fits. However, the default choice maxiter = n * 10 already overflows.
I support your proposal #11302 of moving BLAS/LAPACK/ARPACK etc. to ILP64. This is particularly important for scientific computing, where state-of-the-art applications are touching the limits of 32-bit integers. In particular, the use of 64-bit integers in the ARPACK interface would eliminate the problem I encountered since n * 10 would easily fit a 64-bit integer for reasonable problem sizes n.
Nonetheless, in my opinion, one should in general always check for overflow when copying an arbitrary-size Python integer to some finite-size integer datatype.
The functions
scipy.sparse.linalg.eigsh
andscipy.sparse.linalg.eigs
raise the misleading error messageif the parameter
maxiter
exceeds the maximum value a 32-bit integer can hold (see example below).The use of 32-bit integers is probably enforced by the ARPACK interface and the parameter
maxiter
is apparently passed without checking for overflow. An overflowing integer then appears to ARPACK as a negative value, thereby provoking the error quoted above.In my application, I am using
scipy.sparse.linalg.eigsh
to find the quantum mechanical ground state of a Hamiltonian represented as an instance ofscipy.sparse.linalg.LinearOperator
. The default value of the optionmaxiter
isn * 10
, wheren
is the dimension of the linear operator. Now, for large systems, the default value ofmaxiter
can easily exceed the maximum value a 32-bit integer can hold. The above error message is then very confusing, especially if the parametermaxiter
is not even set explicitly.As a minimum fix, I suggest to check the variable
maxiter
for overflow, and if this is the case, raise an error with a meaningful message. This could be realised by adding the following conditional statement to the__init__
method of the class_ArpackParams
in the filearpack.py
:In addition, it would be good to choose the default value of
maxiter
more carefully, respecting the limit set by the use of 32-bit integers. I'm not sure though if it is easy to come up with a better, general heuristics for a reasonable cutoff value.For a very large system, choosing the default value as a multiple of the dimension of the linear operator, as it is currently the case, practically amounts to setting no cutoff at all (since the computation will take forever if it doesn't converge after a reasonable number of iterations). One solution would therefore be to use
maxiter=np.iinfo(np.int32).max
as the default value and leave it to the user to come up with a reasonable cutoff depending on the respective use case. Or, in order to be more in line with the current behaviour, usemaxiter=min(n * 10, np.iinfo(np.int32).max)
as the default value.Reproducing code example:
Error message:
Scipy/Numpy/Python version information:
The text was updated successfully, but these errors were encountered: