Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: numpy.loadtxt crash for texts of specific length an trailing whitespace #22833

Closed
horsti371 opened this issue Dec 19, 2022 · 1 comment · Fixed by #22836
Closed

BUG: numpy.loadtxt crash for texts of specific length an trailing whitespace #22833

horsti371 opened this issue Dec 19, 2022 · 1 comment · Fixed by #22836

Comments

@horsti371
Copy link

horsti371 commented Dec 19, 2022

Describe the issue:

Loading data using numpy.loadtxt crashes in some specific cases. E.g., loading a string like



fails on Windows and Ubuntu. This seems to depend on the length of the string and the trailing whitespace. When changing the end of the text above to

"... 22 33 1 "
or
"... 22 33 11"
or
"... 22 33 111"
or
"... 22 33 111 "

loading succeeds.
This happens for NumPy 1.23.0 and 1.24.0 but not for NumPy 1.22.0

Reproduce the code example:

from io import StringIO
import numpy

text
string_io = StringIO(text)
numpy.loadtxt(string_io)

Error message:

On Windows, python crashes without any error message when loading the string a second time. On Ubuntu, the following error message occurs immediately:

python3: malloc.c:2617: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.
Aborted (core dumped)

NumPy/Python version information:

Ubuntu:
1.24.0 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]

Windows:
1.23.0 3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64 bit (AMD64)]

Context for the issue:

Can't load arbitrary text-based vectors.

@seberg seberg added this to the 1.24.1 release milestone Dec 19, 2022
seberg added a commit to seberg/numpy that referenced this issue Dec 19, 2022
If a row ends in a delimiter, `add_fields` can be called twice without
any field actually being parsed.  This causes issues with the field
buffer setup.
Basically, I tried to be too smart, and now it needs a small fixup...

closes numpygh-22833
rossbar pushed a commit that referenced this issue Dec 19, 2022
…22836)

If a row ends in a delimiter, `add_fields` can be called twice without
any field actually being parsed.  This causes issues with the field
buffer setup.

closes gh-22833
@seberg
Copy link
Member

seberg commented Dec 19, 2022

Thanks for the report @horsti371! Sorry about that bug, will be fixed in the 1.24.1 release. In case it helps (a bit): it can only be hit if the line ends with a delimiter.

charris pushed a commit to charris/numpy that referenced this issue Dec 21, 2022
…umpy#22836)

If a row ends in a delimiter, `add_fields` can be called twice without
any field actually being parsed.  This causes issues with the field
buffer setup.

closes numpygh-22833
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants