BUG: numpy.loadtxt crash for texts of specific length an trailing whitespace #22833

horsti371 · 2022-12-19T15:18:35Z

Describe the issue:

Loading data using numpy.loadtxt crashes in some specific cases. E.g., loading a string like

"1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 11111 11111 11111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 111 111 111 111 111 111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 1 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11111 11 11 11111 11 11 11 11 11 22 33 11 "

fails on Windows and Ubuntu. This seems to depend on the length of the string and the trailing whitespace. When changing the end of the text above to

"... 22 33 1 "
or
"... 22 33 11"
or
"... 22 33 111"
or
"... 22 33 111 "

loading succeeds.
This happens for NumPy 1.23.0 and 1.24.0 but not for NumPy 1.22.0

Reproduce the code example:

from io import StringIO
import numpy

text = "1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 11111 11111 11111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 111 111 111 111 111 111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 1 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11111 11 11 11111 11 11 11 11 11 11 11 11 "
string_io = StringIO(text)
numpy.loadtxt(string_io)

Error message:

On Windows, python crashes without any error message when loading the string a second time. On Ubuntu, the following error message occurs immediately:

python3: malloc.c:2617: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.
Aborted (core dumped)

NumPy/Python version information:

Ubuntu:
1.24.0 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]

Windows:
1.23.0 3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64 bit (AMD64)]

Context for the issue:

Can't load arbitrary text-based vectors.

The text was updated successfully, but these errors were encountered:

If a row ends in a delimiter, `add_fields` can be called twice without any field actually being parsed. This causes issues with the field buffer setup. Basically, I tried to be too smart, and now it needs a small fixup... closes numpygh-22833

…22836) If a row ends in a delimiter, `add_fields` can be called twice without any field actually being parsed. This causes issues with the field buffer setup. closes gh-22833

seberg · 2022-12-19T19:14:15Z

Thanks for the report @horsti371! Sorry about that bug, will be fixed in the 1.24.1 release. In case it helps (a bit): it can only be hit if the line ends with a delimiter.

…umpy#22836) If a row ends in a delimiter, `add_fields` can be called twice without any field actually being parsed. This causes issues with the field buffer setup. closes numpygh-22833

horsti371 added the 00 - Bug label Dec 19, 2022

seberg added this to the 1.24.1 release milestone Dec 19, 2022

seberg mentioned this issue Dec 19, 2022

BUG: Ensure correct behavior for rows ending in delimiter in loadtxt #22836

Merged

rossbar closed this as completed in #22836 Dec 19, 2022

charris mentioned this issue Dec 21, 2022

BUG: Ensure correct behavior for rows ending in delimiter in loadtxt #22847

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: numpy.loadtxt crash for texts of specific length an trailing whitespace #22833

BUG: numpy.loadtxt crash for texts of specific length an trailing whitespace #22833

horsti371 commented Dec 19, 2022 •

edited

seberg commented Dec 19, 2022

BUG: numpy.loadtxt crash for texts of specific length an trailing whitespace #22833

BUG: numpy.loadtxt crash for texts of specific length an trailing whitespace #22833

Comments

horsti371 commented Dec 19, 2022 • edited

Describe the issue:

Reproduce the code example:

Error message:

NumPy/Python version information:

Context for the issue:

seberg commented Dec 19, 2022

horsti371 commented Dec 19, 2022 •

edited