-
-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Find dtype in genfromtxt for single column of data #16908
Conversation
…, StringConverter was not given the right dtype. The bug caused version 1.19 to fail to load single column text files. However, a fix for the failure to load has already been provided in the master branch, so that the dtype is found. The fix proposed here is consistent with the 2D behaviour.
mtest = np.genfromtxt(TextIO(data), skip_header=1, delimiter=",", names=True, usecols=0) | ||
# Note that squeeze doesn't work when specifying names | ||
ctrl = np.array([(5,), (6,)], | ||
dtype=[('a_header', int)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without commenting on the nature of the regression, this passed for me locally with this minor adjustment:
ctrl = np.array([(5,), (6,)],
- dtype=[('a_header', int)])
+ dtype=[('a_header', float)])
assert_equal(mtest, ctrl)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry the correction you have applied is needed. Actually the test passes without the change that I have made. But at line 1971
converters = [StringConverter(dtype, locked=True
, parameter dtype
has the header name in it and therefore the StringConverter does not find the right type and the converter[0].type is void, whereas it should be float
. This only happens when you have a single column. While for multiple columns of data the dtype_flat
parameter is passed in, which has the header names removed.
The correction I have provided makes it consistent with the multiple column approach and means the data type is found within the StringConverter object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this test is incorrect: note that the default value for dtype
is float, so you should expect a float type here.
The dtype-inference-machinery is triggered with dtype=None
, which gives the expected result:
>>> np.genfromtxt(
... StringIO(data), skip_header=1, delimiter=",", names=True, usecols=0, dtype=None
... )
array([(5,), (6,)], dtype=[('a_header', '<i8')])
AFAICT the regression has been fixed and the current behavior is correct --- I think this can be closed (unless I'm missing something @millen1m )
Yes - the regression has been fixed in another way - can close. Cheers. |
BUG: Find dtype in genfromtxt for single column of data
When single column of data is given to genfromtxt and names=True,
StringConverter was not given the right dtype.
The bug caused version 1.19 to fail to load single column text files.
However, a fix for the failure to load has already been provided in
the master branch, so that the dtype is found.
The fix proposed here is consistent with the 2D behaviour.