You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When building a DataFrame with specified column names and dtypes, one might expect one of two possible behaviours:
The column names and dtypes specs are perfectly cromulent, and Pandas goes on to build the object.
The column names or dtypes don't match the data shape, or the dtypes are badly specified, and Pandas gives an error message.
Instead, I have encountered a segmentation fault.
Now, it is unclear to me whether my column names spec and dtypes are correctly written and if my data is proper too (see example below). But in any case, it should not crash.
I have found that the above script always crashes on my machine (see next section for detailed configuration information). It does it in 2 possible ways:
First mode of failure: hanging
Python 2.7.5 (default, Sep 6 2013, 09:55:21)
[GCC 4.8.1 20130725 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import datetime as dt
>>> import itertools as it
>>>
>>> df_test = pd.DataFrame(data = list(it.repeat((dt.datetime(2001, 1, 1), "aa", 20), 9)),
... columns=["A", "B", "C"],
... dtype=[("A","datetime64[h]"), ("B","str"), ("C","int32")])
*** Error in `python': corrupted double-linked list: 0x0000000001bfd8e0 ***
Note that in the line that I use to create the data list(it.repeat((dt.datetime(2001, 1, 1), "aa", 20), 9)), the number of rows has an influence on whether Python crashes. If less than 9, there is the output:
Now, this output doesn't make much sense to me, it doesn't seem to respect the dtype spec that I give, but it's very possible that I don't understand the dtype spec well and that it's actually perfectly sensible output.
The text was updated successfully, but these errors were encountered:
specifying a dtype will try to coerce to that dtype, but must be a singluar (not compound type), issue #4464 at some point may allow this
Works fine w/o specifing a dtype.
In [8]: df_test = pd.DataFrame(data = list(it.repeat((dt.datetime(2001, 1, 1), "aa", 20), 9)),
columns=["A", "B", "C"])
In [9]: df_test
Out[9]:
A B C
0 2001-01-01 00:00:00 aa 20
1 2001-01-01 00:00:00 aa 20
2 2001-01-01 00:00:00 aa 20
3 2001-01-01 00:00:00 aa 20
4 2001-01-01 00:00:00 aa 20
5 2001-01-01 00:00:00 aa 20
6 2001-01-01 00:00:00 aa 20
7 2001-01-01 00:00:00 aa 20
8 2001-01-01 00:00:00 aa 20
In [10]: df_test.dtypes
Out[10]:
A datetime64[ns]
B object
C int64
dtype: object
You can cast a specific column if you want
In [11]: df_test['C'] = df_test['C'].astype(np.int32)
In [12]: df_test.dtypes
Out[12]:
A datetime64[ns]
B object
C int32
dtype: object
datetime64[h] is not a valid dtype in pandas, nor is it necessary. str is converted to object and is not necessary either.
Description
When building a DataFrame with specified column names and dtypes, one might expect one of two possible behaviours:
Instead, I have encountered a segmentation fault.
Now, it is unclear to me whether my column names spec and dtypes are correctly written and if my data is proper too (see example below). But in any case, it should not crash.
Reproducing
To reproduce, please run:
Modes of failure
I have found that the above script always crashes on my machine (see next section for detailed configuration information). It does it in 2 possible ways:
First mode of failure: hanging
After that line, the terminal is dead.
Second mode of failure: segfault
Configuration information
Python:
uname -a:
pip freeze --local:
Concluding remarks
Note that in the line that I use to create the data
list(it.repeat((dt.datetime(2001, 1, 1), "aa", 20), 9))
, the number of rows has an influence on whether Python crashes. If less than 9, there is the output:Now, this output doesn't make much sense to me, it doesn't seem to respect the dtype spec that I give, but it's very possible that I don't understand the dtype spec well and that it's actually perfectly sensible output.
The text was updated successfully, but these errors were encountered: