PandasCursor doesn't automatically convert int columns with NA's to floats #60

austinlostinboston · 2019-01-16T17:23:54Z

I'm querying a large athena table and can successfully run a query using the below code, however it's really slow (for reasons covered in #46).

conn = pyathena.connect(**at.athena_creds)
df = pd.read_sql(sql, conn)

I would really like to take advantage of the performance boost that PandasCursor offers, however, when I run the code below, I get a value error.

conn = pyathena.connect(**at.athena_creds, cursor_class=PandasCursor)
cursor = at_con.cursor()
df = cursor.execute(sql).as_pandas()

>>> ValueError: Integer column has NA values in column 18

Now I understand why I'm getting this value error. I have a int column in my athena table which has NA values in it, which Pandas notoriously doesn't handle well (NaN's are floats in Pandas eyes, not ints). The pd.read_sql() seems to handle this gracefully. It recognizes there is an int column with NaN's and converts it to a float column. It would be great if pyathena did the same thing.

The text was updated successfully, but these errors were encountered:

laughingman7743 · 2019-01-17T14:16:49Z

I have a int column in my athena table which has NA values in it, which Pandas notoriously doesn't handle well (NaN's are floats in Pandas eyes, not ints).

I do not really know what kind of data this is. Is it possible to present sample data?

austinlostinboston · 2019-01-17T14:58:24Z

Sure. The column in particular that's giving me issues is a column capturing the 4 digits after a zip/postal code. The snippet below shows that sometimes, this data is missing from that column. Pandas must use pd.to_numeric() (or a similar conversion) on columns like these. When I use pyathena without PandasCursor, I end up with a this column being converted to float64 and NaN's where the missing data are.

row_num | zipcode_plus_four (int)
268 | 4005
269 | 1447
270 | 1447
271 | 1447
272 | 2938
273 | 2938
274 | 2938
275 | 2938
276 | 2938
277 |  
278 |  
279 | 4000
280 | 4000
281 | 4000
282 | 6183
283 | 6183
284 | 9702

mckeown12 · 2019-03-06T19:11:01Z

I'm experiencing this same issue. Is it possible to tell explicitly tell the PandasCursor to cast the column to floats?

laughingman7743 · 2019-03-07T12:07:49Z

It seems good to convert with cast.
https://prestodb.github.io/docs/0.172/functions/conversion.html
Do you have a good implementation idea?

mckeown12 · 2019-03-10T14:09:42Z

Thanks @laughingman7743 , I used cast(intColumnName AS double) in my select statement and it worked like a charm. Maybe its cleanest to simply change the ValueError to suggest modifying the query in this way? Something like:

 ValueError: Integer column has NA values in column 18.
   Consider replacing `column18` with `cast(column18 AS double)` in your sql statement

laughingman7743 · 2019-03-10T14:36:48Z

Thanks @mckeown12.
It is an error that Dataframe outputs, so I think it is difficult to customize it.
I think it would be nice to write about this in the README.

Add about ValueError of integer column in Dataframe. (close #60)

daniel1608 · 2019-03-11T08:33:38Z

Isn't it possible that PyAthena handles the cast for us users?
I have a lot of SQL statements that use double columns with NAs. Changing them is a) time consuming b) not very elegant.

laughingman7743 · 2019-03-11T08:42:56Z

Pull requests welcome!

xinluo-gogovan · 2019-04-04T09:00:47Z

Pandas 0.24+ has support for nullable ints, so I was able to keep my int columns as ints (rather than converting to double) by changing converter.py like so:

import pandas as pd

PANDAS_DTYPES = {
    'boolean': bool,
    'tinyint': pd.Int64Dtype(),
    'smallint': pd.Int64Dtype(),
    'integer': pd.Int64Dtype(),
    'bigint': pd.Int64Dtype(),
    'float': float,
    'real': float,
    'double': float,
    'char': str,
    'varchar': str,
    'array': str,
    'map': str,
    'row': str,
}

If you're willing to set the minimum requirements to pandas >=0.24, I think this fix would be cleaner than converting to double.

laughingman7743 · 2019-04-04T14:37:53Z

@xinluo-gogovan Thank you for your information!
https://pandas.pydata.org/pandas-docs/stable/whatsnew/v0.24.0.html#optional-integer-na-support

laughingman7743 · 2019-04-05T15:24:08Z

https://travis-ci.org/laughingman7743/PyAthena/jobs/516226474

error details


=================================== FAILURES ===================================
_______________________ TestPandasCursor.test_arraysize ________________________
values = array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])
dtype = , mask = None
copy = False
    def coerce_to_array(values, dtype, mask=None, copy=False):
        """
        Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:


              dtype = _dtypes[str(np.dtype(dtype))]


E                   KeyError: 'object'

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError

During handling of the above exception, another exception occurred:

self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_arraysize>

cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f195364ba90>

@with_pandas_cursor

def test_arraysize(self, cursor):

cursor.arraysize = 5

  cursor.execute('SELECT * FROM many_rows LIMIT 20')


tests/test_pandas_cursor.py:63:

pyathena/util.py:28: in _wrapper

return wrapped(*args, **kwargs)

pyathena/pandas_cursor.py:55: in execute

self._retry_config)

pyathena/result_set.py:335: in init

self._df = self._as_pandas()

pyathena/result_set.py:424: in _as_pandas

infer_datetime_format=True)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f

return _read(filepath_or_buffer, kwds)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read

data = parser.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read

ret = self._engine.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read

data = self._reader.read(nrows)

pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read

???

pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory

???

pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows

???

pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data

???

pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens

???

pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype

???

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings

return cls._from_sequence(scalars, dtype, copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence

return integer_array(scalars, dtype=dtype, copy=copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array

values, mask = coerce_to_array(values, dtype=dtype, copy=copy)

values = array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,

17, 18, 19])

dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>, mask = None

copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:


              raise ValueError("invalid dtype specified {}".format(dtype))


E                   ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError

_______________________ TestPandasCursor.test_as_pandas ________________________

values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>

mask = None, copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:


              dtype = _dtypes[str(np.dtype(dtype))]


E                   KeyError: 'object'

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError

During handling of the above exception, another exception occurred:

self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_as_pandas>

cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f195307f828>

@with_pandas_cursor

def test_as_pandas(self, cursor):

  df = cursor.execute('SELECT * FROM one_row').as_pandas()


tests/test_pandas_cursor.py:153:

pyathena/util.py:28: in _wrapper

return wrapped(*args, **kwargs)

pyathena/pandas_cursor.py:55: in execute

self._retry_config)

pyathena/result_set.py:335: in init

self._df = self._as_pandas()

pyathena/result_set.py:424: in _as_pandas

infer_datetime_format=True)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f

return _read(filepath_or_buffer, kwds)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read

data = parser.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read

ret = self._engine.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read

data = self._reader.read(nrows)

pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read

???

pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory

???

pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows

???

pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data

???

pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens

???

pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype

???

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings

return cls._from_sequence(scalars, dtype, copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence

return integer_array(scalars, dtype=dtype, copy=copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array

values, mask = coerce_to_array(values, dtype=dtype, copy=copy)

values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>

mask = None, copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:


              raise ValueError("invalid dtype specified {}".format(dtype))


E                   ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError

________________________ TestPandasCursor.test_complex _________________________

values = array([127]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>

mask = None, copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:


              dtype = _dtypes[str(np.dtype(dtype))]


E                   KeyError: 'object'

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError

During handling of the above exception, another exception occurred:

self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_complex>

cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f19533a56d8>

@with_pandas_cursor

def test_complex(self, cursor):

cursor.execute("""

SELECT

col_boolean

,col_tinyint

,col_smallint

,col_int

,col_bigint

,col_float

,col_double

,col_string

,col_timestamp

,CAST(col_timestamp AS time) AS col_time

,col_date

,col_binary

,col_array

,CAST(col_array AS json) AS col_array_json

,col_map

,CAST(col_map AS json) AS col_map_json

,col_struct

,col_decimal

FROM one_row_complex

  """)


tests/test_pandas_cursor.py:100:

pyathena/util.py:28: in _wrapper

return wrapped(*args, **kwargs)

pyathena/pandas_cursor.py:55: in execute

self._retry_config)

pyathena/result_set.py:335: in init

self._df = self._as_pandas()

pyathena/result_set.py:424: in _as_pandas

infer_datetime_format=True)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f

return _read(filepath_or_buffer, kwds)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read

data = parser.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read

ret = self._engine.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read

data = self._reader.read(nrows)

pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read

???

pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory

???

pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows

???

pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data

???

pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens

???

pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype

???

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings

return cls._from_sequence(scalars, dtype, copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence

return integer_array(scalars, dtype=dtype, copy=copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array

values, mask = coerce_to_array(values, dtype=dtype, copy=copy)

values = array([127]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>

mask = None, copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:


              raise ValueError("invalid dtype specified {}".format(dtype))


E                   ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError

___________________ TestPandasCursor.test_complex_as_pandas ____________________

values = array([127]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>

mask = None, copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:


              dtype = _dtypes[str(np.dtype(dtype))]


E                   KeyError: 'object'

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError

During handling of the above exception, another exception occurred:

self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_complex_as_pandas>

cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f1953462390>

@with_pandas_cursor

def test_complex_as_pandas(self, cursor):

df = cursor.execute("""

SELECT

col_boolean

,col_tinyint

,col_smallint

,col_int

,col_bigint

,col_float

,col_double

,col_string

,col_timestamp

,CAST(col_timestamp AS time) AS col_time

,col_date

,col_binary

,col_array

,CAST(col_array AS json) AS col_array_json

,col_map

,CAST(col_map AS json) AS col_map_json

,col_struct

,col_decimal

FROM one_row_complex

  """).as_pandas()


tests/test_pandas_cursor.py:200:

pyathena/util.py:28: in _wrapper

return wrapped(*args, **kwargs)

pyathena/pandas_cursor.py:55: in execute

self._retry_config)

pyathena/result_set.py:335: in init

self._df = self._as_pandas()

pyathena/result_set.py:424: in _as_pandas

infer_datetime_format=True)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f

return _read(filepath_or_buffer, kwds)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read

data = parser.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read

ret = self._engine.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read

data = self._reader.read(nrows)

pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read

???

pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory

???

pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows

???

pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data

???

pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens

???

pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype

???

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings

return cls._from_sequence(scalars, dtype, copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence

return integer_array(scalars, dtype=dtype, copy=copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array

values, mask = coerce_to_array(values, dtype=dtype, copy=copy)

values = array([127]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>

mask = None, copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:


              raise ValueError("invalid dtype specified {}".format(dtype))


E                   ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError

________________________ TestPandasCursor.test_fetchall ________________________

values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>

mask = None, copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:


              dtype = _dtypes[str(np.dtype(dtype))]


E                   KeyError: 'object'

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError

During handling of the above exception, another exception occurred:

self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_fetchall>

cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f1959b72470>

@with_pandas_cursor

def test_fetchall(self, cursor):

  cursor.execute('SELECT * FROM one_row')


tests/test_pandas_cursor.py:49:

pyathena/util.py:28: in _wrapper

return wrapped(*args, **kwargs)

pyathena/pandas_cursor.py:55: in execute

self._retry_config)

pyathena/result_set.py:335: in init

self._df = self._as_pandas()

pyathena/result_set.py:424: in _as_pandas

infer_datetime_format=True)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f

return _read(filepath_or_buffer, kwds)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read

data = parser.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read

ret = self._engine.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read

data = self._reader.read(nrows)

pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read

???

pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory

???

pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows

???

pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data

???

pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens

???

pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype

???

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings

return cls._from_sequence(scalars, dtype, copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence

return integer_array(scalars, dtype=dtype, copy=copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array

values, mask = coerce_to_array(values, dtype=dtype, copy=copy)

values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>

mask = None, copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:


              raise ValueError("invalid dtype specified {}".format(dtype))


E                   ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError

_______________________ TestPandasCursor.test_fetchmany ________________________

values = array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>, mask = None

copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:


              dtype = _dtypes[str(np.dtype(dtype))]


E                   KeyError: 'object'

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError

During handling of the above exception, another exception occurred:

self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_fetchmany>

cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f19529e95c0>

@with_pandas_cursor

def test_fetchmany(self, cursor):

  cursor.execute('SELECT * FROM many_rows LIMIT 15')


tests/test_pandas_cursor.py:43:

pyathena/util.py:28: in _wrapper

return wrapped(*args, **kwargs)

pyathena/pandas_cursor.py:55: in execute

self._retry_config)

pyathena/result_set.py:335: in init

self._df = self._as_pandas()

pyathena/result_set.py:424: in _as_pandas

infer_datetime_format=True)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f

return _read(filepath_or_buffer, kwds)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read

data = parser.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read

ret = self._engine.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read

data = self._reader.read(nrows)

pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read

???

pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory

???

pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows

???

pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data

???

pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens

???

pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype

???

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings

return cls._from_sequence(scalars, dtype, copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence

return integer_array(scalars, dtype=dtype, copy=copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array

values, mask = coerce_to_array(values, dtype=dtype, copy=copy)

values = array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>, mask = None

copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:


              raise ValueError("invalid dtype specified {}".format(dtype))


E                   ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError

________________________ TestPandasCursor.test_fetchone ________________________

values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>

mask = None, copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:


              dtype = _dtypes[str(np.dtype(dtype))]


E                   KeyError: 'object'

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError

During handling of the above exception, another exception occurred:

self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_fetchone>

cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f1953095160>

@with_pandas_cursor

def test_fetchone(self, cursor):

  cursor.execute('SELECT * FROM one_row')


tests/test_pandas_cursor.py:35:

pyathena/util.py:28: in _wrapper

return wrapped(*args, **kwargs)

pyathena/pandas_cursor.py:55: in execute

self._retry_config)

pyathena/result_set.py:335: in init

self._df = self._as_pandas()

pyathena/result_set.py:424: in _as_pandas

infer_datetime_format=True)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f

return _read(filepath_or_buffer, kwds)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read

data = parser.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read

ret = self._engine.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read

data = self._reader.read(nrows)

pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read

???

pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory

???

pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows

???

pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data

???

pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens

???

pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype

???

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings

return cls._from_sequence(scalars, dtype, copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence

return integer_array(scalars, dtype=dtype, copy=copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array

values, mask = coerce_to_array(values, dtype=dtype, copy=copy)

values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>

mask = None, copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:


              raise ValueError("invalid dtype specified {}".format(dtype))


E                   ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError

________________________ TestPandasCursor.test_iterator ________________________

values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>

mask = None, copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:


              dtype = _dtypes[str(np.dtype(dtype))]


E                   KeyError: 'object'

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError

During handling of the above exception, another exception occurred:

self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_iterator>

cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f19536880b8>

@with_pandas_cursor

def test_iterator(self, cursor):

  cursor.execute('SELECT * FROM one_row')


tests/test_pandas_cursor.py:56:

pyathena/util.py:28: in _wrapper

return wrapped(*args, **kwargs)

pyathena/pandas_cursor.py:55: in execute

self._retry_config)

pyathena/result_set.py:335: in init

self._df = self._as_pandas()

pyathena/result_set.py:424: in _as_pandas

infer_datetime_format=True)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f

return _read(filepath_or_buffer, kwds)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read

data = parser.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read

ret = self._engine.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read

data = self._reader.read(nrows)

pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read

???

pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory

???

pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows

???

pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data

???

pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens

???

pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype

???

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings

return cls._from_sequence(scalars, dtype, copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence

return integer_array(scalars, dtype=dtype, copy=copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array

values, mask = coerce_to_array(values, dtype=dtype, copy=copy)

values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>

mask = None, copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:


              raise ValueError("invalid dtype specified {}".format(dtype))


E                   ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError

_____________________ TestPandasCursor.test_many_as_pandas _____________________

values = array([   0,    1,    2, ..., 9997, 9998, 9999])

dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>, mask = None

copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:


              dtype = _dtypes[str(np.dtype(dtype))]


E                   KeyError: 'object'

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError

During handling of the above exception, another exception occurred:

self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_many_as_pandas>

cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f19529e3b38>

@with_pandas_cursor

def test_many_as_pandas(self, cursor):

  df = cursor.execute('SELECT * FROM many_rows').as_pandas()


tests/test_pandas_cursor.py:171:

pyathena/util.py:28: in _wrapper

return wrapped(*args, **kwargs)

pyathena/pandas_cursor.py:55: in execute

self._retry_config)

pyathena/result_set.py:335: in init

self._df = self._as_pandas()

pyathena/result_set.py:424: in _as_pandas

infer_datetime_format=True)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f

return _read(filepath_or_buffer, kwds)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read

data = parser.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read

ret = self._engine.read(nrows)

.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read

data = self._reader.read(nrows)

pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read

???

pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory

???

pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows

???

pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data

???

pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens

???

pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype

???

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings

return cls._from_sequence(scalars, dtype, copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence

return integer_array(scalars, dtype=dtype, copy=copy)

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array

values, mask = coerce_to_array(values, dtype=dtype, copy=copy)

values = array([   0,    1,    2, ..., 9997, 9998, 9999])

dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>, mask = None

copy = False

def coerce_to_array(values, dtype, mask=None, copy=False):

"""

Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:


              raise ValueError("invalid dtype specified {}".format(dtype))


E                   ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>

.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError

xinluo-gogovan · 2019-04-08T08:08:21Z

Not sure what those errors are about as it seems that branch has a bunch of refactoring going on, but I had run the tests on master with just my aforementioned change plus this following one and all the tests were passing:

    def _trunc_date(self, df):
        times = [d[0] for d in self.description if d[1] in ('time', 'time with time zone')]
        if times:
            df.loc[:, times] = df.loc[:, times].apply(lambda r: r.dt.time)
        return df

=================================================== warnings summary ===================================================
tests/test_async_cursor.py::TestAsyncCursor::test_arraysize
  /XXX/.local/share/virtualenvs/PyAthena-exb1nwsV/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/_collections.py:1: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
    from collections import Mapping, MutableMapping

tests/test_sqlalchemy_athena.py::TestSQLAlchemyAthena::test_reflect_select
  /XXX/.local/share/virtualenvs/PyAthena-exb1nwsV/lib/python3.7/site-packages/sqlalchemy/sql/sqltypes.py:639: SAWarning: Dialect awsathena+rest does *not* support Decimal objects natively, and SQLAlchemy must convert from floating point - rounding errors and other issues may occur. Please consider storing Decimal numbers as strings or integers on this platform for lossless storage.
    "storage." % (dialect.name, dialect.driver)

-- Docs: https://docs.pytest.org/en/latest/warnings.html
======================================= 103 passed, 2 warnings in 179.49 seconds =======================================

laughingman7743 · 2019-04-08T08:45:43Z

@xinluo-gogovan Thanks! I will investigate.

laughingman7743 · 2019-04-09T15:11:16Z

When I run the test in the local environment, it passes.
An error occurs when executing with TravisCI. 🤔

Laughingman7743-no-MacBook-Air:PyAthena laughingman7743$ pipenv run pytest -k test_pandas_cursor
========================================================================================================================= test session starts ==========================================================================================================================
platform darwin -- Python 3.6.5, pytest-4.4.0, py-1.8.0, pluggy-0.9.0
rootdir: /Users/laughingman7743/github/PyAthena, inifile: setup.cfg
plugins: flake8-1.0.4, cov-2.6.1
collected 103 items / 86 deselected / 17 selected                                                                                                                                                                                                                      

tests/test_pandas_cursor.py .................                                                                                                                                                                                                                    [100%]

============================================================================================================== 17 passed, 86 deselected in 61.95 seconds ===============================================================================================================

laughingman7743 · 2019-04-14T05:48:31Z

pandas-dev/pandas#24326

You need to call the dtype.
dat = integer_array(d, dtype=dtype())

All tests passed. 🎉
#80
Drop Python 3.4 support. It will work with Python 3.4 unless you use PandasCusrsor.

…in_pandas_cursor Support integer NA values in PandasCursor (fix #60)

austinlostinboston changed the title ~~PandasCursor doesn't automatically convert int columns to floats~~ PandasCursor doesn't automatically convert int columns with NA's to floats Jan 16, 2019

laughingman7743 added a commit that referenced this issue Mar 10, 2019

Update README:

e596715

Add about ValueError of integer column in Dataframe. (close #60)

laughingman7743 mentioned this issue Mar 10, 2019

Update README #76

Closed

laughingman7743 closed this as completed in df8c51c Apr 14, 2019

laughingman7743 added a commit that referenced this issue Apr 14, 2019

Merge pull request #80 from laughingman7743/support_integer_na_value_…

7d952bf

…in_pandas_cursor Support integer NA values in PandasCursor (fix #60)

laughingman7743 mentioned this issue Oct 25, 2019

as_pandas() function from andasCursor automatically convert 'NA' to NaN #92

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PandasCursor doesn't automatically convert int columns with NA's to floats #60

PandasCursor doesn't automatically convert int columns with NA's to floats #60

austinlostinboston commented Jan 16, 2019

laughingman7743 commented Jan 17, 2019

austinlostinboston commented Jan 17, 2019

mckeown12 commented Mar 6, 2019

laughingman7743 commented Mar 7, 2019

mckeown12 commented Mar 10, 2019

laughingman7743 commented Mar 10, 2019

daniel1608 commented Mar 11, 2019

laughingman7743 commented Mar 11, 2019

xinluo-gogovan commented Apr 4, 2019 •

edited

Loading

laughingman7743 commented Apr 4, 2019

laughingman7743 commented Apr 5, 2019 •

edited

Loading

xinluo-gogovan commented Apr 8, 2019

laughingman7743 commented Apr 8, 2019

laughingman7743 commented Apr 9, 2019

laughingman7743 commented Apr 14, 2019

PandasCursor doesn't automatically convert int columns with NA's to floats #60

PandasCursor doesn't automatically convert int columns with NA's to floats #60

Comments

austinlostinboston commented Jan 16, 2019

laughingman7743 commented Jan 17, 2019

austinlostinboston commented Jan 17, 2019

mckeown12 commented Mar 6, 2019

laughingman7743 commented Mar 7, 2019

mckeown12 commented Mar 10, 2019

laughingman7743 commented Mar 10, 2019

daniel1608 commented Mar 11, 2019

laughingman7743 commented Mar 11, 2019

xinluo-gogovan commented Apr 4, 2019 • edited Loading

laughingman7743 commented Apr 4, 2019

laughingman7743 commented Apr 5, 2019 • edited Loading

xinluo-gogovan commented Apr 8, 2019

laughingman7743 commented Apr 8, 2019

laughingman7743 commented Apr 9, 2019

laughingman7743 commented Apr 14, 2019

xinluo-gogovan commented Apr 4, 2019 •

edited

Loading

laughingman7743 commented Apr 5, 2019 •

edited

Loading