Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PandasCursor doesn't automatically convert int columns with NA's to floats #60

Closed
austinlostinboston opened this issue Jan 16, 2019 · 15 comments

Comments

@austinlostinboston
Copy link

I'm querying a large athena table and can successfully run a query using the below code, however it's really slow (for reasons covered in #46).

conn = pyathena.connect(**at.athena_creds)
df = pd.read_sql(sql, conn)

I would really like to take advantage of the performance boost that PandasCursor offers, however, when I run the code below, I get a value error.

conn = pyathena.connect(**at.athena_creds, cursor_class=PandasCursor)
cursor = at_con.cursor()
df = cursor.execute(sql).as_pandas()

>>> ValueError: Integer column has NA values in column 18

Now I understand why I'm getting this value error. I have a int column in my athena table which has NA values in it, which Pandas notoriously doesn't handle well (NaN's are floats in Pandas eyes, not ints). The pd.read_sql() seems to handle this gracefully. It recognizes there is an int column with NaN's and converts it to a float column. It would be great if pyathena did the same thing.

@austinlostinboston austinlostinboston changed the title PandasCursor doesn't automatically convert int columns to floats PandasCursor doesn't automatically convert int columns with NA's to floats Jan 16, 2019
@laughingman7743
Copy link
Owner

I have a int column in my athena table which has NA values in it, which Pandas notoriously doesn't handle well (NaN's are floats in Pandas eyes, not ints).

I do not really know what kind of data this is. Is it possible to present sample data?

@austinlostinboston
Copy link
Author

Sure. The column in particular that's giving me issues is a column capturing the 4 digits after a zip/postal code. The snippet below shows that sometimes, this data is missing from that column. Pandas must use pd.to_numeric() (or a similar conversion) on columns like these. When I use pyathena without PandasCursor, I end up with a this column being converted to float64 and NaN's where the missing data are.

row_num | zipcode_plus_four (int)
268 | 4005
269 | 1447
270 | 1447
271 | 1447
272 | 2938
273 | 2938
274 | 2938
275 | 2938
276 | 2938
277 |  
278 |  
279 | 4000
280 | 4000
281 | 4000
282 | 6183
283 | 6183
284 | 9702

@mckeown12
Copy link

I'm experiencing this same issue. Is it possible to tell explicitly tell the PandasCursor to cast the column to floats?

@laughingman7743
Copy link
Owner

It seems good to convert with cast.
https://prestodb.github.io/docs/0.172/functions/conversion.html
Do you have a good implementation idea?

@mckeown12
Copy link

Thanks @laughingman7743 , I used cast(intColumnName AS double) in my select statement and it worked like a charm. Maybe its cleanest to simply change the ValueError to suggest modifying the query in this way? Something like:

 ValueError: Integer column has NA values in column 18.
   Consider replacing `column18` with `cast(column18 AS double)` in your sql statement

@laughingman7743
Copy link
Owner

Thanks @mckeown12.
It is an error that Dataframe outputs, so I think it is difficult to customize it.
I think it would be nice to write about this in the README.

laughingman7743 added a commit that referenced this issue Mar 10, 2019
Add about ValueError of integer column in Dataframe. (close #60)
@daniel1608
Copy link

Isn't it possible that PyAthena handles the cast for us users?
I have a lot of SQL statements that use double columns with NAs. Changing them is a) time consuming b) not very elegant.

@laughingman7743
Copy link
Owner

Pull requests welcome!

@xinluo-gogovan
Copy link

xinluo-gogovan commented Apr 4, 2019

Pandas 0.24+ has support for nullable ints, so I was able to keep my int columns as ints (rather than converting to double) by changing converter.py like so:

import pandas as pd

PANDAS_DTYPES = {
    'boolean': bool,
    'tinyint': pd.Int64Dtype(),
    'smallint': pd.Int64Dtype(),
    'integer': pd.Int64Dtype(),
    'bigint': pd.Int64Dtype(),
    'float': float,
    'real': float,
    'double': float,
    'char': str,
    'varchar': str,
    'array': str,
    'map': str,
    'row': str,
}

If you're willing to set the minimum requirements to pandas >=0.24, I think this fix would be cleaner than converting to double.

@laughingman7743
Copy link
Owner

@laughingman7743
Copy link
Owner

laughingman7743 commented Apr 5, 2019

https://travis-ci.org/laughingman7743/PyAthena/jobs/516226474

error details

=================================== FAILURES ===================================
_______________________ TestPandasCursor.test_arraysize ________________________
values = array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])
dtype = , mask = None
copy = False
    def coerce_to_array(values, dtype, mask=None, copy=False):
        """
        Coerce the input values array to numpy arrays with a mask
    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
              dtype = _dtypes[str(np.dtype(dtype))]

E KeyError: 'object'
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError
During handling of the above exception, another exception occurred:
self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_arraysize>
cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f195364ba90>
@with_pandas_cursor
def test_arraysize(self, cursor):
cursor.arraysize = 5

  cursor.execute('SELECT * FROM many_rows LIMIT 20')

tests/test_pandas_cursor.py:63:


pyathena/util.py:28: in _wrapper
return wrapped(*args, **kwargs)
pyathena/pandas_cursor.py:55: in execute
self._retry_config)
pyathena/result_set.py:335: in init
self._df = self._as_pandas()
pyathena/result_set.py:424: in _as_pandas
infer_datetime_format=True)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f
return _read(filepath_or_buffer, kwds)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read
data = parser.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read
ret = self._engine.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read
data = self._reader.read(nrows)
pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read
???
pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory
???
pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows
???
pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data
???
pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens
???
pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype
???
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings
return cls._from_sequence(scalars, dtype, copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence
return integer_array(scalars, dtype=dtype, copy=copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array
values, mask = coerce_to_array(values, dtype=dtype, copy=copy)


values = array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19])
dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>, mask = None
copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:
              raise ValueError("invalid dtype specified {}".format(dtype))

E ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError
_______________________ TestPandasCursor.test_as_pandas ________________________
values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>
mask = None, copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
              dtype = _dtypes[str(np.dtype(dtype))]

E KeyError: 'object'
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError
During handling of the above exception, another exception occurred:
self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_as_pandas>
cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f195307f828>
@with_pandas_cursor
def test_as_pandas(self, cursor):

  df = cursor.execute('SELECT * FROM one_row').as_pandas()

tests/test_pandas_cursor.py:153:


pyathena/util.py:28: in _wrapper
return wrapped(*args, **kwargs)
pyathena/pandas_cursor.py:55: in execute
self._retry_config)
pyathena/result_set.py:335: in init
self._df = self._as_pandas()
pyathena/result_set.py:424: in _as_pandas
infer_datetime_format=True)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f
return _read(filepath_or_buffer, kwds)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read
data = parser.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read
ret = self._engine.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read
data = self._reader.read(nrows)
pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read
???
pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory
???
pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows
???
pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data
???
pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens
???
pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype
???
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings
return cls._from_sequence(scalars, dtype, copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence
return integer_array(scalars, dtype=dtype, copy=copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array
values, mask = coerce_to_array(values, dtype=dtype, copy=copy)


values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>
mask = None, copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:
              raise ValueError("invalid dtype specified {}".format(dtype))

E ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError
________________________ TestPandasCursor.test_complex _________________________
values = array([127]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>
mask = None, copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
              dtype = _dtypes[str(np.dtype(dtype))]

E KeyError: 'object'
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError
During handling of the above exception, another exception occurred:
self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_complex>
cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f19533a56d8>
@with_pandas_cursor
def test_complex(self, cursor):
cursor.execute("""
SELECT
col_boolean
,col_tinyint
,col_smallint
,col_int
,col_bigint
,col_float
,col_double
,col_string
,col_timestamp
,CAST(col_timestamp AS time) AS col_time
,col_date
,col_binary
,col_array
,CAST(col_array AS json) AS col_array_json
,col_map
,CAST(col_map AS json) AS col_map_json
,col_struct
,col_decimal
FROM one_row_complex

  """)

tests/test_pandas_cursor.py:100:


pyathena/util.py:28: in _wrapper
return wrapped(*args, **kwargs)
pyathena/pandas_cursor.py:55: in execute
self._retry_config)
pyathena/result_set.py:335: in init
self._df = self._as_pandas()
pyathena/result_set.py:424: in _as_pandas
infer_datetime_format=True)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f
return _read(filepath_or_buffer, kwds)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read
data = parser.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read
ret = self._engine.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read
data = self._reader.read(nrows)
pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read
???
pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory
???
pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows
???
pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data
???
pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens
???
pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype
???
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings
return cls._from_sequence(scalars, dtype, copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence
return integer_array(scalars, dtype=dtype, copy=copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array
values, mask = coerce_to_array(values, dtype=dtype, copy=copy)


values = array([127]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>
mask = None, copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:
              raise ValueError("invalid dtype specified {}".format(dtype))

E ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError
___________________ TestPandasCursor.test_complex_as_pandas ____________________
values = array([127]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>
mask = None, copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
              dtype = _dtypes[str(np.dtype(dtype))]

E KeyError: 'object'
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError
During handling of the above exception, another exception occurred:
self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_complex_as_pandas>
cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f1953462390>
@with_pandas_cursor
def test_complex_as_pandas(self, cursor):
df = cursor.execute("""
SELECT
col_boolean
,col_tinyint
,col_smallint
,col_int
,col_bigint
,col_float
,col_double
,col_string
,col_timestamp
,CAST(col_timestamp AS time) AS col_time
,col_date
,col_binary
,col_array
,CAST(col_array AS json) AS col_array_json
,col_map
,CAST(col_map AS json) AS col_map_json
,col_struct
,col_decimal
FROM one_row_complex

  """).as_pandas()

tests/test_pandas_cursor.py:200:


pyathena/util.py:28: in _wrapper
return wrapped(*args, **kwargs)
pyathena/pandas_cursor.py:55: in execute
self._retry_config)
pyathena/result_set.py:335: in init
self._df = self._as_pandas()
pyathena/result_set.py:424: in _as_pandas
infer_datetime_format=True)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f
return _read(filepath_or_buffer, kwds)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read
data = parser.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read
ret = self._engine.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read
data = self._reader.read(nrows)
pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read
???
pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory
???
pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows
???
pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data
???
pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens
???
pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype
???
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings
return cls._from_sequence(scalars, dtype, copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence
return integer_array(scalars, dtype=dtype, copy=copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array
values, mask = coerce_to_array(values, dtype=dtype, copy=copy)


values = array([127]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>
mask = None, copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:
              raise ValueError("invalid dtype specified {}".format(dtype))

E ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError
________________________ TestPandasCursor.test_fetchall ________________________
values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>
mask = None, copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
              dtype = _dtypes[str(np.dtype(dtype))]

E KeyError: 'object'
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError
During handling of the above exception, another exception occurred:
self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_fetchall>
cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f1959b72470>
@with_pandas_cursor
def test_fetchall(self, cursor):

  cursor.execute('SELECT * FROM one_row')

tests/test_pandas_cursor.py:49:


pyathena/util.py:28: in _wrapper
return wrapped(*args, **kwargs)
pyathena/pandas_cursor.py:55: in execute
self._retry_config)
pyathena/result_set.py:335: in init
self._df = self._as_pandas()
pyathena/result_set.py:424: in _as_pandas
infer_datetime_format=True)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f
return _read(filepath_or_buffer, kwds)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read
data = parser.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read
ret = self._engine.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read
data = self._reader.read(nrows)
pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read
???
pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory
???
pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows
???
pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data
???
pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens
???
pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype
???
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings
return cls._from_sequence(scalars, dtype, copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence
return integer_array(scalars, dtype=dtype, copy=copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array
values, mask = coerce_to_array(values, dtype=dtype, copy=copy)


values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>
mask = None, copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:
              raise ValueError("invalid dtype specified {}".format(dtype))

E ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError
_______________________ TestPandasCursor.test_fetchmany ________________________
values = array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>, mask = None
copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
              dtype = _dtypes[str(np.dtype(dtype))]

E KeyError: 'object'
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError
During handling of the above exception, another exception occurred:
self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_fetchmany>
cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f19529e95c0>
@with_pandas_cursor
def test_fetchmany(self, cursor):

  cursor.execute('SELECT * FROM many_rows LIMIT 15')

tests/test_pandas_cursor.py:43:


pyathena/util.py:28: in _wrapper
return wrapped(*args, **kwargs)
pyathena/pandas_cursor.py:55: in execute
self._retry_config)
pyathena/result_set.py:335: in init
self._df = self._as_pandas()
pyathena/result_set.py:424: in _as_pandas
infer_datetime_format=True)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f
return _read(filepath_or_buffer, kwds)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read
data = parser.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read
ret = self._engine.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read
data = self._reader.read(nrows)
pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read
???
pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory
???
pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows
???
pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data
???
pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens
???
pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype
???
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings
return cls._from_sequence(scalars, dtype, copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence
return integer_array(scalars, dtype=dtype, copy=copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array
values, mask = coerce_to_array(values, dtype=dtype, copy=copy)


values = array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>, mask = None
copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:
              raise ValueError("invalid dtype specified {}".format(dtype))

E ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError
________________________ TestPandasCursor.test_fetchone ________________________
values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>
mask = None, copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
              dtype = _dtypes[str(np.dtype(dtype))]

E KeyError: 'object'
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError
During handling of the above exception, another exception occurred:
self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_fetchone>
cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f1953095160>
@with_pandas_cursor
def test_fetchone(self, cursor):

  cursor.execute('SELECT * FROM one_row')

tests/test_pandas_cursor.py:35:


pyathena/util.py:28: in _wrapper
return wrapped(*args, **kwargs)
pyathena/pandas_cursor.py:55: in execute
self._retry_config)
pyathena/result_set.py:335: in init
self._df = self._as_pandas()
pyathena/result_set.py:424: in _as_pandas
infer_datetime_format=True)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f
return _read(filepath_or_buffer, kwds)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read
data = parser.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read
ret = self._engine.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read
data = self._reader.read(nrows)
pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read
???
pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory
???
pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows
???
pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data
???
pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens
???
pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype
???
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings
return cls._from_sequence(scalars, dtype, copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence
return integer_array(scalars, dtype=dtype, copy=copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array
values, mask = coerce_to_array(values, dtype=dtype, copy=copy)


values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>
mask = None, copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:
              raise ValueError("invalid dtype specified {}".format(dtype))

E ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError
________________________ TestPandasCursor.test_iterator ________________________
values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>
mask = None, copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
              dtype = _dtypes[str(np.dtype(dtype))]

E KeyError: 'object'
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError
During handling of the above exception, another exception occurred:
self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_iterator>
cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f19536880b8>
@with_pandas_cursor
def test_iterator(self, cursor):

  cursor.execute('SELECT * FROM one_row')

tests/test_pandas_cursor.py:56:


pyathena/util.py:28: in _wrapper
return wrapped(*args, **kwargs)
pyathena/pandas_cursor.py:55: in execute
self._retry_config)
pyathena/result_set.py:335: in init
self._df = self._as_pandas()
pyathena/result_set.py:424: in _as_pandas
infer_datetime_format=True)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f
return _read(filepath_or_buffer, kwds)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read
data = parser.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read
ret = self._engine.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read
data = self._reader.read(nrows)
pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read
???
pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory
???
pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows
???
pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data
???
pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens
???
pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype
???
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings
return cls._from_sequence(scalars, dtype, copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence
return integer_array(scalars, dtype=dtype, copy=copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array
values, mask = coerce_to_array(values, dtype=dtype, copy=copy)


values = array([1]), dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>
mask = None, copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:
              raise ValueError("invalid dtype specified {}".format(dtype))

E ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError
_____________________ TestPandasCursor.test_many_as_pandas _____________________
values = array([ 0, 1, 2, ..., 9997, 9998, 9999])
dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>, mask = None
copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
              dtype = _dtypes[str(np.dtype(dtype))]

E KeyError: 'object'
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:165: KeyError
During handling of the above exception, another exception occurred:
self = <tests.test_pandas_cursor.TestPandasCursor testMethod=test_many_as_pandas>
cursor = <pyathena.pandas_cursor.PandasCursor object at 0x7f19529e3b38>
@with_pandas_cursor
def test_many_as_pandas(self, cursor):

  df = cursor.execute('SELECT * FROM many_rows').as_pandas()

tests/test_pandas_cursor.py:171:


pyathena/util.py:28: in _wrapper
return wrapped(*args, **kwargs)
pyathena/pandas_cursor.py:55: in execute
self._retry_config)
pyathena/result_set.py:335: in init
self._df = self._as_pandas()
pyathena/result_set.py:424: in _as_pandas
infer_datetime_format=True)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:702: in parser_f
return _read(filepath_or_buffer, kwds)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:435: in _read
data = parser.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1139: in read
ret = self._engine.read(nrows)
.tox/py37/lib/python3.7/site-packages/pandas/io/parsers.py:1995: in read
data = self._reader.read(nrows)
pandas/_libs/parsers.pyx:899: in pandas._libs.parsers.TextReader.read
???
pandas/_libs/parsers.pyx:914: in pandas._libs.parsers.TextReader._read_low_memory
???
pandas/_libs/parsers.pyx:991: in pandas._libs.parsers.TextReader._read_rows
???
pandas/_libs/parsers.pyx:1123: in pandas._libs.parsers.TextReader._convert_column_data
???
pandas/_libs/parsers.pyx:1154: in pandas._libs.parsers.TextReader._convert_tokens
???
pandas/_libs/parsers.pyx:1234: in pandas._libs.parsers.TextReader._convert_with_dtype
???
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:308: in _from_sequence_of_strings
return cls._from_sequence(scalars, dtype, copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:303: in _from_sequence
return integer_array(scalars, dtype=dtype, copy=copy)
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:111: in integer_array
values, mask = coerce_to_array(values, dtype=dtype, copy=copy)


values = array([ 0, 1, 2, ..., 9997, 9998, 9999])
dtype = <class 'pandas.core.arrays.integer.Int64Dtype'>, mask = None
copy = False
def coerce_to_array(values, dtype, mask=None, copy=False):
"""
Coerce the input values array to numpy arrays with a mask

    Parameters
    ----------
    values : 1D list-like
    dtype : integer dtype
    mask : boolean 1D array, optional
    copy : boolean, default False
        if True, copy the input

    Returns
    -------
    tuple of (values, mask)
    """
    # if values is integer numpy array, preserve it's dtype
    if dtype is None and hasattr(values, 'dtype'):
        if is_integer_dtype(values.dtype):
            dtype = values.dtype

    if dtype is not None:
        if (isinstance(dtype, string_types) and
                (dtype.startswith("Int") or dtype.startswith("UInt"))):
            # Avoid DeprecationWarning from NumPy about np.dtype("Int64")
            # https://github.com/numpy/numpy/pull/7476
            dtype = dtype.lower()

        if not issubclass(type(dtype), _IntegerDtype):
            try:
                dtype = _dtypes[str(np.dtype(dtype))]
            except KeyError:
              raise ValueError("invalid dtype specified {}".format(dtype))

E ValueError: invalid dtype specified <class 'pandas.core.arrays.integer.Int64Dtype'>
.tox/py37/lib/python3.7/site-packages/pandas/core/arrays/integer.py:167: ValueError


@xinluo-gogovan
Copy link

Not sure what those errors are about as it seems that branch has a bunch of refactoring going on, but I had run the tests on master with just my aforementioned change plus this following one and all the tests were passing:

    def _trunc_date(self, df):
        times = [d[0] for d in self.description if d[1] in ('time', 'time with time zone')]
        if times:
            df.loc[:, times] = df.loc[:, times].apply(lambda r: r.dt.time)
        return df
=================================================== warnings summary ===================================================
tests/test_async_cursor.py::TestAsyncCursor::test_arraysize
  /XXX/.local/share/virtualenvs/PyAthena-exb1nwsV/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/_collections.py:1: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
    from collections import Mapping, MutableMapping

tests/test_sqlalchemy_athena.py::TestSQLAlchemyAthena::test_reflect_select
  /XXX/.local/share/virtualenvs/PyAthena-exb1nwsV/lib/python3.7/site-packages/sqlalchemy/sql/sqltypes.py:639: SAWarning: Dialect awsathena+rest does *not* support Decimal objects natively, and SQLAlchemy must convert from floating point - rounding errors and other issues may occur. Please consider storing Decimal numbers as strings or integers on this platform for lossless storage.
    "storage." % (dialect.name, dialect.driver)

-- Docs: https://docs.pytest.org/en/latest/warnings.html
======================================= 103 passed, 2 warnings in 179.49 seconds =======================================

@laughingman7743
Copy link
Owner

@xinluo-gogovan Thanks! I will investigate.

@laughingman7743
Copy link
Owner

When I run the test in the local environment, it passes.
An error occurs when executing with TravisCI. 🤔

Laughingman7743-no-MacBook-Air:PyAthena laughingman7743$ pipenv run pytest -k test_pandas_cursor
========================================================================================================================= test session starts ==========================================================================================================================
platform darwin -- Python 3.6.5, pytest-4.4.0, py-1.8.0, pluggy-0.9.0
rootdir: /Users/laughingman7743/github/PyAthena, inifile: setup.cfg
plugins: flake8-1.0.4, cov-2.6.1
collected 103 items / 86 deselected / 17 selected                                                                                                                                                                                                                      

tests/test_pandas_cursor.py .................                                                                                                                                                                                                                    [100%]

============================================================================================================== 17 passed, 86 deselected in 61.95 seconds ===============================================================================================================

@laughingman7743
Copy link
Owner

pandas-dev/pandas#24326

You need to call the dtype.
dat = integer_array(d, dtype=dtype())

All tests passed. 🎉
#80
Drop Python 3.4 support. It will work with Python 3.4 unless you use PandasCusrsor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants