read_sql: TypeError: Argument 'rows' has incorrect type #11522

Closed
PetitLepton opened this Issue Nov 5, 2015 · 5 comments

Comments

Projects
None yet
3 participants

Hi,
I encountered a problem when using read_sql with a remote mySQL database.

After loading the modules,

import pymysql
import pandas as pd
import sqlalchemy

I create both a pymysql connector and a sqlalchemy engine with pymysql for the same database.

connector = pymysql.connect(...)
engine = sqlalchemy.create_engine('mysql+pymysql://...')

Everything is fine with the engine

df = pd.DataFrame()
query = " SELECT id, created_at FROM themes"
for chunk in pd.read_sql(query, engine, chunksize=100):
    df = df.append(chunk)

but fails with the connector

df = pd.DataFrame()
query = " SELECT id, created_at FROM themes"
for chunk in pd.read_sql(query, connector, chunksize=100):
    df = df.append(chunk)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-9a0ee2188a78> in <module>()
      1 df = pd.DataFrame()
      2 query = " SELECT id, created_at FROM themes"
----> 3 for chunk in pd.read_sql(query, connector_mySQL, chunksize=100):
      4     df = df.append(chunk)

/usr/local/lib/python3.5/site-packages/pandas/io/sql.py in _query_iterator(cursor, chunksize, columns, index_col, coerce_float, parse_dates)
   1563                 yield _wrap_result(data, columns, index_col=index_col,
   1564                                    coerce_float=coerce_float,
-> 1565                                    parse_dates=parse_dates)
   1566 
   1567     def read_query(self, sql, index_col=None, coerce_float=True, params=None,

/usr/local/lib/python3.5/site-packages/pandas/io/sql.py in _wrap_result(data, columns, index_col, coerce_float, parse_dates)
    135 
    136     frame = DataFrame.from_records(data, columns=columns,
--> 137                                    coerce_float=coerce_float)
    138 
    139     _parse_date_columns(frame, parse_dates)

/usr/local/lib/python3.5/site-packages/pandas/core/frame.py in from_records(cls, data, index, exclude, columns, coerce_float, nrows)
    928         else:
    929             arrays, arr_columns = _to_arrays(data, columns,
--> 930                                              coerce_float=coerce_float)
    931 
    932             arr_columns = _ensure_index(arr_columns)

/usr/local/lib/python3.5/site-packages/pandas/core/frame.py in _to_arrays(data, columns, coerce_float, dtype)
   5211     if isinstance(data[0], (list, tuple)):
   5212         return _list_to_arrays(data, columns, coerce_float=coerce_float,
-> 5213                                dtype=dtype)
   5214     elif isinstance(data[0], collections.Mapping):
   5215         return _list_of_dict_to_arrays(data, columns,

/usr/local/lib/python3.5/site-packages/pandas/core/frame.py in _list_to_arrays(data, columns, coerce_float, dtype)
   5289 def _list_to_arrays(data, columns, coerce_float=False, dtype=None):
   5290     if len(data) > 0 and isinstance(data[0], tuple):
-> 5291         content = list(lib.to_object_array_tuples(data).T)
   5292     else:
   5293         # list of lists

TypeError: Argument 'rows' has incorrect type (expected list, got tuple)

Here are the versions of the modules

print(pymysql.__version__)
print(sqlalchemy.__version__)
print(pd.__version__)

0.6.7.None
1.0.9
0.17.0

Best regards,
Flavien.

@PetitLepton Thanks for the report.

Can you show the output of?

cur = connector.cursor()
cur.execute("SELECT id, created_at FROM themes").fetchall()

@jorisvandenbossche Hi, this one works fine, I was using it so far

cur = connector.cursor()
cur.execute("SELECT id, created_at FROM themes")
cur.fetchall()
((1, datetime.datetime(2012, 12, 2, 4, 6, 17)),
 (2, datetime.datetime(2012, 12, 2, 4, 20, 56)),
 (4, datetime.datetime(2012, 12, 2, 4, 57, 39)),
 (5, datetime.datetime(2012, 12, 2, 4, 59, 31)),
 (6, datetime.datetime(2012, 12, 2, 5, 3, 7)),
 ...

Thanks! As I suspected, the problem is that it returns a tuple of tuples instead of a list of tuples, and DataFrame.from_records fails on this:

In [1]: pd.DataFrame.from_records([(1,2), (3,4)])
Out[1]:
   0  1
0  1  2
1  3  4

In [2]: pd.DataFrame.from_records(((1,2), (3,4)))
...
c:\users\vdbosscj\scipy\pandas-joris\pandas\core\frame.pyc in _list_to_arrays(da
ta, columns, coerce_float, dtype)
   5300 def _list_to_arrays(data, columns, coerce_float=False, dtype=None):
   5301     if len(data) > 0 and isinstance(data[0], tuple):
-> 5302         content = list(lib.to_object_array_tuples(data).T)
   5303     else:
   5304         # list of lists

TypeError: Argument 'rows' has incorrect type (expected list, got tuple)

This has always been the case in pandas, so I am wondering if this is a change in pymysql

@jreback This is easily solved in the sql code by ensuring it is a list before passing to DataFrame.from_records, or do you think this is something from_records should handle itself?

jorisvandenbossche added this to the 0.17.1 milestone Nov 6, 2015

Contributor

jreback commented Nov 7, 2015

@jorisvandenbossche no, list-of-tuples is the specified type, tuple-of-tuple is not allowed as I think it can signify nested types that would require more parsing (its not allowed in the DataFrame constructor either).

I suppose these might be able to be relaxed but would be a separate issue

In [3]: DataFrame([('a','b')])
Out[3]: 
   0  1
0  a  b

In [4]: DataFrame((('a','b')))
PandasError: DataFrame constructor not properly called!

In [5]: DataFrame(np.array([('a','b')]))
Out[5]: 
   0  1
0  a  b

OK, no problem. It is easy to ensure in the SQL code that it is a list of tuples and not a tuple of tuples

@jreback jreback modified the milestone: Next Major Release, 0.17.1 Nov 15, 2015

@jorisvandenbossche jorisvandenbossche added a commit that referenced this issue Dec 19, 2015

@jorisvandenbossche jorisvandenbossche Merge pull request #11861 from grahamjeffries/bugfix-11522
BUG: force list type for tuples from chunked sql table reads #11522
1357321
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment