read.sql fails with adodbapi connection #27547

ParfaitG · 2019-07-23T20:47:40Z

Code Sample

import pandas as pd
import adodbapi

conn = adodbapi.connect(r'Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Path\To\MS\Access\DB.accdb;')

df = pd.read_sql("SELECT TOP 10 * FROM myTable", conn)
print(df)
	
conn.close()

Problem description

Similar to this recent python-forum.io post, pandas.read_sql does not work with the older module, adodbapi, which adheres to Python's DB-API 2.0 (PEP-249) standard and allows interfacing with OLEDB providers, a recognized, industry API for data sources. Above makes a successful connection to an MS Access database in Windows environment. However, the pandas.read_sql results in following error:

Traceback (most recent call last):
  File "OleDBConnection.py", line 14, in <module>
    df = pd.read_sql("SELECT TOP 10 * FROM Random_Data", conn)
  File "C:\Users\ParfaitG\AppData\Roaming\Python\Python37\site-packages\pandas\io\sql.py", line 380, in read_sql
    chunksize=chunksize)
  File "C:\Users\ParfaitG\AppData\Roaming\Python\Python37\site-packages\pandas\io\sql.py", line 1482, in read_query
    parse_dates=parse_dates)
  File "C:\Users\ParfaitG\AppData\Roaming\Python\Python37\site-packages\pandas\io\sql.py", line 141, in _wrap_result
    coerce_float=coerce_float)
  File "C:\Users\ParfaitG\AppData\Roaming\Python\Python37\site-packages\pandas\core\frame.py", line 1512, in from_records
    coerce_float=coerce_float)
  File "C:\Users\ParfaitG\AppData\Roaming\Python\Python37\site-packages\pandas\core\internals\construction.py", line 408, in to_arrays
    elif isinstance(data[0], ABCSeries):
  File "C:\Users\ParfaitG\AppData\Roaming\Python\Python37\site-packages\pandas\core\dtypes\generic.py", line 9, in _check
    return getattr(inst, attr, '_typ') in comp
  File "C:\Users\ParfaitG\AppData\Roaming\Python\Python37\site-packages\adodbapi\apibase.py", line 474, in __getattr__
    return self._getValue(self.rows.columnNames[name.lower()])
KeyError: '_typ'

Do note: a regular cursor.execute and cursor.fetchall works successfully without issue using adodbapi with or without parameters:

conn = adodbapi.connect(r'Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Path\To\MS\Access\DB.accdb;')

cur = conn.cursor()
cur.execute("SELECT TOP 10 * FROM myTable WHERE [value] >= ? AND [value] <= ?", (10, 50))

for row in cur.fetchall():
    print(row)

Expected Output

Desired result would be exact data frame construction from read_sql as rendered successfully with pyodbc which makes an ODBC connection (another industry recognized API to data sources) to the same database:

import pandas as pd
import pyodbc

conn = pyodbc.connect(r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=C:\Path\To\MS\Access\DB.accdb;')

df = pd.read_sql("SELECT TOP 10 * FROM myTable", conn)
print(df)

conn.close()

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.7.2.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.24.1
pytest: None
pip: 19.1.1
setuptools: 40.6.2
Cython: None
numpy: 1.15.4
scipy: 1.2.1
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.2
openpyxl: 2.6.2
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: 4.3.3
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10.1
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None
None

The text was updated successfully, but these errors were encountered:

TomAugspurger · 2019-07-29T21:49:02Z

Pandas only supports sqlite3 for in its DBAPI2 mode. Does a sqlalchemy engine for adodbapi exist?

jorisvandenbossche · 2019-07-30T07:56:24Z

Pandas only supports sqlite3 for in its DBAPI2 mode. Does a sqlalchemy engine for adodbapi exist?

It seems this is supported in sqlalchemy: https://docs.sqlalchemy.org/en/13/dialects/mssql.html#module-sqlalchemy.dialects.mssql.adodbapi

In general this is true, but in practice, read_sql with a query has mostly worked with raw connections as well (as opposed to read_sql_table / read_sql with a table name / to_sql which for sure require sqlalchemy). So I am still curious what is going wrong here.

jorisvandenbossche · 2019-07-30T08:05:13Z

It seems the isinstance(data[0], ABCSeries) check is failing where data[0] is some kind of adodbapi object

I think this is because the adodbapi SQLrow.__getattr__ is returning a KeyError instead of AttributeError, which is arguably a bug in adodbapi and not in pandas (as in general, isinstance checks should never result in an exception, but rather in False)

ParfaitG · 2019-07-30T14:02:07Z

Correct @jorisvandenbossche. AFAIK, read_sql works for raw connections and does not need SQLAlchemy (unlike to_sql). While MS Access (also a file-level, non -server DBMS like SQLite) is not supported in SQLAlchemy, as mentioned above read_sql can successfully run Access SQL queries via pyodbc connection. The problem here is a raw connection for adodbapi does not work with read_sql.

ParfaitG · 2019-07-30T14:17:54Z

Also, a surface-level read of __getattr__ in adodbapi source code (assuming this is latest version) shows no call to raise a KeyError only AttributeError. Though, possibly upstream this error is raised.

Additionally, as mentioned above basic cursor.execute and cursor.fetchall calls work successfully in adodbapi. And under the hood, pandas' read_sql source code appears to be doing the same cursor handling.

jorisvandenbossche · 2019-07-30T14:59:44Z

As mentioned above, it is the isinstance(obj, ABCSeries) that fails (in the later conversion of the set of results from fetchall into a DataFrame). So after cursor.fetchall has succeeded. I suppose this returns a list of adodbapi SQLrow objects, and it is getattr on those that is raising an incorrect error.

TomAubrunner · 2019-09-09T12:12:23Z

I updated pandas and now I get the same KeyError.
Some tests showed that pandas 0.23.4 works well with adodbapi. The problem exists since pandas 0.24.0.
Maybe that helps to locate the cause of the issue.

jorisvandenbossche · 2019-09-09T18:49:09Z

As mentioned above, this is a bug in adodbapi IMO. Someone feel free to report / fix it over there.

TomAubrunner · 2019-09-10T11:33:28Z

ok.
I replaced line 466 in adodbapi/apibase.py from:

        return self._getValue(self.rows.columnNames[name.lower()])

to:

       try:
            return self._getValue(self.rows.columnNames[name.lower()])
        except:
            return False

This works for me.
I'll try to propose this change to adodbapi. Any clue where/how to?

jorisvandenbossche · 2019-09-10T11:45:44Z

It seems the project is on sourceforge: https://sourceforge.net/projects/adodbapi/. The https://pypi.org/project/adodbapi/ (at the bottom) mentions that the pywin32 mailing list can be used for this project.
Or maybe you can directly use the pywin32 github repo which includes adodbapi (https://github.com/mhammond/pywin32), I don't know.

ParfaitG · 2019-09-17T19:50:56Z

Awesome @TomAubrunner! The fix worked with pandas read_sql even with parameters. I have cited you on a StackOverflow question! Let me know if I should re-ask this ticket on the pywin32 repo, or you can add a link if you already attempted a PR. I see some recent tickets for adodbapi, so authors may still be actively monitoring. Possibly the authors may have a different suggestion, too.

Thanks @jorisvandenbossche for looking into this! I believe we can close this ticket on pandas side, migrate, or I can backlink here.

TomAubrunner · 2019-09-18T06:27:46Z

@ParfaitG, I’m not used to place requests, so it would be great if you re-ask this ticket on the pywin32 repo.

jorisvandenbossche · 2019-09-18T07:17:20Z

OK, thanks for the follow-up, closing then on the pandas side.

vernondcole · 2019-10-02T22:03:33Z

Thank you. Seen on pywin32 mailing list. I will look into the problem. The suggested fix looks appropriate.

jbrockmendel added the IO SQL to_sql, read_sql, read_sql_query label Jul 25, 2019

TomAugspurger added the Needs Info Clarification about behavior needed to assess issue label Jul 29, 2019

jorisvandenbossche closed this as completed Sep 18, 2019

jorisvandenbossche removed the Needs Info Clarification about behavior needed to assess issue label Sep 18, 2019

jorisvandenbossche added this to the No action milestone Sep 18, 2019

ParfaitG mentioned this issue Oct 2, 2019

Pandas's read.sql fails with adodbapi connection mhammond/pywin32#1423

Closed

leandermaben mentioned this issue Feb 4, 2020

Replaced .format with f- strings #31660

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read.sql fails with adodbapi connection #27547

read.sql fails with adodbapi connection #27547

ParfaitG commented Jul 23, 2019 •

edited

Loading

INSTALLED VERSIONS

TomAugspurger commented Jul 29, 2019

jorisvandenbossche commented Jul 30, 2019 •

edited

Loading

jorisvandenbossche commented Jul 30, 2019

ParfaitG commented Jul 30, 2019

ParfaitG commented Jul 30, 2019 •

edited

Loading

jorisvandenbossche commented Jul 30, 2019

TomAubrunner commented Sep 9, 2019

jorisvandenbossche commented Sep 9, 2019

TomAubrunner commented Sep 10, 2019 •

edited

Loading

jorisvandenbossche commented Sep 10, 2019

ParfaitG commented Sep 17, 2019

TomAubrunner commented Sep 18, 2019 via email •

edited

Loading

jorisvandenbossche commented Sep 18, 2019

vernondcole commented Oct 2, 2019

read.sql fails with adodbapi connection #27547

read.sql fails with adodbapi connection #27547

Comments

ParfaitG commented Jul 23, 2019 • edited Loading

Code Sample

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

TomAugspurger commented Jul 29, 2019

jorisvandenbossche commented Jul 30, 2019 • edited Loading

jorisvandenbossche commented Jul 30, 2019

ParfaitG commented Jul 30, 2019

ParfaitG commented Jul 30, 2019 • edited Loading

jorisvandenbossche commented Jul 30, 2019

TomAubrunner commented Sep 9, 2019

jorisvandenbossche commented Sep 9, 2019

TomAubrunner commented Sep 10, 2019 • edited Loading

jorisvandenbossche commented Sep 10, 2019

ParfaitG commented Sep 17, 2019

TomAubrunner commented Sep 18, 2019 via email • edited Loading

jorisvandenbossche commented Sep 18, 2019

vernondcole commented Oct 2, 2019

ParfaitG commented Jul 23, 2019 •

edited

Loading

Output of `pd.show_versions()`

jorisvandenbossche commented Jul 30, 2019 •

edited

Loading

ParfaitG commented Jul 30, 2019 •

edited

Loading

TomAubrunner commented Sep 10, 2019 •

edited

Loading

TomAubrunner commented Sep 18, 2019 via email •

edited

Loading