Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: read_sql no longer works simply with SqlAlchemy selectables and a quick fix #35484

Open
3 tasks done
machow opened this issue Jul 30, 2020 · 10 comments
Open
3 tasks done
Assignees
Labels
good first issue IO SQL to_sql, read_sql, read_sql_query Needs Tests Unit test(s) needed to prevent regressions

Comments

@machow
Copy link

machow commented Jul 30, 2020

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

Hey--I noticed while running siuba's SQL unit tests that queries using the modulo operator are failing.

The issue is due to a recent change for issue #34211 in pandas setting the no_parameters argument before executing via a sqlalchemy engine. This was done to allow queries like SELECT 1 % 2, but causes SqlAlchemy expressions handling % to not always work with read_sql.

Solution. Rather than executing in a special way (no_parameters), I think you want to wrap a string query in SqlAlchemy.sql.text (see here). This will allow both queries with % and the full range of SqlAlchemy expressions. WDYT?

from sqlalchemy import sql, create_engine
import pandas as pd

engine = create_engine('postgresql://postgres:@localhost:5433/postgres', echo=False)
#engine = create_engine('postgresql://USERNAME:PASSWORD@localhost:PORT/DBNAME', echo=False)

# doesn't work, original issue in pandas: 'dict' object does not support indexing
engine.execute("SELECT 1 % 2")

# works, ideal solution
engine.execute(sql.text("SELECT 1 % 2" ))

# queries below broken by no_parameters change ----
pd.read_sql(sql.text("SELECT 1 % 2"), engine)

pd.read_sql(sql.select([sql.literal(1) % sql.literal(2)]), engine)

Here's a gist of the error for the last two queries...

ProgrammingError: (psycopg2.errors.UndefinedFunction) operator does not exist: integer %% integer
LINE 1: SELECT 1 %% 2
                 ^
HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.

[SQL: SELECT 1 %% 2]
(Background on this error at: http://sqlalche.me/e/13/f405)

Full traceback in the details

``` --------------------------------------------------------------------------- UndefinedFunction Traceback (most recent call last) ~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args) 1267 self.dialect.do_execute_no_params( -> 1268 cursor, statement, context 1269 )

~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/sqlalchemy/engine/default.py in do_execute_no_params(self, cursor, statement, context)
595 def do_execute_no_params(self, cursor, statement, context=None):
--> 596 cursor.execute(statement)
597

UndefinedFunction: operator does not exist: integer %% integer
LINE 1: SELECT 1 %% 2
^
HINT: No operator matches the given name and argument types. You might need to add explicit type casts.

The above exception was the direct cause of the following exception:

ProgrammingError Traceback (most recent call last)
in
----> 1 pd.read_sql(sql.text("SELECT 1 % 2"), engine)

~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/pandas/io/sql.py in read_sql(sql, con, index_col, coerce_float, params, parse_dates, columns, chunksize)
513 coerce_float=coerce_float,
514 parse_dates=parse_dates,
--> 515 chunksize=chunksize,
516 )
517

~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/pandas/io/sql.py in read_query(self, sql, index_col, coerce_float, parse_dates, params, chunksize)
1293 args = _convert_params(sql, params)
1294
-> 1295 result = self.execute(*args)
1296 columns = result.keys()
1297

~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/pandas/io/sql.py in execute(self, *args, **kwargs)
1160 """Simple passthrough to SQLAlchemy connectable"""
1161 return self.connectable.execution_options(no_parameters=True).execute(
-> 1162 *args, **kwargs
1163 )
1164

~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py in execute(self, statement, *multiparams, **params)
2236
2237 connection = self._contextual_connect(close_with_result=True)
-> 2238 return connection.execute(statement, *multiparams, **params)
2239
2240 def scalar(self, statement, *multiparams, **params):

~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py in execute(self, object_, *multiparams, **params)
1012 )
1013 else:
-> 1014 return meth(self, multiparams, params)
1015
1016 def _execute_function(self, func, multiparams, params):

~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/sqlalchemy/sql/elements.py in _execute_on_connection(self, connection, multiparams, params)
296 def _execute_on_connection(self, connection, multiparams, params):
297 if self.supports_execution:
--> 298 return connection._execute_clauseelement(self, multiparams, params)
299 else:
300 raise exc.ObjectNotExecutableError(self)

~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py in _execute_clauseelement(self, elem, multiparams, params)
1131 distilled_params,
1132 compiled_sql,
-> 1133 distilled_params,
1134 )
1135 if self._has_events or self.engine._has_events:

~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
1316 except BaseException as e:
1317 self._handle_dbapi_exception(
-> 1318 e, statement, parameters, cursor, context
1319 )
1320

~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py in handle_dbapi_exception(self, e, statement, parameters, cursor, context)
1510 elif should_wrap:
1511 util.raise
(
-> 1512 sqlalchemy_exception, with_traceback=exc_info[2], from_=e
1513 )
1514 else:

~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/sqlalchemy/util/compat.py in raise_(failed resolving arguments)
176
177 try:
--> 178 raise exception
179 finally:
180 # credit to

~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
1266 if not evt_handled:
1267 self.dialect.do_execute_no_params(
-> 1268 cursor, statement, context
1269 )
1270 else:

~/Dropbox/Repo/siuba/env/lib/python3.6/site-packages/sqlalchemy/engine/default.py in do_execute_no_params(self, cursor, statement, context)
594
595 def do_execute_no_params(self, cursor, statement, context=None):
--> 596 cursor.execute(statement)
597
598 def is_disconnect(self, e, connection, cursor):

ProgrammingError: (psycopg2.errors.UndefinedFunction) operator does not exist: integer %% integer
LINE 1: SELECT 1 %% 2
^
HINT: No operator matches the given name and argument types. You might need to add explicit type casts.

[SQL: SELECT 1 %% 2]
(Background on this error at: http://sqlalche.me/e/13/f405)


</details>

@jbrockmendel
Copy link
Member

@machow want to make a PR for this?

@jbrockmendel jbrockmendel added IO SQL to_sql, read_sql, read_sql_query and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 2, 2020
@machow
Copy link
Author

machow commented Sep 10, 2020

@jbrockmendel sure! I have the changes running and passing tests in the pandas docker image. I'm noticing that the tests cover only either queries with parameters or using % as an operator.

I'll try adding two new kinds of tests:

  • queries with %% (should pass after changes)
  • queries with a combination of parameters and % as operator. (should fail before and after changes?).

Will add tests and open a PR tomorrow.

@jreback
Copy link
Contributor

jreback commented Dec 28, 2020

might be closed by #37534 (e.g. working on master).

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Bug labels Aug 8, 2021
@machow
Copy link
Author

machow commented Jan 13, 2022

Hey--that PR (#37534) just changed the code in PR #34211 back, so reverted back to pandas v1.05 behavior that I listed in my PR (#36275).

Here is the table I made in that PR for reference

query v1.05 v.1.1 my PR #36275 issue
no params, % #34211 (merged)
no params, %% #35871
params, % psycopg/psycopg2#827
params, %%
sqla declarative, % #35484

That means this code worked in v1.1, but fails now (as it did in v1.05; it's the no params, % case):

import pandas as pd
from sqlalchemy import create_engine

# 1.3.5
pd.__version__

engine = create_engine("postgresql://postgres:@localhost:5432/postgres", echo=False)

# TypeError: dict is not a sequence
pd.read_sql("SELECT 1 % 2", engine)

I'm not sure whether y'all consider this a bug or not, but pandas' behavior as it exists now IMO aligns well with sqlalchemy's API, since technically you shouldn't pass a sql string to engine.execute, but wrap it in sqlalchemy.sql.text (which is how you can support people passing SQL as a string).

@adejumoridwan
Copy link

@machow I would like to take this issue, I am a new contributor. Probably need some guidance.

@machow
Copy link
Author

machow commented Jul 3, 2023

Hey! I haven't looked at this issue for a while, and am not sure if it's intended behavior in pandas or not. I wonder if another issue with the label "good first issue" might be a bit easier to pick up?

@Anikhet Anikhet removed their assignment Jul 7, 2023
@VedangPokharkar
Copy link

take

@VedangPokharkar
Copy link

hello, @machow , is this issue still being pursued?

@sandervh14
Copy link

I just encountered this too. This bug prevents executing SELECT ... WHERE col like '...%...' statements.

@emilenstrom
Copy link

I just encountered this too. This bug prevents executing SELECT ... WHERE col like '...%...' statements.

The fix for this is to use two %, example:

SELECT ... WHERE fullname like 'John%%'
instead of SELECT ... WHERE fullname like 'John%'

Read about it here:
https://stackoverflow.com/questions/64252764/sql-case-when-x-like-t-in-python-script-resulting-in-typeerror-dict-is-not

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue IO SQL to_sql, read_sql, read_sql_query Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants