Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of bind variables not supported in GROUP BY clause #252

Open
v-kox opened this issue May 27, 2022 · 1 comment
Open

Use of bind variables not supported in GROUP BY clause #252

v-kox opened this issue May 27, 2022 · 1 comment
Assignees

Comments

@v-kox
Copy link

v-kox commented May 27, 2022

as recommended by @Brooke-white (here), I am opening a copy of issue #104 for the redshift_connector on this repository.

When trying to group query results on the outcome of a case statement, driver complains that the original column needs to be part of the GROUP BY clause or needs to be used in an aggregate function.

Code below illustrates the issue. There is a case statement on the data of

import pandas as pd
from sqlalchemy import Column, Integer, String, create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.sql.expression import case, func, select

Base = declarative_base()


class TestTable(Base):
    __tablename__ = "test_table"

    col1 = Column(String, primary_key=True, nullable=False)
    col2 = Column(Integer, nullable=False)
    col3 = Column(String)


def main():
    new_col = case(
        [
            (
                TestTable.col1.in_(["a", "b", "c"]),
                TestTable.col1,
            )
        ],
        else_="Other",
    ).label("new_col")

    stmt = select(new_col, func.sum(TestTable.col2).label("sum_col2")).group_by(new_col)

    engine = create_engine(
        "redshift+redshift_connector://{user}:{password}@{cluster}.redshift.amazonaws.com:5439/{db}"
    )

    engine.execute(stmt)
    # with engine.connect() as conn:
    #     df = pd.read_sql(stmt, conn)
    #     print(df.shape)


if __name__ == "__main__":
    main()

Stacktrace:

Traceback (most recent call last):
  File "/{pythondir}/lib/python3.8/site-packages/redshift_connector/core.py", line 1648, in execute
    ps = cache["ps"][key]
KeyError: ('SELECT CASE WHEN (adhoc.test_table.col1 IN (%s, %s, %s)) THEN adhoc.test_table.col1 ELSE %s END AS new_col, sum(adhoc.test_table.col2) AS sum_col2 \nFROM adhoc.test_table GROUP BY CASE WHEN (adhoc.test_table.col1 IN (%s, %s, %s)) THEN adhoc.test_table.col1 ELSE %s END', ((705, 0, <function text_out at 0x7f1970a1be50>), (705, 0, <function text_out at 0x7f1970a1be50>), (705, 0, <function text_out at 0x7f1970a1be50>), (705, 0, <function text_out at 0x7f1970a1be50>), (705, 0, <function text_out at 0x7f1970a1be50>), (705, 0, <function text_out at 0x7f1970a1be50>), (705, 0, <function text_out at 0x7f1970a1be50>), (705, 0, <function text_out at 0x7f1970a1be50>)))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/{pythondir}/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/{pythondir}/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
  File "/{pythondir}/lib/python3.8/site-packages/redshift_connector/cursor.py", line 231, in execute
    self._c.execute(self, operation, args)
  File "/{pythondir}/lib/python3.8/site-packages/redshift_connector/core.py", line 1718, in execute
    self.handle_messages(cursor)
  File "/{pythondir}/lib/python3.8/site-packages/redshift_connector/core.py", line 1986, in handle_messages
    raise self.error
redshift_connector.error.ProgrammingError: {'S': 'ERROR', 'C': '42803', 'M': 'column "test_table.col1" must appear in the GROUP BY clause or be used in an aggregate function', 'F': '../src/pg/src/backend/parser/parse_agg.c', 'L': '486', 'R': 'check_ungrouped_columns_walker'}

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/{localdir}//test.py", line 42, in <module>
    main()
  File "/{localdir}//test.py", line 35, in main
    engine.execute(stmt)
  File "<string>", line 2, in execute
  File "/{pythondir}/lib/python3.8/site-packages/sqlalchemy/util/deprecations.py", line 401, in warned
    return fn(*args, **kwargs)
  File "/{pythondir}/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3176, in execute
    return connection.execute(statement, *multiparams, **params)
  File "/{pythondir}/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1306, in execute
    return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
  File "/{pythondir}/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 325, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/{pythondir}/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1498, in _execute_clauseelement
    ret = self._execute_context(
  File "/{pythondir}/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1862, in _execute_context
    self._handle_dbapi_exception(
  File "/{pythondir}/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2043, in _handle_dbapi_exception
    util.raise_(
  File "/{pythondir}/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/{pythondir}/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/{pythondir}/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
  File "/{pythondir}/lib/python3.8/site-packages/redshift_connector/cursor.py", line 231, in execute
    self._c.execute(self, operation, args)
  File "/{pythondir}/lib/python3.8/site-packages/redshift_connector/core.py", line 1718, in execute
    self.handle_messages(cursor)
  File "/{pythondir}/lib/python3.8/site-packages/redshift_connector/core.py", line 1986, in handle_messages
    raise self.error
sqlalchemy.exc.ProgrammingError: (redshift_connector.error.ProgrammingError) {'S': 'ERROR', 'C': '42803', 'M': 'column "test_table.col1" must appear in the GROUP BY clause or be used in an aggregate function', 'F': '../src/pg/src/backend/parser/parse_agg.c', 'L': '486', 'R': 'check_ungrouped_columns_walker'}
[SQL: SELECT CASE WHEN (adhoc.test_table.col1 IN (%s, %s, %s)) THEN adhoc.test_table.col1 ELSE %s END AS new_col, sum(adhoc.test_table.col2) AS sum_col2 
FROM adhoc.test_table GROUP BY CASE WHEN (adhoc.test_table.col1 IN (%s, %s, %s)) THEN adhoc.test_table.col1 ELSE %s END]
[parameters: ('a', 'b', 'c', 'Other', 'a', 'b', 'c', 'Other')]
(Background on this error at: https://sqlalche.me/e/14/f405)
@Brooke-white
Copy link
Contributor

Thanks @v-kox -- I've assigned this issue to myself and will look into getting a fix for this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants