Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Redshift #122

Open
matthewgdv opened this issue Oct 20, 2021 · 15 comments
Open

Support Redshift #122

matthewgdv opened this issue Oct 20, 2021 · 15 comments
Labels
enhancement New feature or request funding requested The feature can be developed if there is financial support

Comments

@matthewgdv
Copy link

matthewgdv commented Oct 20, 2021

Unlike psycopg2, which is usable with Redshift out-of-the-box (in fact, the official sqlalchemy dialect for Redshift uses psycopg2 under-the-hood), psycopg3 seems to error out when calling Cursor.execute() against a Redshift cluster.

The error I'm getting is the following (after locally fixing the broken f-string on line 91 of psycopg._encodings, which is where the exception is raised from):

psycopg.NotSupportedError:
codec not available in Python: b'UNICODE'

I have no idea how much work it would be to allow psycopg3 to be used against Redshift, but I can say I'd massively appreciate it.

I've currently got a use-case for running COPY commands on several hundred tables, and if I could async-ify it that would literally transform an hours-long ETL into a <10 minute length one. The idea of spawning hundreds of python interpreters to parallelize it, each with their own error reporting, is nowhere near as appealing as using asyncio-based concurrency.

@dvarrazzo
Copy link
Member

I'll fix the broken f-string, thank you.

Redshift is a commercial product, so I would be open to add its support if there is sponsorship for the feature.

dvarrazzo added a commit that referenced this issue Oct 20, 2021
@auvipy
Copy link

auvipy commented Oct 20, 2021

can be closed now i guess

@dvarrazzo
Copy link
Member

@auvipy why?

@dvarrazzo dvarrazzo added enhancement New feature or request funding requested The feature can be developed if there is financial support labels Oct 20, 2021
@dvarrazzo dvarrazzo changed the title [FEATURE] Support Redshift Support Redshift Oct 20, 2021
@zeenewton
Copy link

I was able to get Redshift working by overriding client_encoding to UTF8

I think redshift is reporting its encoding as UNICODE which is throwing the error, so perhaps you could add a mapping from UNICODE to utf-8

@zeenewton
Copy link

One other note for Redshift compatibility:
DEALLOCATE ALL is only supported on PG >= 8.3
Prior to that you have to deallocate each plan explicitly

@dvarrazzo
Copy link
Member

You might also see that interval should fail to be transferred to/from the database. At least in text mode, no idea about binary mode.

@daturkel
Copy link

daturkel commented Feb 3, 2022

@zeenewton Is there a workaround for the DEALLOCATE ALL issue? I'm running several queries in a pscyopg with block and getting

SyntaxError: syntax error at or near "ALL"
LINE 1: DEALLOCATE ALL

during the last query.

I should note that my query does not contain DEALLOCATE ALL and that instead it appears to be something psycopg is sending.

@zeenewton
Copy link

Yeah, the workaround is to just disable prepared statements when you get connection

conn​.​prepare_threshold​ ​=​ ​None

@johnpatrickroach
Copy link

I'll fix the broken f-string, thank you.

Redshift is a commercial product, so I would be open to add its support if there is sponsorship for the feature.

This works for me:

import psycopg
psycopg._encodings._py_codecs["UNICODE"] = "utf-8"
psycopg._encodings.py_codecs.update(
    (k.encode(), v) for k, v in psycopg._encodings._py_codecs.items()
)

@rockpunk
Copy link

rockpunk commented Oct 8, 2022

An easy fix for this is to set export PGCLIENTENCODING=utf-8 in your environment before creating the connection.

@jybaek
Copy link

jybaek commented Oct 18, 2022

An easy fix for this is to set export PGCLIENTENCODING=utf-8 in your environment before creating the connection.

works fine for me.

os.environ["PGCLIENTENCODING"] = "utf-8"

@7E6D4309-D2A9-4155-9D4F-287B8CDA14C1

You are really refusing to add one line of code to handle this codec because AWS isn't giving you money to add that one line of code? You aren't hurting AWS. You're hurting your users.

@dvarrazzo
Copy link
Member

I keep your opinion in the utmost consideration, @7E6D4309-D2A9-4155-9D4F-287B8CDA14C1

@adityatoshniwal
Copy link

We're facing this issue in pgAdmin with pgBouncer pgadmin-org/pgadmin4#6340

@matanper
Copy link

An easy fix for this is to set export PGCLIENTENCODING=utf-8 in your environment before creating the connection.

An alternative which doesn't require changing the encoding globally, is setting it on the connection string:

pool = ConnectionPool("user=admin port=5439 host=host client_encoding=utf-8")
OR
pool = ConnectionPool("postgresql://admin@host:5439/db?client_encoding=utf-8")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request funding requested The feature can be developed if there is financial support
Projects
None yet
Development

No branches or pull requests