generated from amazon-archives/__template_Apache-2.0
-
Notifications
You must be signed in to change notification settings - Fork 81
Closed
Description
Driver version
2.0.918
Redshift version
PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.63282
Client Operating System
Docker container: Debian GNU/Linux 11 (bullseye)
on python3.9
docker image
Codebuild: Using aws/codebuild/standard:7.0
Python version
python3.9
Table schema
Problem description
- Expected behaviour:
Codebuild machine connecting correctly to redshift. - Actual behaviour:
Codebuild machine not connecting to redshift. - Error message/stack trace:
Traceback (most recent call last):
--
417 | File "/usr/local/lib/python3.9/site-packages/redshift_connector/core.py", line 626, in __init__
418 | self._usock.connect(hostport)
419 | TimeoutError: [Errno 110] Connection timed out
420 |
421 | During handling of the above exception, another exception occurred:
422 |
423 | Traceback (most recent call last):
424 | File "/analytics-dbt/test_redshift_connector.py", line 29, in <module>
425 | with redshift_connector.connect(
426 | File "/usr/local/lib/python3.9/site-packages/redshift_connector/__init__.py", line 376, in connect
427 | return Connection(
428 | File "/usr/local/lib/python3.9/site-packages/redshift_connector/core.py", line 689, in __init__
429 | raise InterfaceError("communication error", e)
430 | redshift_connector.error.InterfaceError: ('communication error', TimeoutError(110, 'Connection timed out'))
- Any other details that can be helpful:
- We have been using dbt, which is a framework for data modeling. For now we are using dbt 1.4.5 which uses
psycopg2
. - We use codebuild as our CI and so far, when running dbt commands such as
dbt compile
ordbt run
we had no problems on connecting to redshift with our codebuild environment. - We have tried to upgrade to dbt multiple times, from versions 1.5 to 1.7.8 (the last one) which start to use
redshift-connector
, and we have not been successful as we are getting time outs multiple times. - IMPORTANT: Sometimes we are able to connect, but most of the time we aren't.
Things we've tried:
- Using a custom
redshift-connector
script to do a simple query (to discard that the issue is caused by dbt itself) -> Same result: connection time out - Using a custom
psycopg2
script to do a simple query (to discard that the issue is caused by dbt itself) -> THIS WORKS! - Running
psql -h <host> -p 5439 -U <user> -d <db>
-> Same result: connection time out
It is also important to note that we have no problem when running any of the above approaches in our local machines.
Python Driver trace logs
Reproduction code
import redshift_connector
import os
import time
schema = os.environ.get("DBT_TARGET_SCHEMA")
# Establish a connection to the database
query = f"""
select
table_catalog as database,
table_name as name,
table_schema as schema,
'table' as type
from information_schema.tables
where table_schema ilike '{schema}'
and table_type = 'BASE TABLE'
union all
select
table_catalog as database,
table_name as name,
table_schema as schema,
case
when view_definition ilike '%create materialized view%'
then 'materialized_view'
else 'view'
end as type
from information_schema.views
where table_schema ilike '{schema}'
"""
with redshift_connector.connect(
host="<host>",
database="dbt_ci",
# user=os.environ.get("DBT_PROFILE_USER"),
user="dbt_ci",
password=os.environ.get("DBT_PROFILE_PASSWORD"),
timeout=999999,
) as conn:
# Create a new cursor
with conn.cursor() as cursor:
start_time = time.time()
# Execute the SQL query
cursor.execute(query)
rows = cursor.fetchall()
for row in rows:
print(row)
This does not work, giving the following error:
Traceback (most recent call last):
--
417 | File "/usr/local/lib/python3.9/site-packages/redshift_connector/core.py", line 626, in __init__
418 | self._usock.connect(hostport)
419 | TimeoutError: [Errno 110] Connection timed out
420 |
421 | During handling of the above exception, another exception occurred:
422 |
423 | Traceback (most recent call last):
424 | File "/analytics-dbt/test_redshift_connector.py", line 29, in <module>
425 | with redshift_connector.connect(
426 | File "/usr/local/lib/python3.9/site-packages/redshift_connector/__init__.py", line 376, in connect
427 | return Connection(
428 | File "/usr/local/lib/python3.9/site-packages/redshift_connector/core.py", line 689, in __init__
429 | raise InterfaceError("communication error", e)
430 | redshift_connector.error.InterfaceError: ('communication error', TimeoutError(110, 'Connection timed out'))
Whereas the following snippet works:
import psycopg2
import os
schema = os.environ.get("DBT_TARGET_SCHEMA")
query = f"""
select
table_catalog as database,
table_name as name,
table_schema as schema,
'table' as type
from information_schema.tables
where table_schema ilike '{schema}'
and table_type = 'BASE TABLE'
union all
select
table_catalog as database,
table_name as name,
table_schema as schema,
case
when view_definition ilike '%create materialized view%'
then 'materialized_view'
else 'view'
end as type
from information_schema.views
where table_schema ilike '{schema}'
"""
# Establish a connection to the database
conn = psycopg2.connect(
dbname="dbt_ci",
host="<host>",
port="5439",
user="dbt_ci",
password=os.environ.get("DBT_PROFILE_PASSWORD"),
)
# Create a cursor object
cur = conn.cursor()
print(query)
# Execute a query
cur.execute(query)
# Fetch all the rows
rows = cur.fetchall()
for row in rows:
print(row)
# Close the cursor and connection
cur.close()
conn.close()
mc51
Metadata
Metadata
Assignees
Labels
No labels