-
Notifications
You must be signed in to change notification settings - Fork 81
Closed
Labels
Description
Driver version
2.0.907
Redshift version
Redshift 1.0.38698
Client Operating System
Docker python:3.10.2 image
Python version
3.10.2
Table schema
test_column: INTEGER
Problem description
When attempting to copy a file from S3 into Redshift via awswrangler, a data type mismatch will correctly throw an error. However, the error message is truncated, which makes it hard to debug the issue in non-trivial applications
Python Driver trace logs
redshift_connector.error.ProgrammingError: {'S': 'ERROR', 'C': 'XX000', 'M': 'Spectrum Scan Error', 'D': "
error: Spectrum Scan Error
code: 15007
context: File 'https://s3.region.amazonaws.com/bucket/bucket_directory/subdirectory/afile.snappy.parquet' has an incompatible Parquet schema for column 's3://bucket/bucket_director
query: 1234567
location: dory_util.cpp:1226
process: worker_thread [pid=12345]
Reproduction code
import pandas as pd
import awswrangler as wr
df = pandas.DataFrame([[1.23]], columns="test_column") # target schema is an integer, this is a float
wr.redshift.copy(
df=df,
path="s3://bucket/bucket_directory/subdirectory/",
table="test",
)