Skip to content

Conversation

@tiechengsu
Copy link

When we are trying to use pandas.to_redshift function, a columnName has double underscore, like __dc_timelabel
This internally call to_s3, which is trying to normalize_columns_names_athena. This ended up with columnName dc_timelabel in redshift and it's unexpected behavior.

The issue is athena actually support repeated underscores, we should not remove it.

disable the rule of remove repeated underscores in function normalize_columns_names_athena in athena.py

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@igorborgest igorborgest self-assigned this Apr 8, 2020
@igorborgest igorborgest added bug Something isn't working major release Will be addressed in the next major release WIP Work in progress labels Apr 8, 2020
@igorborgest
Copy link
Contributor

igorborgest commented Apr 8, 2020

Hi @tiechengsu, thanks for open.

@igorborgest igorborgest merged commit 9c95d12 into aws:master Apr 8, 2020
@igorborgest igorborgest removed the WIP Work in progress label Apr 8, 2020
@tiechengsu tiechengsu deleted the normalize_name_repeated_underscores branch April 8, 2020 04:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working major release Will be addressed in the next major release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants