Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing nested struct bug in athena2pyarrow method #612

Merged
merged 4 commits into from
Apr 6, 2021

Conversation

jaidisido
Copy link
Contributor

Issue #, if available:
#585

Description of changes:
The athena2pyarrow method has recursive calls for array, struct and map types (i.e. it's calling itself for these types). We should therefore be passing the original dtype to each recursive call, not the modified dtype. Otherwise, the returned pyarrow type won't match the one in the table.

Example:
For a column 'onboardcountdetails': 'array<struct<CarRef:string,OnboardCount:int>>' the current implementation of the athena2pyarrow would return a pyarrow type of list<item: struct<carref: string, onboardcount: int32>> (notice the lowercase carref and onboardcount) which causes a mismatch in columns.

This fix would return list<item: struct<CarRef: string, OnBoardCount: int32>> which is correct

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@jaidisido jaidisido self-assigned this Mar 19, 2021
@jaidisido jaidisido added bug Something isn't working minor release Will be addressed in the next minor release labels Mar 19, 2021
@jaidisido jaidisido added this to the 2.7.0 milestone Mar 19, 2021
@jaidisido
Copy link
Contributor Author

AWS CodeBuild CI Report

  • CodeBuild project: GitHubCodeBuild8756EF16-vPH6gKH92ax6
  • Commit ID: 0559620
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@jaidisido
Copy link
Contributor Author

AWS CodeBuild CI Report

  • CodeBuild project: GitHubCodeBuild8756EF16-sDRE8Pq0duHT
  • Commit ID: 2cea8d4
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@jaidisido jaidisido merged commit 6b93548 into main Apr 6, 2021
@jaidisido jaidisido deleted the bug-585-nested-struct branch April 6, 2021 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working minor release Will be addressed in the next minor release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant