Skip to content

Conversation

@kukushking
Copy link
Contributor

Issue #765

Description of changes:
Add header support for CSV datasets.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@kukushking kukushking requested a review from jaidisido June 30, 2021 14:32
wr.s3.read_csv(path=paths, iterator=True)


@pytest.mark.parametrize("header", [True, ["identifier"]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the code change you made only applies when there is a database and table specified as params in the call. Yet the to_csv call in this test does not reference a Glue table. Should we test that instead? Also can we check that if we make more than one to_csv call to the same Glue table and read it back there isn't a duplicate header line in the resulting dataframe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, completely forgot about that in the test. I changed the test case + added a new one testing multiple modes.

@jaidisido
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubCodeBuild8756EF16-pDO66x4b9gEu
  • Commit ID: 5c19be0
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@jaidisido jaidisido merged commit 777e04f into main Jul 1, 2021
@jaidisido jaidisido deleted the feat-s3-csv-header branch July 1, 2021 11:53
@kukushking kukushking self-assigned this Nov 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants