Skip to content

Conversation

@bryanyang0528
Copy link
Contributor

@bryanyang0528 bryanyang0528 commented Jun 1, 2020

Issue #, if available:
add unit test for #271

Description of changes:
Add a unit test for testing pandas argument and encoding and newline argument for the fs.open command(https://github.com/awslabs/aws-data-wrangler/blob/b57e2b7245e999191918022341943884e49b2e4e/awswrangler/s3.py#L1654)

p.s. Because pandas.read_csv is in the pandas.concat section, a TypeError will raise when concat get a mock object returned from pandas.read_csv. So I used a try catch block to aviod the exception.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@igorborgest
Copy link
Contributor

Thanks a lot @bryanyang0528!

You can ignore the failed check above, it's my fault, I will fix it.

@igorborgest igorborgest self-assigned this Jun 1, 2020
@igorborgest igorborgest self-requested a review June 1, 2020 16:15
@igorborgest igorborgest added the enhancement New feature or request label Jun 1, 2020
@bryanyang0528
Copy link
Contributor Author

Thanks!

Copy link
Contributor

@igorborgest igorborgest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the TypeError expected? Pandas only support a single character as line terminator. So are we using "\r\n" to force the exceptions? I think I didn't get the point here.

In addition to that, if the exception was intentional, what you think about make pytest expect it like in this example?

@bryanyang0528
Copy link
Contributor Author

@igorborgest yes, you are right. I forgot this way.
This is the output if I don't catch the exception, but it shout be caught by pytest.raises :
Screen Shot 2020-06-02 at 00 49 17

@igorborgest
Copy link
Contributor

@bryanyang0528 this error is weird. Seems that the mocking are not working as expected.

Btw, what is the purpose of mocking these function? Why not let it simply run against moto as usual?

@bryanyang0528
Copy link
Contributor Author

bryanyang0528 commented Jun 1, 2020

@igorborgest I saw that you parsed parameters like encoding and newline from parameters which will pass to pandas and used them in fs.open. The logic here is a little complex for me. So I would like to use mock functions to make sure these parameters will be sent to both functions as expected.

@igorborgest
Copy link
Contributor

@bryanyang0528 Oh I see. And are you expecting this error raised from pandas.concat? it seems a little bit weird to me. But if you are OK with that I will move on and merge it. :)

@bryanyang0528
Copy link
Contributor Author

bryanyang0528 commented Jun 1, 2020

@igorborgest This exception is intentional for me.

  1. pd.read_csv will be called in this function . https://github.com/awslabs/aws-data-wrangler/blob/b57e2b7245e999191918022341943884e49b2e4e/awswrangler/s3.py#L1641

  2. And the result of _read_text_full will be a part of pd.concat here https://github.com/awslabs/aws-data-wrangler/blob/b57e2b7245e999191918022341943884e49b2e4e/awswrangler/s3.py#L1606

So when I mock read_csv function. _read_text_full will return a mocked object to pd.concat, a TypeError will raise because of mocked object is not Series and DataFrame object for pd.concat.

@igorborgest
Copy link
Contributor

@bryanyang0528 Understood. Thanks for the extended explanation!

@igorborgest igorborgest merged commit a137e07 into aws:dev Jun 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants