Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply "unify bucket and key" before "provide bucket" #28710

Merged
merged 19 commits into from
Jan 6, 2023

Conversation

dstandish
Copy link
Contributor

@dstandish dstandish commented Jan 3, 2023

Previously if user provided full key it may be overwritten by conn bucket but now we fix ordering of decorators and this won't happen.

This fix doesn't seem to break backcompat because previously you'd get call(Bucket='bucket', Key='s3://other-bucket/file.txt') which should fail anyway.

depends on #28707

Copy link
Member

@feluelle feluelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but to me the tests are looking quite complex because of all the branches.

airflow/providers/amazon/aws/hooks/s3.py Outdated Show resolved Hide resolved
tests/providers/amazon/aws/hooks/test_s3.py Outdated Show resolved Hide resolved
tests/providers/amazon/aws/hooks/test_s3.py Outdated Show resolved Hide resolved
@dstandish
Copy link
Contributor Author

ok @o-nikolas and @feluelle i have updated these tests to move the "tokens" to individual params. it's not quite "putting the test values in the tuple", which, if i understood correctly, is what yall were thinking of (and what i more or less strenuously object to doing, myself anyway) but hopefully it's "orthodox enough" to not raise anyone's hackles (i kid) while still being relatively compact and readable. 🥳

Comment on lines +923 to +924
def test_s3_head_object_decorated_behavior(mock_conn, has_conn, has_bucket, key_kind, expected):
if has_conn == "with_conn":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making these changes @dstandish! It's very much appreciated 🙏 and is indeed what I was describing before. The only other improvement is that you can actually just make many of the params here simply booleans instead of strings. This would allow you to get rid of the string comparisons in the conditionals of the test body.

But of course feel free to do this or not :)

P.S. I tried, but unfortunately cannot, add a full suggested change for the above because the snippet includes deleted lines 927/928 and the GitHub UI does not allow that, and I didn't want to include a partial one so as to avoid confusion.

@o-nikolas
Copy link
Contributor

ok @o-nikolas and @feluelle i have updated these tests to move the "tokens" to individual params. it's not quite "putting the test values in the tuple", which, if i understood correctly, is what yall were thinking of (and what i more or less strenuously object to doing, myself anyway)

Thanks @dstandish! And this was actually exactly what I was intending in the previous CR, apologies if we miscommunicated there 🙏
I only left one comment for a way to optimize it further (using booleans as the param values instead of strings for everything but expected). Feel free to do so or not 😃

@feluelle
Copy link
Member

feluelle commented Jan 5, 2023

Thanks @dstandish. I appreciate that. :)
What I actually meant is 729e275. Feel free to revert if you don't like it, but this is how I would do it.

I think writing tests is not about being DRY. It much more important that they are readable and maintainable.

@potiuk
Copy link
Member

potiuk commented Jan 5, 2023

Glad to see we got to common conclusions even after some initial strugles in communication :)

BTW.

I think writing tests is not about being DRY. It much more important that they are readable and maintainable.

Yep. 100% agree. DRY is important but for tests DAMP is importanter :D :D

https://enterprisecraftsmanship.com/posts/dry-damp-unit-tests

@dstandish
Copy link
Contributor Author

Thanks @dstandish. I appreciate that. :)
What I actually meant is 729e275. Feel free to revert if you don't like it, but this is how I would do it.

By all means, gold plate this to your hearts content 😜

Now if we can figure out why tests failing....

Meanwhile I'm in the mountains and checked out for a few days 🌧️

- resolve "no conn", "no bucket", "rel key" args before running the test
@dstandish
Copy link
Contributor Author

dstandish commented Jan 6, 2023

ok i think i have gotten the test issue fixed now... it was the result of reloading the s3 module. after that, mocks don't work.

separately i had a chance to look at the change @feluelle and i reverted, in small part cus it was failing :) but mainly because...

this is sort of why i objected to doing it that way.
with your way, we see a bunch of test values but we don't really know what they are there for. sure you could use param class and add an ID but that is a lot more noise. and it risks that the params differ from the description (id).
meanwhile, my params, to borrow a phrase, are "descriptive and meaningful phrases", e.g.
"unify", "no_conn", "no_bucket", "full_key"
this scenario covers the case unify first, with no connection (or no schema in connection), and no bucket provided as kwarg, and with a full key.
meanwhile, with yours we see
None, {"key": "s3://key_bucket/key.txt"}
if we look at this, how are we to know what it's testing? and when we look at all the params together, how do we know if we have missed a case?
with mine, we just need to verify that we all combinations of the 4 binary choices. when using the values directly, they don't quite combine that way.
and, this is also related to why i didn't want to make them booleans, which @o-nikolas suggested. by using the string values, we get good test names. test names that clearly indicate the scenario tested.
anyway, not that these other approaches aren't fine, just explaining my choices here.

thanks

@feluelle
Copy link
Member

feluelle commented Jan 6, 2023

sure you could use param class and add an ID but that is a lot more noise.

Not to me, but okay.

Instead of

# full key
# no conn - no bucket - full key
(None, {"key": "s3://key_bucket/key.txt"}, ["key_bucket", "key.txt"]),

you would do:

param(None, {"key": "s3://key_bucket/key.txt"}, ["key_bucket", "key.txt"], id="unify-no_conn-no_bucket-full_key"),

and you get more descriptive output in pytest.

and it risks that the params differ from the description (id).

Isn't this the same as with comments (we have above each line)? And comments also can be valuable. You just need to make sure to update them accordingly.

@dstandish
Copy link
Contributor Author

Isn't this the same as with comments (we have above each line)? And comments also can be valuable. You just need to make sure to update them accordingly.

yes absolutely it is, and this is a good reminder to remove them, which i've now done. i also rearranged things so that the truth table is built in the normal, sane way, and unify vs provide remain innermost (since those are what we are actually comparing here). now it's much easier to see that we have everything covered, and we no longer need the comments.

@dstandish dstandish merged commit 3eee33a into apache:main Jan 6, 2023
@dstandish dstandish deleted the actually-fix-aws-s3-dec-pref branch January 6, 2023 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants