Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix AWS DataSync tests failing (#10985) #11020

Merged
merged 19 commits into from
Nov 25, 2020
Merged

Fix AWS DataSync tests failing (#10985) #11020

merged 19 commits into from
Nov 25, 2020

Conversation

baolsen
Copy link
Contributor

@baolsen baolsen commented Sep 18, 2020

Simple fix for DataSync Operator tests failing.

Moto library was not returning a response containing 'Result' key when calling describe_task_execution.
This is normally returned by DataSync. This Result was used just for logging within the DataSync operator, so can simply be skipped during testing if not found in the response.

@boring-cyborg boring-cyborg bot added the provider:amazon-aws AWS/Amazon - related issues label Sep 18, 2020
@baolsen
Copy link
Contributor Author

baolsen commented Sep 18, 2020

My GitHub actions were timing out on my repo, so I decided to try creating a PR upstream and see if the CI checks work correctly here.

setup.py Outdated Show resolved Hide resolved
Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests were skipped, updated the version 🤞

@baolsen
Copy link
Contributor Author

baolsen commented Sep 19, 2020

So the build is failing, and I get the same failures pattern and errors locally if I do pip install moto==1.3.14.
1.3.14 does not have the datasync mocks so Airflow shouldn't be using it. Not sure why that is happening.

moto==1.3.16 is working a lot better my side.
So I think somewhere there is a ref still to 1.3.14 in Airflow..

@baolsen
Copy link
Contributor Author

baolsen commented Sep 19, 2020

Think it may have still been pinned, something dodgy with my force push. Retrying.

@baolsen
Copy link
Contributor Author

baolsen commented Sep 19, 2020

Looks like it is still pinned to moto 1.3.14 instead of 1.3.16.
I can't find any reason for this in our codebase.... setup.py looks fine and no other references to specific moto versions.

@kaxil any suggestions? :)
Perhaps another package Airflow is installing is limiting the moto version? Only thing I can think of.

@kaxil
Copy link
Member

kaxil commented Sep 21, 2020

@baolsen
Copy link
Contributor Author

baolsen commented Sep 21, 2020

@kaxil I see the same failures locally if I install moto 1.3.14. I think 1.3.14 is still pinned somewhere, somehow.
"NotImplementedError: The delete_task action has not been implemented"

With moto 1.3.16 most of the tests pass when I run locally. (There is still a little work I have to do on the one test)

@feluelle Would love to get your take on this... :)

@kaxil
Copy link
Member

kaxil commented Sep 21, 2020

Hmm. @potiuk might be able to help.

@baolsen
Copy link
Contributor Author

baolsen commented Sep 21, 2020

FYI latest push fixes the last of the failed tests (I ran the test_datasync.py locally using pytest and with moto 1.3.16 installed).

Still, per above comments we need to figure out why/where moto==1.3.14 is being pinned by Airflow.

@baolsen
Copy link
Contributor Author

baolsen commented Sep 21, 2020

@kaxil Maybe this could be because of constraints?
I'm not sure how these constraints work exactly but I can see they are referenced in the Dockerfile and breeze. The constraints files themselves pin moto==1.3.14. Example:
https://raw.githubusercontent.com/apache/airflow/constraints-master/constraints-3.6.txt

@potiuk
Copy link
Member

potiuk commented Sep 21, 2020

This is precisely why It happened @baolsen @kaxil . This is all expected.

This is the raw log of the "Build Image" workflow corresponding to your build:

https://pipelines.actions.githubusercontent.com/dG18C1r9Veh0Ng8eypyIHvAXk1Mz6Bvz6mj1myj6BZplhvMhGo/_apis/pipelines/1/runs/17875/signedlogcontent/19?urlExpires=2020-09-21T19%3A09%3A38.6094424Z&urlSigningMethod=HMACV1&urlSignature=MVvgOsdVUakamt%2BsTq0HxIU7BQJyTNPzmUsz2u3aySE%3D

The line from log:

2020-09-21T13:32:24.8603720Z Requirement already satisfied: moto==1.3.14 in /usr/local/lib/python3.6/site-packages (from -c https://raw.githubusercontent.com/apache/airflow/constraints-master/constraints-3.6.txt (line 236)) (1.3.14)

This is perfectly fine. If your change opens up for a newer version, it might not succeed. The constraints will only be updated after merge, when all tests succeed for the upgraded version. this is to protect the PR authors from accidentally failing on seemingly unrelated upgrades in requirements. Imagine you add a new dependency (say presto) and you want to make PR with it - but in the meantime 'moto' latest version starts breaking the build. If we do not use constraint files then those things could happen easily and you would have to fix unrelated problems in your change.

The workflow works like that that only AFTER your change is merged, the master build will attempt to upgrade all deps eagerly and it will run all the tests and it will only modify the constraints if all the tests succeed. Than all builds after that constraint push will start using upgraded moto.

So in your case, if you locally tested that your change works with latest moto, all is fine. after merge and successful tests in master everyone will be greeted with the updated moto :)

@potiuk
Copy link
Member

potiuk commented Sep 21, 2020

And if you really want to make a change where you bump moto and want to (for example) introduce limit to have version >1.3.15 for example, that will also work. Setup.py has precedence over constraints so if setup.py has >1.3.15 and constraints == 1.3.14, the 1.3.16 version will be installed.

setup.py Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
@potiuk
Copy link
Member

potiuk commented Sep 22, 2020

I see now that we do want to get >=1.13.16 :). Good. :)

@baolsen
Copy link
Contributor Author

baolsen commented Sep 22, 2020

Hey @potiuk , thanks for the feedback.
Looks like moto 1.3.14 is still pinned despite our changes in setup.py.

The test errors are:
"NotImplementedError: The delete_task action has not been implemented"
Which shows moto 1.3.14 is in use, I can reproduce this locally with that version.

Your link expired, but I tried to use "Search the logs" on the build and can't find anything for "moto".
Not sure what to check / do next :)

@potiuk
Copy link
Member

potiuk commented Sep 22, 2020

Yeah. I see it - seems that setup.py DOES NOT give precedence over constraints. I have to figure out something then. I have some idea for that . Sorry for the trouble :(

Copy link
Member

@feluelle feluelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 (..once we have fixed the issue with setup.py)

@baolsen
Copy link
Contributor Author

baolsen commented Sep 29, 2020

Hey @potiuk . Trust all is well :) Is there anything more I can do to assist ?
Not to nag, but I've got another DataSync PR waiting in the wings for this one to be merged :) Please let me know if / how I can assist here.

@kaxil
Copy link
Member

kaxil commented Oct 6, 2020

@baolsen I think if you can confirm again that all the tests that are failing here pass for you locally with the fixes, we can just go ahead and merge and see.

cc @potiuk

@baolsen
Copy link
Contributor Author

baolsen commented Oct 7, 2020

Awesome. I ran the tests again now, rebased and pushed.
Local results attached:
hook tests.txt
operator tests.txt

@kaxil
Copy link
Member

kaxil commented Oct 8, 2020

#11363 should fix the issue with testing this kind of things within the PR

@baolsen
Copy link
Contributor Author

baolsen commented Nov 18, 2020

I have reverted the changes made to the test_batch_waiters hook and rebased to master. If the failure is still present what I intend to do is disable the problematic test so at least the remaining work on this PR can be merged.

I'm not sure what the process is to get the test fixed thereafter, perhaps log a new issue. But I feel that resolution of this timeout / deadlock issue is not in the scope of this PR. We've already included the moto version upgrade, fixed DataSync and a bunch of other tests.

@baolsen
Copy link
Contributor Author

baolsen commented Nov 18, 2020

I've skipped the problematic test. Should be good to merge now.

@potiuk
Copy link
Member

potiuk commented Nov 22, 2020

I've skipped the problematic test. Should be good to merge now.

I think a better way to do it would be to mark it as pytest.xfail. This way we will see it in the output and likely fix some day when we do cleanup. @kaxil WDYT? This could also help @turbaszek in fixing #12508

@kaxil
Copy link
Member

kaxil commented Nov 23, 2020

I've skipped the problematic test. Should be good to merge now.

I think a better way to do it would be to mark it as pytest.xfail. This way we will see it in the output and likely fix some day when we do cleanup. @kaxil WDYT? This could also help @turbaszek in fixing #12508

Ya I agree 👍

@baolsen
Copy link
Contributor Author

baolsen commented Nov 24, 2020

Neat, I didn't know about that. Done & pushed

@mik-laj
Copy link
Member

mik-laj commented Nov 24, 2020

@baolsen CI are sad. Can you look at it?

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
@kaxil kaxil merged commit 663259d into apache:master Nov 25, 2020
@baolsen
Copy link
Contributor Author

baolsen commented Nov 25, 2020

Thanks @kaxil and @mik-laj , glad to see this one merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants