New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Qemu emulation on GHA for Vagrant tests #503
Conversation
mikedep333
commented
Dec 4, 2020
•
edited
edited
4a7e9a8
to
c86a2f6
Compare
f5f4efe
to
95794c4
Compare
ae082c4
to
64c8add
Compare
3742f78
to
3c0df58
Compare
55ef148
to
c7fa77b
Compare
c7fa77b
to
2d17f86
Compare
To ensure that git recognizes that they are existing ones rather than new ones. This would likely be a problem as we are changing them heavily in the next commit. [noissue]
9312900
to
c47ac0b
Compare
| fi | ||
|
|
||
| sed -i -e 's/memory: 10500/memory: 5500/g' vagrant/boxes.d/* | ||
| sed -i -e 's/cpus: 4/cpus: 2/g' vagrant/boxes.d/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does GHA not handle 4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GHA only has 2 virtual CPUs, just like Travis.
These 2 lines only have any effect if when we run pulp2-nightly-pulp3-source-centos7 on CI, which we currently do not do.
| @@ -99,3 +99,31 @@ jobs: | |||
| PY_COLORS: '1' | |||
| ANSIBLE_FORCE_COLOR: '1' | |||
| shell: bash | |||
| vagrant: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should run for branches instead of PRs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we'd test immediately after merging?
I'm open to that. I figured we'd adjust the schedule according to dev feedback. Or look into enabling merging while these tests still have not completed yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good to me
| # timeout 90 is needed Pulp to service its 1st request on extremely slow | ||
| # machines, such as qemu-emulated 2-core machines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it needs a new changelog entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree. I'll make it a separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gerrod3 this is the change I mentioned
| # timeout 90 is needed Pulp to service its 1st request on extremely slow | ||
| # machines, such as qemu-emulated 2-core machines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it needs a new changelog entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll combine it with pulpcore-api's commit/changelog entry, yeah. It may not even be needed for pulpcore-content, but I did it at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gerrod3 same here
| - name: Ensure Pulp is up and healthy | ||
| uri: | ||
| url: "http://{{ __pulp_default_host if pulp_api_bind.startswith('unix:') else pulp_api_bind }}/pulp/api/v3/status/" | ||
| status_code: "200" | ||
| unix_socket: "{{ pulp_api_bind | regex_replace('^unix:', '') if pulp_api_bind.startswith('unix:') else omit }}" | ||
| # Alternate between 30 second timeouts and 5 second timeouts when handling the situation of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why these magic numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
30 is the default. It handles hypothetical extremely long times for the webserver to produce the page.
We intermix it with 5 second waits, because if the server isn't up yet and suddenly is, a 5 second wait usually is sufficient I would think. Just my guestimate.
If your question is about why I combine a short and a long time out together, the purpose is 2:
- Is to lower the total amount of time that the CI or the user may wait on the health check when we have a "connection timed out". 38 tries * (30 seconds + 6 second wait) + 37 tries * (5 seconds + 6 second wait) = 29 minutes, rather than 45 minutes.
- Still passing a theoretical 29.9 second wait.
- Ensuring a large number of checks over a medium period of time (75 tries * 6 second wait = 7.5 min) when the server does "connection refused," which takes under a second.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was more from where they came from, thank you for all the explanation
Solution: Raise the the Pulp server's gunicorn worker timeout to 90 seconds. fixes: #8228 Pulp Connection Timed Out on slow emulated machines https://pulp.plan.io/issues/8228
c47ac0b
to
00d820c
Compare
Adapted from: pulp/pulplift#66 "RFC: Testing nested Virtualization" Implementation Includes: 1. Upgrade Qemu from 4.4 to 5.2 from our PPA to address a severe bug affecting CentOS 7 guests, they could not even validate SSL certs with curl / yum or create the Pulp postgres database. 2. Upgrade the rest of the virtualization stack on Ubuntu 3. Address the EL8 vagrant-sshfs workaround task failing due to a GPG signature mismatch. 4. Workaround a bug with VM storage on the newer virtualization stack. 5. Switch the boxes used on CentOS 7 for more recent updates. 6. Reducing how long the pulp health check may take, particularly when there is a connection timed out. workaround #8095: FIPS failure in geerlingguy.postgresql by using an old version. https://pulp.plan.io/issues/8095 workaround #7993: pulp_installer fails to create the database on EL7 when LANG=C.UTF-8 https://pulp.plan.io/issues/7993 fixes: #7884 Move the pulp_installer Vagrant tests off Travis https://pulp.plan.io/issues/7884
00d820c
to
13db1da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Mike!
|
Attached issue: https://pulp.plan.io/issues/8228 Attached issue: https://pulp.plan.io/issues/7884 |