Skip to content
This repository has been archived by the owner on May 6, 2020. It is now read-only.

qa: stability: check multi launch and rm stability #491

Merged

Conversation

grahamwhaley
Copy link
Contributor

launch a number of containers, and then 'rm -f' them all in one go.
Check we have the right number of 'components' running at each
stage.

Fixes: #490

Signed-off-by: Graham whaley graham.whaley@intel.com

@clearcontainersbot
Copy link

Popular Images qa-passed 👍

}

function init() {
kill_all_containers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be good to restart docker/cc-proxy services at this point

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yea, I guess. It almost seem backwards - make the system 'more stable' to try and 'find instabilities', but for repeatability it makes a lot of sense. Let me see if I can do something by re-using the stuff we have in the metrics lib files (would be good if we could share some of that across metrics and QA for instance).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it is time to add a lib directory in the root of the repository, and start populating it and using it. But this can be another PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An idea from @chavafg - we can also check that there are no pending docker mounts left after the test - something else we have spotted as a potential issue (clearcontainers/packaging#110)

}

function check_all_running() {
echo Checking ${how_many} containers have all relevant components
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing quotes in this echo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strictly not needed, well not unless ${how_many} might expand into a command line argument or something ;-), but sure, good practice to have - I'll go add

}

function kill_all_containers() {
present=$(docker ps -qa | wc -l)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe count_containers() could be used here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, good point!

how_many_proxys=$(ps --no-header -C ${PROXY_NAME} | wc -l)
if (( ${how_many_running} >= 1 )); then
# If we have any containers, then we expect to have a single proxy
if (( ${how_many_proxys} != 1 )); then
Copy link
Contributor

@MarioCarrilloA MarioCarrilloA Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this tests is designed for runc as a RUNTIME, could be ${how_many_proxys} > 1, this is due to runc does not use cc-proxy or something like that, then how_many_proxys will be 0 and this is an expected result

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Right now the primary goal is to run with CC, but if we were to support 'other' runtimes, then yes, we need to make the script more sentient.
I'm probably about to code something similar up in another script - I'll see if I get to apply that here as well.

@grahamwhaley
Copy link
Contributor Author

I've rebase and pushed - let's see if we can get the CI to pass and ack/merge this?
We can then open an issue to add the mount-checks.
Oh, and Travis CI will fail as I will have rebased on the PR that just landed that removed the Travis config...

@clearcontainersbot
Copy link

kubernetes qa-passed 👍

@clearcontainersbot
Copy link

kubernetes qa-passed 👍

@grahamwhaley
Copy link
Contributor Author

@MarioCarrilloA updated with changes
@devimc added a check for 'dangling mounts'. It is a bit of a simple check - just check we have the same number of mounts in the system after we've deleted all the containers that we had before we launched them. I did that as checking specifically for mounts related to docker and to check after each container launch and delete I believe would need fine knowledge of which graph driver was being used etc. This simple check should be good enough to catch any dangling mounts - it will just catch them at the end of each test cycle rather than at the exact point of failure.
@mcastelino if you are looking at some of the 'parts of the runtime' or 'mount points' left dangling issues then this test might help you reproduce them in a stable manner.

@grahamwhaley
Copy link
Contributor Author

@MarioCarrilloA I should have said - I also made it so you should be able to run with other RUNTIMEs as well (which helped me sanity check the script ;-) )

@clearcontainersbot
Copy link

kubernetes qa-passed 👍

@chavafg
Copy link
Contributor

chavafg commented Oct 2, 2017

lgtm

Approved with PullApprove Approved with PullApprove

# The goals are two fold:
# - spot any stuck or non-started components
# - catch any hang ups

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to have a set -e in here. Note that I checked with that locally, but the script then exited prematurely so I had to change a few of the functions to add || true and also this...

((how_many++))

... had to change to:

how_many=$((how_many + 1))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm. I'm a bit wary of setting -e here - if then docker or something fails because we have a bug then the script would just quit, without a chance of dumping any diagnostics. I think I'd rather have the script find that something 'had not happened', and then give it the chance to dump out the diagnostics.
Let me push an update to the script btw, which checks some other things (that cc-runtime list and /var/lib/virtcontainers/pods both have the correct number of items - and also I might have the 'check number of mounts' in as well - but I need to stare at that a bit)... hold on, I'll do a push - but, be warned, with the runtime as it is at the minute the test will fail on the vc/pods check I believe)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given what this script is doing, yeah that's a fair point.

launch a number of containers, and then 'rm -f' them all in one go.
Check we have the right number of 'components' running at each
stage.

Fixes: clearcontainers#490

Signed-off-by: Graham whaley <graham.whaley@intel.com>
@grahamwhaley
Copy link
Contributor Author

@jodh-intel I've pushed the new version - it does also check the mount count.

@clearcontainersbot
Copy link

kubernetes qa-passed 👍

# The goals are two fold:
# - spot any stuck or non-started components
# - catch any hang ups

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given what this script is doing, yeah that's a fair point.

for ((i=1; i<= ITERATIONS; i++)); do {
echo "Start loop $i"
#spin them up
go
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this can cause clashes with the go command

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh - fair comment. I suspect there is some bash parse/ordering/precedent rule, but it's going to be easier for me to just rename the func ...

@jodh-intel jodh-intel merged commit 380375f into clearcontainers:master Oct 13, 2017
mcastelino pushed a commit to mcastelino/tests that referenced this pull request Jan 23, 2019
…xists

k8s: check cni inteface exists before delete it
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants