Skip to content
This repository has been archived by the owner on Mar 9, 2022. It is now read-only.

Add node e2e test CI. #145

Merged
merged 1 commit into from
Aug 21, 2017
Merged

Conversation

Random-Liu
Copy link
Member

@Random-Liu Random-Liu commented Aug 20, 2017

This PR:

  1. Added node e2e test CI. I've enabled the daily build cron job, and node e2e test usually takes 20min, currently let's only run it in the cron job to accelerate our development.
  2. Change travis to only build with golang 1.8.x and tip, but only test with 1.8.x. @mikebrow
  3. Clearly define the dependencies between different Make targets.

Please note that:

  1. I've commented the TRAVIS_EVENT_TYPE related code for now, so that node e2e test will still be triggered in this PR. I'll uncomment that after node e2e passes.
  2. I'm also working on setup a node e2e test framework in our test infra, which will test against different node image.

/cc @kubernetes-incubator/maintainers-cri-containerd
Signed-off-by: Lantao Liu lantaol@google.com

@Random-Liu
Copy link
Member Author

I've run all the non-slow, non-serial and non-flaky test locally, here is the test result:

Summarizing 10 Failures:

[Fail] [k8s.io] Security Context When creating a container with runAsUser [It] should run the container with uid 65534 
/home/lantaol/workspace/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/pods.go:202

[Fail] [k8s.io] Kubelet Cgroup Manager Pod containers On scheduling a Guaranteed Pod [It] Pod containers should have been created under the cgroup-root 
/home/lantaol/workspace/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e_node/pods_container_manager_test.go:205

[Fail] [k8s.io] Kubelet Cgroup Manager Pod containers On scheduling a Burstable Pod [It] Pod containers should have been created under the Burstable cgroup 
/home/lantaol/workspace/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e_node/pods_container_manager_test.go:293

[Fail] [k8s.io] Security Context when creating containers with AllowPrivilegeEscalation [It] should not allow privilege escalation when false 
/home/lantaol/workspace/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/pods.go:202

[Fail] [k8s.io] Kubelet Cgroup Manager Pod containers On scheduling a BestEffort Pod [It] Pod containers should have been created under the BestEffort cgroup 
/home/lantaol/workspace/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e_node/pods_container_manager_test.go:249

[Fail] [k8s.io] Summary API when querying /stats/summary [It] should report resource usage through the stats api 
/home/lantaol/workspace/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e_node/summary_test.go:263

[Fail] [k8s.io] ImageID [It] should be set to the manifest digest (from RepoDigests) when available 
/home/lantaol/workspace/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e_node/image_id_test.go:64

[Fail] [k8s.io] AppArmor [Feature:AppArmor] when running with AppArmor [It] should enforce a permissive profile 
/home/lantaol/workspace/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e_node/apparmor_test.go:149

[Fail] [k8s.io] Kubelet Cgroup Manager QOS containers On enabling QOS cgroup hierarchy [It] Top level QoS containers should have been created 
/home/lantaol/workspace/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e_node/pods_container_manager_test.go:167

[Fail] [k8s.io] AppArmor [Feature:AppArmor] when running with AppArmor [It] should enforce a profile blocking writes 
/home/lantaol/workspace/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e_node/apparmor_test.go:149

Ran 183 of 246 Specs in 1372.085 seconds
FAIL! -- 173 Passed | 10 Failed | 0 Pending | 63 Skipped 

Ginkgo ran 1 suite in 22m52.377204165s
Test Suite Failed

Failures are mainly caused by several missing features:

  1. RunAsUser
  2. AppArmor
  3. Metrics
  4. Cgroup hierarchy

script:
- make install.deps
- make test
- make binaries
Copy link
Member

@mikebrow mikebrow Aug 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make test-cri will build anyway so removing make binaries just postpones the step..

Suggest keeping the name Verify.. and removing the make binaries from the 1.8.x since that will be done anyway in the test step for 1.8.3. But keep the make binaries in tip as an extra step since build and test isn't being run for it. Suggest keeping build and test since it actually does build then test...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like Build and Test seems better, which is more clear. :)

Build for tip but not build for 1.8.x makes people confusing. Build doesn't actually take too long, so an extra build seems acceptable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly just pointing out that deleting the -make binaries line did not remove the call to make binaries. Yeah an extra build is acceptable since it's so fast. Just pointing out it's also unnecessary on the 1.8 build because you'll be doing it twice... not sure why it's confusing to have a build in a verify step but not verify in a build step :-)

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments. Overall great new feature!

.travis.yml Outdated
go: 1.8.x
- script:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok sure, not doing the long test twice is ok, the verify & build step for tip should be enough to make sure most of the golang issues are addressed. Will only miss runtime issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK.

verify: lint gofmt boiler

version:
@echo $(VERSION)

lint: check-gopath
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was only necessary to do this test prior to go 1.8 and we require 1.8 so.. yeah ok to remove :-) Maybe to a go version check?

Copy link
Member Author

@Random-Liu Random-Liu Aug 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that check-gopath is a PHONY,, target depends on it will run every time.
I want to avoid rebuild, if the binaries are built and source code is not changed, so I have to remove this.

clean:
rm -f $(BUILD_DIR)/cri-containerd
rm -rf $(BUILD_DIR)/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drf?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d seems not required if r is specified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d is even if the directory is empty..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, r is a superset of d. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting.. from quick test I see you are correct. Thx

@./hack/test-cri.sh

test-e2e-node: binaries
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to add test-e2e-node to the above help section...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -0,0 +1,53 @@
#!/bin/bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -o nounset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# prevent unset variables

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All script has this, don't think we want to comment all of them. :)

# See the License for the specific language governing permissions and
# limitations under the License.
set -o nounset
set -o pipefail
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# force script to exit if any command in a pipeline errors

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

export SKIP=${SKIP:-${DEFAULT_SKIP}}
REPORT_DIR=${REPORT_DIR:-"/tmp/test-e2e-node"}

if [[ -z "${GOPATH}" ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's ironic :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above.

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/LGTM your call on the nits.

Cheers!

@Random-Liu
Copy link
Member Author

@mikebrow Let's wait for the test to finish, till know everything looks good. :)
With this PR, we'll have node e2e CI test running. We should be more confident with cri-containerd then. :)

@kubernetes-incubator/maintainers-cri-containerd

@mikebrow
Copy link
Member

Travis is in a bad state. The last test seems to have over run the log max of 4m and CI was not successful in stopping the jobs.

@Random-Liu
Copy link
Member Author

Random-Liu commented Aug 21, 2017

@mikebrow Yeah, we are printing log into travis test output directly. It seems that we have to upload them somewhere else.

@Random-Liu Random-Liu force-pushed the add-node-e2e-ci branch 2 times, most recently from 56199a0 to c8814d2 Compare August 21, 2017 18:04
@Random-Liu
Copy link
Member Author

@mikebrow Separated the e2e test into another stage, and only print out cri-containerd log for now.

.travis.yml Outdated
- script:
- stage: E2E Test
script:
# TODO(random-liu): Uncomment after test passes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the cron job schedule?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Daily for now. There are only 3 options: daily, weekly and monthly.

.travis.yml Outdated
- make test-e2e-node
after_script:
# TODO(random-liu): Upload log to GCS.
- test -f /tmp/test-e2e-node/cri-containerd.log && cat /tmp/test-e2e-node/cri-containerd.log
Copy link
Member

@mikebrow mikebrow Aug 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably a todo or issue here to figure out how to filter the logs a bit more so there's room in our 4m limit to get the containerd log...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikebrow I've changed the log level, :)
I think the best solution is to upload the log onto GCS. Or run the test in kubernetes test_infra, which I'm also working on recently.

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/LGTM just a nit for a todo comment and a question

@mikebrow
Copy link
Member

All pass so good to remove comments from if not cron job skip for the e2e bucket :-)

@mikebrow mikebrow added the lgtm label Aug 21, 2017
@Random-Liu
Copy link
Member Author

Random-Liu commented Aug 21, 2017 via email

@Random-Liu
Copy link
Member Author

@mikebrow Actually, there are still test failing which succeed in my environment:

Summarizing 6 Failures:
[Fail] [k8s.io] Container Lifecycle Hook when create a pod with lifecycle hook [It] should execute prestop exec hook properly [Conformance] 
/home/travis/gopath/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e_node/lifecycle_hook_test.go:83
[Fail] [k8s.io] Networking [k8s.io] Granular Checks: Pods [It] should function for intra-pod communication: http [Conformance] 
/home/travis/gopath/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/networking_utils.go:217
[Fail] [k8s.io] Projected [It] should project all components that make up the projection API [Conformance] [Volume] [Projection] 
/home/travis/gopath/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/util.go:2224
[Fail] [k8s.io] Container Lifecycle Hook when create a pod with lifecycle hook [It] should execute poststart exec hook properly [Conformance] 
/home/travis/gopath/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e_node/lifecycle_hook_test.go:74
[Fail] [k8s.io] Networking [k8s.io] Granular Checks: Pods [It] should function for intra-pod communication: udp [Conformance] 
/home/travis/gopath/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/networking_utils.go:217
[Fail] [k8s.io] Container Runtime Conformance Test container runtime conformance blackbox test when running a container with a new image [It] should be able to pull from private registry with secret [Conformance] 
/home/travis/gopath/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e_node/runtime_conformance_test.go:385
Ran 163 of 223 Specs in 1305.905 seconds
FAIL! -- 157 Passed | 6 Failed | 0 Pending | 60 Skipped 

Because I didn't return that error code, the test passes.

However, I thought it was OK. Let's get the test running first, and then fix these test failures. @mikebrow

Signed-off-by: Lantao Liu <lantaol@google.com>
@Random-Liu Random-Liu merged commit 010a562 into containerd:master Aug 21, 2017
@Random-Liu Random-Liu deleted the add-node-e2e-ci branch August 21, 2017 21:41
lanchongyizu pushed a commit to lanchongyizu/cri-containerd that referenced this pull request Sep 3, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants