Skip to content
This repository has been archived by the owner on Jul 30, 2021. It is now read-only.

e2e: make checkpointer tests more robust. #784

Merged
merged 3 commits into from Dec 11, 2017

Conversation

diegs
Copy link
Contributor

@diegs diegs commented Nov 28, 2017

Instead of extending the existing checkpointer daemonset to run on
worker nodes, start a new checkpointer daemonset instead. This makes
sure that the existing checkpointer (on the apiserver) is not disturbed.

Furthermore, this allows the checkpointer tests to work with the new
checkpointer that is only allowed to look at its own namespace.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 28, 2017
@diegs diegs self-assigned this Nov 28, 2017
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 28, 2017
@diegs diegs force-pushed the checkpointer-e2e branch 2 times, most recently from dae5bf1 to dfa7ae6 Compare November 29, 2017 00:03
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 29, 2017
@diegs diegs force-pushed the checkpointer-e2e branch 2 times, most recently from e3860b0 to 3cd7975 Compare December 5, 2017 10:49
@diegs
Copy link
Contributor Author

diegs commented Dec 5, 2017

Note: includes commit from #778

@diegs
Copy link
Contributor Author

diegs commented Dec 5, 2017

cc @ericchiang

@diegs diegs force-pushed the checkpointer-e2e branch 2 times, most recently from 164b62e to d3a9f98 Compare December 6, 2017 20:09
ericchiang and others added 3 commits December 7, 2017 13:52
The checkpointer now only watches pods in kube-system (kubernetes-retired#774), so it
doesn't need cluster wide permissions.
Instead of extending the existing checkpointer daemonset to run on
worker nodes, start a new checkpointer daemonset instead. This makes
sure that the existing checkpointer (on the apiserver) is not disturbed.

Furthermore, this allows the checkpointer tests to work with the new
checkpointer that is only allowed to look at its own namespace.
@diegs
Copy link
Contributor Author

diegs commented Dec 8, 2017

coreosbot run e2e

Copy link
Contributor

@dghubble dghubble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is still flakiness, but this is valuable to unblock the checkpointer change to only watch a single namespace.

@dghubble
Copy link
Contributor

coreosbot run e2e

@diegs
Copy link
Contributor Author

diegs commented Dec 11, 2017

In the most recent run the reboot test failed. Looks like the apiserver never came back. That is pretty concerning. I'll run some stress tests on that locally.

@diegs
Copy link
Contributor Author

diegs commented Dec 11, 2017

coreosbot run e2e

@diegs
Copy link
Contributor Author

diegs commented Dec 11, 2017

e2e didn't even run this time: ssh: connect to host 54.209.210.25 port 22: Connection timed out

@diegs
Copy link
Contributor Author

diegs commented Dec 11, 2017

coreosbot run e2e

@dghubble dghubble merged commit 64c66a3 into kubernetes-retired:master Dec 11, 2017
@diegs diegs deleted the checkpointer-e2e branch December 11, 2017 23:28
dghubble added a commit to poseidon/terraform-render-bootstrap that referenced this pull request Dec 12, 2017
* pod-checkpointer no longer needs to watch pods in all namespaces,
it should only have permission to watch kube-system
* kubernetes-retired/bootkube#784
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants