Skip to content
This repository has been archived by the owner on Sep 30, 2020. It is now read-only.

Fix kubelet-to-apiserver connection checks on controller nodes not to fail in certain cases #1015

Merged
merged 3 commits into from
Nov 22, 2017

Conversation

mumoshu
Copy link
Contributor

@mumoshu mumoshu commented Nov 21, 2017

  • check-worker-connection was failing when audit logging is disabled
  • In addition to that, it was failing since K8S v1.8 due to the change in K8S that no audits are logged when missing an audit policy file

Follow-up for #991 and #996
Fix for #991 (comment)
Depends on #1014

@mumoshu mumoshu added this to the v0.9.9-rc.3 milestone Nov 21, 2017
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 21, 2017
if _, err := os.Stat(filePath); os.IsNotExist(err) {
if defaultValue == nil {
return nil, fmt.Errorf("%s must exist. Please confirm that you have not deleted the file manually", filePath)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the contents of #1014 got mixed in here as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out!
Sorry but I had no time to resolving possible conflicts and testing them in isolations.
Marking this to depend on #1014 anyway.

resources: ["tokenreviews"]
omitStages:
- "RequestReceived"
# Get repsonses can be large; skip them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in repsonses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Fixed.

# Get repsonses can be large; skip them.
- level: Request
verbs: ["get", "list", "watch"]
resources: ${known_apis}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not substituting the known_apis variable, like it's done here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Manually substituted them anyway 😄


auditlogs() {
if [ "$AUDIT_LOG_PATH" == "/dev/stdout" ]; then
docker logs --since 11s ${DOCKERIMAGE} |& cat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason behind this magic number here (11s)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

queryserver is called every 10+ seconds below. I wanted to gather logs for periods slightly longer than that not to drop any lines(hopefully)

@codecov-io
Copy link

codecov-io commented Nov 21, 2017

Codecov Report

Merging #1015 into master will decrease coverage by 0.02%.
The diff coverage is 61.25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1015      +/-   ##
==========================================
- Coverage   34.88%   34.86%   -0.03%     
==========================================
  Files          59       59              
  Lines        4133     4156      +23     
==========================================
+ Hits         1442     1449       +7     
- Misses       2532     2545      +13     
- Partials      159      162       +3
Impacted Files Coverage Δ
core/controlplane/config/credential.go 51.26% <14.28%> (-2.73%) ⬇️
core/controlplane/config/encrypted_assets.go 67.22% <65.75%> (-0.86%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6a0a4dc...3c30fd7. Read the comment docs.

… fail in certain cases

- check-worker-connection was failing when audit logging is disabled
- In addition to that, it was failing since K8S v1.8 due to the change in K8S that no audits are logged when missing an audit policy file

Follow-up for kubernetes-retired#991 and kubernetes-retired#996
Fix for kubernetes-retired#991 (comment)
@mumoshu
Copy link
Contributor Author

mumoshu commented Nov 21, 2017

I've manually verified this to work by updating my cluster few times:

  • From no auditing to auditing enabled while setting logpath of /dev/stdout
  • And then to /var/log/apiserver-audit.log

@danielfm
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 21, 2017
@mumoshu
Copy link
Contributor Author

mumoshu commented Nov 22, 2017

@danielfm Thanks for reviewing!

@mumoshu mumoshu merged commit 650a25c into kubernetes-retired:master Nov 22, 2017
@mumoshu mumoshu deleted the fix-worker-conn-check branch November 22, 2017 01:40
kylehodgetts pushed a commit to HotelsDotCom/kube-aws that referenced this pull request Mar 27, 2018
…nn-check

 Fix kubelet-to-apiserver connection checks on controller nodes not to fail in certain cases
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants