-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] - Add fix-certs cmd #444
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tnozicka The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
f030372
to
1063a46
Compare
272ef65
to
40fc9d5
Compare
40fc9d5
to
a3c5f6e
Compare
recovery-apiserver got it's own PR #448 |
a3c5f6e
to
f005ff1
Compare
Commit message is bad of second commit. |
} | ||
|
||
cmd := &cobra.Command{ | ||
Use: "fix-certs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description is missing. What is this command doing?
KubeApiserverImage string | ||
PodManifestDir string | ||
StaticPodResourcesDir string | ||
Timeout time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timeout of what?
pkg/cmd/fixcerts/fixcerts.go
Outdated
cmd.Flags().StringVar(&o.KubeApiserverImage, "kube-apiserver-image", o.KubeApiserverImage, "") | ||
cmd.Flags().StringVar(&o.StaticPodResourcesDir, "static-pod-resources", o.StaticPodResourcesDir, "path to store a directory containing static pod resources for recovery apiserver manifest") | ||
cmd.Flags().StringVar(&o.PodManifestDir, "pod-manifest-dir", o.PodManifestDir, "directory for the static pod manifes") | ||
cmd.Flags().DurationVar(&o.Timeout, "timeout", o.Timeout, "timeout, 0 means infinite") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timeout of what?
yes, this one is in progress still |
pkg/cmd/fixcerts/fixcerts.go
Outdated
PodManifestDir: o.PodManifestDir, | ||
StaticPodResourcesDir: o.StaticPodResourcesDir, | ||
} | ||
defer apiserver.Destroy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be after creation?
return fmt.Errorf("failed to create config client: %v", err) | ||
} | ||
|
||
operatorConfigInformers := operatorexternalinformers.NewSharedInformerFactory(operatorConfigClient, 10*time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing the start of both shared informer factories.
APIVersion: "v1", | ||
Kind: "namespace", | ||
Name: "kube-system", | ||
}) // fake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is fake?
pkg/cmd/fixcerts/fixcerts.go
Outdated
|
||
certRotationWg := sync.WaitGroup{} | ||
go func() { | ||
certRotationWg.Add(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add outside of the loop. This can race.
@sttts this is WIP still, I'll let you know when it is ready to review, I have not even went through it myself yet :) |
pkg/cmd/fixcerts/fixcerts.go
Outdated
return fmt.Errorf("failed to find kube-apiserver certs dir: %v", err) | ||
} | ||
|
||
kubeControllerManagerManifest := path.Join(o.PodManifestDir, KubeControllerManagerStaticPodFileName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filepath
pkg/cmd/fixcerts/fixcerts.go
Outdated
return fmt.Errorf("unknown object type %q", def.objectType) | ||
} | ||
|
||
dir := path.Join(def.toplevelDir, def.objectType, def.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filepath
|
||
for name, bytes := range data { | ||
filePath := path.Join(dir, name) | ||
err := recovery.EnsureFileContent(filePath, bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this use 0600 for secrets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keeps the perm of what was there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and fallback is 600
f005ff1
to
d50803f
Compare
@@ -11,3 +11,6 @@ COPY --from=builder /go/src/github.com/openshift/cluster-kube-apiserver-operator | |||
COPY manifests/*.yaml /manifests | |||
COPY manifests/image-references /manifests | |||
LABEL io.openshift.release.operator true | |||
# FIXME: entrypoint shouldn't be bash but the binary (needs fixing the chain) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mfojtik I'd expect a reason
@@ -11,3 +11,5 @@ COPY --from=builder /go/src/github.com/openshift/cluster-kube-apiserver-operator | |||
COPY manifests/*.yaml /manifests | |||
COPY manifests/image-references /manifests | |||
LABEL io.openshift.release.operator true | |||
# FIXME: entrypoint shouldn't be bash but the binary (needs fixing the chain) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
pkg/cmd/recoveryapiserver/create.go
Outdated
func NewCreateCommand() *cobra.Command { | ||
o := &CreateOptions{ | ||
Options: NewDefaultOptions(), | ||
KubeApiserverImage: "", // TODO: set the public pullspec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove, error out in Validate() ?
pkg/cmd/recoveryapiserver/create.go
Outdated
} | ||
|
||
func (o *CreateOptions) Run() error { | ||
ctx, cancel := watch.ContextWithOptionalTimeout(context.Background(), o.Timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
context.TODO()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in future we will pass the context via Run() that context will be wired to TERM signal handler
pkg/cmd/recoveryapiserver/create.go
Outdated
func (o *CreateOptions) Run() error { | ||
ctx, cancel := watch.ContextWithOptionalTimeout(context.Background(), o.Timeout) | ||
defer cancel() | ||
// TODO: hook up signals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
termHandler := server.SetupSignalHandler()
ctx, shutdown := context.WithCancel(context.TODO())
go func() {
defer shutdown()
<-termHandler
}()
pkg/cmd/recoveryapiserver/create.go
Outdated
return nil | ||
} | ||
|
||
fmt.Printf("Waiting for recovery apiserver to come up.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use klog or at least add new line ;-)
return nil, err | ||
} | ||
|
||
return &clientcmdapiv1.Config{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest.CopyConfig
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from where? you are also commenting on the wrong PR, this one is just picking up #448
}, nil | ||
} | ||
|
||
func (s *Apiserver) GetKubeClientset() (*kubernetes.Clientset, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why you need clientset?
pkg/cmd/fixcerts/fixcerts.go
Outdated
} | ||
|
||
// TODO: add/prettify descriptions | ||
cmd.Flags().StringVar(&o.KubeApiserverImage, "kube-apiserver-image", o.KubeApiserverImage, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as other pull, I don't want to choose images or directories.
pkg/cmd/fixcerts/fixcerts.go
Outdated
defer cancel() | ||
// TODO: hook up signals | ||
|
||
apiserver := &recovery.Apiserver{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks weird.
pkg/cmd/fixcerts/fixcerts.go
Outdated
StaticPodResourcesDir: o.StaticPodResourcesDir, | ||
} | ||
|
||
err := apiserver.Create() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not do this. Fixing the certs should use the admin.kubeconfig
from it's known location. It should not try to start a kube-apiserver
|
||
operatorConfigInformers := operatorexternalinformers.NewSharedInformerFactory(operatorConfigClient, 10*time.Minute) | ||
certRotationWg.Add(1) | ||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an odd structure. Start
is non-blocking. I don't see a reason to wrap it in a gofunc
d50803f
to
237621a
Compare
certRotationWg.Add(1) | ||
go func() { | ||
defer certRotationWg.Done() | ||
kubeApiserverInformersForNamespaces.Start(certRotationCtx.Done()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same coment about gofunc
eventRecorder := events.NewKubeRecorder(kubeClient.CoreV1().Events(""), "fix-certs (CLI)", &corev1.ObjectReference{ | ||
APIVersion: "v1", | ||
Kind: "namespace", | ||
Name: "kube-system", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use the openshift-kube-apiserver-operator
namespace and set the namespace value or we end up in default and unfindable.
klog.Info("Waiting for certs to be refreshed...") | ||
// FIXME: wait for valid certs | ||
// time.Sleep(5*time.Minute) | ||
time.Sleep(30 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't do. Add a RunOnce
at each level so we can be sure we run and avoid the go func/defer, wait, hope flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't do. Add a
RunOnce
at each level so we can be sure we run and avoid the go func/defer, wait, hope flow.
You'll also need a WaitForPrereqs
func to sync the bits at least once.
} | ||
|
||
timestamp := time.Now().Format(time.RFC3339) | ||
kubeControllerManagerPod.Annotations["force-triggered-by-fix-certs-at"] = timestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expected to see the kube-apiserver pod here too. Let's do them all, just to be sure.
237621a
to
aa4a704
Compare
aa4a704
to
faea844
Compare
@tnozicka: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
merged in #461 |
Requires:
/cc @mfojtik @deads2k