Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added check for config changes #33

Merged

Conversation

VariableDeclared
Copy link
Contributor

Moved set_status call further down the build_kubeconfig
method and added a check for differences in kubeconfig.

LP#1822021

Moved set_status call further down the build_kubeconfig
method and added a check for differences in kubeconfig.

LP: https://bugs.launchpad.net/charm-kubernetes-master/+bug/1822021
@VariableDeclared
Copy link
Contributor Author

VariableDeclared commented Jun 24, 2019

Discovered this PR causes a regression on LP#1833089.

@VariableDeclared
Copy link
Contributor Author

VariableDeclared commented Jun 25, 2019

Discovered this PR causes a regression on LP#1833089.

Was unable to repeat the issue. Seems all clear. Going to re-deploy again. Just to be sure.

EDIT: Deployed twice and observed skipping.

@VariableDeclared VariableDeclared changed the title Added check for data changes Added check for config changes Jun 25, 2019
@@ -1458,6 +1458,13 @@ def build_kubeconfig(server):
kubeconfig_path = os.path.join(os.sep, 'home', 'ubuntu', 'config')
# Create the kubeconfig on this system so users can access the cluster.

with open(kubeclientconfig_path, 'r') as conf:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You meant kubeconfig_path vs kubeclientconfig_path here, right?

Also, I think on initial deployment, /home/ubuntu/config won't exist yet. That'll lead to a FileNotFoundError on this open. Consider a try/pass or check with something like isfile() to workaround that.

Changed the mistaken check to check the kubernetes service
configuration file, added a try pass to catch the case
where the configuration file has not yet been created.
Copy link
Member

@kwmonroe kwmonroe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VariableDeclared the good news is that the try block lgtm; thanks for that!

The bad news -- and sorry I didn't catch it earlier -- is that we need to re-think this one. Let's say on initial deploy, we don't have keystone nor a kubeconfig_path file. We'll pass the FileNotFound exception and write our initial kubeconfig with keystone=False.

Now let's say keystone becomes available. We'll initialize the data_changed and continue to write a new kubeconfig with keystone=True. We're still all good.

Now let's say keystone goes away. We'll check data_changed, but it will be False because kubeconfig hasn't changed yet, and that's because data_changed happens before kubeconfig is written. This means we'll return without having written kubeconfig with keystone=False.

pass

hookenv.status_set('maintenance', 'Writing kubeconfig file.')

if ks:
create_kubeconfig(kubeconfig_path, server, ca_crt_path,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call create_kubeconfig differently based on ks. If our ks changes after the kubeconfig file is written, we'll return in the not data_changed handler before we get a chance to write the new kubeconfig with keystone changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow good catch. OK will work on that.

Created a configuration matrix including keystone configuration
and kubernetes configuration for that case where keystone
configuration changes.
@hyperbolic2346
Copy link
Contributor

I hate to be the 🎉 💩 but I have some concerns that this might not be the best option. I agree it is 100% going to keep things matching our config. My concern is a couple of things:

  1. In the past, before this bug to write it every hook, a user could ssh into the machine and add some specific content to the config file and it would stay until they did something such as upgrading, adding or removing a relation, or changing config. This may be poor assumptions by the users.
  2. The performance issue. I think it is my games background coming through here where it simply doesn't matter, but it bothers me that we're going to generate disk i/o every hook for this when it seems we should know if things have changed.

I feel like the proper solution is to track down what flag is causing us to hit this every hook. This is a change to behavior in the last 6 months or maybe just a change where it is noticeable now? It just seems like we know if this data has changed enough to write it without reading the file back in from disk.

Maybe I'm just nitpicking too much. @kwmonroe will keep me in line.

@kwmonroe
Copy link
Member

  1. In the past, before this bug to write it every hook, a user could ssh into the machine and add some specific content to the config file and it would stay until they did something such as upgrading, adding or removing a relation, or changing config. This may be poor assumptions by the users.

That may have been true, but it's certainly not now.. For the benefit of those that aren't sure, consider this:

## from juju client, see we wrote config at 17:36
$ juju debug-log -i kubernetes-master/1 --tail | grep kubeconfig
...
unit-kubernetes-master-1: 17:36:54 INFO unit.kubernetes-master/1.juju-log status-set: maintenance: Writing kubeconfig file.
...

## on k8s-master/1:
$ date
Thu Jul 18 22:40:11 UTC 2019

$ tail -2 config
    password: NYOKYEVwZuSXQ7mnYErQzBvwtMMjSONU
    username: admin

$ echo "- name: knobby" >> config

$ tail -2 config
    username: admin
- name: knobby

$ date
Thu Jul 18 22:40:19 UTC 2019

## back on the juju client, see we wrote config again at 17:41
...
unit-kubernetes-master-1: 17:41:56 INFO unit.kubernetes-master/1.juju-log status-set: maintenance: Writing kubeconfig file.
...

## back on k8s-master/1:
$ date
Thu Jul 18 22:43:44 UTC 2019

$ tail -2 config
    password: NYOKYEVwZuSXQ7mnYErQzBvwtMMjSONU
    username: admin

Our - name: knobby has been clobbered.

  1. The performance issue. I think it is my games background coming through here where it simply doesn't matter, but it bothers me that we're going to generate disk i/o every hook for this when it seems we should know if things have changed.

Agreed, but at least it's a read every hook and not a write. That's something, right?

I feel like the proper solution is to track down what flag is causing us to hit this every hook.

There isn't a flapping flag that's causing this. It's caused by flags being present. We'll build_kubeconfig every time we have certs and a loadbalancer or certs without an lb.

This is a change to behavior in the last 6 months or maybe just a change where it is noticeable now? It just seems like we know if this data has changed enough to write it without reading the file back in from disk.

The @when decorators guarding the calls to build_kubeconfig have been around for a few years (see blame). What has changed in the past 6ish months is that we now set a status when we write the kubeconfig:

3555652

So whatcha want to do @hyperbolic2346? Remove the log or live with a read vs write every hook? ;)

I suppose we could check for changes to server, ca_crt_path, client_pass, and keystone since those are what is passed to create_kubeconfig. That would save the disk read, but feels like an opportunity to miss a change in the future if we add new flags/config that should trigger create_kubeconfig. Wadda you think?

@tvansteenburgh
Copy link
Contributor

tvansteenburgh commented Jul 19, 2019

FWIW I think we should simply change the status_set to a debug log and move on. We should also not write a log line just to say "nothing changed!" - that's just spam.

1 and 2 above don't bother me. The charm owns the file and is free to clobber user changes. User can make a copy if they need to. We're likely doing disk IO every hook anyway (unitdata.kv()).

The points are valid, but they're not important enough for us to spend a lot of time here, in light of what else we have in the queue. The biggest risk here is not 1 and 2, but that we try to get clever to avoid unnecessary writes, and end up introducing a bug where we fail to write when we should.

Review changes, removed log call when config has not
changed and changed set_status to log.
@hyperbolic2346
Copy link
Contributor

Seeing that the change is just notifying the user and not a behavior of calling this function constantly, I am much happier leaving it. I thought calling it each hook was a new thing. I also agree that we should avoid breaking things trying to get fancy.

@hyperbolic2346 hyperbolic2346 merged commit 1978315 into charmed-kubernetes:master Jul 20, 2019
Cynerva pushed a commit that referenced this pull request Jul 30, 2019
* Added check for data changes

Moved set_status call further down the build_kubeconfig
method and added a check for differences in kubeconfig.

LP: https://bugs.launchpad.net/charm-kubernetes-master/+bug/1822021

* Changed mistaken client config check

Changed the mistaken check to check the kubernetes service
configuration file, added a try pass to catch the case
where the configuration file has not yet been created.

* Added configuration matrix

Created a configuration matrix including keystone configuration
and kubernetes configuration for that case where keystone
configuration changes.

* Review changes

Review changes, removed log call when config has not
changed and changed set_status to log.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants