2.9 into develop #13143

SimonRichardson · 2021-07-08T09:04:18Z

Merge 2.9 into develop

6ad00ba (upstream/2.9, origin/2.9, 2.9) Merge pull request #13142 from SimonRichardson/2.8-into-2.9-cache
f4b4ab4 Merge pull request #13140 from hpidcock/pull-secrets
185fac6 Merge pull request #13135 from wallyworld/cass-image-repo-port
2b67b0a Merge pull request #13134 from achilleasa/2.9-backport-equinix-provider-support
43b8197 (achilleasa/2.9) Merge pull request #13129 from ycliuhw/fix/lp-1934180
25e8028 Merge pull request #13128 from tlm/oauth-cred-k8s-lp1929910
7b18431 Merge pull request #13133 from SimonRichardson/pod-ready-timeout
c139c11 Merge pull request #13130 from SimonRichardson/set-config-empty-string

Conflicts:

CONFLICT (content): Merge conflict in cmd/juju/commands/bootstrap_test.go
CONFLICT (content): Merge conflict in cloudconfig/podcfg/podcfg_test.go
CONFLICT (content): Merge conflict in cloudconfig/podcfg/image.go
CONFLICT (content): Merge conflict in cloud/clouds_test.go
CONFLICT (content): Merge conflict in apiserver/facades/client/application/application_unit_test.go
CONFLICT (content): Merge conflict in apiserver/facades/agent/uniter/uniter.go

This improves visibility in cases such as LP1930899 where underprovisioned mongodb instance prevented log entries from being persisted with no additional error output. As the DB error was returned back to the caller (log sender worker), the worker would keep restarting thus preventing models from logging anything. While this commit does not fix the underlying problem (underprovisioned mongo) it should at least make the cause of the problem more obvious to users that are trying to diagnose similar issues by inspecting the logsink.log file contents.

…rsisting-logs-to-db-fails juju#13097 This improves visibility in cases such as LP1930899 where underprovisioned mongodb instance prevented log entries from being persisted with no additional error output. As the DB error was returned back to the caller (log sender worker), the worker would keep restarting thus preventing models from logging anything. While this commit does not fix the underlying problem (underprovisioned mongo) it should at least make the cause of the problem more obvious to users that are trying to diagnose similar issues by inspecting the logsink.log file contents. ## Bug reference https://bugs.launchpad.net/juju/+bug/1930899 (NOTE: the PR does not fix ^ but helps diagnose the root cause of the issue)

The juju ssh --proxy command uses the controller as a jumpbox and invokes 'nc' to forward ssh traffic to the target machine. While this approach works fine for regular machines, it fails (for substrates as ec2) for container machines (lxd/kvm) when they only have a FAN address. This is generally acceptable as juju only requires for machines to be able to connect to the controller and not vice-versa. In particular, ec2 does not allow FAN connectivity from the controller model to any other Juju model (and vice versa). To workaround this limitation, we now first check whether the machine we are trying to connect to is a container. If that's the case, we identify the address of the machine that hosts the container and modify the ssh proxy command to use a 2-level proxying scheme: we ssh to the controller, then ssh to the machine hosting the container and then run 'nc' to route the ssh traffic to the intended machine.

…oxy-to-fan-subnets juju#13108 The juju ssh --proxy command uses the controller as a jumpbox and invokes 'nc' to forward ssh traffic to the target machine. While this approach works fine for regular machines, it fails (for substrates as ec2) for container machines (lxd/kvm) when they only have a FAN address. This is generally acceptable as juju only requires for machines to be able to connect to the controller and not vice-versa. In particular, ec2 does not allow FAN connectivity from the controller model to any other Juju model (and vice versa). To workaround this limitation, we now first check whether the machine we are trying to connect to is a container. If that's the case, we identify the address of the machine that hosts the container and modify the ssh proxy command to use a 2-level proxying scheme: we ssh to the controller, then ssh to the machine hosting the container and then run 'nc' to route the ssh traffic to the intended machine. Caveats: this workaround cannot be applied in a scenario where we `juju scp` to copy files between two different **container remotes** as each remote in this case would need a custom ssh proxy command. ## QA steps Bootstrap to ec2. Then: ```sh $ juju deploy mysql --to lxd # should fail $ juju ssh mysql/0 ERROR cannot connect to any address: [252.40.132.122:22] # should work with this patch $ juju ssh --proxy mysql/0 hostname juju-12d453-0-lxd-0 Connection to 252.40.132.122 closed. # try scp $ echo "BTC" > wallet $ juju scp --proxy -- -3 wallet mysql/0: $ juju scp --proxy -- -3 wallet 0: # Verify the files have been uploaded $ juju ssh --proxy 0 cat wallet BTC Connection to 172.31.40.132 closed. $ juju ssh --proxy 0/lxd/0 cat wallet BTC Connection to 252.40.132.122 closed. ``` ## Bug reference https://bugs.launchpad.net/juju/+bug/1932547

The following changes allow the charm config as a map and the charm config as a yaml to work as intended. The code previously assumed that there was always a yaml and that the charm config map would always overwrite it. The problem occurs when there isn't a charm yaml and it would blindly overwrite values of the yaml. This caused empty strings to be converted to nil, which then caused the --reset flow to be triggered. The --reset flow essentially says, use the config defaults when you see nil, by deleting any user config. That's something that we didn't want to trigger. The change causes us to exit early if there isn't a yaml. There is a much deeper problem and one that we will need to understand. API versioning should essentially not touch the old code OR at the very least have enough testing to ensure that any new changes keeps the old behaviour. 100% code coverage in these scenarios would be beneficial to ensure that we didn't trip up here. The problem with attempting to get to that level, is the amount of levels you have to test to just get that far. From API server all the way down to state, it's a large task and maybe not worth it?

To correctly parse other values that aren't strings, we should parse via the config using ParseSettingsStrings.

…tring juju#13130 The following changes allow the charm config as a map and the charm config as a YAML to work as intended. The code previously assumed that there was always a YAML and that the charm config map would always overwrite it. The problem occurs when there isn't a charm YAML and it would blindly overwrite values. This caused empty strings to be converted to `nil`, causing the --reset flow to be triggered. The --reset flow sets the config defaults when it sees nil. There is a much deeper problem and one that we will need to understand. API versioning should essentially not touch old code _or_ at the very least have enough testing to ensure that any new changes keeps the old behaviour. 100% code coverage in these scenarios would be beneficial to ensure that we didn't trip up here. The problem with attempting to get to that level, is the amount of levels you have to test to just get that far. From API server all the way down to state, it's a large task and maybe not worth it? ## QA steps Save bundle.yaml ```sh cat >bundle.yaml<<EOF applications: haproxy: charm: cs:haproxy channel: stable options: services: "" EOF ``` ```sh $ juju bootstrap lxd test $ juju deploy ./bundle.yaml $ juju config haproxy services $ juju config haproxy services="" WARNING the configuration setting "services" already has the value "" $ juju config haproxy services ``` Notice that the default config shouldn't be shown. ## Bug reference https://bugs.launchpad.net/juju/+bug/1934151

The following moves the timeout deadline from 1 minute to 10 minutes. If a pod hasn't become ready in that time, I think it's safe to say it won't ever! The change is simple, just add a longer duration.

juju#13133 The following moves the timeout deadline from 1 minute to 10 minutes. If a pod hasn't become ready in that time, I think it's safe to say it won't! The change is simple, just add a longer duration. ## QA steps See: https://discourse.charmhub.io/t/error-getting-started-with-juju-microk8s/3855/3 ## Bug reference https://bugs.launchpad.net/juju/+bug/1934494

This change updates the Juju facades to return Kubernetes credential that pre 2.9 releases can understand. This has been done through a facade bump for all users of the CloudSpec facade.

juju#13128 OAuth 2.8 credentials support in 2.9. This change updates the Juju facades to return Kubernetes credential that a pre 2.9 releases can understand. This has been done through a facade bump for all users of the CloudSpec facade. ## Checklist - [x] Requires a [pylibjuju](https://github.com/juju/python-libjuju) change - [x] Added [integration tests](https://github.com/juju/juju/tree/develop/tests) for the PR - [x] Added or updated [doc.go](https://discourse.jujucharms.com/t/readme-in-packages/451) related to packages changed - [x] Comments answer the question of why design decisions were made ## QA steps 1. Deploy a Juju 2.8 controller with a new model and deploy a workload. 2. Upgrade the controller to this PR 3. Check that the workload agent and support components have no errors and are communicating with the Kubernetes api. 4. Check controller logs for errors ## Bug reference https://bugs.launchpad.net/juju/+bug/1929910

juju#13129 *Delete cluster role bindings before creating and ensure the resource has been deleted completely before creating;* Driveby: refactor all caas code to use client-go pointer helper functions; ## Checklist - [ ] ~Requires a [pylibjuju](https://github.com/juju/python-libjuju) change~ - [ ] ~Added [integration tests](https://github.com/juju/juju/tree/develop/tests) for the PR~ - [ ] ~Added or updated [doc.go](https://discourse.jujucharms.com/t/readme-in-packages/451) related to packages changed~ - [x] Comments answer the question of why design decisions were made ## QA steps ```console $ juju deploy /tmp/charm-builds/mariadb-k8s/ --debug --resource mysql_image=mariadb $ yml2json /tmp/charm-builds/mariadb-k8s/reactive/k8s_resources.yaml| jq '.kubernetesResources.serviceAccounts' [ { "name": "rbac-foo", "roles": [ { "rules": [ { "apiGroups": [ "" ], "verbs": [ "get", "watch", "list" ], "resources": [ "pods" ] } ], "global": true, "name": "pod-cluster-role" } ], "automountServiceAccountToken": true } ] $ mkubectl -nt1 get clusterrolebinding.rbac.authorization.k8s.io/rbac-foo-t1-pod-cluster-role -o yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: controller.juju.is/id: 1bb4c340-1536-4281-815b-ad893b5fcd73 model.juju.is/id: b3bcd724-b4e0-4bb1-80e2-af346cbba787 creationTimestamp: "2021-07-02T06:11:17Z" labels: app.juju.is/created-by: controller app.kubernetes.io/managed-by: juju app.kubernetes.io/name: mariadb-k8s model.juju.is/name: t1 managedFields: - apiVersion: rbac.authorization.k8s.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:controller.juju.is/id: {} f:model.juju.is/id: {} f:labels: .: {} f:app.kubernetes.io/managed-by: {} f:app.kubernetes.io/name: {} f:model.juju.is/name: {} f:roleRef: f:apiGroup: {} f:kind: {} f:name: {} f:subjects: {} manager: juju operation: Update time: "2021-07-02T06:11:17Z" name: rbac-foo-t1-pod-cluster-role resourceVersion: "166248" selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/rbac-foo-t1-pod-cluster-role uid: ace7f3f1-6d2f-4cd3-aca1-b9211126d1d0 roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: t1-pod-cluster-role subjects: - kind: ServiceAccount name: rbac-foo namespace: t1 # change podspec a bit then build and upgrade charm $ juju upgrade-charm mariadb-k8s --path /tmp/charm-builds/mariadb-k8s/ --debug --resource mysql_image=mariadb $ mkubectl -nt1 get clusterrolebinding.rbac.authorization.k8s.io/rbac-foo-t1-pod-cluster-role -o yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: controller.juju.is/id: 1bb4c340-1536-4281-815b-ad893b5fcd73 model.juju.is/id: b3bcd724-b4e0-4bb1-80e2-af346cbba787 creationTimestamp: "2021-07-02T06:12:30Z" labels: app.juju.is/created-by: controller app.kubernetes.io/managed-by: juju app.kubernetes.io/name: mariadb-k8s model.juju.is/name: t1 managedFields: - apiVersion: rbac.authorization.k8s.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:controller.juju.is/id: {} f:model.juju.is/id: {} f:labels: .: {} f:app.kubernetes.io/managed-by: {} f:app.kubernetes.io/name: {} f:model.juju.is/name: {} f:roleRef: f:apiGroup: {} f:kind: {} f:name: {} f:subjects: {} manager: juju operation: Update time: "2021-07-02T06:12:30Z" name: rbac-foo-t1-pod-cluster-role resourceVersion: "166401" selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/rbac-foo-t1-pod-cluster-role uid: 6ccb09f0-f931-408a-a478-f2c7a5ad817c roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: t1-pod-cluster-role subjects: - kind: ServiceAccount name: rbac-foo namespace: t1 ``` ## Documentation changes No ## Bug reference https://bugs.launchpad.net/juju/+bug/1934180

This commit adds EM API entrypoint to the cloud package and it generates the updated go file. Signed-off-by: root <jasmin.gacic@gmail.com> Signed-off-by: Gianluca Arbezzano <ciao@gianarb.it>

This commit contains the struct and functions needed to correctly hook Equinix Metal API authentication to Juju Signed-off-by: root <jasmin.gacic@gmail.com>

This is an empty stub for the Equinix Metal Environ and instance implementation. It returns panics and notImplemented errors Signed-off-by: Gianluca Arbezzano <ciao@gianarb.it>

This commit contains the device implementation for Equinix Metal Device

This commit implement a couple of functions needed to bridge JuJu requirements against the Equinix Metal API.

This commit contains the logic to render a valid User Data for Equinix Metal device

This commit bridges the Equinix Metal API to JuJu environ interface

This commit hooks the Equinix Metal provider to gocheck testsuite. Signed-off-by: Gianluca Arbezzano <ciao@gianarb.it>

This commit contains the init file to register the Equinix Metal provider to JuJu Signed-off-by: root <jasmin.gacic@gmail.com> Signed-off-by: Gianluca Arbezzano <ciao@gianarb.it>

This commit adds supports for juju subnet for the Equinix Metal provider. Signed-off-by: root <jasmin.gacic@gmail.com>

…e rules

Allow the disabling of public ip addresses if the constraint is set. This is a workaround for lack of a firewall for now.

signature

…ovider-support juju#13134 This PR backports juju#12983 to 2.9 so it can be made available as part of the next point release. ## QA steps ```console # clone this branch and compile binaries $ cp cloud/fallback-public-cloud.yaml ~/.local/share/juju/public-clouds.yaml # This command shows Equinix as cloud provider now $ juju clouds --client --all # Add credentials $ juju add-credential equinix # bootstrap the controller $ juju bootstrap equinix/am test-equinix ``` At this point should be juju business as usual, what I do is enable-ha and check controller status with: ```console $ juju enable-ha $ juju status -m controller ```

The cache would fail to notify if you changed from a value and back to the original value. This is because the underlying hash value was never written. This has two consequences: 1. The cache could never emit a change of the original value 2. Because we never cached the new value, if the new value was the same as the old one, we would still emit the event. The fix is easy, just save the hash after the notify and we can have better cache coherency.

juju#13135 A small fix for an issue found by QA folks when setting up a custom docker image repo. If the repo URL configured using `caas-image-repo` has a port, the parsing fails. ## QA steps There's a unit test which ensures the expected behaviour. There's no 2.9.8 images but do this ` juju bootstrap microk8s test --config caas-image-repo=docker.io:5000/jujusolutions` and it won't find the image but you can check the correct image URL: ``` microk8s.kubectl -n controller-test get pod/controller-0 -o json | jq .spec.containers[0].image "docker.io:5000/jujusolutions/jujud-operator:2.9.8" ``` Also bootstrap a standard microk8s controller to regression test default behaviour. ## Bug reference https://bugs.launchpad.net/juju/+bug/1934707

juju#13137 The cache would fail to notify if you changed from a value and back to the original value (flip flopping). This is because the underlying hash value was never written. This has two consequences: 1. The cache could never emit a change of the original value 2. Because we never cached the new value, if the new value was the same as the old one, we would still emit the event. The fix is easy, just save the hash after the notify and we can have better cache coherency. ## QA steps If you change the model-config logging-config to a new value and back to the old value, it should now correctly go back to the old value. Flip-flopping of values in the logging-config exposes this relatively easily. ```sh $ juju bootstrap lxd test $ juju model-config logging-config="<root>=WARN" $ juju model-config logging-config="<root>=DEBUG" ``` View `$ juju debug-log -m controller` to ensure that the flip-flop happens.

juju#13140 Adds image pull secrets for sidecar charms with private images such as from charmhub. ## QA steps Deploy charm with private repo images. Check image pull secrets exist and images pull. Refresh charm with public repo images. Check image pull secrets are cleaned up. ## Documentation changes N/A ## Bug reference https://bugs.launchpad.net/juju/+bug/1934416

juju#13142 Merge forward 2.8 into 2.9 2ecb859 (upstream/2.8, origin/2.8, 2.8) Merge pull request juju#13137 from SimonRichardson/cache-coherence c973d15 (achilleasa/2.8) Merge pull request juju#13108 from achilleasa/2.8-support-ssh-with-proxy-to-fan-subnets f505b12 Merge pull request juju#13097 from achilleasa/2.8-logsink-error-if-persisting-logs-to-db-fails Conflicts: CONFLICT (content): Merge conflict in cmd/juju/commands/ssh_unix_test.go CONFLICT (content): Merge conflict in cmd/juju/commands/ssh_machine.go CONFLICT (content): Merge conflict in cmd/juju/commands/ssh.go CONFLICT (content): Merge conflict in cmd/juju/commands/scp.go

SetApplicationsConfig has been removed in preference to SetConfigs, so the test are no longer required.

2.9 introduced different return types for GetControllerImagePath and I missed that in the merge, this fixes that.

SimonRichardson · 2021-07-08T10:48:16Z

!!build!!

manadart

Looks like we need to update tests for the anticipated Mongo version.

13:02:34 FAIL: podcfg_test.go:75: podcfgSuite.TestOperatorImagesCustomRepo
13:02:34 
13:02:34 podcfg_test.go:90:
13:02:34     c.Assert(podConfig.GetJujuDbOCIImagePath(), gc.Equals, "path/to/my/repo/juju-db:4.0")
13:02:34 ... obtained string = "path/to/my/repo/juju-db:4.4"
13:02:34 ... expected string = "path/to/my/repo/juju-db:4.0"
13:02:34 
13:02:34 
13:02:34 ----------------------------------------------------------------------
13:02:34 FAIL: podcfg_test.go:58: podcfgSuite.TestOperatorImagesDefaultRepo
13:02:34 
13:02:34 podcfg_test.go:72:
13:02:34     c.Assert(podConfig.GetJujuDbOCIImagePath(), gc.Equals, "jujusolutions/juju-db:4.0")
13:02:34 ... obtained string = "jujusolutions/juju-db:4.4"
13:02:34 ... expected string = "jujusolutions/juju-db:4.0"

I have no idea why this wasn't picked up, probably a CI day activity. For now fix this so we can land it.

SimonRichardson · 2021-07-08T13:43:15Z

$$merge$$

SimonRichardson · 2021-07-08T14:46:47Z

$$merge$$

SimonRichardson · 2021-07-08T15:29:58Z

$$merge$$

achilleasa and others added 30 commits June 22, 2021 13:37

Facades: Parse via the config

8a65070

To correctly parse other values that aren't strings, we should parse via the config using ParseSettingsStrings.

Kubernetes: Wait for pod ready timeout

3d145f5

The following moves the timeout deadline from 1 minute to 10 minutes. If a pod hasn't become ready in that time, I think it's safe to say it won't ever! The change is simple, just add a longer duration.

Delete clusterrolebindings before creating;

35e712e

Comment shouldDelete in cluster role binding resource applier;

dd8b18f

Use client-go pointer helpers;

06d8b93

OAuth 2.8 credentials support in 2.9.

c3d9838

This change updates the Juju facades to return Kubernetes credential that pre 2.9 releases can understand. This has been done through a facade bump for all users of the CloudSpec facade.

Add Equinix Metal entrypoint to cloud package

d518de2

This commit adds EM API entrypoint to the cloud package and it generates the updated go file. Signed-off-by: root <jasmin.gacic@gmail.com> Signed-off-by: Gianluca Arbezzano <ciao@gianarb.it>

Implement Equinix Metal provider configuration and credentials

f456dde

This commit contains the struct and functions needed to correctly hook Equinix Metal API authentication to Juju Signed-off-by: root <jasmin.gacic@gmail.com>

Stub Equinix Metal environ/instance implementation

c68771b

This is an empty stub for the Equinix Metal Environ and instance implementation. It returns panics and notImplemented errors Signed-off-by: Gianluca Arbezzano <ciao@gianarb.it>

Equinix device abstraction (equinixDevice type) and methods

a51603a

This commit contains the device implementation for Equinix Metal Device

Implement Environ equinix-specific helpers

8365efb

This commit implement a couple of functions needed to bridge JuJu requirements against the Equinix Metal API.

Cloudinit renderer

6bb2ad3

This commit contains the logic to render a valid User Data for Equinix Metal device

Implement the environ provider

28f88e1

This commit bridges the Equinix Metal API to JuJu environ interface

Hook gocheck testsuite for Equinix Metal provider

8831ca8

This commit hooks the Equinix Metal provider to gocheck testsuite. Signed-off-by: Gianluca Arbezzano <ciao@gianarb.it>

Provider initialization and validation checks

17db150

Initialise Equinix Metal provider

0796322

This commit contains the init file to register the Equinix Metal provider to JuJu Signed-off-by: root <jasmin.gacic@gmail.com> Signed-off-by: Gianluca Arbezzano <ciao@gianarb.it>

Add subnets support to Equinix Metal provider

97ae725

This commit adds supports for juju subnet for the Equinix Metal provider. Signed-off-by: root <jasmin.gacic@gmail.com>

Add equinix-specific fixup script to repair routes and fix FAN iptabl…

6dd8ccf

…e rules

Enable automatic FAN configuration for the equinix provider

12b24a2

Migrate environ to standard context and set to collection pkg

99d07aa

Fix packngo stanza

a5c5ddf

gianarb and others added 16 commits July 5, 2021 17:03

Assert for equinix as part of bootstrap test command

f4ea610

Equinix: Allocate public ip constraint

6fa6d20

Allow the disabling of public ip addresses if the constraint is set. This is a workaround for lack of a firewall for now.

Ensure equinix provider Spaces() method uses 2.9 Networking iface

8393b57

signature

Fix parsing of custom k8s image repos with a port

fe8bdf7

Create pull secrets for sidecar containers.

b06dca2

Merge branch '2.8' of github.com:juju/juju into 2.8-into-2.9-cache

9d9bb07

Merge branch '2.9' of github.com:juju/juju into 2.8-into-2.9-cache

b360f96

Merge branch '2.9' of github.com:juju/juju into 2.9-into-develop-cache

1d61d73

Facades: remove old code

b798a86

SetApplicationsConfig has been removed in preference to SetConfigs, so the test are no longer required.

Cloudconfig: Fix test

0b47094

2.9 introduced different return types for GetControllerImagePath and I missed that in the merge, this fixes that.

manadart requested changes Jul 8, 2021

View reviewed changes

Integration tests: shfmt tests

47adcdb

I have no idea why this wasn't picked up, probably a CI day activity. For now fix this so we can land it.

SimonRichardson force-pushed the 2.9-into-develop-cache branch from 841028b to 47adcdb Compare July 8, 2021 12:59

manadart approved these changes Jul 8, 2021

View reviewed changes

jujubot merged commit 4ae860b into juju:develop Jul 8, 2021

SimonRichardson deleted the 2.9-into-develop-cache branch July 9, 2021 08:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2.9 into develop #13143

2.9 into develop #13143

SimonRichardson commented Jul 8, 2021

SimonRichardson commented Jul 8, 2021

manadart left a comment

SimonRichardson commented Jul 8, 2021

SimonRichardson commented Jul 8, 2021

SimonRichardson commented Jul 8, 2021

2.9 into develop #13143

2.9 into develop #13143

Conversation

SimonRichardson commented Jul 8, 2021

Merge 2.9 into develop

Conflicts:

SimonRichardson commented Jul 8, 2021

manadart left a comment

Choose a reason for hiding this comment

SimonRichardson commented Jul 8, 2021

SimonRichardson commented Jul 8, 2021

SimonRichardson commented Jul 8, 2021