[dev] merge 2.9 #13329

achilleasa · 2021-09-13T11:55:17Z

This PR forward ports 2.9 into develop. The following PRs are included in this port:

Merge pull request [2.9] Add OVS integration test #13324 from achilleasa/2.9-add-ovs-integration-test
Merge pull request Fix assess_upgrade_series again #13328 from manadart/2.9-assess-series-upgrade
Merge pull request Remove unused azure network/storage imports in py test #13327 from wallyworld/azure-tests-fix
Merge pull request Raft worker errors #13325 from SimonRichardson/raft-worker-errors
Merge pull request Network devices from config host_name #13274 from SimonRichardson/lxd-network-devices-config-host-name
Merge pull request Increment juju to 2.9.15 #13323 from jujubot/increment-to-2.9.15
Merge pull request Add tags and description and status to secret metadata; adjust URL syntax #13321 from wallyworld/more-secret-metadata
Merge pull request Apply Fan subnet fixes to Address collection in incoming network configuration #13319 from manadart/2.9-bridge-policy
Merge pull request Revert LXD changes #13320 from SimonRichardson/revert-lxd-changes
Merge pull request Replace missing links. #13194 from juanmanuel-tirado/patch-1
Merge pull request Add test to ec2 + ensure ec2 SecurityGroup mutates use only group id #13318 from hpidcock/fix-1942948
Merge pull request Securely handle /etc/snap.assertions #13314 from simondeziel/snap-ack
Merge pull request [2.9] Preserve empty openvswitch blocks when merging netplan configurations #13297 from achilleasa/2.9-allow-empty-openvswitch-blocks-in-netplan-config
Merge pull request Increment juju to 2.9.14 #13317 from jujubot/increment-to-2.9.14
Merge pull request Don't use GroupName when opening ports on AWS #13316 from hpidcock/fix-1942948
Merge pull request Fix/registry oauth2 #13296 from ycliuhw/fix/registry-oauth2
Merge pull request Be more lenient if the charm dir is deleted after a relation is joined #13311 from wallyworld/unitagent-missing-charm
Merge pull request Use a lxc client library helper to detect Not Found errors #13315 from wallyworld/lxd-not-found-fix
Merge pull request Status watch flag #13221 from juanmanuel-tirado/status_watch_flag
Merge pull request Remove unnecessary txn watcher started channel/topic #13312 from wallyworld/remove-txnwatcher-started
Merge pull request Fix /etc/profile.d/juju-introspection.sh not being POSIX-shell compatible #13309 from kot0dama/fix-instrospection-posix-shell-2.9

The following files had merge conflicts that had to be resolved (please double-check the changes in last commit):

caas/kubernetes/provider/bootstrap_test.go
feature/flags.go
scripts/win-installer/setup.iss
snap/snapcraft.yaml
state/pool.go
version/version.go
worker/uniter/relation/state_test.go

The following changes ensures that the eth0 device gets a unique name. This is given to us via LXD. Unfortunately we're unsure currently why you would have interfaces first time around and then devices another time around, which seems to be the only way to trip this up? More information from the host machine would be required to find the underlying bug `ip -a` and a list of all the nics. For now this seems like it has the potential to unblock containers being created.

…ntation;

…he new registry API;

…upgrade-controller command;

…ible Fixes https://bugs.launchpad.net/juju/+bug/1942430

…shell-2.9 juju#13309 See also juju#13299 /etc/profile.d/juju-introspection.sh contains a few bash arrays, breaking POSIX shell compatibility and at least some packages post-installation as it is loaded as a profile script. For example, on a model deployed with Juju 2.9.11, installing spamassassin yields a _Syntax error_ as spamassassin post-installation script attempts to run sa-update as debian-spamd user, having /bin/sh as shell. The change includes variables name changes (and a comment) to avoid future updates breaking POSIX compatibility. Feel free to update the comment to whatever you see fit. ## QA steps * Check it builds properly * Deploy a model with spamassassin installation (eg. charm cs:mailman3-core) and check it doesn't fail during SA install. ## Bug reference Fixes https://bugs.launchpad.net/juju/+bug/1942430 Probable iteration of https://bugs.launchpad.net/juju/+bug/1770437

juju#13312 As seen in the linked bug, there can be a panic if the txn watcher worker bounces and closes a "watcherStarted" channel twice. However, that channel and associated pubsub topic is unnecessary in the first place, so this PR removes them both. ## QA steps bootstrap and deploy a couple of charms and relate ## Bug reference https://bugs.launchpad.net/juju/+bug/1942421

juju#13221 A very common user behaviour is to watch the status of deployments for a while. Actually, in the documents this is done using the `watch` command in conjunction with `juju status`. I have defined an additional flag for the `status` command called `--watch` followed by a duration value that runs 'status' commands every given interval of time endlessly. ## QA steps To test simply deploy any application and watch the iterative status calls. ```sh juju deploy hello-juju # print status and exit juju status # print the status every two seconds juju status --watch=2s ``` The operation continues until the program is killed. ## Documentation changes The flag is documented in the command docs.

This reverts commit 030a669.

Until we can correctly pull in the lxd upstream project, we have to use a work around to check for the error status. This is an abridged version of how the LXD code base does it internally[1]. 1. https://github.com/lxc/lxd/blob/master/shared/api/error.go#L32-L58

juju#13320 The following PR reverts the 2.9 changes around LXD. As the LXD project is using 1.16 version of go and we're using 1.14 there are some issues around the exact packages we're building with. Running `go mod tidy` on the tip of 2.9 returns an error (see below) which indicates something is using the latest `io/fs` packages and we're not able to consume them, so it's finding alternative dependencies to use from the sum file (at a guess). As stated before, go mod along with the sum files, are not a lock file, it only describes what's downloaded and to ensure they're not corrupt. The fix is simple, revert the LXD changes and work around the LXD status error by creating an interface to validate against. ``` go mod tidy github.com/juju/juju/apiserver/logsink imports gopkg.in/natefinch/lumberjack.v2 tested by gopkg.in/natefinch/lumberjack.v2.test imports github.com/BurntSushi/toml imports io/fs: package io/fs is not in GOROOT (/usr/local/go/src/io/fs) github.com/juju/juju/apiserver/logsink imports gopkg.in/natefinch/lumberjack.v2 tested by gopkg.in/natefinch/lumberjack.v2.test imports github.com/BurntSushi/toml tested by github.com/BurntSushi/toml.test imports testing/fstest: package testing/fstest is not in GOROOT (/usr/local/go/src/testing/fstest) github.com/juju/juju/apiserver/logsink imports gopkg.in/natefinch/lumberjack.v2 tested by gopkg.in/natefinch/lumberjack.v2.test imports github.com/BurntSushi/toml tested by github.com/BurntSushi/toml.test imports github.com/BurntSushi/toml/internal/toml-test imports embed: package embed is not in GOROOT (/usr/local/go/src/embed) ``` ## QA steps ```sh $ snap install juju --classic $ /snap/bin/juju bootstrap lxd test $ juju upgrade-controller --build-agent $ juju debug-log -m controller ``` Ensure that the single-controller does settle and has started. ```sh $ juju ssh -m controller 0 $ juju_engine_report | less ``` ## Bug reference https://bugs.launchpad.net/juju/+bug/1943075

juju#13319 Some time ago we did a significant overhaul of the payload delivery for updating machine address configuration. Included were changes to use members of the `NetworkConfig.Addresses` entries instead of those on the top level `NetworkConfig`. One of these was the address subnet CIDR. What we missed at the time was to change`NetworkConfigAPI.fixUpFanSubnets` so that it applied the Fan updates to the `Addresses` entries and not the top-level `CIDR` member. This meant that Fan addresses were being stored with subnets indicating the /8 overlay instead of the zone-specific /12 segments in AWS, resulting in the observed bug. Here we fix `fixUpFanSubnets` so that we get the fixed Fan segment subnets against incoming addresses. ## QA steps - `juju bootstrap aws patch-test --debug --no-gui --build-agent` - `juju add-space space-1 172.31.0.0/20` - `juju add-machine --constraints "spaces=space-1" --series bionic` - `juju deploy cs:ubuntu --to lxd:0 --series focal --constraints "spaces=space-1" --bind "space-1"` - Observe that the container is provisioned. ## Documentation changes None. ## Bug reference https://bugs.launchpad.net/juju/+bug/1942950

…ntax

juju#13321 The secret-create and secret-update hook commands can now set additional secret metadata: - description - tags - status Secret status defaults to 'active' and can be set to 'pending' via the CLI option `--pending`. It can be used when rotating secrets. The secret URL format has changed also. The main change is the removal of version from the URL and the introduction of an 'app' namespace component (more to come). Also, the use of '.' as a path separator is removed, replaced by '/'. ## QA steps bootstrap and deploy ubuntu charm ``` juju deploy ubuntu $ juju exec --unit ubuntu/0 "secret-create password --description 'my secret' --tag foo=bar --tag hello=world data=s3cret!" secret://app/ubuntu/password $ juju exec --unit ubuntu/0 "secret-get secret://app/ubuntu/password" s3cret! $ juju exec --unit ubuntu/0 "secret-update password --rotate 5h --description 'ssshhh'" secret://app/ubuntu/password?revision=1 $ juju secrets --format yaml - ID: 1 URL: secret://app/ubuntu/password revision: 1 path: app/ubuntu/password status: active rotate-interval: 5h0m0s version: 1 description: ssshhh tags: foo: bar hello: world backend: juju create-time: 2021-09-10T00:48:49Z update-time: 2021-09-10T00:50:26Z $ juju exec --unit ubuntu/0 "secret-update password --pending data=another" secret://app/ubuntu/password?revision=2 $ juju secrets --format yaml --show-secrets - ID: 1 URL: secret://app/ubuntu/password revision: 2 path: app/ubuntu/password status: pending rotate-interval: 5h0m0s version: 1 description: ssshhh tags: foo: bar hello: world backend: juju create-time: 2021-09-10T00:48:49Z update-time: 2021-09-10T00:52:42Z value: data: another ```

The host_name should be generated directly from LXD, instead of Juju. We only set the host_name to identify the devices/nics when looking at them via `ip a`. As this was just a convenience, we should fallback to letting LXD do it. This will prevent naming collisions at the kernel level, where they haven't gc'd the devices in time.

It is no longer required to pass the machine ID through to generate the device name any longer.

Increment juju to 2.9.15

This test is meant to catch regressions for LP1942328.

…s-config-host-name juju#13274 The following changes ensures that the eth0 device gets a unique name. This is given to us via LXD. Unfortunately, we're unsure currently why you would have interfaces the first time around and then devices another time around, which seems to be the only way to trip this up? More information from the host machine would be required to find the underlying bug `ip a` and a list of all the nics. For now, this seems like it has the potential to unblock containers being created. ## QA steps Can you run this branch and then if you encounter the issue, give us the results of `ip a` and all the nics? ## Bug reference https://bugs.launchpad.net/juju/+bug/1932180

In order to improve readability of why a raft worker is restarting, the following introduces logging error messages. The problem isn't helped by the masked ErrStartTimeout waiting for a new raft channel. As the catacomb is killed by the ErrStartTimeout, we loose the original error message. We should revisit if this is the best way to do this!

juju#13325 In order to improve the readability of why a raft worker is restarting, the following introduces logging error messages. The problem isn't helped by the masked ErrStartTimeout waiting for a new raft channel. As the catacomb is killed by the ErrStartTimeout, we lose the original error message. ## QA steps ```sh $ snap install juju --classic $ /snap/bin/juju bootstrap lxd test $ go get github.com/lxc/lxd@172613bf35d31b8e788f00fe93069b8d0d19b947 $ juju upgrade-controller --build-agent $ juju debug-log -m controller ``` You should see the following error message ``` machine-0: 16:26:49 ERROR juju.worker.raft Failed to setup raft instance, err: failed to get last log at index 52: msgpack decode error [pos 13]: invalid length of bytes for decoding time - expecting 4 or 8 or 12, got 15 ``` ## Bug reference https://bugs.launchpad.net/juju/+bug/1943075

Remove unused azure network/storage imports in py test

acceptance test.

juju#13328 When running the series-upgrade acceptance test, just ignore whatever return code comes back from rebooting the machine via SSH. It is not the part that we should be hinging test failure on. ## QA steps - With the appropriate virtualenv activated: `(cd acceptancetests ; ./assess upgrade_series --juju $YOUR-JUJU --logging-config '<root>=DEBUG')` - You might see this output in the test logging: ``` ERROR subprocess encountered error code 255 2021-09-13 11:38:22 INFO Ignoring `juju ssh` exit code 1 after triggering reboot ``` ## Documentation changes None. ## Bug reference N/A

…-test juju#13324 This PR adds a regression test for ensuring that Juju correctly retains empty `openvswitch` blocks (such as the ones injected by MAAS to composed KVM machines with OVS bridges) when merging netplan configurations. ## QA steps The test is wired to use a dual-spaced machine with an OVS bridge on our internal finfolk MAAS. To run the QA steps, define a MAAS cloud and point it to finfolk. Then: ```sh $ cd tests $ ./main.sh -v -p maas -c $finfolk-cloud-name ovs_maas ``` ### Setting up a suitable machine for this test on your local MAAS instance The following steps will allow you to prepare a machine with the right NIC devices for running the above test. The network setup is based on [this](https://openstackdevops.wordpress.com/2021/05/03/converged-networking-with-charmed-openstack/) blog post about converged networking. Prior to running the following steps you must: - be running MAAS 2.9.2+. - define two spaces named **space1** and **space2**. Using the MAAS UI: 1) Compose a new KVM machine (1 core, 4G RAM) 2) **Abort** the commissioning attempt for the new machine. 3) Connect to the KVM host and edit the machine definition using `virsh`. For example, if your machine is called 'foo' run: `virsh edit foo`. 4) Locate the XML block that describes the one and only NIC assigned to the machine. It should look something like: ```xml <interface type='network'> <mac address='52:54:00:24:a7:07'/> <source network='maas'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </interface> ``` 5) Clone this block two times (and change the **slot** attribute to **0x08** and **0x0a** respectively). Use [this](https://gist.github.com/viz3/6591201) script to generate a MAC for each new NIC. 6) Go back to the MAAS UI and commission the machine. 7) Edit the machine and attach the `ovs` tag to it. 8) In the networking tab for the machine you should now see three NICs. - Select the first two NICS and click the `create bond` button; leave the default values as-is and save the bond. - Select the bond and click the `create bridge` button. Under bridge type select `open vswitch`. Then, select the appropriate fabric / subnet values for `space1` and `Auto assign` for the IP configuration option. - Select the remaining NIC from the list, edit it and select the appropriate fabric/subnet values for `space2` and `Auto assign` for the IP configuration option. 9) Deploy the machine and ensure that the operation succeeds. If the machine gets stuck in `Rebooting` ask the MAAS folks for help.

Resolved Conflicts: caas/kubernetes/provider/bootstrap_test.go feature/flags.go scripts/win-installer/setup.iss snap/snapcraft.yaml state/pool.go version/version.go worker/uniter/relation/state_test.go

hmlanigan

Double checked the files mentioned - LGTM.

hmlanigan · 2021-09-13T17:37:49Z

$$merge$$

Juan Tirado and others added 30 commits July 26, 2021 17:01

Replace missing links.

353bfd7

Added watch flag to status.

6a3c089

Replace initial tabs with spaces.

272d68a

Added clear screen character.

f2c00d2

Added helpers for windows and UNIX systems.

21b332d

Added missing copyright.

eb08b27

Removed unnecessary return param.

6e53e13

Ignore watch flag under windows.

9d7078f

Removed non-existing URL.

99f094c

Add String() method for imageInfo for better logging;

f778a1e

Refactor docker/registry package for easier provider specific impleme…

bd28f1d

…ntation;

Add ACR registry client;

9502e73

Add dockerhub registry client;

7e912f3

Draft gcr and quay registry client(not fully implemented yet);

4050389

Add github container registry client;

cd06f5c

Add gitlab container registry client;

54729b5

Refactor upgrade-controller command and controller/config.go to use t…

385f47b

…he new registry API;

Use the new registry API for fetching public jujud-operator tags for …

06a5476

…upgrade-controller command;

Fix /etc/profile.d/juju-introspection.sh not being POSIX-shell compat…

2ff1c79

…ible Fixes https://bugs.launchpad.net/juju/+bug/1942430

Add tests for the registry package;

b444778

Removed windows helper for status watch flag.

7ad1cae

Skip testing watch flag under windows.

22db933

Fix server address missing issue for public repositories;

d7924df

Remove unnecessary txn watcher started channel/topic

b7c7860

Make k8s bootstrap log messages less verbose;

56e7335

Use a lxc client library helper to detect Not Found errors

030a669

SimonRichardson and others added 23 commits September 9, 2021 16:04

Revert "Use a lxc client library helper to detect Not Found errors"

dd6a5ac

This reverts commit 030a669.

Tweak loop in to only parse IPP once

5531cc2

Add tags and description and status to secret metadata; adjust URL sy…

bf9c0aa

…ntax

Increment juju to 2.9.15

1e8efea

Remove machine ID from methods

bd7da6e

It is no longer required to pass the machine ID through to generate the device name any longer.

Add support for MAAS clouds to bash test runner

19bdcc2

Merge pull request juju#13323 from jujubot/increment-to-2.9.15

d6ddac6

Increment juju to 2.9.15

Add regression test for MAAS machines with OVS bridges

f4c9468

This test is meant to catch regressions for LP1942328.

Remove unused azure network/storage imports in py test

71f5d43

Merge pull request juju#13327 from wallyworld/azure-tests-fix

7a069ff

Remove unused azure network/storage imports in py test

Corrects spelling of ARTIFACT in bash-based test scripts.

1c8b3b0

Ignores error returns from machine reboot via SSH in series upgrade

8eca3ed

acceptance test.

Suppress showing-clouds-with-credentials warning in bash tests

e677f01

achilleasa force-pushed the dev-merge-2.9 branch from b6d4b2c to 67e664d Compare September 13, 2021 13:12

Merge 2.9 into develop

e2665c4

Resolved Conflicts: caas/kubernetes/provider/bootstrap_test.go feature/flags.go scripts/win-installer/setup.iss snap/snapcraft.yaml state/pool.go version/version.go worker/uniter/relation/state_test.go

achilleasa force-pushed the dev-merge-2.9 branch from 67e664d to e2665c4 Compare September 13, 2021 15:08

hmlanigan approved these changes Sep 13, 2021

View reviewed changes

jujubot merged commit a0f25a5 into juju:develop Sep 13, 2021

achilleasa deleted the dev-merge-2.9 branch September 14, 2021 14:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dev] merge 2.9 #13329

[dev] merge 2.9 #13329

achilleasa commented Sep 13, 2021 •

edited

hmlanigan left a comment

hmlanigan commented Sep 13, 2021

[dev] merge 2.9 #13329

[dev] merge 2.9 #13329

Conversation

achilleasa commented Sep 13, 2021 • edited

hmlanigan left a comment

Choose a reason for hiding this comment

hmlanigan commented Sep 13, 2021

achilleasa commented Sep 13, 2021 •

edited