Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge develop into 3.0-dqlite #14466

Merged
merged 392 commits into from
Aug 15, 2022
Merged

Conversation

manadart
Copy link
Member

@manadart manadart commented Aug 15, 2022

Merge from the develop branch to bring in:

Only conflict was the updated pebble dependency.

anvial and others added 30 commits July 22, 2022 19:11
…superusers_access_models

juju#14351

Be sure superusers can access all the models inside a controller.

## Checklist

NA

## QA steps

```sh
juju bootstrap lxd test --verbose
juju add-model modela
juju add-machine
juju ssh 0
```
Now you're into the machine 0. Exit, create a new user, logout, register the user bob, login using bob, and try to ssh again.
```ssh
juju add-user bob --controller test
juju grant bob superuser --controller test
juju change-user-password admin
juju logout
juju register <put here the provided string>
juju login -u bob
juju ssh 0 
```

## Documentation changes

NA

## Bug reference

https://bugs.launchpad.net/juju/+bug/1982337
Avoid timing window where the ID hasn't been set as the charm hasn't
been downloaded yet. A very small window.
Adjust error message on revision without channel to match juju 3.0
juju#14367

When removing a unit with a charm with an lxd profile from a machine, depending on timing, the lxd profile would not be updated.
The root cause was that the unit went to Dead but the `machineLXDProfileInfo()` call was loading all units and not ignoring the dead ones.

As a driveby, the logic to figure out the assigned machine id when watching units with a charm with a lxd profile was extracted to a helper method. A result is that we avoid making an extra call to `AssignedMachineId()` (tests updated accordingly). For subordinate units, if the unit has a machine id, we use it. If it is not yet known, we then try the principal unit's machine id and use that.

## QA steps

juju bootstrap lxd
juju deploy ubuntu --series focal
juju deploy ./testcharms/charm-repo/quantal/lxd-profile-alt --to 0 --series focal --force

wait for things to settle

juju remove-unit lxd-profile-alt/0
lxc profile show juju-controller-lxd-profile-alt-0

check that used_by is empty

## Bug reference

https://bugs.launchpad.net/juju/+bug/1982599
Co-authored-by: John Arbash Meinel <john@arbash-meinel.com>
Co-authored-by: John Arbash Meinel <john@arbash-meinel.com>
Co-authored-by: John Arbash Meinel <john@arbash-meinel.com>
Co-authored-by: John Arbash Meinel <john@arbash-meinel.com>
Co-authored-by: John Arbash Meinel <john@arbash-meinel.com>
juju#14368

Check charm origin ID before attempting to refresh charmhub charms. Prevents a case where refresh called before charm is downloaded.

Account for change to refresh in updated run_deploy_revision_upgrade.

Drive by fix of run_deploy_revision_fail.

## QA steps

```sh
(cd tests ; ./main.sh -v deploy test_deploy_revision )
```
Signed-off-by: Marques Johansson <mjohansson@equinix.com>
Signed-off-by: Marques Johansson <mjohansson@equinix.com>
We get these weird errors from Charmhub every so often, for example:

ERROR resolving with preferred channel: Post "https://api.charmhub.io/v2/charms/refresh": EOF

These aren't valid HTTP responses, so we can't use the existing
juju/http retry logic as it only works for valid HTTP responses with
certain 40x and 50x status codes. This is an empty TCP response.

So retry 3 times (4 attempts) in this EOF case. We need to read in POST
bodies up-front because the http library will be reading/copying them
to the network more than once if a retry occurs. But they shouldn't be
huge, so this seems reasonable.

I used a test client and net.Listen server to reproduce this exact
case, to ensure the io.EOF test works on the real responses we got.
Here is the code for that:

* Client: https://pastebin.canonical.com/p/4RGNKggrK6/
* Server: https://pastebin.canonical.com/p/jtFfNP3GFF/
it clear that we still retrieve network interfaces for those that we
find.
…etrieval

juju#14317

The linked bug describes a situation where we have the instance-poller worker making many calls to retrieve servers and ports from OpenStack, for instances that the provider does not recognise.

This is occurring because the model is dying, but not being cleaned up, and the only machine(s) remaining have already had their instances removed.

Here we do two things:
- Remove the retry logic for server retrieval from `Instances` in the OpenStack provider. At present, we keep attempting to retrieve instances repeatedly for the attempt duration, until we find all IDs we queried for. We will never find them in this case and always exhaust the retries.
- Avoid calling `NetworkInterfaces` when we have no recognised instances in the poll-group. We will have nothing to update anyway, so it is a wasted call.

This is a first step in increased efficiency. It will likely be followed with a change to filter `NetworkInterfaces` in the provider call rather than filtering later, and back-offs in the next poll time when we can't find instances.

## QA steps

- Make sure you have access to an OpenStack, and the OpenStack client installed.
- Bootstrap on OpenStack.
- Reconfigure `logging-config` so that `juju.provider.openstack=TRACE`.
- `juju add-machine`.
- After the machine is up, delete the instance using `openstack server delete <name|uuid>`
- Usually a run of the instance-poller will result in logs like:
```
machine-0: 17:14:42 TRACE juju.provider.openstack 1/1 live servers found
machine-0: 17:14:43 DEBUG juju.provider.openstack finding subnets in networks: c65bc2d8-0dcd-4ebe-a072-9f71622c3ca4
```
- Within 15 minutes you should note that there is no subnet retrieval, just:
```
(nil), ProviderId:"77d7435a-c074-431b-b3ee-cc92187e2253", ProviderSpaceId:"", ProviderNetworkId:"c65bc2d8-0dcd-4ebe-a072-9f71622c3ca4", VLANTag:0, AvailabilityZones:[]string{}, SpaceID:"", SpaceName:"", FanInfo:(*network.FanCIDRs)(nil), IsPublic:false, Life:""}
machine-0: 17:14:42 DEBUG juju.provider.openstack error listing servers: failed to get details for serverId: 25737f33-c218-449f-b195-13caf288ad7c
caused by: Resource at http://10.245.161.158:8774/v2.1/servers/25737f33-c218-449f-b195-13caf288ad7c not found
caused by: request (http://10.245.161.158:8774/v2.1/servers/25737f33-c218-449f-b195-13caf288ad7c) returned unexpected status: 404; error info: {"itemNotFound": {"code": 404, "message": "Instance 25737f33-c218-449f-b195-13caf288ad7c could not be found."}}
machine-0: 17:14:42 TRACE juju.provider.openstack 0/1 live servers found
```
- CAUTION: Don't confuse seeing the subnets line associated with another model, such as the controller.

## Documentation changes

None

## Bug reference

https://bugs.launchpad.net/juju/+bug/1981597
Co-authored-by: John Arbash Meinel <john@arbash-meinel.com>
This interface will no longer be state-specific given we want to use it
for streaming debug-log from syslog.
…test-deploy-bundles-aws

juju#14366

This patch fixes issues with mysql on 2.9, and revision issues on 3.0.
Also, this patch adds new tests with float and fixed revision for charm in the bundle's yaml.

## Checklist

 ~- [ ] Requires a [pylibjuju](https://github.com/juju/python-libjuju) change~
 ~- [ ] Added [integration tests](https://github.com/juju/juju/tree/develop/tests) for the PR~
 ~- [ ] Added or updated [doc.go](https://discourse.charmhub.io/t/readme-in-packages/451) related to packages changed~
 - [x] Comments answer the question of why design decisions were made

## QA steps

```sh
cd tests
./main.sh -v -p aws deploy
```
state logTailer implementation.

This allows us to decouple LogTailerParams from state.
state package, allowing us to use it from streaming logs from syslog.
juju#14371

This is mechanical refactoring to decouple the `LogTailer` interface and `LogTailerParams` from state. We need to do this in order to implement non-Mongo implementations supporting the `debug-log` command.

They are relocated to `core/logger`, and the `opLog` article remains in state by becoming an argument to the `NewLogTailer` constructor.

## QA steps

Just bootstrap and check that `juju debug-log` works.

## Documentation changes

None.

## Bug reference

N/A
barrettj12 and others added 26 commits August 12, 2022 12:34
juju#14452

Pull requests:
- juju#14424 
- juju#14428 
- juju#14430 
- juju#14431 
- juju#14433 
- juju#14434 
- juju#14437 
- juju#14439 
- juju#14440 
- juju#14442 
- juju#14444
- juju#14445 
- juju#14446 
- juju#14448
- juju#14453 
- Updated `github.com/juju/charm/v9` to v9.0.4.

Conflicts:
- apiserver/common/tools_test.go
- cmd/juju/commands/upgradecontroller_test.go
- cmd/juju/commands/upgrademodel_test.go
- go.mod
- go.sum
- scripts/win-installer/setup.iss
- snap/snapcraft.yaml
- version/version.go

Note: ignoring juju#14427 since this is due to be reverted (see juju#14449).
juju#14457

This PR adds the support for running the secret-changed hook. Two new secrets manager api calls are added to allow the uniter to watch for secret changes and to ask for the most recent revision info.
The secrets resolved and local/remote state is updated to manage the known revision info.

## Checklist

- [X] Code style: imports ordered, good names, simple structure, etc
- [X] Comments saying why design decisions were made
- [X] Go unit tests, with comments saying what you're testing
- [ ] ~[Integration tests](https://github.com/juju/juju/tree/develop/tests), with comments saying what you're testing~
- [X] [doc.go](https://discourse.charmhub.io/t/readme-in-packages/451) added or updated in changed packages

## QA steps

bootstrap and deploy the ubuntu charm
create a secret, eg
```
juju exec --unit ubuntu/0 "secret-add foo-bar
secret-get secret:cbqqf59esa1idj8gu59g
```

deploy a second charm with a secret-changed hook, eg
```
$ cat hooks/secret-changed 
#!/bin/bash
echo "secret-changed"
juju-log "secret-changed uri=$JUJU_SECRET_URI label=$JUJU_SECRET_LABEL"
exit 0
```

get the secret value
```
juju exec --unit consumer/0 "secret-get secret:cbqqf59esa1idj8gu59g --label baz"
foo=bar
```

check the logs - no secret changed hook
update the secret
`juju exec --unit ubuntu/0 "secret-update secret:cbqqf59esa1idj8gu59g foo=bar2"`

check the logs
```
INFO juju.worker.uniter.operation ran "secret-changed" hook (via explicit, bespoke hook script)
INFO unit.consumer/0.juju-log secret-changed uri=secret:cbqqf59esa1idj8gu59g label=baz
```
juju#14461

The PR adds the secrets grant and revoke hook commands and the front end api and apiserver backend.
A followup PR will implement the actual backend on top of state.

## Checklist

- [X] Code style: imports ordered, good names, simple structure, etc
- [X] Comments saying why design decisions were made
- [X] Go unit tests, with comments saying what you're testing
- ~[ ] [Integration tests](https://github.com/juju/juju/tree/develop/tests), with comments saying what you're testing~
- [x] [doc.go](https://discourse.charmhub.io/t/readme-in-packages/451) added or updated in changed packages

## QA steps

Just unit tests for this PR
@manadart manadart merged commit 92ffd88 into juju:3.0-dqlite Aug 15, 2022
@manadart manadart deleted the develop-into-3.0-dqlite branch August 15, 2022 17:03
jujubot added a commit that referenced this pull request Feb 10, 2023
#15177

The following brings the 3.0-dqlite feature branch into the develop branch.

### Changes

This brings in the dqlite database to sit along side the mongo database. Currently, only leases are implemented in Juju using dqlite, more controller base configuration and data will be subsequently moved over to dqlite once this branch has landed.

#### Leases/Raft

The whole raft implementation has been removed from Juju completely. This includes the following workers:

 - raft backstop
 - raft clusterer
 - raft log
 - raft transport
 - global clock updater

In addition, the raft API implementation has also been removed. Instead, the lease has changed to handle the store (dqlite db) directly, improving readability and complexity.

### Jujud 

The `jujud` agent is now built using musl (specifically musl-gcc). This allows `juju` to be built statically embedding `dqlite` in the same process. There are still some rough edges when building and testing and when this lands, we expect to see some churn to polish any of those issues.

Using `go test` is expected to still work as is, this is a last-minute change so that we can utilize sqlite directly for local tests. If you require to test with dqlite (linux only), then running `-tags="dqlite"` with builds/tests/installs is required. All CI jobs are required to run with the dqlite tag.

Some notes:

 1. `CGO_ENABLED=1` and `CGO_LDFLAGS_ALLOW="(-Wl,-wrap,pthread_create)|(-Wl,-z,now)"` are required if you're using dqlite directly.
 2. You are expected to install musl directly on your system if you want to build, using `make musl-install`. This will require sudo.
 3. For development purposes we will download dqlite `.a` files from an s3 bucket to facilitate the setup process. The tar file is sha256 summed to ensure no MITM. You can build these locally if you want to bypass s3 using `make dqlite-build-lxd`. This will spin up an lxd container to build. **Do not attempt** to run `make dqlite-build` locally, unless you know what you're doing.
 4. To access dqlite from a controller, use `make repl`, this will open up a pseudo repl when you can then explore the database with. `.open <db name>` and then you can use SQL from there.
 5. Cross compilation to other architectures can be done using `GOARCH` and `GOOS` before `make install` or `make build`.

There are probably some things I've forgotten, expect a discourse post soon, which will highlight the development flow.

----

Two conflicts when merging. The resolution was to bring in the secret backends for the manifold tests and the controller config type changed for `DefaultMigrationMinionWaitMax`.

```
CONFLICT (content): Merge conflict in cmd/jujud/agent/machine/manifolds_test.go
CONFLICT (content): Merge conflict in controller/config.go
```

c141b2e (upstream/3.0-dqlite) Merge pull request #15159 from SimonRichardson/system-install-musl-by-default
83656e2 Merge pull request #15156 from SimonRichardson/change-log-ddl
125c19d Fix static-analysis pipeline (#15168)
5abfa24 Merge pull request #15140 from SimonRichardson/allow-testing-on-mac
1dc60f6 (3.0-dqlite) Merge pull request #15153 from SimonRichardson/content-addressable-deps
5a1cd24 Merge pull request #15150 from jack-w-shaw/JUJU-2615_symlink_sudo
4502d63 Merge pull request #15148 from SimonRichardson/better-install-method
88941dd Merge pull request #15134 from SimonRichardson/bootstrap-dqlite-agent-tests
2551ffc Merge pull request #15130 from SimonRichardson/build-jujud-snap
0180a53 (origin/3.0-dqlite, manadart/3.0-dqlite) Merge pull request #15123 from SimonRichardson/fix-manifold-lease-expiry-tests
fdf9cc7 Merge pull request #15115 from SimonRichardson/remove-jujud-main-test-file
bf58843 Merge pull request #15113 from SimonRichardson/remove-api-raftlease-api-client
f9419c0 Merge pull request #15112 from SimonRichardson/fix-initializing-state-twice
334d557 Merge pull request #15108 from SimonRichardson/github-action-go-build
2ee6e1a Merge pull request #15107 from SimonRichardson/cross-building-jujud
5a93305 Merge pull request #15087 from SimonRichardson/ensure-placement-of-file
da95dc0 Merge pull request #15086 from SimonRichardson/more-sudo-changes
7b86376 Merge pull request #15085 from SimonRichardson/sudo-apt-get
c4d4eb6 Merge pull request #15057 from SimonRichardson/dqlite-local-build
0ac79b3 Merge pull request #15061 from manadart/develop-into-3.0-dqlite
adc20f7 Merge pull request #15043 from SimonRichardson/allow-overriding-arch-machine
8c02f22 Merge pull request #15048 from SimonRichardson/static-analysis-fix
4547c06 Merge pull request #15050 from manadart/dqlite-address-option
d51b324 Merge pull request #15049 from manadart/dqlite-bootstrap-options
3801b78 Merge pull request #15047 from manadart/develop-into-3.0-dqlite
22d5247 Merge pull request #15037 from SimonRichardson/standardise-dqlite-build
25640a2 Merge pull request #15036 from SimonRichardson/remove-batch-fsm-controller-config
dfa4cb1 Merge pull request #15041 from manadart/dqlite-fix-mock
caf9481 Merge pull request #15034 from manadart/develop-into-3.0-dqlite
c91985d Merge pull request #15035 from SimonRichardson/remove-typed-lease-error
42d17be Merge pull request #15009 from SimonRichardson/allow-repl-via-juju-ssh
d798238 Merge pull request #15002 from manadart/dqlite-use-lease-store
e4f0d39 Merge pull request #14918 from manadart/3.0-dqlite-lease-store
8315fb7 Merge pull request #14986 from manadart/dqlite-build-from-tags
a73b947 Merge pull request #14927 from manadart/3.0-dqlite-lease-store-interface
1657a1d Merge pull request #14910 from manadart/3.0-dqlite-db-supply
27b23f3 Merge pull request #14909 from manadart/3.0-into-3.0-dqlite
6adff35 Merge pull request #14756 from manadart/develop-into-3.0-dqlite
37d81ff Merge pull request #14717 from manadart/develop-into-3.0-dqlite
fe2edb8 Merge pull request #14671 from manadart/3.0-simplify-dbaccessor
1a09836 Merge pull request #14604 from manadart/3.0-bootstrap-controller-db
5ad011e Merge pull request #14652 from manadart/develop-into-3.0-dqlite
1c3d250 Merge pull request #14591 from manadart/develop-into-3.0-dqlite
229cd3e Merge pull request #14578 from manadart/3.0-dqlite-simplify
9d715ba Merge pull request #14565 from manadart/develop-into-3.0-dqlite
92ffd88 Merge pull request #14466 from manadart/develop-into-3.0-dqlite
57f67ce Merge pull request #14336 from manadart/develop-into-3.0-dqlite
648d354 Merge pull request #14364 from manadart/update-musl
198621d Merge pull request #14241 from manadart/develop-into-3.0-dqlite
0360db6 Merge pull request #14153 from manadart/develop-into-3.0-dqlite
17950b2 Merge pull request #14053 from manadart/develop-into-3.0-dqlite
9452026 Merge pull request #14016 from manadart/develop-into-3.0-dqlite
741baca Merge pull request #13963 from manadart/develop-into-3.0-dqlite
5449603 Merge pull request #13969 from manadart/dqlite-manifolds
7b612a0 Merge pull request #13944 from SimonRichardson/dqlite-develop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet