log forwarding: Workers now run per-model #7418

Merged
merged 2 commits into from Jun 1, 2017

Conversation

Projects
None yet
3 participants
Member

babbageclunk commented May 31, 2017

Description of change

At the moment, the most expensive step in destroying a model is clearing out its logs. We want to make this cheaper by having a log collection for each model. But in order to do this we have to make log forwarding happen per-model rather than running for all models at once.

Move the log forwarder worker creation into the model manifolds, and allow the log streaming endpoint to be called for non-controller models. Remove the all-model capability of the log tailer, and always include the model UUID in the log messages it generates.

QA steps

  • Generate a CA cert + key and certs and keys for a server and client.
  • Set up an rsyslog server to receive forwarded messages with the server certificate.
    • Refer to this tutorial for information on this.
    • Be sure to match the common names in the certificates to the names you put into the rsyslog config.
    • If you're specifying the IP address of the server, ensure the server certificate has the right IP address as a SubjectAltName - Go TLS support enforces this although OpenSSL doesn't by default.
  • Bootstrap a controller and deploy some applications and relations to the default model.
  • Configure log forwarding by specifying the following model-config for both controller and default model:
    • syslog-host: the IP and port of the rsyslog receiver (10514 if you follow the instructions above).
    • syslog-ca-cert
    • syslog-client-cert
    • syslog-client-key
  • Turn forwarding on for only the controller by setting logforward-enable to true.
  • Check that log messages from the controller appear in the receiver's logs, and that messages from the default model don't.
  • Turn off logging for the controller model and enable it for the default model - records from the default model should start coming in and the controller ones should stop.

TestTailingLogsOnlyForControllerModel is misnamed now right? We really just need to test that tailing for a model works.

Update log forwarder to work per-model
Move it into the models manifolds. This means we can drop the fake state
dependency - this wasn't required except to prevent the worker from
running in machine agents that weren't the controller, and only the
controller runs model workers.

Pull out the support for forwarding logs from all models. We want to
split the logs collection into one collection per model after this
change, so logging from all models would be harder to do (and we don't
think there's any need for it).

Make the forwarder always send the model ID.
Member

babbageclunk commented May 31, 2017

$$merge$$

Contributor

jujubot commented May 31, 2017

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

Contributor

jujubot commented May 31, 2017

Build failed: Tests failed
build url: http://juju-ci.vapour.ws:8080/job/github-merge-juju/11033

Fix potential state leak in machine agent
...and make the syslog feature test pass.The test was failing for a
couple of weird reasons (despite the feature working correctly).

First, the log-forwarder was never being started so the dummy rsyslog
server never got any messages. This was because the model manifolds are
run by the state workers worker, which in turn depended on the state
worker (which opens a state connection). This was failing because the
default controller config used when creating the agent config had the
state/mongo port as 1234. Explicitly setting this to the port reported
by the juju/testing mongo instance resolved this, revealing the next
problem.

The test was now passing (log forwarding was working) but teardown was
failing because there were mongo connections left alive. This turned out
to be because there was a state leak in the machine agent - if there was
an error starting the apiserver, the state created for it would be
leaked. The apiserver wasn't starting because it was trying to bind to a
port that was in use. It turns out that the dummy provider starts an
apiserver on that same port, and all of the workers that used the API
are talking to that, instead of the one that the agent should've
created.

I've fixed the leak, and the test passes, but it's a weird situation. I
think the right thing would be for tests against a model-hosting
agent (ie a controller agent rather than a normal machine or unit agent)
not to use AgentSuite, but to use something else that doesn't use dummy
provider to bootstrap, since the tests will run an agent that should
contain its own apiserver.

I'll make a separate task for that.
Member

babbageclunk commented Jun 1, 2017

$$merge$$

Contributor

jujubot commented Jun 1, 2017

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

@jujubot jujubot merged commit f1821b1 into juju:develop Jun 1, 2017

1 check failed

github-check-merge-juju Built PR, ran unit tests, and tested LXD deploy. Use !!.*!! to request another build. IE, !!build!!, !!retry!!
Details

@babbageclunk babbageclunk deleted the babbageclunk:logforward-permodel branch Jun 1, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment