Logging in the Clouds
Loggregator is the user application logging subsystem of Cloud Foundry.
Table of Contents
- Emitting Messages from other Cloud Foundry components
- Enabling TLS between Metron and Doppler
- Deploying via BOSH
- Configuring the Firehose
- Consuming Log and Metric Data
- Loggregator as a separate release
- Metrics generated by Loggregator
Loggregator allows users to:
- Tail their application logs.
- Dump a recent set of application logs (where recent is a configurable number of log packets).
- Continually drain their application logs to 3rd party log archive and analysis services.
- (Operators and administrators only) Access the firehose, which includes the combined stream of logs from all apps, plus metrics data from CF components.
First, make sure you're using the new golang based CF CLI. Once that's installed:
cf logs APP_NAME [--recent]
$ cf logs private-app Connected, tailing... Oct 3 15:09:26 private-app App/0 STDERR This message is on stderr at 2013-10-03 22:09:26 +0000 for private-app instance 0 Oct 3 15:09:26 private-app App/0 STDERR 22.214.171.124, 10.10.2.148 - - [03/Oct/2013 22:09:26] "GET / HTTP/1.1" 200 81 0.0010 Oct 3 15:09:26 private-app App/0 This message is on stdout at 2013-10-03 22:09:26 +0000 for private-app instance 0 Oct 3 15:09:26 private-app App/0 STDERR This message is on stderr at 2013-10-03 22:09:26 +0000 for private-app instance 0 ^C
- Loggregator collects STDOUT & STDERR from applications. This may require configuration on the developer's side.
- Warning: the DEA logging agent expects an application to have open connections on both STDOUT and STDERR. Closing either of these (for example, by redirecting output to
/dev/null) will be read by the logging agent as a misbehaving application, and it will disconnect from all sockets for that app.
- A Loggregator outage must not affect the running application.
- Loggregator gathers and stores logs in a best-effort manner. While undesirable, losing the current buffer of application logs is acceptable.
- The 3rd party drain API should mimic Heroku's in order to reduce integration effort for our partners. The Heroku drain API is simply remote syslog over TCP.
Loggregator is composed of:
- Sources: Logging agents that run on the Cloud Foundry components.
- Metron: Metron agents are co-located with sources. They collect logs and forward them to:
- Doppler: Responsible for gathering logs from the Metron agents, storing them in temporary buffers, and forwarding logs to 3rd party syslog drains.
- Traffic Controller: Handles client requests for logs. Gathers and collates messages from all Doppler servers, and provides external API and message translation (as needed for legacy APIs).
Source agents emit the logging data as protocol-buffers, and the data stays in that format throughout the system.
In a redundant CloudFoundry setup, Loggregator can be configured to survive zone failures. Log messages from non-affected zones will still make it to the end user. On AWS, availability zones could be used as redundancy zones. The following is an example of a multi zone setup with two zones.
The role of Metron is to take traffic from the various emitter sources (dea, dea-logging-agent, router, etc) and route that traffic to one or more dopplers. In the current config we route this traffic to the dopplers in the same az. The traffic is randomly distributed across dopplers.
The role of Traffic Controller is to handle inbound HTTP and WebSocket requests for log data. It does this by proxying the request to all dopplers (regardless of AZ). Since an application can be deployed to multiple AZs, its logs can potentially end up on dopplers in multiple AZs. This is why the traffic controller will attempt to connect to dopplers in each AZ and will collate the data into a single stream for the web socket client.
The traffic controller itself is stateless; an incoming request can be handled by any instance in any AZ.
Traffic controllers also exposes a
firehose web socket endpoint. Connecting to this endpoint establishes connections to all dopplers, and streams logs and metrics for all applications and CF components.
Emitting Messages from other Cloud Foundry components
Cloud Foundry developers can easily add source clients to new CF components that emit messages to the doppler. Currently, there are libraries for Go and Ruby. For usage information, look at their respective READMEs.
Enabling TLS between Metron and Doppler
The default transport between Metron and Doppler is UDP. We have recently added support for TLS since it is a reliable protocol and also has in-built support to ensure integrity, encryption and authentication.
NOTE: TLS support is currently experimental. Enable it at your own discretion. The properties discussed below as well as their behavior might change in the future.
|Metron prefers this protocol to communicate with Doppler. Options are
|Size of the buffer between Metron and Doppler when the
|Signed client certificate used by Metron when communicating with Doppler over TLS|
|Client key used by Metron when communicating with Doppler over TLS|
|Certificate Authority used to sign the certificate|
|Enable TLS communication with Metron. If enabled,
|Signed server certificate used by Doppler when communicating with Doppler over TLS|
|Server key used by Doppler when communicating with Metron over TLS|
|Certificate Authority used to sign the certificate|
An example manifest is given below:
loggregator: tls: ca: | -----BEGIN CERTIFICATE----- LOGGREGATOR CA CERTIFICATE -----END CERTIFICATE----- metron_agent: preferred_protocol: tls buffer_size: 1000 tls_client: cert: | -----BEGIN CERTIFICATE----- METRON AGENT CERTIFICATE -----END CERTIFICATE----- key: | -----BEGIN RSA PRIVATE KEY----- METRON AGENT KEY -----END RSA PRIVATE KEY----- doppler: enable_tls_transport: true tls_server: cert: | -----BEGIN CERTIFICATE----- DOPPLER CERTIFICATE -----END CERTIFICATE----- key: | -----BEGIN RSA PRIVATE KEY----- DOPPLER KEY -----END RSA PRIVATE KEY-----
Generating TLS Certificates
For generating TLS certificates, we recommend certstrap. An operator can follow the following steps to successfully generate the required certificates.
Most of these commands can be found in bin/generate-loggregator-certs
go get github.com/square/certstrap cd $GOPATH/src/github.com/square/certstrap ./build cd bin
Initialize a new certificate authority.
$ ./certstrap init --common-name "loggregatorCA" Enter passphrase (empty for no passphrase): <hit enter for no password> Enter same passphrase again: <hit enter for no password> Created out/loggregatorCA.key Created out/loggregatorCA.crt
Create and sign a certificate for the Doppler server.
$ ./certstrap request-cert --common-name "doppler" Enter passphrase (empty for no passphrase): <hit enter for no password> Enter same passphrase again: <hit enter for no password> Created out/doppler.key Created out/doppler.csr $ ./certstrap sign doppler --CA loggregatorCA Created out/doppler.crt from out/doppler.csr signed by out/loggregatorCA.key
The manifest property
properties.doppler.enable_tls_transportshould be set to
true. The manifest property
properties.doppler.tls_server.certshould be set to the certificate in
out/doppler.crt. The manifest property
properties.doppler.tls_server.keyshould be set to the certificate in
out/doppler.key. The manifest property
properties.loggregator.tls.cashould be set to the certificate in
Create and sign a certificate for metron agents.
$ ./certstrap request-cert --common-name "metron_agent" Enter passphrase (empty for no passphrase): <hit enter for no password> Enter same passphrase again: <hit enter for no password> Created out/metron_agent.key Created out/metron_agent.csr $ ./certstrap sign metron_agent --CA loggregatorCA Created out/metron_agent.crt from out/metron_agent.csr signed by out/loggregatorCA.key
The manifest property
properties.metron_agent.preferred_protocolshould be set to
tls. The manifest property
properties.metron_agent.buffer_size(truncating buffer) by default is set to
100, but can be increased e.g
100000The manifest property
properties.metron_agent.tls_client.certshould be set to the certificate in
out/metron_agent.crt, and the manifest property
properties.metron_agent.tls_client.keyshould be set to the certificate in
Custom TLS Certificate Generation
If you already have a CA, or wish to use your own names for clients and servers, please note that the common-names "loggregatorCA" and "metron_agent" are placeholders and can be renamed.
The server certificate must have the common name
Deploying via BOSH
Below are example snippets for deploying the DEA Logging Agent (source), Doppler, and Loggregator Traffic Controller via BOSH.
jobs: - name: dea_next templates: - name: dea_next release: cf - name: dea_logging_agent release: cf - name: metron_agent release: cf instances: 1 resource_pool: dea networks: - name: cf1 default: - dns - gateway properties: dea_next: zone: z1 metron_agent: zone: z1 networks: apps: cf1 - name: doppler_z1 # Add "doppler_zX" jobs if you have runners in zX templates: - name: doppler release: cf - name: syslog_drain_binder release: cf - name: metron_agent release: cf instances: 1 # Scale out as neccessary resource_pool: common networks: - name: cf1 properties: doppler: zone: z1 networks: apps: cf1 - name: loggregator_trafficcontroller_z1 templates: - name: loggregator_trafficcontroller release: cf - name: metron_agent release: cf instances: 1 # Scale out as necessary resource_pool: common networks: - name: cf1 properties: traffic_controller: zone: z1 # Denoting which one of the redundancy zones this traffic controller is servicing metron_agent: zone: z1 networks: apps: cf1 properties: loggregator: servers: z1: # A list of loggregator servers for every redundancy zone - 10.10.16.14 incoming_port: 3456 outgoing_port: 8080 loggregator_endpoint: # The end point sources will connect to shared_secret: loggregatorEndPointSharedSecret host: 10.10.16.16 port: 3456
Configuring the Firehose
The firehose feature includes the combined stream of logs from all apps, plus metrics data from CF components, and is intended to be used by operators and administrators.
Access to the firehose requires a user with the
The "cf" UAA client needs permission to grant this custom scope to users.
The configuration of the
uaa job in Cloud Foundry adds this scope by default.
However, if your Cloud Foundry instance overrides the
properties.uaa.clients.cf property in a stub, you need to add
doppler.firehose to the scope list in the
Configuring at deployment time (via deployment manifest)
In your deployment manifest, add
properties: … uaa: … clients: … cf: scope: …,doppler.firehose … doppler: override: true authorities: uaa.resource secret: YOUR-DOPPLER-SECRET
properties.uaa.clients.doppler.id key should be populated automatically.) These are also set by default in cf-properties.yml.
Adding scope to a running cluster (via
Before continuing, you should be familiar with the
- Ensure that doppler is a UAA client. If
uaac client get dopplerreturns output like
scope: uaa.none client_id: doppler resource_ids: none authorized_grant_types: authorization_code refresh_token authorities: uaa.resource
then you're set.
- If it does not exist, run
uaac client add doppler --scope uaa.none --authorized_grant_types authorization_code,refresh_token --authorities uaa.resource(and set its secret).
- If it exists but with incorrect properties, run
uaac client update doppler --scope uaa.none --authorized_grant_types "authorization_code refresh_token" --authorities uaa.resource.
- Grant firehose access to the
- Check the scopes assigned to
uaac client get cf, e.g.
``` scope: cloud_controller.admin cloud_controller.read cloud_controller.write openid password.write scim.read scim.userids scim.write client_id: cf resource_ids: none authorized_grant_types: implicit password refresh_token access_token_validity: 600 refresh_token_validity: 2592000 authorities: uaa.none autoapprove: true ```
- Copy the existing scope and add
doppler.firehose, then update the client
``` uaac client update cf --scope "cloud_controller.admin cloud_controller.read cloud_controller.write openid password.write scim.read scim.userids scim.write doppler.firehose" ```
Consuming log and metric data
The NOAA Client library, written in Golang, can be used by Go applications to consume app log data as well as the log + metrics firehose. If you wish to write your own client application using this library, please refer to the NOAA source and documentation.
Multiple subscribers may connect to the firehose endpoint, each with a unique subscription_id. Each subscriber (in practice, a pool of clients with a common subscription_id) receives the entire stream. For each subscription_id, all data will be distributed evenly among that subscriber's client pool.
The Cloud Foundry team uses GitHub and accepts contributions via pull request.
Follow these steps to make a contribution to any of our open source repositories:
Set your name and email
git config --global user.name "Firstname Lastname" git config --global user.email "firstname.lastname@example.org"
Fork the repo (from
developbranch to get the latest changes)
Make your changes on a topic branch, commit, and push to github and open a pull request against the
Once your commits are approved by Travis CI and reviewed by the core team, they will be merged.
Go version support
As of version ca517531f4ef646435365996c791c5031b75fc9d, all Loggregator components are deployed to Cloud Foundry with Go 1.4. As of that revision, support for earlier versions of the language are not guaranteed.
OS X prerequisites
Use brew and do
brew install go --cross-compile-all brew install direnv
Make sure you add the proper entry to load direnv into your shell. See
brew info direnv
for details. To be safe, close the terminal window that you are using to make sure the
changes to your shell are applied.
git clone https://github.com/cloudfoundry/loggregator cd loggregator # When you cd into the loggregator dir for the first time direnv will prompt you to trust the config file git submodule update --init
bin/install-git-hooks before committing for the first time. The pre-commit hook that this installs will ensure that all dependencies are properly listed in the
bosh/packages directory. (Of course, you should probably convince yourself that the hooks are safe before installing them.) Without this script, it is possible to commit a version of the repository that will not compile.
Additional go tools
Install go vet and go cover
go get golang.org/x/tools/cmd/vet go get golang.org/x/tools/cmd/cover
go get github.com/vito/gosub
Running specific tests
export GOPATH=`pwd` #in the root of the project go get github.com/onsi/ginkgo/ginkgo export PATH=$PATH:$GOPATH/bin cd src/loggregator # or any other component ginkgo -r
Doppler will dump information about the running goroutines to stdout if sent a
goroutine 1 [running]: runtime/pprof.writeGoroutineStacks(0xc2000bc3f0, 0xc200000008, 0xc200000001, 0xca0000c2001fcfc0) /home/travis/.gvm/gos/go1.1.1/src/pkg/runtime/pprof/pprof.go:511 +0x7a runtime/pprof.writeGoroutine(0xc2000bc3f0, 0xc200000008, 0x2, 0xca74765c960d5c8f, 0x40bbf7, ...) /home/travis/.gvm/gos/go1.1.1/src/pkg/runtime/pprof/pprof.go:500 +0x3a ....
Editing Manifest Templates
Currently the Doppler/Metron manifest configuration lives here.
Editing this file will make changes in the manifest templates in cf-release.
When making changes to these templates, you should be working out of the loggregator submodule in cf-release.
After changing this configuration, you will need to run the tests in root directory of cf-release with
bundle exec rspec.
These tests will pull values from lamb-properties in order to populate the fixtures.
Necessary changes should be made in lamb-properties.
Loggregator as a separate release
There are cases when releases outside of Cloud Foundry would like to emit logs and metrics to the Loggregator system. In such cases we have instructions on using Loggregator as a separate release here.
Metrics generated by Loggregator
See this list.