Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
README.md correct metrics test instruction Dec 5, 2018

README.md

Welcome to the Sensu Go sandbox!

This tutorial will get you up and running with Sensu.

Report issues or share feedback by opening an issue in this repo.


Set up the sandbox

1. Install Vagrant and VirtualBox:

2. Download the sandbox:

Download from GitHub then unzip and enter the sensu-go sandbox directory

unzip sandbox-master.zip && cd sandbox-master/sensu-go/core/

Or clone the repository:

git clone git@github.com:sensu/sandbox.git && cd sandbox/sensu-go/core

3. Start Vagrant:

ENABLE_SENSU_SANDBOX_PORT_FORWARDING=1 vagrant up

This will take around five minutes.

NOTE: This will configure VirtualBox to forward a couple of tcp ports (3002,4002) from the sandbox VM machine to the localhost to make it easier for you to interact with the Sandbox dashboards. Dashboard links provided below assume port forwarding from the VM to the host is active and reference http://localhost.

4. SSH into the sandbox:

Thanks for waiting! To start using the sandbox:

vagrant ssh

You should now have shell access to the sandbox and should be greeted with this sandbox prompt:

[sensu_go_sandbox]$

NOTE: To exit out of the sandbox, use CTRL+D. Use vagrant destroy then vagrant up to erase and restart the sandbox. Use vagrant provision to reset sandbox's sensu configuration to the beginning of this lesson

NOTE: To save you a little time we've pre-configured sensuctl in the sandbox to use the Sensu Go admin user with default password as part of sandbox provisioning, so you won't have to configure sensuctl each time you spin up the sandbox to try out a new feature. Before installing sensuctl outside of the sandbox please, please read the first time setup reference to learn how to configure sensuctl.


Lesson #1: Create a Sensu monitoring event

First off, we'll make sure everything is working correctly by creating a keepalive event with the Sensu agent.

1. Get list of entities

Sensu keeps track of monitored components as entities. Let's use the sensuctl command line tool to make sure Sensu hasn't connected to any entities yet:

sensuctl entity list

We should see no entities in the list.

2. Get list of events

Let's check to make sure that no monitoring events have been created with Sensu:

sensuctl event list

We should see no events listed.

3. Start the Sensu Agent

Let's go ahead and start the Sensu agent to start monitoring the sandbox:

sudo systemctl start sensu-agent

We can use sensuctl to see that the sandbox agent is now being monitored by Sensu:

sensuctl entity list

Sensu agents send keepalive events to help you monitor their status. We can use sensuctl to see the keepalive events generated by the sandbox entity:

sensuctl event list

The sensu-go-sandbox keepalive event has status 0, meaning the agent is successfully able to communicate with the server. If we wait a minute and check the event list again you will set the Last Seen timestamp for the keepalive check has updated.

We can also see the event and the client in the dashboard event view and entities view. When you login to the Sensu dashboard for the first time you will need login as user: admin with password: P@ssw0rd!.

Lesson #2: Pipe keepalive events into Slack

Now that we know the sandbox is working properly, let's get to the fun stuff: creating a pipeline. In this lesson, we'll create a pipeline to send keepalive alerts to Slack. (If you'd rather not create a Slack account, you can skip ahead to lesson 3.)

W'll use the Sensu Slack handler to create our pipeline. For convenience, this command has been installed as part of sandbox provisioning.

1. Get your Slack webhook URL

If you're already an admin of a Slack, visit https://YOUR WORKSPACE NAME HERE.slack.com/services/new/incoming-webhook and follow the steps to add the Incoming WebHooks integration, choose a channel, and save the settings. (If you're not yet a Slack admin, start here to create a new workspace.) After saving, you'll see your webhook URL under Integration Settings.

2. Test the Slack handler manually Run the following commands to apply your Slack channel and Slack webhook URl to the sandbox: Make sure to change the channel string and webhook URL string to match your particular Slack configuration.

KEEPALIVE_SLACK_CHANNEL="#sensu-sandbox"
KEEPALIVE_SLACK_WEBHOOK="https://hooks.slack.com/services/AAA/BBB/CCC"

Now let's manually generate a Slack alert using the sandbox keepalive events:

sensuctl event info sensu-go-sandbox keepalive --format json | /usr/local/bin/sensu-slack-handler -c "${KEEPALIVE_SLACK_CHANNEL}" -w "${KEEPALIVE_SLACK_WEBHOOK}"

If you have the correct channel and webhook url configured, you should now see a new message in Slack indicating that the sensu-go-sandbox is in a ok state.

Now let's disable the agent service and wait a couple of minutes for the keepalive check to enter the warning state (status = 1).

sudo systemctl stop sensu-agent

Now is a good time to grab a cup of coffee, or browse the Sensu documentation for a couple of minutes. Let's check to make sure the sandbox keepalive is now in a failed state.

sensuctl event list

The keepalive event should report status = 1 after the agent has been stopped for a couple of minutes. Once the sandbox entity is in a failed state, we can manually run the Slack handler again.

sensuctl event info sensu-go-sandbox keepalive --format json | /usr/local/bin/sensu-slack-handler -c "${KEEPALIVE_SLACK_CHANNEL}" -w "${KEEPALIVE_SLACK_WEBHOOK}"

The resulting Slack message now indicates a warning (status = 1). Now that we ensured that the Slack handler is working correctly, let's build a Sensu keepalive pipeline.

3. Edit sensu-slack-handler.json Open the Slack handler definition provided with the sandbox:

nano sensu-slack-handler.json

And edit it to include the same Slack channel and webhook URL you added in the previous step.

4. Create the handler definition using sensuctl

sensuctl create -f sensu-slack-handler.json

We can confirm that we now have a Sensu Slack handler with sensuctl:

sensuctl handler list

5. Test Slack handler pipeline Restart the Sensu agent to resume producing keepalive events.

sudo systemctl restart sensu-agent

Once the agent begins to send keepalive events, you should see a new message in Slack indicating that the sandbox entity is in an ok state.

6. Filter keepalive events

Now that we're generating Slack alert automatically, let's reduce the potential for alert fatigue by ensuring that Sensu sends only warning, critical, and resolution alerts to Slack.

To accomplish this, we'll interactively add the built-in is_incident filter to the keepalive handler pipeline so we'll only receive alerts when the sandbox entity fails to send a keepalive event.

sensuctl handler update keepalive

When prompted for the filters selection, enter is_incident to apply the built-in incidents filter.

? Filters: [? for help] is_incident

We can confirm that the Slack handler now includes the incidents filter using sensuctl:

sensuctl handler info keepalive

Now with the filter in place we should no longer be receiving messages in the Slack channel every time the sandbox entity sends a keepalive event.

Let's stop the agent and confirm that we receive the expected warning message.

sudo systemctl stop sensu-agent

We should see the warning message after a couple of minutes, informing you that the sandbox entity is no longer sending keepalive events.

Lesson #3: Automate event production with the Sensu agent

So far we've used only the Sensu agent's built-in keepalive feature, but in this lesson, we'll create a check that automatically produces workload-related events. Instead of sending alerts to Slack, we'll store event data with InfluxDB and visualize it with Grafana.

1. Make sure Sensu agent is running

sudo systemctl restart sensu-agent

2. Install Nginx and the Sensu HTTP Plugin

We'll use the Sensu HTTP Plugin to monitor an Nginx server running on the sandbox.

First, install and start Nginx:

sudo yum install -y nginx && sudo systemctl start nginx

And make sure it's working with:

curl -I http://localhost:80

Then install the Sensu HTTP Plugin:

sudo sensu-install -p sensu-plugins-http

We'll be using the metrics-curl.rb plugin. We can test its output using:

/opt/sensu-plugins-ruby/embedded/bin/metrics-curl.rb -u "http://localhost"
$ /opt/sensu-plugins-ruby/embedded/bin/metrics-curl.rb -u "http://localhost"
...
sensu-go-sandbox.curl_timings.http_code 200 1535670975

3. Create a check to monitor Nginx

Use a configuration file to create a service check that runs metrics-curl.rb every 10 seconds on all entities with the entity:sensu-go-sandbox subscription and send it to the InfluxDB metrics handler pipeline:

nano curl_timings-check.json

Notice how we are defining a metrics handler and metric format. In Sensu Go metrics are a core element of the data model, and we can build pipelines to handle metrics separately from the alerts! This allows us to customize our monitoring workflows to get better visibility and reduce alert fatigue.

Use sensuctl to create a check to monitor Nginx:

sensuctl create -f curl_timings-check.json

We can use sensuctl to confirm that the check has been created:

sensuctl check list

After about 10 seconds, we can see the event produced by the entity:

sensuctl event info sensu-go-sandbox curl_timings --format json |jq .
...
  "metrics": {
    "handlers": [
      "influx-db"
    ],
    "points": [
      {
        "name": "sensu-go-sandbox.curl_timings.time_total",
        "value": 0.005,
        "timestamp": 1543532948,
        "tags": []
      },
      {
        "name": "sensu-go-sandbox.curl_timings.time_namelookup",
        "value": 0.005,
        "timestamp": 1543532948,
        "tags": []
      },
      {
        "name": "sensu-go-sandbox.curl_timings.time_connect",
        "value": 0.005,
        "timestamp": 1543532948,
        "tags": []
      },
      {
        "name": "sensu-go-sandbox.curl_timings.time_pretransfer",
        "value": 0.005,
        "timestamp": 1543532948,
        "tags": []
      },
      {
        "name": "sensu-go-sandbox.curl_timings.time_redirect",
        "value": 0,
        "timestamp": 1543532948,
        "tags": []
      },
      {
        "name": "sensu-go-sandbox.curl_timings.time_starttransfer",
        "value": 0.005,
        "timestamp": 1543532948,
        "tags": []
      },
      {
        "name": "sensu-go-sandbox.curl_timings.http_code",
        "value": 200,
        "timestamp": 1543532948,
        "tags": []
      }
    ]
  }

Because we configured a metric-format, the Sensu agent was able to convert the Graphite-formatted metrics provided by the check command into a set of Sensu-formatted metrics. Metric support isn't limited to just Graphite, the Sensu agent can extract metrics in multiple line protocol formats, including Nagios performance data. Now let's create the InfluxDB handler to store these metrics and visualize them with Grafana.

4. Create an InfluxDB pipeline

Since we've already installed InfluxDB as part of the sandbox, all we need to do to create an InfluxDB pipeline is create a Sensu handler. As part of sandbox provisioning, a version of the sensu-influxdb-handler has been installed for convenience.

Take a look at the provided configuration file:

nano influx-handler.json

To create the handler from the config, just use the sensuctl create command:

sensuctl create -f influx-handler.json

We can use sensuctl to confirm that the handler has been created successfully:

sensuctl handler list

5. See the HTTP response code events for Nginx in Grafana.

Log in to Grafana as username: admin password: admin. We should see a graph of real HTTP response codes for Nginx.

Now if we turn Nginx off, we should see the impact in Grafana:

sudo systemctl stop nginx

Start Nginx:

sudo systemctl start nginx

5. Automate disk usage monitoring for the sandbox

Now that we have an entity set up, we can easily add more checks. For example, let's say we want to monitor disk usage on the sandbox.

First, install the plugin:

sudo sensu-install -p sensu-plugins-disk-checks

And test it:

/opt/sensu-plugins-ruby/embedded/bin/metrics-disk-usage.rb
$ /opt/sensu-plugins-ruby/embedded/bin/metrics-disk-usage.rb
sensu-core-sandbox.disk_usage.root.used 2235 1534191189
sensu-core-sandbox.disk_usage.root.avail 39714 1534191189
...

Then create the check using sensuctl, assigning it to the entity:sensu-go-sandbox subscription and the InfluxDB pipeline:

sensuctl create -f disk_usage-check.json

And we should see it working in the [dashboard entity view][http://localhost:3002/#/entities] and via sensuctl:

sensuctl event list

Now we should be able to see disk usage metrics for the sandbox in Grafana.

You made it! You're ready for the next level of Sensu-ing. Here are some resources to help continue your journey: