Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Checks for Heartbeat HTTP Monitors #8667

Merged
merged 3 commits into from Nov 9, 2018
Merged

Conversation

andrewvc
Copy link
Contributor

@andrewvc andrewvc commented Oct 19, 2018

Note: This is rebased on top of #8660 since that bug breaks the integration tests here

Add JSON body check support to Heartbeat

This commit adds a new json check for HTTP responses letting users define an arbitrary condition to match against parsed JSON to determine whether an endpoint is up or down.

The nice thing about structured checks like this is that it makes it easy for users to precisely piggy-back on top of existing JSON endpoints, or write their own where a given key/value could indicate the health of an external system. In a sense, it allows users to write a healthcheck endpoint.

An example can be seen below:

heartbeat.monitors:
- type: http
   urls: ["http://localhost:9200"]
   schedule: '@every 10s'
   check.response.json:
     - description: version number should be 6.4.0
       equals.version.number: "6.4.0" 

Fixes #7936

Status uint16 `config:"status" verify:"min=0, max=699"`
RecvHeaders map[string]string `config:"headers"`
RecvBody []match.Matcher `config:"body"`
RecvJson []*jsonResponseCheck `config:"json"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

struct field RecvJson should be RecvJSON

@@ -115,3 +129,19 @@ func checkBody(body []match.Matcher) RespCheck {
return errBodyMismatch
}
}

func checkJson(description string, condition conditions.Condition) RespCheck {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func checkJson should be checkJSON

@andrewvc
Copy link
Contributor Author

Since this includes docs, CCing @dedemorton and @karenzone

@andrewvc andrewvc force-pushed the body-json branch 2 times, most recently from bf34e68 to d447a87 Compare October 19, 2018 21:48
@@ -125,3 +128,63 @@ func TestCheckBody(t *testing.T) {
})
}
}

func TestCheckJson(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are a bit redundant given the python integration tests. However, it's a nicer dev/test cycle when using these. I wrote these first.

I'm +1 to just leave them in, they could be useful in the future.

@andrewvc andrewvc changed the title Body json JSON Checks for Heartbeat HTTP Monitors Oct 19, 2018
@@ -459,7 +460,11 @@ contains JSON:
'X-API-Key': '12345-mykey-67890'
check.response:
status: 200
body: '{"status": "ok"}'
json:
- description: check status
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of the description? logging output?

As this is YAML I would more expect someone that wants to put details about it as a yaml comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's interpolated into the error field in the event when the check fails.

This is important because it means the user can associate a human readable error with the downtime message when they get, say, an SMS.

So, instead of getting "http check for elastic.co/foo failed" they can get one that says JSON body did not match condition '%s' for monitor. Received JSON %+v, where %s is the description.

When using multiple conditions this is even more important since it may not be apparent what caused the failure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. So the description is sent with every event or just errors (need to check the code).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin no, it's only sent on error. There are multiple conditions checked, and the first one that fails is the only one that creates an error.

Now that I think about it, that's probably a mistake. We should run all the conditions and list all of the condition failures.

I'll improve the PR by making the error message include a list of error messages. JSON body did not match 2 conditions: "version greater than 1", "name must be valid". Received JSON {...}

@ruflin
Copy link
Member

ruflin commented Oct 23, 2018

@andrewvc Will review this as soon as we get #8660 in and it's rebased on top.

This will probably need a backport to 6.x later on so we get it into 6.6?

@andrewvc
Copy link
Contributor Author

@ruflin thanks for taking a peek. Agree on the backport.

@@ -147,13 +146,30 @@ func (t *SimpleTransport) writeRequest(conn net.Conn, req *http.Request) error {
return err
}

// comboConnReadCloser wraps a ReadCloser that is backed by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this need a rebase? I remember almost same code went into an other PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll rebase it soon.

}{
{
"positive match",
"{\"foo\": \"bar\"}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see some checks for something like {"a": {"b": "c"}}. How will the config look like if someone wnats to check for a.b=c?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather avoid adding tests for this sort of behavior. Testing the correctness of conditions seems more like the responsibility of the conditions package.

That said, I'm glad to alter this test to access a nested key instead of a root level one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree it's up to the condition package to work correctly. My worry is that perhaps it doesn't as we will have a problem on the key side + yaml.

I definitively want to know if it works or doesn't. If modifying the test is the best way, lets go for it. We should also adjust it in the system test as I think the actual issue we will hit when we read from Yaml not here in the golang tests.

This fixes an issue where the connection would be closed after only a partial body read. The RoundTripper would close the conn before returning the response which included a partially buffered body. This would actually work for short responses, since the backing bufio would do a partial read, but would fail on all but the shortest responses.

Normally connection lifecycle is handled outside the realm of the `RoundTripper`, but for our purposes we don't want to re-use connections. Since it is a requirement that all response bodies be closed, we can piggy-back on top of that to ensure the connection is closed.

Fixes elastic#8588
This commit adds a new `json` check for HTTP responses letting users define an arbitrary condition to match against parsed JSON to determine whether an endpoint is up or down.

The nice thing about structured checks like this is that it makes it easy for users to precisely piggy-back on top of existing JSON endpoints, or write their own where a given key/value could indicate the health of an external system. In a sense, it allows users to write a healthcheck endpoint.

An example can be seen below:

```yaml
heartbeat.monitors:
- type: http
  # List or urls to query
  urls: ["http://localhost:9200"]
  schedule: '@every 10s'
  check.response.json:
    - description: check version
      condition: equals.version.number: "6.4.0"
```
@andrewvc
Copy link
Contributor Author

@ruflin rebased and pushed with nested keys used in both py and go tests.


var errorDescs []string
for _, compiledCheck := range compiledChecks {
//compiledCheck.condition, err = conditions.NewEqualsCondition(map[string]interface{}{"foo.baz": "bar"})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover?

"check_response_json": [{
"description": "foo equals bar",
"condition": {
"equals": {"foo": {"baz": "bar"}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the YAML output look like here? I assume if you use "foo.baz": "bar" things will not work? What if the response actually has this exact key with a dot inside?

Sorry to keep poking on this one. Agree it's a potential condition issue if it breaks but it shows up here very prominent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some investigation and determined that dotted paths and nested maps are equivalent here due to the use of common.MapStr. This means that you can use either in your condition and it will work.

I think we should move further discussion of this to a separate issue since this has more to do with the beats condition evaluation engine than this specific PR.

@ruflin
Copy link
Member

ruflin commented Oct 29, 2018

Code LGTM except the leftover part (at least I think it's left over)

@andrewvc
Copy link
Contributor Author

@ruflin I removed the comment. What does "Code LGTM" mean here? Am I good to merge? Is the conditions thing a blocker?

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR is good to go. My reference was only for the leftover part.

Let's sync up offline quickly so I can explain in more detail the issue around the docs.

andrewvc added a commit to andrewvc/beats that referenced this pull request Nov 1, 2018
On busy CI servers, I suspect some flakiness, as in the case of elastic#8667 (review) is due to the server being temporarily too busy.

Let's try lengthening this time to see if that helps.
@andrewvc
Copy link
Contributor Author

andrewvc commented Nov 8, 2018

jenkins, retest this please

@andrewvc
Copy link
Contributor Author

andrewvc commented Nov 9, 2018

Build failures are unrelated.

@andrewvc andrewvc merged commit 22ba375 into elastic:master Nov 9, 2018
andrewvc added a commit to andrewvc/beats that referenced this pull request Nov 9, 2018
This commit adds a new `json` check for HTTP responses letting users define an arbitrary condition to match against parsed JSON to determine whether an endpoint is up or down.

The nice thing about structured checks like this is that it makes it easy for users to precisely piggy-back on top of existing JSON endpoints, or write their own where a given key/value could indicate the health of an external system. In a sense, it allows users to write a healthcheck endpoint.

An example can be seen below:

```yaml
heartbeat.monitors:
- type: http
  # List or urls to query
  urls: ["http://localhost:9200"]
  schedule: '@every 10s'
  check.response.json:
    - description: check version
      condition: equals.version.number: "6.4.0"
```


(cherry picked from commit 22ba375)
@andrewvc andrewvc added the v6.6.0 label Nov 9, 2018
andrewvc added a commit that referenced this pull request Nov 12, 2018
This commit adds a new `json` check for HTTP responses letting users define an arbitrary condition to match against parsed JSON to determine whether an endpoint is up or down.

The nice thing about structured checks like this is that it makes it easy for users to precisely piggy-back on top of existing JSON endpoints, or write their own where a given key/value could indicate the health of an external system. In a sense, it allows users to write a healthcheck endpoint.

An example can be seen below:

```yaml
heartbeat.monitors:
- type: http
  # List or urls to query
  urls: ["http://localhost:9200"]
  schedule: '@every 10s'
  check.response.json:
    - description: check version
      condition: equals.version.number: "6.4.0"
```


(cherry picked from commit 22ba375)
@andrewvc andrewvc added the Team:obs-ds-hosted-services Label for the Observability Hosted Services team label Nov 27, 2018
@praveenkumar466
Copy link

@here : how do we change the monitor.status to partial when there is failure in the validation body

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants