Skip to content
This repository has been archived by the owner on Nov 5, 2021. It is now read-only.

Add a cloudwatch surfacer #583

Closed
wants to merge 26 commits into from
Closed

Add a cloudwatch surfacer #583

wants to merge 26 commits into from

Conversation

robpickerill
Copy link
Contributor

@robpickerill robpickerill commented Mar 31, 2021

This PR adds cloudwatch surfacer support for cloudprober.


This is deployed onto AWS ECS and working with the following config:

probe {
  name: "grafana_homepage"
  type: HTTP
  targets {
    host_names: "redacted"
  }
  http_probe: {
    protocol: HTTPS
  }
  interval_msec: 30000
  timeout_msec: 3000

  validator {
    name: "status_code_2xx"
    http_validator {
        success_status_codes: "200-399"
    }
  }

  latency_unit: "ms"
  latency_distribution {
    explicit_buckets: "10,20,40,80,160,320"
  }
}

probe {
  name: "grafana_homepage_https_redirect"
  type: HTTP
  targets {
    host_names: "redacted"
  }
  http_probe: {
    protocol: HTTP
  }
  interval_msec: 30000
  timeout_msec: 3000

  validator {
    name: "status_code_2xx"
    http_validator {
        success_status_codes: "200-399"
    }
  }
}

probe {
  name: "grafana_status"
  type: HTTP
  targets {
    host_names: "redacted"
  }
  http_probe: {
    protocol: HTTPS
    relative_url: "/api/health"
  }
  interval_msec: 30000
  timeout_msec: 3000

  validator {
    name: "status_code_2xx"
    http_validator {
        success_status_codes: "200-399"
    }
  }
}

probe {
  name: "grafana_metrics_dashboard"
  type: HTTP
  targets {
    host_names: "redacted"
  }
  http_probe: {
    protocol: HTTPS
    relative_url: "/d/Iwkba85Mk/grafana-metrics"
  }
  interval_msec: 30000
  timeout_msec: 3000

  validator {
    name: "status_code_2xx"
    http_validator {
        success_status_codes: "200-399"
    }
  }
}

probe {
  name: "grafana_login"
  type: HTTP
  targets {
    host_names: "redacted"
  }
  http_probe: {
    protocol: HTTPS
    relative_url: "/login"
  }
  interval_msec: 30000
  timeout_msec: 3000

  validator {
    name: "status_code_2xx"
    http_validator {
        success_status_codes: "200-399"
    }
  }
}

surfacer {
  type: CLOUDWATCH

  cloudwatch_surfacer {
    namespace: "/cloudprober/grafana"
  }
}

Using this IAM policy attached to the task role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Condition": {
                "StringEqualsIgnoreCase": {
                    "cloudwatch:namespace": "/cloudprober/grafana"
                }
            },
            "Action": [
                "cloudwatch:PutMetricData"
            ],
            "Resource": [
                "*"
            ],
            "Effect": "Allow",
            "Sid": "PutMetrics"
        }
    ]
}

@google-cla
Copy link

google-cla bot commented Mar 31, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@robpickerill robpickerill changed the title WIP Add initial implementation of a cw surfacer WIP Add initial implementation of a cloudwatch surfacer Mar 31, 2021
@robpickerill
Copy link
Contributor Author

@googlebot I signed it!

@google-cla
Copy link

google-cla bot commented Mar 31, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@robpickerill
Copy link
Contributor Author

@googlebot I fixed it.

Rebased the wrong email out of the commits.

@google-cla
Copy link

google-cla bot commented Mar 31, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@robpickerill
Copy link
Contributor Author

@googlebot I fixed it.


should be fixed properly now

@robpickerill robpickerill changed the title WIP Add initial implementation of a cloudwatch surfacer Add a cloudwatch surfacer Apr 4, 2021
@manugarg
Copy link
Contributor

manugarg commented Apr 5, 2021

@robpickerill Thanks a lot for this contribution. I'll try to review it soon -- I am out of office half of this week, so it may get pushed to next week.

Copy link
Contributor

@manugarg manugarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @robpickerill,

Thanks once again for this. Sorry for the delay in reviewing. I've added some comments. Please take a look.

surfacers/cloudwatch/proto/config.proto Outdated Show resolved Hide resolved
surfacers/cloudwatch/proto/config.proto Outdated Show resolved Hide resolved
surfacers/cloudwatch/cloudwatch.go Show resolved Hide resolved
surfacers/cloudwatch/cloudwatch.go Outdated Show resolved Hide resolved
surfacers/cloudwatch/cloudwatch.go Outdated Show resolved Hide resolved
surfacers/cloudwatch/cloudwatch.go Outdated Show resolved Hide resolved
surfacers/cloudwatch/cloudwatch.go Outdated Show resolved Hide resolved
surfacers/cloudwatch/cloudwatch.go Outdated Show resolved Hide resolved
surfacers/cloudwatch/cloudwatch.go Outdated Show resolved Hide resolved
surfacers/cloudwatch/cloudwatch.go Outdated Show resolved Hide resolved
@robpickerill
Copy link
Contributor Author

Hey Manu

Thanks once again for this. Sorry for the delay in reviewing. I've added some comments. Please take a look.

No problems at all. I appreciate the feedback here. Let me work through the comments, and let you know when they are complete.

Thanks again.

@robpickerill
Copy link
Contributor Author

Manu, I've been resolving the comments so I can see which comments are still outstanding - if I've missed anything please feel free to unresolve.

Copy link
Contributor

@manugarg manugarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing the comments. I've added a few more comments, but you don't need to fix them. I am working on importing/exporting this change following the process the described at:
https://github.com/google/cloudprober/blob/master/CONTRIBUTING.md#source-of-truth-sot-and-commit-process

I'll make minor fixes (mentioned in the comments) while doing that.

Comment on lines +177 to +186

switch lu {
case time.Second:
metricDatum.Value = aws.Float64(value * 1000)
case time.Microsecond:
metricDatum.Value = aws.Float64(value / 1000)
case time.Nanosecond:
metricDatum.Value = aws.Float64((value / 1000) / 1000)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can do this instead:

	metricDatum.Value = aws.Float64(value * float64(latencyUnit) / float64(time.Millisecond))

surfacers/cloudwatch/proto/config.proto Outdated Show resolved Hide resolved

// publish the metrics to cloudwatch, using the namespace provided from configuration
func (cw *CWSurfacer) publishMetrics(md *cloudwatch.MetricDatum) {
if len(cw.cwMetricDatumCache) >= 20 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to use cloudwatchMaxMetricDatums here instead of 20.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, that's an obvious miss by me - but thanks for catching

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. :)

return failure, nil
}

func (s *CWSurfacer) ignoreMetric(name string) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Receiver "s" needs to consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, sorry, thats as obvious miss by me, but thanks again for catching - was copy/paste as I was trying to keep consistent with the stackdriver implementation. Apologies

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries.

manugarg pushed a commit that referenced this pull request Apr 22, 2021
This is import of the original PR at
#583.

PiperOrigin-RevId: 369784229
@manugarg
Copy link
Contributor

manugarg commented Apr 22, 2021

@robpickerill Importing this PR through #587.

Documentation changes are not pushed through the import-export method. They can be committed directly. Can you please send another PR, containing just the documentation changes? You can close this PR afterwards.

@robpickerill
Copy link
Contributor Author

Thanks for all your assistance here Manu, very much appreciated.

Can you please send another PR, containing just the documentation changes?

Let me wrap this up today and we can close down this PR, let me also put the items we discussed on regex's into issues for tracking, then this PR is good to close, and I'll work on datadog.

Thanks again!

manugarg pushed a commit that referenced this pull request Apr 22, 2021
This is import of the original PR at
#583.

PiperOrigin-RevId: 369784229
@robpickerill
Copy link
Contributor Author

closing for #587

@manugarg manugarg added this to the v0.11.3 milestone Sep 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants