Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update quota monitor blueprint to support project discovery #1924

Merged
merged 14 commits into from
Dec 12, 2023
Merged
13 changes: 6 additions & 7 deletions blueprints/cloud-operations/quota-monitoring/README.md
Expand Up @@ -38,9 +38,10 @@ The region, location of the bundle used to deploy the function, and scheduling f

The `quota_config` variable mirrors the arguments accepted by the Python program, and allows configuring several different aspects of its behaviour:

- `quota_config.discover_root` organization or folder to be used to discover all underlying projects to track quotas for, in `organizations/nnnnn` or `folders/nnnnn` format
- `quota_config.exclude` do not generate metrics for quotas matching prefixes listed here
- `quota_config.include` only generate metrics for quotas matching prefixes listed here
- `quota_config.projects` projects to track quotas for, defaults to the project where metrics are stored
- `quota_config.projects` projects to track quotas for, defaults to the project where metrics are stored, if projects are automatically discovered, those in this list are appended.
- `quota_config.regions` regions to track quotas for, defaults to the `global` region for project-level quotas
- `dry_run` do not write actual metrics
- `verbose` increase logging verbosity
Expand All @@ -54,7 +55,6 @@ Clone this repository or [open it in cloud shell](https://ssh.cloud.google.com/c
- `terraform init`
- `terraform apply -var project_id=my-project-id`
<!-- BEGIN TFDOC -->

## Variables

| name | description | type | required | default |
Expand All @@ -64,10 +64,9 @@ Clone this repository or [open it in cloud shell](https://ssh.cloud.google.com/c
| [bundle_path](variables.tf#L33) | Path used to write the intermediate Cloud Function code bundle. | <code>string</code> | | <code>&#34;.&#47;bundle.zip&#34;</code> |
| [name](variables.tf#L39) | Arbitrary string used to name created resources. | <code>string</code> | | <code>&#34;quota-monitor&#34;</code> |
| [project_create_config](variables.tf#L45) | Create project instead of using an existing one. | <code title="object&#40;&#123;&#10; billing_account &#61; string&#10; parent &#61; optional&#40;string&#41;&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | | <code>null</code> |
| [quota_config](variables.tf#L59) | Cloud function configuration. | <code title="object&#40;&#123;&#10; exclude &#61; optional&#40;list&#40;string&#41;, &#91;&#10; &#34;a2&#34;, &#34;c2&#34;, &#34;c2d&#34;, &#34;committed&#34;, &#34;g2&#34;, &#34;interconnect&#34;, &#34;m1&#34;, &#34;m2&#34;, &#34;m3&#34;,&#10; &#34;nvidia&#34;, &#34;preemptible&#34;&#10; &#93;&#41;&#10; include &#61; optional&#40;list&#40;string&#41;&#41;&#10; projects &#61; optional&#40;list&#40;string&#41;&#41;&#10; regions &#61; optional&#40;list&#40;string&#41;&#41;&#10; dry_run &#61; optional&#40;bool, false&#41;&#10; verbose &#61; optional&#40;bool, false&#41;&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | | <code>&#123;&#125;</code> |
| [region](variables.tf#L76) | Compute region used in the example. | <code>string</code> | | <code>&#34;europe-west1&#34;</code> |
| [schedule_config](variables.tf#L82) | Schedule timer configuration in crontab format. | <code>string</code> | | <code>&#34;0 &#42; &#42; &#42; &#42;&#34;</code> |

| [quota_config](variables.tf#L59) | Cloud function configuration. | <code title="object&#40;&#123;&#10; exclude &#61; optional&#40;list&#40;string&#41;, &#91;&#10; &#34;a2&#34;, &#34;c2&#34;, &#34;c2d&#34;, &#34;committed&#34;, &#34;g2&#34;, &#34;interconnect&#34;, &#34;m1&#34;, &#34;m2&#34;, &#34;m3&#34;,&#10; &#34;nvidia&#34;, &#34;preemptible&#34;&#10; &#93;&#41;&#10; discovery_root &#61; optional&#40;string, &#34;&#34;&#41;&#10; dry_run &#61; optional&#40;bool, false&#41;&#10; include &#61; optional&#40;list&#40;string&#41;&#41;&#10; projects &#61; optional&#40;list&#40;string&#41;&#41;&#10; regions &#61; optional&#40;list&#40;string&#41;&#41;&#10; verbose &#61; optional&#40;bool, false&#41;&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | | <code>&#123;&#125;</code> |
| [region](variables.tf#L85) | Compute region used in the example. | <code>string</code> | | <code>&#34;europe-west1&#34;</code> |
| [schedule_config](variables.tf#L91) | Schedule timer configuration in crontab format. | <code>string</code> | | <code>&#34;0 &#42; &#42; &#42; &#42;&#34;</code> |
<!-- END TFDOC -->
## Test

Expand All @@ -80,5 +79,5 @@ module "test" {
billing_account = "12345-ABCDE-12345"
}
}
# tftest modules=4 resources=14
# tftest modules=4 resources=19
```
58 changes: 56 additions & 2 deletions blueprints/cloud-operations/quota-monitoring/main.tf
Expand Up @@ -20,6 +20,8 @@ locals {
? [var.project_id]
: var.quota_config.projects
)
discovery_root_type = split("/", coalesce(var.quota_config["discovery_root"], "/"))[0]
discovery_root_id = split("/", coalesce(var.quota_config["discovery_root"], "/"))[1]
}

module "project" {
Expand All @@ -29,8 +31,11 @@ module "project" {
parent = try(var.project_create_config.parent, null)
project_create = var.project_create_config != null
services = [
"compute.googleapis.com",
"cloudfunctions.googleapis.com"
"cloudasset.googleapis.com",
"cloudbuild.googleapis.com",
"cloudfunctions.googleapis.com",
"cloudscheduler.googleapis.com",
"compute.googleapis.com"
]
}

Expand Down Expand Up @@ -81,6 +86,55 @@ resource "google_cloud_scheduler_job" "default" {
}
}

resource "google_organization_iam_member" "org_asset_viewer" {
count = local.discovery_root_type == "organizations" ? 1 : 0
org_id = local.discovery_root_id
role = "roles/cloudasset.viewer"
member = module.cf.service_account_iam_email
}


# role with the least privilege including compute.projects.get permission
resource "google_organization_iam_member" "org_network_viewer" {
count = local.discovery_root_type == "organizations" ? 1 : 0
org_id = local.discovery_root_id
role = "roles/compute.networkViewer"
member = module.cf.service_account_iam_email
}

resource "google_organization_iam_member" "org_quota_viewer" {
count = local.discovery_root_type == "organizations" ? 1 : 0
org_id = local.discovery_root_id
role = "roles/servicemanagement.quotaViewer"
member = module.cf.service_account_iam_email
}

resource "google_folder_iam_member" "folder_asset_viewer" {
count = local.discovery_root_type == "folders" ? 1 : 0
folder = local.discovery_root_id
role = "roles/cloudasset.viewer"
member = module.cf.service_account_iam_email
}

# role with the least privilege including compute.projects.get permission
resource "google_folder_iam_member" "folder_network_viewer" {
count = local.discovery_root_type == "folders" ? 1 : 0
folder = local.discovery_root_id
role = "roles/compute.networkViewer"
member = module.cf.service_account_iam_email
}

resource "google_folder_iam_member" "folder_quota_viewer" {
count = local.discovery_root_type == "folders" ? 1 : 0
folder = local.discovery_root_id
role = "roles/servicemanagement.quotaViewer"
member = module.cf.service_account_iam_email
}





resource "google_project_iam_member" "metric_writer" {
project = module.project.project_id
role = "roles/monitoring.metricWriter"
Expand Down
50 changes: 40 additions & 10 deletions blueprints/cloud-operations/quota-monitoring/src/main.py
Expand Up @@ -39,6 +39,9 @@
URL_PROJECT = 'https://compute.googleapis.com/compute/v1/projects/{}'
URL_REGION = 'https://compute.googleapis.com/compute/v1/projects/{}/regions/{}'
URL_TS = 'https://monitoring.googleapis.com/v3/projects/{}/timeSeries'
URL_DISCOVERY = ('https://cloudasset.googleapis.com/v1/{}/assets?'
'assetTypes=cloudresourcemanager.googleapis.com%2FProject&'
'contentType=RESOURCE&pageSize=100&pageToken={}')

_Quota = collections.namedtuple('_Quota',
'project region tstamp metric limit usage')
Expand Down Expand Up @@ -80,8 +83,8 @@ def _api_format(self, name, value):
else:
d['valueType'] = 'INT64'
d['points'][0]['value'] = {'int64Value': value}
# remove this label if cardinality gets too high
d['metric']['labels']['quota'] = f'{self.usage}/{self.limit}'
# re-enable the following line if cardinality is not a problem
# d['metric']['labels']['quota'] = f'{self.usage}/{self.limit}'
return d

@property
Expand All @@ -92,7 +95,7 @@ def timeseries(self):
ratio = 0
yield self._api_format('ratio', ratio)
yield self._api_format('usage', self.usage)
# yield self._api_format('limit', self.limit)
yield self._api_format('limit', self.limit)


def batched(iterable, n):
Expand All @@ -112,6 +115,23 @@ def configure_logging(verbose=True):
warnings.filterwarnings('ignore', r'.*end user credentials.*', UserWarning)


def discover_projects(discovery_root):
'Discovers projects under a folder or organization.'
if discovery_root.partition('/')[0] not in ('folders', 'organizations'):
raise SystemExit(f'Invalid discovery root {discovery_root}.')
next_page_token = ''
while True:
list_assets_results = fetch(
HTTPRequest(URL_DISCOVERY.format(discovery_root, next_page_token)))
if 'assets' in list_assets_results:
for asset in list_assets_results['assets']:
if (asset['resource']['data']['lifecycleState'] == 'ACTIVE'):
yield asset['resource']['data']['projectId']
next_page_token = list_assets_results.get('nextPageToken')
if not next_page_token:
break


def fetch(request, delete=False):
'Minimal HTTP client interface for API calls.'
logging.debug(f'fetch {"POST" if request.data else "GET"} {request.url}')
Expand Down Expand Up @@ -163,9 +183,13 @@ def get_quotas(project, region='global'):

@click.command()
@click.argument('project-id', required=True)
@click.option(
'--discovery-root', '-dr', required=False, help=
'Root node used to dynamically fetch projects, in organizations/nnn or folders/nnn format.'
)
@click.option(
'--project-ids', multiple=True, help=
'Project ids to monitor (multiple). Defaults to monitoring project if not set.'
'Project ids to monitor (multiple). Defaults to monitoring project if not set, values are appended to those found under discovery-root'
)
@click.option('--regions', multiple=True,
help='Regions (multiple). Defaults to "global" if not set.')
Expand All @@ -175,11 +199,13 @@ def get_quotas(project, region='global'):
help='Exclude quotas starting with keyword (multiple).')
@click.option('--dry-run', is_flag=True, help='Do not write metrics.')
@click.option('--verbose', is_flag=True, help='Verbose output.')
def main_cli(project_id=None, project_ids=None, regions=None, include=None,
exclude=None, dry_run=False, verbose=False):
def main_cli(project_id=None, discovery_root=None, project_ids=None,
regions=None, include=None, exclude=None, dry_run=False,
verbose=False):
'Fetch GCE quotas and writes them as custom metrics to Stackdriver.'
try:
_main(project_id, project_ids, regions, include, exclude, dry_run, verbose)
_main(project_id, discovery_root, project_ids, regions, include, exclude,
dry_run, verbose)
except RuntimeError as e:
logging.exception(f'exception raised: {e.args[0]}')

Expand All @@ -193,14 +219,18 @@ def main(event, context):
raise


def _main(monitoring_project, projects=None, regions=None, include=None,
exclude=None, dry_run=False, verbose=False):
def _main(monitoring_project, discovery_root=None, projects=None, regions=None,
include=None, exclude=None, dry_run=False, verbose=False):
"""Module entry point used by cli and cloud function wrappers."""
configure_logging(verbose=verbose)
projects = projects or [monitoring_project]

# default to monitoring scope project if projects parameter is not passed, then merge the list with discovered projects, if any
regions = regions or ['global']
include = set(include or [])
exclude = set(exclude or [])
projects = projects or [monitoring_project]
if (discovery_root):
projects = set(list(projects) + list(discover_projects(discovery_root)))
for k in ('monitoring_project', 'projects', 'regions', 'include', 'exclude'):
logging.debug(f'{k} {locals().get(k)}')
timeseries = []
Expand Down
19 changes: 14 additions & 5 deletions blueprints/cloud-operations/quota-monitoring/variables.tf
Expand Up @@ -63,14 +63,23 @@ variable "quota_config" {
"a2", "c2", "c2d", "committed", "g2", "interconnect", "m1", "m2", "m3",
"nvidia", "preemptible"
])
include = optional(list(string))
projects = optional(list(string))
regions = optional(list(string))
dry_run = optional(bool, false)
verbose = optional(bool, false)
discovery_root = optional(string, "")
dry_run = optional(bool, false)
include = optional(list(string))
projects = optional(list(string))
regions = optional(list(string))
verbose = optional(bool, false)
})
nullable = false
default = {}
validation {
condition = (
var.quota_config.discovery_root == "" ||
startswith(var.quota_config.discovery_root, "folders/") ||
startswith(var.quota_config.discovery_root, "organizations/")
)
error_message = "non-null discovery root needs to start with folders/ or organizations/"
}
}

variable "region" {
Expand Down