Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Cloudera integration #13244

Merged
merged 73 commits into from
Dec 23, 2022
Merged
Show file tree
Hide file tree
Changes from 71 commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
5659a84
Cloudera template
yzhan289 Nov 1, 2022
69747e9
Outline plan for integration
yzhan289 Nov 2, 2022
98d608a
Set up init metric and test env
yzhan289 Nov 14, 2022
7527b93
Add common.py
yzhan289 Nov 14, 2022
2146e0f
Outline check and include tests
yzhan289 Nov 15, 2022
a246429
Clean up check
yzhan289 Nov 17, 2022
734aa55
Update cloudera/pyproject.toml
yzhan289 Nov 17, 2022
d4868af
Drop support for Python 2
yzhan289 Nov 21, 2022
0a17f6c
Clean up timeseries metrics
yzhan289 Nov 21, 2022
e25bcd5
Add more metrics
yzhan289 Nov 21, 2022
43efa9a
Add more metrics
yzhan289 Nov 21, 2022
fd00fe8
Implement host service check and fix style
yzhan289 Nov 22, 2022
1bba4b5
Change how cloudera client is created
yzhan289 Nov 23, 2022
87c0051
Add caddy to mock API output
yzhan289 Nov 23, 2022
929a0ad
Add caddy test API outputs
yzhan289 Nov 29, 2022
efa96da
Apply Jose's metric and health check collection style (#13423)
yzhan289 Nov 29, 2022
ed09437
Apply testing changes (#13424)
yzhan289 Nov 30, 2022
dbe5a36
Apply changes from Jose's branch
yzhan289 Nov 30, 2022
1c4c4b6
Add Jose's new host metrics
yzhan289 Nov 30, 2022
b6a54bf
Fix validations
yzhan289 Nov 30, 2022
53a023b
Add license
yzhan289 Nov 30, 2022
47aaa9f
Fix properties
yzhan289 Nov 30, 2022
ad6efb3
Change to use caddy docker image
yzhan289 Nov 30, 2022
4f50e48
Start events implementation
yzhan289 Nov 30, 2022
bf2b032
Unit tests working
jose-manuel-almaza Dec 1, 2022
f581b3e
Map to 80 local port not recommended
jose-manuel-almaza Dec 1, 2022
27f784c
Added url to debug log
jose-manuel-almaza Dec 2, 2022
8048632
Remove events implementation
yzhan289 Dec 1, 2022
d060eeb
Fixed integration tests
jose-manuel-almaza Dec 2, 2022
cdff200
Fix license validation
yzhan289 Dec 2, 2022
8de24cc
Remove unused constants
yzhan289 Dec 2, 2022
4432ccb
Add e2e test
yzhan289 Dec 2, 2022
4ed276d
Use master's change
yzhan289 Dec 2, 2022
bbfe023
Fix e2e and integration test style
yzhan289 Dec 2, 2022
65e44c1
Add temporary gitlab change
yzhan289 Dec 2, 2022
f1de60f
Temporarily add default dep
yzhan289 Dec 2, 2022
01499aa
Wait for Cloudera is up
yzhan289 Dec 2, 2022
21a909c
Refactor Cloudera (#13460)
yzhan289 Dec 7, 2022
8c0c2cf
Update service checks and update metadata.csv
yzhan289 Dec 8, 2022
ae6d76e
Fixed missing coverage
jose-manuel-almaza Dec 9, 2022
e8a922d
Added 'io' to VALID_UNIT_NAMES
jose-manuel-almaza Dec 9, 2022
1f245f0
Added some tag checks to tests
jose-manuel-almaza Dec 9, 2022
214084f
Fix service checks and add comma
yzhan289 Dec 9, 2022
a6d8f82
Add native metrics (#13474)
yzhan289 Dec 9, 2022
dafb968
Clean up
yzhan289 Dec 9, 2022
ec57bd8
Add version collection
yzhan289 Dec 9, 2022
6e5350b
Add back service check for can_connect
yzhan289 Dec 9, 2022
de7ce85
Update README and add pictures
yzhan289 Dec 9, 2022
72e555f
Move images directory
yzhan289 Dec 9, 2022
c10c869
Add custom tag support
yzhan289 Dec 9, 2022
abd1082
Fix style
yzhan289 Dec 9, 2022
67243eb
Add custom tag tests
yzhan289 Dec 12, 2022
f8e4c7e
Fix test for e2e test
yzhan289 Dec 12, 2022
8b6f2eb
Run cloudera manager requests in parallel (#13499)
jose-manuel-almaza Dec 13, 2022
47ab749
Fix config param
yzhan289 Dec 13, 2022
46c81c1
Remove io as metric type
yzhan289 Dec 15, 2022
3afab94
Apply suggestions from code review
yzhan289 Dec 19, 2022
f668c7b
Apply suggestions
yzhan289 Dec 19, 2022
30dfbbf
Apply suggestion
yzhan289 Dec 19, 2022
d7b1047
Fix style
yzhan289 Dec 19, 2022
360aec7
Update cloudera/README.md
yzhan289 Dec 19, 2022
e52c3d6
Update cloudera/README.md
yzhan289 Dec 19, 2022
ca60582
Apply suggestions
yzhan289 Dec 20, 2022
0fe44d2
Fix tests
yzhan289 Dec 20, 2022
639a5ca
Fix style
yzhan289 Dec 20, 2022
f0c3556
Fix formatting
yzhan289 Dec 20, 2022
3ddfa9f
Apply suggestion and move credentials to init_config
yzhan289 Dec 21, 2022
4a2f761
Update Cloudera README for collecting metrics
yzhan289 Dec 21, 2022
e410120
Merge branch 'master' into az/cloudera
yzhan289 Dec 22, 2022
e0b5059
Fix README
yzhan289 Dec 22, 2022
ddee7d8
Fix README
yzhan289 Dec 22, 2022
8bc721e
Merge branch 'master' into az/cloudera
yzhan289 Dec 22, 2022
70750f0
Apply suggestions from code review
yzhan289 Dec 23, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .azure-pipelines/templates/test-all-checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,9 @@ jobs:
- checkName: cloud_foundry_api
displayName: Cloud Foundry API
os: linux
- checkName: cloudera
displayName: Cloudera
os: linux
- checkName: cockroachdb
displayName: CockroachDB
os: linux
Expand Down
9 changes: 9 additions & 0 deletions .codecov.yml
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,10 @@ coverage:
target: 75
flags:
- cloud_foundry_api
Cloudera:
target: 75
flags:
- cloudera
CockroachDB:
target: 75
flags:
Expand Down Expand Up @@ -736,6 +740,11 @@ flags:
paths:
- cloud_foundry_api/datadog_checks/cloud_foundry_api
- cloud_foundry_api/tests
cloudera:
carryforward: true
paths:
- cloudera/datadog_checks/cloudera
- cloudera/tests
cockroachdb:
carryforward: true
paths:
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/config/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@ integration/clickhouse:
- clickhouse/**/*
integration/cloud_foundry_api:
- cloud_foundry_api/**/*
integration/cloudera:
- cloudera/**/*
integration/cockroachdb:
- cockroachdb/**/*
integration/confluent_platform:
Expand Down
1 change: 1 addition & 0 deletions LICENSE-3rdparty.csv
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ cachetools,PyPI,MIT,Thomas Kemmer
check-postgres,"https://github.com/bucardo/",BSD-2-Clause,Greg Sabino Mullane
clickhouse-cityhash,PyPI,MIT,Alexander [Amper] Marshalov
clickhouse-driver,PyPI,MIT,Konstantin Lebedev
cm-client,PyPI,Apache-2.0,
contextlib2,PyPI,PSF,Nick Coghlan
cryptography,PyPI,Apache-2.0,The Python Cryptographic Authority and individual contributors | The cryptography developers
cryptography,PyPI,BSD-3-Clause,The Python Cryptographic Authority and individual contributors | The cryptography developers
Expand Down
2 changes: 2 additions & 0 deletions cloudera/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# CHANGELOG - Cloudera

150 changes: 150 additions & 0 deletions cloudera/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# Agent Check: Cloudera

## Overview

This integration monitors your [Cloudera Data Platform][1] through the Datadog Agent, allowing you to submit metrics and service checks on the health of your Cloudera Data Hub clusters, hosts, and roles.

## Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the [Autodiscovery Integration Templates][3] for guidance on applying these instructions.

### Installation

The Cloudera check is included in the [Datadog Agent][2] package.
No additional installation is needed on your server.

### Configuration

#### Prepare Cloudera Manager
1. In Cloudera Data Platform, navigate to the Management Console and click on the **User Management** tab.
![User Management][10]

2. Click on **Actions**, then **Create Machine User** to create the machine user that queries the Cloudera Manager through the Datadog Agent.
![Create Machine User][11]

3. If the workload password hasn't been set, click on **Set Workload Password** after the user is created.
![Set Workload Password][12]

<!-- xxx tabs xxx -->
<!-- xxx tab "Host" xxx -->

#### Host
1. Edit the `cloudera.d/conf.yaml` file, in the `conf.d/` folder at the root of your Agent's configuration directory to start collecting your Cloudera cluster and host data. See the [sample cloudera.d/conf.yaml][4] for all available configuration options. Note that the `api_url` should contain the API version at the end.
yzhan289 marked this conversation as resolved.
Show resolved Hide resolved

```yaml
init_config:

## @param workload_username - string - required
## The Workload username. This value can be found in the `User Management` tab of the Management
## Console in the `Workload User Name`.
#
workload_username: <WORKLOAD_USERNAME>

## @param workload_password - string - required
## The Workload password. This value can be found in the `User Management` tab of the Management
## Console in the `Workload Password`.
#
workload_password: <WORKLOAD_PASSWORD>

## Every instance is scheduled independently of the others.
#
instances:

## @param api_url - string - required
## The URL endpoint for the Cloudera Manager API. This can be found under the Endpoints tab for
## your Data Hub to monitor.
##
## Note: The version of the Cloudera Manager API needs to be appended at the end of the URL.
## For example, using v48 of the API for Data Hub `cluster_1` should result with a URL similar
## to the following:
## `https://cluster1.cloudera.site/cluster_1/cdp-proxy-api/cm-api/v48`
#
- api_url: <API_URL>
```

2. [Restart the Agent][5] to start collecting and sending Cloudera Data Hub cluster data to Datadog.

<!-- xxz tab xxx -->
<!-- xxx tab "Containerized" xxx -->

#### Containerized

For containerized environments, see the [Autodiscovery Integration Templates][9] for guidance on applying the parameters below.
yzhan289 marked this conversation as resolved.
Show resolved Hide resolved

| Parameter | Value |
| -------------------- | ---------------------------------------------------------------------------------------------------------------- |
| `<INTEGRATION_NAME>` | `cloudera` |
| `<INIT_CONFIG>` | `{"workload_username": "<WORKLOAD_USERNAME>", 'workload_password": "<WORKLOAD_PASSWORD>"}` |
| `<INSTANCE_CONFIG>` | `{"api_url": <API_URL>"}` |

<!-- xxz tab xxx -->
<!-- xxz tabs xxx -->

### Validation

[Run the Agent's status subcommand][6] and look for `cloudera` under the Checks section.

## Data Collected

### Metrics

See [metadata.csv][7] for a list of metrics provided by this integration.

### Events

The Cloudera integration does not include any events.

### Service Checks

See [service_checks.json][8] for a list of service checks provided by this integration.

## Troubleshooting

### Collecting metrics of Datadog integrations on Cloudera hosts
To install the Datadog Agent on a Cloudera host, make sure that the security group associated with the host allows SSH access.
Then, you need to use the [root user `cloudbreak`][13] when accessing the host with the SSH key generated during the environment creation:

```
sudo ssh -i "/path/to/key.pem" cloudbreak@<HOST_IP_ADDRESS>
```

The workload username and password can be used to access Cloudera hosts via SSH, although only the `cloudbreak` user can install the Datadog Agent.
yzhan289 marked this conversation as resolved.
Show resolved Hide resolved
Trying to use any user that is not `cloudbreak` may result in the following error:
```
<NON_CLOUDBREAK_USER> is not allowed to run sudo on <CLOUDERA_HOSTNAME>. This incident will be reported.
```

### Config errors when collecting Datadog metrics
If you something similar to the following in the Agent status when collecting metrics from your Cloudera host:
yzhan289 marked this conversation as resolved.
Show resolved Hide resolved

```
Config Errors
==============
zk
--
open /etc/datadog-agent/conf.d/zk.d/conf.yaml: permission denied
```

you need to change the ownership of the `conf.yaml` to `dd-agent`:
yzhan289 marked this conversation as resolved.
Show resolved Hide resolved

```
[cloudbreak@<CLOUDERA_HOSTNAME> ~]$ sudo chown -R dd-agent:dd-agent /etc/datadog-agent/conf.d/zk.d/conf.yaml
```


Need help? Contact [Datadog support][9].


[1]: https://www.cloudera.com/products/cloudera-data-platform.html
[2]: https://app.datadoghq.com/account/settings#agent
[3]: https://docs.datadoghq.com/agent/kubernetes/integrations/
yzhan289 marked this conversation as resolved.
Show resolved Hide resolved
[4]: https://github.com/DataDog/integrations-core/blob/master/cloudera/datadog_checks/cloudera/data/conf.yaml.example
[5]: https://docs.datadoghq.com/agent/guide/agent-commands/#start-stop-and-restart-the-agent
[6]: https://docs.datadoghq.com/agent/guide/agent-commands/#agent-status-and-information
[7]: https://github.com/DataDog/integrations-core/blob/master/cloudera/metadata.csv
[8]: https://github.com/DataDog/integrations-core/blob/master/cloudera/assets/service_checks.json
[9]: https://docs.datadoghq.com/help/
[10]: https://raw.githubusercontent.com/DataDog/integrations-core/master/cloudera/images/user_management.png
[11]: https://raw.githubusercontent.com/DataDog/integrations-core/master/cloudera/images/create_machine_user.png
[12]: https://raw.githubusercontent.com/DataDog/integrations-core/master/cloudera/images/set_workload_password.png
[13]: https://docs.cloudera.com/data-hub/cloud/access-clusters/topics/mc-accessing-cluster-via-ssh.html
44 changes: 44 additions & 0 deletions cloudera/assets/configuration/spec.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
name: Cloudera
files:
- name: cloudera.yaml
options:
- template: init_config
options:
- name: workload_username
description: |
The Workload username. This value can be found in the `User Management` tab of the Management
Console in the `Workload User Name`.
required: true
value:
type: string
- name: workload_password
description: |
The Workload password. This value can be found in the `User Management` tab of the Management
Console in the `Workload Password`.
required: true
value:
type: string
- template: init_config/default
- template: instances
options:
- name: api_url
required: true
description: |
The URL endpoint for the Cloudera Manager API. This can be found under the Endpoints tab for
your Data Hub to monitor.

Note: The version of the Cloudera Manager API needs to be appended at the end of the URL.
For example, using v48 of the API for Data Hub `cluster_1` should result with a URL similar
to the following:
`https://cluster1.cloudera.site/cluster_1/cdp-proxy-api/cm-api/v48`

value:
type: string
- name: max_parallel_requests
description: |
The maximum number of requests to Cloudera Manager that are allowed in parallel.
hidden: true
value:
type: integer
example: 100
- template: instances/default
50 changes: 50 additions & 0 deletions cloudera/assets/service_checks.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
[
{
"agent_version": "7.43.0",
"integration": "Cloudera",
"check": "cloudera.can_connect",
"statuses": [
"ok",
"critical"
],
"groups": [
"api_url"
],
"name": "Cloudera Manager Can Connect",
"description": "Returns `OK` if the check is able to connect to the Cloudera Manager API and collect metrics, `CRITICAL` otherwise."
},
{
"agent_version": "7.43.0",
"integration": "Cloudera",
"check": "cloudera.cluster.health",
"statuses": [
"ok",
"critical",
"warning",
"unknown"
],
"groups": [
"cloudera_cluster"
],
"name": "Cloudera Cluster Health",
"description": "Returns `OK` if the cluster is in good health or is starting, `WARNING` if the cluster is stopping or the health is concerning, `CRITICAL` if the cluster is down or in bad health, and `UNKNOWN` otherwise."
},
{
"agent_version": "7.43.0",
"integration": "Cloudera",
"check": "cloudera.host.health",
"statuses": [
"ok",
"critical",
"warning",
"unknown"
],
"groups": [
"cloudera_cluster",
"cloudera_rack_id",
"cloudera_hostname"
],
"name": "Cloudera Host Health",
"description": "Returns `OK` if the host is in good health or is starting, `WARNING` if the host is stopping or the health is concerning, `CRITICAL` if the host is down or in bad health, and `UNKNOWN` otherwise."
}
]
4 changes: 4 additions & 0 deletions cloudera/datadog_checks/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# (C) Datadog, Inc. 2022-present
# All rights reserved
# Licensed under a 3-clause BSD style license (see LICENSE)
__path__ = __import__('pkgutil').extend_path(__path__, __name__) # type: ignore
4 changes: 4 additions & 0 deletions cloudera/datadog_checks/cloudera/__about__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# (C) Datadog, Inc. 2022-present
# All rights reserved
# Licensed under a 3-clause BSD style license (see LICENSE)
__version__ = '0.0.1'
7 changes: 7 additions & 0 deletions cloudera/datadog_checks/cloudera/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# (C) Datadog, Inc. 2022-present
# All rights reserved
# Licensed under a 3-clause BSD style license (see LICENSE)
from .__about__ import __version__
from .check import ClouderaCheck

__all__ = ['__version__', 'ClouderaCheck']
13 changes: 13 additions & 0 deletions cloudera/datadog_checks/cloudera/api_client.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from abc import ABC, abstractmethod


class ApiClient(ABC):
def __init__(self, check, api_client):
self._check = check
self._log = check.log
self._api_client = api_client

@abstractmethod
def collect_data(self):
"""Collect metrics and service checks via the Cloudera API Client"""
pass
40 changes: 40 additions & 0 deletions cloudera/datadog_checks/cloudera/api_client_factory.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
import cm_client
from cm_client.rest import RESTClientObject
from packaging.version import parse

from datadog_checks.base import ConfigurationError
from datadog_checks.cloudera.api_client_v7 import ApiClientV7


def make_api_client(check, config, shared_config):
cm_client.configuration.username = shared_config.workload_username
cm_client.configuration.password = shared_config.workload_password
api_client = cm_client.ApiClient(config.api_url)
yzhan289 marked this conversation as resolved.
Show resolved Hide resolved
api_client.rest_client = RESTClientObject(maxsize=(config.max_parallel_requests))
check.log.debug('Getting version from cloudera API URL: %s', config.api_url)
cloudera_manager_resource_api = cm_client.ClouderaManagerResourceApi(api_client)
try:
get_version_response = cloudera_manager_resource_api.get_version()
except Exception:
check.log.warning(
"Unable to get the version of Cloudera Manager, please check that the URL is valid and API version \
is appended at the end"
)
raise
check.log.debug('get_version_response: %s', get_version_response)
response_version = get_version_response.version
if response_version:
cloudera_version = parse(response_version)
check.log.debug('Cloudera Manager Version: %s', cloudera_version)
if cloudera_version.major == 7:
version_raw = str(cloudera_version)
version_parts = {
'major': str(cloudera_version.major),
'minor': str(cloudera_version.minor),
'patch': str(cloudera_version.micro),
}
check.set_metadata('version', version_raw, scheme='parts', part_map=version_parts)

return ApiClientV7(check, api_client)

raise ConfigurationError(f'Cloudera Manager Version is unsupported or unknown: {response_version}')
Loading