Skip to content

Missing support for EC2 mac instances (mac2.metal) #798

@albertodebortoli

Description

@albertodebortoli

Describe the bug

While the CloudWatch agent would theoretically work on macOS, it cannot be used on EC2 mac instances of type mac2.metal.
At the time of writing, it seems that macOS support was added but the documentation is still mainly focussed on Linux and Widows and there is no evidence online that the agent can be setup on EC2 mac instances.

Main blocker: the agent looks for types.db at /usr/share/collectd/ but the directory is not writable without disabling SIP (System Integrity Protection) which cannot be done on EC2 mac instances.

Steps to reproduce

On an EC2 mac instance of type mac2.metal, download the agent:

wget https://s3.<region>.amazonaws.com/amazoncloudwatch-agent-<region>/darwin/arm64/latest/amazon-cloudwatch-agent.pkg

install it

sudo installer -pkg ./amazon-cloudwatch-agent.pkg -target /

configure it using the wizard and save the configuration in the SSM Parameter Store

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

when running the agent

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c ssm:AmazonCloudWatch-darwin

the following error occurs:

======== Error Log ========
2023-07-23T04:57:28Z E! [telegraf] Error running agent: Error loading config file /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml: error parsing socket_listener, open /usr/share/collectd/types.db: no such file or directory

This is because the location of types.db is /usr/share/collectd/types.db on Linux, but on macOS, when installing collectd via Brew, is /opt/homebrew/opt/collectd/share/collectd/types.db.

One might be tempted to create a sym link (as suggested here):

sudo ln -s /opt/homebrew/opt/collectd/share/collectd/types.db /usr/share/collectd/types.db

but /usr/share/ is not writable on EC2 mac instances

% sudo mkdir -p /usr/share/collectd 
mkdir: /usr/share/collectd: Operation not permitted

therefore installing collectd in a different way won't help either.

Disabling SIP would make the folder writable but it's not an option on EC2 mac instances.

Ultimately, modifying the collectd_typesdb value in /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml has no effect as amazon-cloudwatch-agent-ctl reverts the changes when executing this line:

runTranslatorCommand=`"${CMDDIR}/config-translator" --input "${JSON}" --input-dir "${JSON_DIR}" --output "${TOML}" --mode ${mode} --config "${COMMON_CONIG}" --multi-config ${multi_config}`

Inspecting config-translator, it seems that it doesn't have support for darwin (os flag):

% sudo  /opt/aws/amazon-cloudwatch-agent/bin/config-translator --help
Usage of /opt/aws/amazon-cloudwatch-agent/bin/config-translator:
  -config string
    	Please provide the common-config file
  -input string
    	Please provide the path of input agent json config file
  -input-dir string
    	Please provide the path of input agent json config directory.
  -mode string
    	Please provide the mode, i.e. ec2, onPremise, onPrem, auto (default "ec2")
  -multi-config string
    	valid values: default, append, remove (default "remove")
  -os string
    	Please provide the os preference, valid value: windows/linux.
  -output string
    	Please provide the path of the output CWAgent config file

modifying /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl by adding --os "darwin" on the line invoking config-translator is recognised by config-translator (other strings are not) meaning that work was done to support it but when trying to run the agent the same error persist:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c ssm:AmazonCloudWatch-darwin

======== Error Log ========
2023-07-23T14:36:14Z E! [telegraf] Error running agent: Error loading config file /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml: error parsing socket_listener, open /usr/share/collectd/types.db: no such file or directory

meaning that

  1. the tool documentation (--help) has not been updated
  2. config-translator has /usr/share/collectd/types.db hard-coded

In the README.md, the link to the nightly build binary for mac for arm64 is not present and the one for amd64 doesn't work (https://amazoncloudwatch-agent.s3.amazonaws.com/nightly-build/latest/darwin_amd64/config-translator). The link to the amd64 package works but the arm64 one is missing even though working if constructed (https://amazoncloudwatch-agent.s3.amazonaws.com/nightly-build/latest/darwin/arm64/amazon-cloudwatch-agent.tar.gz). The nightly build downloaded at the time of writing (23/07/2023) was created on 30/06/2023 at it seems that the corresponding GitHub Actions has been failing silently for some time (https://github.com/aws/amazon-cloudwatch-agent/actions/runs/5633431469):

image

What did you expect to see?

The CloudWatch agent on macOS being able to run collectd and works expected.

What did you see instead?

The following error when running the agent:

======== Error Log ========
2023-07-23T14:36:14Z E! [telegraf] Error running agent: Error loading config file /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml: error parsing socket_listener, open /usr/share/collectd/types.db: no such file or directory

showing the path /usr/share/collectd/types.db typical of Linux.

What version did you use?

I tried both the latest stable and the latest nightly build available at the time of writing:

The instance I've used is of type mac2.metal.

What config did you use?

{
	"agent": {
		"metrics_collection_interval": 60,
		"run_as_user": "root"
	},
	"metrics": {
		"aggregation_dimensions": [
			[
				"InstanceId"
			]
		],
		"append_dimensions": {
			"AutoScalingGroupName": "${aws:AutoScalingGroupName}",
			"ImageId": "${aws:ImageId}",
			"InstanceId": "${aws:InstanceId}",
			"InstanceType": "${aws:InstanceType}"
		},
		"metrics_collected": {
			"collectd": {
				"metrics_aggregation_interval": 60
			},
			"cpu": {
				"measurement": [
					"cpu_usage_idle",
					"cpu_usage_iowait",
					"cpu_usage_user",
					"cpu_usage_system"
				],
				"metrics_collection_interval": 60,
				"resources": [
					"*"
				],
				"totalcpu": false
			},
			"disk": {
				"measurement": [
					"used_percent",
					"inodes_free"
				],
				"metrics_collection_interval": 60,
				"resources": [
					"*"
				]
			},
			"diskio": {
				"measurement": [
					"io_time",
					"write_bytes",
					"read_bytes",
					"writes",
					"reads"
				],
				"metrics_collection_interval": 60,
				"resources": [
					"*"
				]
			},
			"mem": {
				"measurement": [
					"mem_used_percent"
				],
				"metrics_collection_interval": 60
			},
			"netstat": {
				"measurement": [
					"tcp_established",
					"tcp_time_wait"
				],
				"metrics_collection_interval": 60
			},
			"statsd": {
				"metrics_aggregation_interval": 60,
				"metrics_collection_interval": 10,
				"service_address": ":8125"
			},
			"swap": {
				"measurement": [
					"swap_used_percent"
				],
				"metrics_collection_interval": 60
			}
		}
	}
}

Environment
OS: macOS 13.2.1

Additional context
N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions