Semantic Conventions for System Metrics

Status: Experimental

This document describes instruments and labels for common system level metrics in OpenTelemetry. Consider the general metric semantic conventions when creating instruments not explicitly defined in the specification.

Metric Instruments

Metric Instruments

`system.cpu.` - Processor metrics

Description: System level processor metrics.

Name	Units	Instrument Type	Value Type	Label Key(s)	Label Values
system.cpu.time	s	SumObserver	Double	state	idle, user, system, interrupt, etc.
				cpu	CPU number [0..n-1]
system.cpu.utilization	1	ValueObserver	Double	state	idle, user, system, interrupt, etc.
				cpu	CPU number (0..n)

`system.memory.` - Memory metrics

Description: System level memory metrics. This does not include paging/swap memory.

Name	Description	Units	Instrument Type	Value Type	Label Key	Label Values
system.memory.usage		By	UpDownSumObserver	Int64	state	used, free, cached, etc.
system.memory.utilization		1	ValueObserver	Double	state	used, free, cached, etc.

`system.paging.` - Paging/swap metrics

Description: System level paging/swap memory metrics.

Name	Description	Units	Instrument Type	Value Type	Label Key	Label Values
system.paging.usage	Unix swap or windows pagefile usage	By	UpDownSumObserver	Int64	state	used, free
system.paging.utilization		1	ValueObserver	Double	state	used, free
system.paging.faults		{faults}	SumObserver	Int64	type	major, minor
system.paging.operations		{operations}	SumObserver	Int64	type	major, minor
					direction	in, out

`system.disk.` - Disk controller metrics

Description: System level disk performance metrics.

Name	Description	Units	Instrument Type	Value Type	Label Key	Label Values
system.disk.io		By	SumObserver	Int64	device	(identifier)
					direction	read, write
system.disk.operations		{operations}	SumObserver	Int64	device	(identifier)
					direction	read, write
system.disk.io_time¹	Time disk spent activated	s	SumObserver	Double	device	(identifier)
system.disk.operation_time²	Sum of the time each operation took to complete	s	SumObserver	Double	device	(identifier)
					direction	read, write
system.disk.merged		{operations}	SumObserver	Int64	device	(identifier)
					direction	read, write

¹ The real elapsed time ("wall clock") used in the I/O path (time from operations running in parallel are not counted). Measured as:

Linux: Field 13 from procfs-diskstats
Windows: The complement of "Disk% Idle Time" performance counter: uptime * (100 - "Disk\% Idle Time") / 100

² Because it is the sum of time each request took, parallel-issued requests each contribute to make the count grow. Measured as:

Linux: Fields 7 & 11 from procfs-diskstats
Windows: "Avg. Disk sec/Read" perf counter multiplied by "Disk Reads/sec" perf counter (similar for Writes)

`system.filesystem.` - Filesystem metrics

Description: System level filesystem metrics.

Name	Units	Instrument Type	Value Type	Label Key	Label Values
system.filesystem.usage	By	UpDownSumObserver	Int64	device	(identifier)
				state	used, free, reserved
				type	ext4, tmpfs, etc.
				mode	rw, ro, etc.
				mountpoint	(path)
system.filesystem.utilization	1	ValueObserver	Double	device	(identifier)
				state	used, free, reserved
				type	ext4, tmpfs, etc.
				mode	rw, ro, etc.
				mountpoint	(path)

`system.network.` - Network metrics

Description: System level network metrics.

Name	Description	Units	Instrument Type	Value Type	Label Key	Label Values
system.network.dropped¹	Count of packets that are dropped or discarded even though there was no error	{packets}	SumObserver	Int64	device	(identifier)
					direction	transmit, receive
system.network.packets		{packets}	SumObserver	Int64	device	(identifier)
					direction	transmit, receive
system.network.errors²	Count of network errors detected	{errors}	SumObserver	Int64	device	(identifier)
					direction	transmit, receive
system.network.io		By	SumObserver	Int64	device	(identifier)
					direction	transmit, receive
system.network.connections		{connections}	UpDownSumObserver	Int64	device	(identifier)
					protocol	tcp, udp, etc.
					state	e.g. for tcp

¹ Measured as:

Linux: the drop column in /proc/dev/net (source).
Windows: InDiscards/OutDiscards from GetIfEntry2.

² Measured as:

Linux: the errs column in /proc/dev/net (source).
Windows: InErrors/OutErrors from GetIfEntry2.

`system.processes.` - Aggregate system process metrics

Description: System level aggregate process metrics. For metrics at the individual process level, see process metrics.

Name	Description	Units	Instrument Type	Value Type	Label Key	Label Values
system.processes.count	Total number of processes in each state	{processes}	UpDownSumObserver	Int64	status	running, sleeping, etc.
system.processes.created	Total number of processes created over uptime of the host	{processes}	SumObserver	Int64	-	-

`system.{os}.` - OS Specific System Metrics

Instrument names for system level metrics that have different and conflicting meaning across multiple OSes should be prefixed with system.{os}. and follow the hierarchies listed above for different entities like CPU, memory, and network.

For example, UNIX load average over a given interval is not well standardized and its value across different UNIX like OSes may vary despite being under similar load:

Without getting into the vagaries of every Unix-like operating system in existence, the load average more or less represents the average number of processes that are in the running (using the CPU) or runnable (waiting for the CPU) states. One notable exception exists: Linux includes processes in uninterruptible sleep states, typically waiting for some I/O activity to complete. This can markedly increase the load average on Linux systems.

(source of quote, linux source code)

An instrument for load average over 1 minute on Linux could be named system.linux.cpu.load_1m, reusing the cpu name proposed above and having an {os} prefix to split this metric across OSes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

system-metrics.md

system-metrics.md

Semantic Conventions for System Metrics

Metric Instruments

`system.cpu.` - Processor metrics

`system.memory.` - Memory metrics

`system.paging.` - Paging/swap metrics

`system.disk.` - Disk controller metrics

`system.filesystem.` - Filesystem metrics

`system.network.` - Network metrics

`system.processes.` - Aggregate system process metrics

`system.{os}.` - OS Specific System Metrics

Files

system-metrics.md

Latest commit

History

system-metrics.md

File metadata and controls

Semantic Conventions for System Metrics

Metric Instruments

system.cpu. - Processor metrics

system.memory. - Memory metrics

system.paging. - Paging/swap metrics

system.disk. - Disk controller metrics

system.filesystem. - Filesystem metrics

system.network. - Network metrics

system.processes. - Aggregate system process metrics

system.{os}. - OS Specific System Metrics

`system.cpu.` - Processor metrics

`system.memory.` - Memory metrics

`system.paging.` - Paging/swap metrics

`system.disk.` - Disk controller metrics

`system.filesystem.` - Filesystem metrics

`system.network.` - Network metrics

`system.processes.` - Aggregate system process metrics

`system.{os}.` - OS Specific System Metrics