Skip to content

Commit

Permalink
Rename Events to Categorized Logs
Browse files Browse the repository at this point in the history
This is an alternate to open-telemetry#2863

## Problem

"Event" is a confusing term that is understood differently by different people in different contexts. We would like to avoid using it if possible.

## Proposal

The OpenTelemetry Event is defined as a LogRecord that has specific attributes, namely event.name and event.domain. The event.domain here is one of the important elements. It places the event definitions into isolated buckets (domains).
We could use the term "Domenized Logs", but it sounds a bit weird, so I want to suggest renaming the concept of "domain" to "category", without changing the semantics.
The specially shaped Log Records can be called "Categorized Logs" and have attributes "log.category" (previously known as "event.domain") and "log.name" (previously known as "event.name").
We will refrain from using the term "event" as much as possible to avoid confusion.
I am open to other name suggestions for "Categorized Logs". We just want to make sure to avoid the word "events" and avoid inventing completely new terms, so some sort of adjective + "logs" seems to be the best approach.

## What Changes?

- "Event" is renamed to "Categorized LogRecord"
- "event.name" is renamed to "log.name"
- "event.domain" is renamed to "log.category"
- "log.category" is an attribute of LogRecord (instead of previously "event.domain" being a Scope attribute)

## What Did We Lose?

"event.domain" previously could be recorded as a Scope attribute and be used for efficient batch processing/routing of logrecords. This is no longer possible, but we can add another such attribute in the future that serves the same purpose (e.g. see the [proposal to add "signal.type"](https://github.com/open-telemetry/opentelemetry-specification/pull/2863/files#r993794504)).
  • Loading branch information
tigrannajaryan committed Oct 13, 2022
1 parent 5accc43 commit 944ba27
Show file tree
Hide file tree
Showing 4 changed files with 137 additions and 64 deletions.
100 changes: 86 additions & 14 deletions specification/logs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ aliases: [/docs/reference/specification/logs/overview]
- [OpenTelemetry Solution](#opentelemetry-solution)
- [Log Correlation](#log-correlation)
- [Events and Logs](#events-and-logs)
* [Categorized LogsRecords](#categorized-logsrecords)
* [FAQ](#faq)
- [Legacy and Modern Log Sources](#legacy-and-modern-log-sources)
* [System Logs](#system-logs)
* [Infrastructure Logs](#infrastructure-logs)
Expand Down Expand Up @@ -124,11 +126,6 @@ languages have established standards for using particular logging libraries. For
example in Java world there are several highly popular and widely used logging
libraries, such as Log4j or Logback.

OpenTelemetry defines [events](#events-and-logs) as a type of LogRecord with
specific characteristics. This definition is not ubiquitous across existing
libraries and languages. In some logging libraries, producing events aligned
with the OpenTelemetry event definition is clunky or error-prone.

There are also countless existing prebuilt applications or systems that emit
logs in certain formats. Operators of such applications have no or limited
control on how the logs are emitted. OpenTelemetry needs to support these logs.
Expand All @@ -148,6 +145,12 @@ Given the above state of the logging space we took the following approach:
OpenTelemetry log data model. OpenTelemetry Collector can read such logs and
translate them to OpenTelemetry log data model.

- OpenTelemetry defines [Categorized Logs](#events-and-logs) as a type of LogRecord with
specific characteristics:
- They have a LogRecord attribute `event.name` (and possibly other LogRecord attributes).
- They have an InstrumentationScope with a non-empty `Name` and with an
InstrumentationScope attribute `event.domain` (and possibly other InstrumentationScope attributes).

- OpenTelemetry defines an API
for [emitting LogRecords](./api.md#emit-logrecord). Application developers are
NOT encouraged to call this API directly. It is provided for library authors
Expand All @@ -157,7 +160,7 @@ Given the above state of the logging space we took the following approach:
features than what is defined in OpenTelemetry. It is NOT a goal of
OpenTelemetry to ship a feature-rich logging library.

- OpenTelemetry defines an API for [emitting Events](./api.md#emit-event).
- OpenTelemetry defines an API for [emitting Categorized Logs](./api.md#emit-event).
Application developers are encouraged to call this API directly.

- OpenTelemetry defines an [SDK](./sdk.md) implementation of the [API](./api.md),
Expand Down Expand Up @@ -208,15 +211,84 @@ Wikipedia’s [definition of log file](https://en.wikipedia.org/wiki/Log_file):
>In computing, a log file is a file that records either events that occur in an
>operating system or other software runs.
From OpenTelemetry's perspective LogRecords and Events are both represented
using the same [data model](./data-model.md).
From OpenTelemetry's perspective logs and events conceptually are not different. Both
are represented using the same [LogRecord data model](./data-model.md).

### Categorized LogsRecords

OpenTelemetry defines **Categorized LogRecords** as LogRecords that are shaped
in a special way:

- They have a LogRecord attribute `log.name` (and possibly other LogRecord attributes).
- They have an InstrumentationScope with a non-empty `Name` and with an
InstrumentationScope attribute `log.category` (and possibly other InstrumentationScope attributes).

Within a particular `log.category`, the `log.name` uniquely defines a particular class
or type of Categorized LogRecords. Categorized LogRecords with the same `log.category` /
`log.name` follow the same schema which assists in analysis in observability platforms.
See also OpenTelemetry Log [semantic conventions](./semantic_conventions/events.md).

### FAQ

**What is a Categorized LogRecord?**

It is a specially shaped LogRecord. See [Categorized LogRecords](#categorized-logsrecords).

**How are events and logs different?**

They are not. The words "events" and "logs" are synonyms. We prefer the word "logs"
when referring to generic log and event data.

**Who produces Categorized LogRecords?**

Categorized LogRecord are produced using OpenTelemetry Categorized Logs API or
by OpenTelemetry Collector.

**Why do Categorized LogRecords exist as a concept?**

Categorized LogRecords are a class of logs designed within OpenTelemetry community
or in compliance with OpenTelemetry recommendations. Categorized LogRecords have a
particular shape of data that OpenTelemetry believes is beneficial for designers of
structured logs and events to adopt.

**What are the reasons Categorized LogRecords have an `log.category` attribute?**

The `log.category` Scope attribute isolates groups (categorizes) of logs or events
designed by different people. Any decisions about the choice of attribute names and other
decisions about the shape of the LogRecord made by designers of logs in a particular
domain have no impact on the design of logs in another domain.
In other words, the `log.category` attribute allows different groups of people to
independently make choices about log representation in their domain of expertise
without worrying that their choices will impact people who design logs
in some other domain of expertise.

**I have a non-OpenTelemetry data source that produces logs/events (e.g. Windows Events).
Should I make sure they are shaped like Categorized LogRecords when used with OpenTelemetry
software (e.g. inside OpenTelemetry Collector)?**

Not necessarily. Only do so if the semantics of the non-OpenTelemetry data source
match the definition of Categorized LogRecords.

**I have non-OpenTelemetry data source that produces events that have a `name` and
`category`. The semantics of the `name` and `category` in this data source are exactly the
same as `log.name` and `log.category` at OpenTelemetry. What should I do when I bring
these events to OpenTelemetry?**

If there is an exact match in the semantics then it is reasonable to map them to
OpenTelemetry's concepts. So, when the events from the external data source are converted
to OpenTelemetry LogRecords (for example in OpenTelemetry Collector) it is reasonable
to shape them like Categorized Logs. In the given example it is reasonable to map
the `name` field from the data source to `log.name` and the `category` field to
`log.category`.

**I am designing a new library/application/system and want to produce structured logs/events
using OpenTelemetry. Should my events be shaped like Categorize LogRecords?**

However, OpenTelemetry does recognize a subtle semantic difference between
LogRecords and Events: Events are LogRecords which have a `name` and `domain`.
Within a particular `domain`, the `name` uniquely defines a particular class or
type of event. Events with the same `domain` / `name` follow the same schema
which assists in analysis in observability platforms. Events are described in
more detail in the [semantic conventions](./semantic_conventions/events.md).
Yes. For new designs we recommend to shape your data like Categorize LogRecords.
Make sure to choose a good descriptive value for `log.category`. If the domain is common
enough consider adding it as a well-known domain name in OpenTelemetry's [semantic conventions](
https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/semantic_conventions/events.md)
for `log.category` attribute.

## Legacy and Modern Log Sources

Expand Down
47 changes: 24 additions & 23 deletions specification/logs/api.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Events and Logs API Interface
# Logs API Interface

**Status**: [Experimental](../document-status.md)

Expand All @@ -14,7 +14,7 @@
+ [Get a Logger](#get-a-logger)
- [Logger](#logger)
* [Logger operations](#logger-operations)
+ [Emit Event](#emit-event)
+ [Emit Categorized LogRecord](#emit-categorized-logrecord)
+ [Emit LogRecord](#emit-logrecord)
- [LogRecord](#logrecord)
- [Usage](#usage)
Expand All @@ -26,19 +26,19 @@

</details>

The Events and Logs API consist of these main classes:
The Logs API consist of these main classes:

* LoggerProvider is the entry point of the API. It provides access to Loggers.
* Logger is the class responsible for
creating [Events](./semantic_conventions/events.md)
and [Logs](./data-model.md#log-and-event-record-definition) as LogRecords.
creating [arbitrary LogRecords](#emit-logrecord) or
[Categorized Logs](#emit-categorized-logrecord).

LoggerProvider/Logger are analogous to TracerProvider/Tracer.

```mermaid
graph TD
A[LoggerProvider] -->|Get| B(Logger)
B --> C(Event)
B --> C(Categorized Log)
B --> D(Log)
```

Expand Down Expand Up @@ -91,10 +91,10 @@ produced by this library.
the scope has a version (e.g. a library version). Example value: 1.0.0.
- `schema_url` (optional): Specifies the Schema URL that should be recorded in
the emitted telemetry.
- `event_domain` (optional): Specifies the domain for the Events emitted, which
should be added as `event.domain` attribute of the instrumentation scope.
- `log_category` (optional): Specifies the category for the logs emitted, which
should be added as `log.category` attribute of the LogRecords.
- `include_trace_context` (optional): Specifies whether the Trace Context should
automatically be passed on to the Events and Logs emitted by the Logger. This
automatically be passed on to the Logs emitted by the Logger. This
SHOULD be true by default.
- `attributes` (optional): Specifies the instrumentation scope attributes to
associate with emitted telemetry.
Expand All @@ -110,7 +110,7 @@ identifying fields are equal. The term *distinct* applied to Loggers describes
instances where at least one identifying field has a different value.

Implementations MUST NOT require users to repeatedly obtain a Logger again with
the same name+version+schema_url+event_domain+include_trace_context+attributes
the same name+version+schema_url+log_category+include_trace_context+attributes
to pick up configuration changes. This can be achieved either by allowing to
work with an outdated configuration or by ensuring that new configuration
applies also to previously returned Loggers.
Expand All @@ -119,7 +119,7 @@ Note: This could, for example, be implemented by storing any mutable
configuration in the `LoggerProvider` and having `Logger` implementation objects
have a reference to the `LoggerProvider` from which they were obtained.
If configuration must be stored per-Logger (such as disabling a certain `Logger`),
the `Logger` could, for example, do a look-up with its name+version+schema_url+event_domain+include_trace_context+attributes
the `Logger` could, for example, do a look-up with its name+version+schema_url+log_category+include_trace_context+attributes
in a map in the `LoggerProvider`, or the `LoggerProvider` could maintain a registry
of all returned `Logger`s and actively update their configuration if it changes.

Expand All @@ -129,7 +129,7 @@ the emitted data format is capable of representing such association.

## Logger

The `Logger` is responsible for emitting Events and Logs.
The `Logger` is responsible for emitting Logs.

Note that `Logger`s should not be responsible for configuration. This should be
the responsibility of the `LoggerProvider` instead.
Expand All @@ -138,22 +138,22 @@ the responsibility of the `LoggerProvider` instead.

The Logger MUST provide functions to:

#### Emit Event
#### Emit Categorized LogRecord

Emit a `LogRecord` representing an Event to the processing pipeline.
Emit a `LogRecord` representing a Categorized LogRecord to the processing pipeline.

This function MAY be named `logEvent`.
This function MAY be named `logCategorized`.

**Parameters:**

* `name` - the Event name. This argument MUST be recorded as a `LogRecord`
attribute with the key `event.name`. Care MUST be taken by the implementation
to not override or delete this attribute while the Event is emitted to
* `name` - the log name. This argument MUST be recorded as a `LogRecord`
attribute with the key `log.name`. Care MUST be taken by the implementation
to not override or delete this attribute while the log is emitted to
preserve its identity.
* `logRecord` - the [LogRecord](#logrecord) representing the Event.
* `logRecord` - the [LogRecord](#logrecord) representing the log.

Events require the `event.domain` attribute. The API MUST not allow creating an
Event if the Logger instance doesn't have `event.domain` scope attribute.
Categorize Logs require the `log.category` attribute. The API MUST not allow creating a
Categorize Log if the Logger instance doesn't have `log.category` attribute.

#### Emit LogRecord

Expand All @@ -171,8 +171,9 @@ by end users or other instrumentation.

## LogRecord

The API emits [Events](#emit-event) and [LogRecords](#emit-logrecord) using
the `LogRecord` [data model](data-model.md).
The API emits [arbitrary LogRecords](#emit-logrecord) or
[Categorized LogRecords](#emit-categorized-logrecord) using the `LogRecord`
[data model](data-model.md).

A function receiving this as an argument MUST be able to set the following
fields:
Expand Down
27 changes: 27 additions & 0 deletions specification/logs/semantic_conventions/categorizedlogs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Semantic Convention for Categorize Logs

**Status**: [Experimental](../../document-status.md)

This document describes the attributes of Categorized Logs that are represented
by `LogRecord`s. All Categorized Logs have a name and a category. The category
is a namespace for names and is used as a mechanism to avoid conflicts of
names.

<!-- semconv event -->
| Attribute | Type | Description | Examples | Requirement Level |
|-----------------------|---|------------------------------------------------------------------------------------------------------------------------------|---|---|
| `log.name` | string | The name identifies the log type. | `click`; `exception` | Required |
| `log.category` | string | The category identifies the context in which the log is defined. An log name is unique only within a cagtegory. [1] | `browser` | Required |

**[1]:** An `log.name` is supposed to be unique only in the context of an
`log.category`, so this allows for two logs in different categories to
have same `log.name`, yet be unrelated logs.

`log.category` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value | Description |
|---|--------------------------|
| `browser` | Events from browser apps |
| `device` | Events from mobile apps |
| `k8s` | Events from Kubernetes |
<!-- endsemconv -->
27 changes: 0 additions & 27 deletions specification/logs/semantic_conventions/events.md

This file was deleted.

0 comments on commit 944ba27

Please sign in to comment.