Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incident bucket #107

Merged
merged 10 commits into from
Mar 9, 2023
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,14 +70,18 @@ Definition of specific events that are fundamental to pipeline execution and orc

Handling Events relating to changes in version management of Source Code and related assets

### [Continuous Integration Events](./continuous-integration-pipeline-events.md)
### [Continuous Integration Events](./continuous-integration.md)

Handling Events associated with Continuous Integration activities, typically involving build and test

### [Continuous Deployment Events](./continuous-deployment-pipeline-events.md)
### [Continuous Deployment Events](./continuous-deployment.md)

Handling Events associated with Continuous Deployment activities

### [Continuous Operations Events](./continuous-operations.md)
e-backmark-ericsson marked this conversation as resolved.
Show resolved Hide resolved

Handling Events associated with the health of the services deployed and running in a specific environment

### [CloudEvents Binding and Transport](./cloudevents-binding.md)

Defining how CDEvents are mapped to CloudEvents for transportation and delivery
Expand Down
File renamed without changes.
File renamed without changes.
90 changes: 90 additions & 0 deletions continuous-operations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
<!--
e-backmark-ericsson marked this conversation as resolved.
Show resolved Hide resolved
---
linkTitle: "Continuous Operations Events"
e-backmark-ericsson marked this conversation as resolved.
Show resolved Hide resolved
weight: 70
description: >
Continuous Operations Events
e-backmark-ericsson marked this conversation as resolved.
Show resolved Hide resolved
---
-->
# Continuous Operations Events
e-backmark-ericsson marked this conversation as resolved.
Show resolved Hide resolved

Continuous Operations events are related to the operation of services deployed in target environments, tracking of incidents and their resolution. Incidents, and their resolution, can be detected by a number of different actors, like the end-user, a quality gate, a monitoring system, an SRE through a ticketing system or even the service itself.
e-backmark-ericsson marked this conversation as resolved.
Show resolved Hide resolved

## Subjects

This specification defines one subject in this stage, the [`incident`](#incident). To quote the definition of the term from the NIST glossary, and [incident][] is:

> An occurrence that actually or potentially jeopardizes the confidentiality, integrity, or availability of an information system or the information the system processes, stores, or transmits or that constitutes a violation or imminent threat of violation of security policies, security procedures, or acceptable use policies.

| Subject | Description | Predicates |
|---------|-------------|------------|
| [`incident`](#incident) | A problem in a production environment | [`detected`](#incident-detected), [`reported`](#incident-reported), [`resolved`](#incident-resolved)|

### `incident`

An `incident` represents a problem in a production environment.

| Field | Type | Description | Examples |
|-------|------|-------------|----------|
| id | `String` | Uniquely identifies the subject within the source. | `04896C75-F34D-40FF-A584-3F2B71CB9D47`, `issue123`, `risk-CVE123` |
| source | `URI-Reference` | [source](../spec.md#source) from the context | `region1/production`, `monitoring-system/metricA`|
| description | `String` | Short, free style description of the incident | "Response time above 10ms", "New CVE-123 detected" |
| environment | `Object` ([`environment`](./continuous-deployment.md#environment)) | Reference to the environment | `{"id": "production"}`, `{"id": "staging"}`, `{"id": "prod123", "source": "iaas-region-1"}` |
| service | `Object` ([`service`](./continuous-deployment.md#service)) | Reference to the service | `{"id": "service123"}`, `{"id": "service123", "source": "region1/k8s/namespace"}` |
| artifactId | `Purl` | Identifier of the artifact deployed with this service | `0b31b1c02ff458ad9b7b81cbdf8f028bd54699fa151f221d1e8de6817db93427`, `927aa808433d17e315a258b98e2f1a55f8258e0cb782ccb76280646d0dbe17b5`, `six-1.14.0-py2.py3-none-any.whl` |

## Events

### `incident detected`

This event represents an incident that has been detected by a system or human.

- Event Type: __`dev.cdevents.incident.detected.0.1.0-draft`__
- Predicate: detected
- Subject: [`incident`](#incident)

| Field | Type | Description | Examples | Mandatory ✅ |
|-------|------|-------------|----------|----------------------------|
| id | `String` | Uniquely identifies the subject within the source. | `04896C75-F34D-40FF-A584-3F2B71CB9D47`, `issue123`, `risk-CVE123` | ✅ |
| source | `URI-Reference` | [source](../spec.md#source) from the context | `region1/production`, `monitoring-system/metricA`| |
| description | `String` | Short, free style description of the incident | "Response time above 10ms", "New CVE-123 detected" | |
| environment | `Object` ([`environment`](./continuous-deployment.md#environment)) | Reference to the environment | `{"id": "production"}`, `{"id": "staging"}`, `{"id": "prod123", "source": "iaas-region-1"}` | ✅ |
| service | `Object` ([`service`](./continuous-deployment.md#service)) | Reference to the service | `{"id": "service123"}`, `{"id": "service123", "source": "region1/k8s/namespace"}` | |
| artifactId | `Purl` | Identifier of the artifact deployed with this service | `0b31b1c02ff458ad9b7b81cbdf8f028bd54699fa151f221d1e8de6817db93427`, `927aa808433d17e315a258b98e2f1a55f8258e0cb782ccb76280646d0dbe17b5`, `six-1.14.0-py2.py3-none-any.whl` | |

### `incident reported`

This event represents an incident that has been reported through a ticketing system. Compared to the `detected` predicated, it introduces a ticket URI.

- Event Type: __`dev.cdevents.incident.reported.0.1.0-draft`__
- Predicate: reported
- Subject: [`incident`](#incident)

| Field | Type | Description | Examples | Mandatory ✅ |
|-------|------|-------------|----------|----------------------------|
| id | `String` | Uniquely identifies the subject within the source. | `04896C75-F34D-40FF-A584-3F2B71CB9D47`, `issue123`, `risk-CVE123` | ✅ |
| source | `URI-Reference` | [source](../spec.md#source) from the context | `region1/production`, `monitoring-system/metricA`| |
| description | `String` | Short, free style description of the incident | "Response time above 10ms", "New CVE-123 detected" | |
| environment | `Object` ([`environment`](./continuous-deployment.md#environment)) | Reference to the environment | `{"id": "production"}`, `{"id": "staging"}`, `{"id": "prod123", "source": "iaas-region-1"}` | ✅ |
| ticketURI | `URI` | URI of the ticket | `example.issues.com/ticket123` | ✅ |
| service | `Object` ([`service`](./continuous-deployment.md#service)) | Reference to the service | `{"id": "service123"}`, `{"id": "service123", "source": "region1/k8s/namespace"}` | |
| artifactId | `Purl` | Identifier of the artifact deployed with this service | `0b31b1c02ff458ad9b7b81cbdf8f028bd54699fa151f221d1e8de6817db93427`, `927aa808433d17e315a258b98e2f1a55f8258e0cb782ccb76280646d0dbe17b5`, `six-1.14.0-py2.py3-none-any.whl` | |

### `incident resolved`

This event represents an incident that has been resolved, meaning that the problem identified by the incident has been solved or recalled.

- Event Type: __`dev.cdevents.incident.resolved.0.1.0-draft`__
- Predicate: resolved
- Subject: [`incident`](#incident)

| Field | Type | Description | Examples | Mandatory ✅ |
|-------|------|-------------|----------|----------------------------|
| id | `String` | Uniquely identifies the subject within the source. | `04896C75-F34D-40FF-A584-3F2B71CB9D47`, `issue123`, `risk-CVE123` | ✅ |
| source | `URI-Reference` | [source](../spec.md#source) from the context | `region1/production`, `monitoring-system/metricA`| |
| description | `String` | Short, free style description of the incident resolution | "Response time restored below 10ms", "CVE-123 acknowledged as non-exploitable" | |
e-backmark-ericsson marked this conversation as resolved.
Show resolved Hide resolved
| environment | `Object` ([`environment`](./continuous-deployment.md#environment)) | Reference to the environment | `{"id": "production"}`, `{"id": "staging"}`, `{"id": "prod123", "source": "iaas-region-1"}` | ✅ |
| service | `Object` ([`service`](./continuous-deployment.md#service)) | Reference to the service | `{"id": "service123"}`, `{"id": "service123", "source": "region1/k8s/namespace"}` | |
| artifactId | `Purl` | Identifier of the artifact deployed with this service | `0b31b1c02ff458ad9b7b81cbdf8f028bd54699fa151f221d1e8de6817db93427`, `927aa808433d17e315a258b98e2f1a55f8258e0cb782ccb76280646d0dbe17b5`, `six-1.14.0-py2.py3-none-any.whl` | |

[incident]: https://csrc.nist.gov/glossary/term/incident
32 changes: 32 additions & 0 deletions examples/incidentdetected.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
{
"context": {
"version": "0.2.0-draft",
"id": "F4BD2B55-B6F6-4F44-AF72-BD2D0E7A8708",
"source": "/monitoring/prod1",
"type": "dev.cdevents.incident.detected.0.1.0-draft",
m-linner-ericsson marked this conversation as resolved.
Show resolved Hide resolved
"timestamp": "2022-11-11T13:52:20.079Z"
},
"subject": {
"id": "incident-123",
"source": "/monitoring/prod1",
"type": "incident",
"content": {
"description": "Response time above threshold of 100ms",
"environment": {
"id": "prod1",
"source": "/iaas/geo1"
},
"service": {
"id": "myApp",
"source": "/clusterA/namespaceB"
},
"artifactId": "pkg:oci/myapp@sha256%3A0b31b1c02ff458ad9b7b81cbdf8f028bd54699fa151f221d1e8de6817db93427"
}
},
"customData": {
"metric": "responseTime",
"threshold": "100ms",
"value": "200ms"
},
"customDataContentType": "application/json"
}
33 changes: 33 additions & 0 deletions examples/incidentreported.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
{
"context": {
"version": "0.2.0-draft",
"id": "F4BD2B55-B6F6-4F44-AF72-BD2D0E7A8708",
"source": "/monitoring/prod1",
"type": "dev.cdevents.incident.reported.0.1.0-draft",
"timestamp": "2022-11-11T13:52:20.079Z"
},
"subject": {
"id": "incident-123",
"source": "/monitoring/prod1",
"type": "incident",
"content": {
"description": "Response time above threshold of 100ms",
"environment": {
"id": "prod1",
"source": "/iaas/geo1"
},
"service": {
"id": "myApp",
"source": "/clusterA/namespaceB"
},
"artifactId": "pkg:oci/myapp@sha256%3A0b31b1c02ff458ad9b7b81cbdf8f028bd54699fa151f221d1e8de6817db93427",
"ticketURI": "https://my-issues.example/incidents/ticket-345"
}
},
"customData": {
"severity": "medium",
"priority": "critical",
"reportedBy": "userId"
},
"customDataContentType": "application/json"
}
32 changes: 32 additions & 0 deletions examples/incidentresolved.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
{
"context": {
"version": "0.2.0-draft",
"id": "F4BD2B55-B6F6-4F44-AF72-BD2D0E7A8708",
"source": "/monitoring/prod1",
"type": "dev.cdevents.incident.resolved.0.1.0-draft",
"timestamp": "2022-11-11T13:52:20.079Z"
},
"subject": {
"id": "incident-123",
"source": "/monitoring/prod1",
"type": "incident",
"content": {
"description": "Response time restored below 100ms",
"environment": {
"id": "prod1",
"source": "/iaas/geo1"
},
"service": {
"id": "myApp",
"source": "/clusterA/namespaceB"
},
"artifactId": "pkg:oci/myapp@sha256%3A0b31b1c02ff458ad9b7b81cbdf8f028bd54699fa151f221d1e8de6817db93439"
}
},
"customData": {
"metric": "responseTime",
"threshold": "100ms",
"value": "70ms"
},
"customDataContentType": "application/json"
}
139 changes: 139 additions & 0 deletions schemas/incidentdetected.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://cdevents.dev/0.2.0-draft/schema/incident-detected-event",
"properties": {
"context": {
"properties": {
"version": {
"type": "string",
"minLength": 1
},
"id": {
"type": "string",
"minLength": 1
},
"source": {
"type": "string",
"minLength": 1,
"format": "uri-reference"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments for all: Shouldn't this be uri instead of uri-reference? Main concern is with interoperability, the reader must then know intricate knowledge of the sender e.g. which cluster is it deployed in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how it is defined in the spec https://github.com/cdevents/spec/blob/v0.1.1/spec.md#source-context so not anything defined in this PR.

The initial reason for that is that source is a URI-reference in CloudEvents. I don't think we can enforce a URI here really since not all event sources will have a URI associated, but we could do a better job at describe the format of URI-references - I'm tracking this in #29

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason for my objection about the uri-reference was that it allows relative urls. Basically you can have a system with two sources having the same source as they differ on the host which is not part of the source field.

Regarding this PR: These are the only events that have uri-reference in the schemas. I would suggest to remove it here an do a PR to add it in all the events. Main motivation for this is to keep the protocol consistent, why have do we have one event with this but not all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say this PR should align with how the existing schemas look, and aligning the usage or uri/uri-reference should be done in #114

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this PR - alternatively we could merge #116 first, and bring back the URI-reference format here.

},
"type": {
"type": "string",
"enum": [
"dev.cdevents.incident.detected.0.1.0-draft"
],
"default": "dev.cdevents.incident.detected.0.1.0-draft"
},
"timestamp": {
"type": "string",
"format": "date-time"
}
},
"additionalProperties": false,
"type": "object",
"required": [
"version",
"id",
"source",
"type",
"timestamp"
]
},
"subject": {
"properties": {
"id": {
"type": "string",
"minLength": 1
},
"source": {
"type": "string",
"format": "uri-reference"
},
"type": {
afrittoli marked this conversation as resolved.
Show resolved Hide resolved
e-backmark-ericsson marked this conversation as resolved.
Show resolved Hide resolved
"type": "string",
"enum": [
"incident"
],
"default": "incident"
},
"content": {
"properties": {
"description": {
"type": "string"
},
"environment": {
"properties": {
"id": {
"type": "string",
"minLength": 1
},
"source": {
"type": "string",
"format": "uri-reference"
}
},
"additionalProperties": false,
"type": "object",
"required": [
"id"
]
},
"service": {
"properties": {
"id": {
"type": "string",
"minLength": 1
},
"source": {
"type": "string",
"format": "uri-reference"
}
},
"additionalProperties": false,
"type": "object",
"required": [
"id"
]
},
"artifactId": {
"type": "string",
"minLength": 1
}
},
"additionalProperties": false,
"type": "object",
"required": [
"environment"
]
}
},
"additionalProperties": false,
"type": "object",
"required": [
"id",
"type",
"content"
]
},
"customData": {
"oneOf": [
{
"type": "object"
},
{
"type": "string",
"contentEncoding": "base64"
}
]
},
"customDataContentType": {
"type": "string"
}
},
"additionalProperties": false,
"type": "object",
"required": [
"context",
"subject"
]
}
Loading