Skip to content

Commit

Permalink
[EV-4885] Add example to flow log page. Improve aggr desc.
Browse files Browse the repository at this point in the history
  • Loading branch information
dimitri-nicolo committed May 28, 2024
1 parent 9c13bd4 commit 46196b6
Show file tree
Hide file tree
Showing 8 changed files with 284 additions and 16 deletions.
6 changes: 3 additions & 3 deletions calico-cloud/visibility/elastic/flow/aggregation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ The following table summarizes the aggregation levels by flow log traffic:
|-----------|-------------------------------------|-------------------------------------------------------------------|
| 0 | | No aggregation |
| 1 | AnyProcessInSameSourcePod | Identity fields below source pod level are masked out. It means that if multiple processes or containers, within the same source pod, perform the same operation, the events are aggregated. |
| 2 | AnyProcessInSameSourcePodPrefix | Identity fields below source pod-prefix level are masked out. It means that if multiple processes or containers, within pods with the same prefix, perform the same operation, the events are aggregated. |
| 3 | AnyProcessInSamePodPrefix | Identity fields below source and destination pod-prefix level are masked out. It means that if multiple processes or containers, within pods with the same prefix, perform the same operation towards pods with the same prefix, the events are aggregated. |
| 2 | AnyProcessInSameSourcePodPrefix | In addition to the above, source pod names are aggregated based on their shared prefixes. This means that flows, to the same destination, from pods within the same Deployment/ReplicaSet are aggregated together. |
| 3 | AnyProcessInSamePodPrefix | This level of aggregation builds on the previous two levels and also groups destination pod names based on their shared prefixes. |

### Understanding aggregation level differences

Expand All @@ -45,7 +45,7 @@ type minimizes the flow logs generated for traffic coming from different contain
and port. The two flows originating from `client-a` without aggregation are combined into one.

In Kubernetes, ReplicaSets and StatefulSets can automatically create names for pods. For example, the pods `nginx-1` and `nginx-2` are created by the
ReplicaSet nginx. The ReplicaSet name is considered a pod-prefix and is used to aggregate flow log entries (indicated with an asterisk * at the end
ReplicaSet `nginx`. The ReplicaSet name is considered a pod-prefix and is used to aggregate flow log entries (indicated with an asterisk * at the end
of the name). Flow logs originating from pods with the same prefix will be aggregated as long as the traffic is on the same protocol, and destined
towards the same IP, and destination port. The three flow logs without aggregation originating from `client-a` and `client-b` are combined into a
single flow log. This aggregation level is called `AnyProcessInSameSourcePodPrefix`.
Expand Down
67 changes: 67 additions & 0 deletions calico-cloud/visibility/elastic/flow/datatypes.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -101,3 +101,70 @@ Where,
action for the tier. In this case, the `<policy name>` is selected arbitrarily from the set of policies within
the tier that apply to the endpoint.
* `-2` means "unknown". The rule index was not recorded.

### Flow log example, with **no aggregation**

When the aggregation level is set to **no aggregation** the flow log will look like the following:

```
{
"start_time": 1597166083,
"end_time": 1597166383,
"source_ip": "192.168.47.9",
"source_name": "access-6b687c8dcb-zn5s2",
"source_name_aggr": "access-6b687c8dcb-*",
"source_namespace": "policy-demo",
"source_port": 42106,
"source_type": "wep",
"source_labels": {
"labels": [
"pod-template-hash=6b687c8dcb",
"app=access"
]
},
"dest_ip": "192.168.138.79",
"dest_name": "nginx-86c57db685-h6792",
"dest_name_aggr": "nginx-86c57db685-*",
"dest_namespace": "policy-demo",
"dest_port": 80,
"dest_type": "wep",
"dest_labels": {
"labels": [
"pod-template-hash=86c57db685",
"app=nginx"
]
},
"proto": "tcp",
"action": "allow",
"reporter": "dst",
"policies": {
"all_policies": [
"0|default|policy-demo/default.access-nginx|allow"
]
},
"bytes_in": 388,
"bytes_out": 1113,
"num_flows": 1,
"num_flows_started": 1,
"num_flows_completed": 1,
"packets_in": 6,
"packets_out": 5,
"http_requests_allowed_in": 0,
"http_requests_denied_in": 0,
"original_source_ips": null,
"num_original_source_ips": 0,
"host": "bz-n8kf-kadm-node-1",
"@timestamp": 1597166383000
}
```

* The aggregation interval can be determined by examining the interval length between the `"start_time": 1597166083` and `"end_time": 1597166383`.
* Workload endpoints with a similar prefix access-6b687c8dcb-* in the policy-demo namespace connected to a workload-endpoints/pods with prefix nginx-86c57db685-* exposing a service on port 80.
* The aggregated source workload endpoints have labels app: nginx and pod-template-hash: 6b687c8dcb and the aggregated destination workload endpoint has the labels app: nginx and pod-template-hash: 86c57db685.
* This log is an incoming connection reported by the "Destination" node, and a policy "Allowed" the connection.
* There are 6 incoming packets and 5 outgoing packets.
* The `num_flow` field indicates how many flows were aggregated together within the reported interval. As the aggregation level increases, we might expect more flows to be grouped together, depending on your data.

For different levels of aggregation, some of the fields may display `null` values, like the `source_port` or `source_ip` signifying that the pods
originating from the same source and performing the same tasks will be aggregated together into the same flow log. For more information on
aggregation levels see [configure flow log aggregation](./aggregation.mdx).
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,9 @@ The following table summarizes the aggregation levels by flow log traffic:
| **Level** | **Name** | **Description** |
|-----------|-------------------------------------|-------------------------------------------------------------------|
| 0 | | No aggregation |
| 1 | AnyProcessInSameSourcePod | Identity fields below source pod level are masked out. It means that if multiple processes or containers, within the same source pod, perform the same operation, the events are aggregated. |
| 2 | AnyProcessInSameSourcePodPrefix | Identity fields below source pod-prefix level are masked out. It means that if multiple processes or containers, within pods with the same prefix, perform the same operation, the events are aggregated. |
| 3 | AnyProcessInSamePodPrefix | Identity fields below source and destination pod-prefix level are masked out. It means that if multiple processes or containers, within pods with the same prefix, perform the same operation towards pods with the same prefix, the events are aggregated. |
| 1 | AnyProcessInSameSourcePod | Identity fields below source pod level are masked out. It means that if multiple processes or containers, within the same source pod, perform the same operation, the events are aggregated. |
| 2 | AnyProcessInSameSourcePodPrefix | In addition to the above, source pod names are aggregated based on their shared prefixes. This means that flows, to the same destination, from pods within the same Deployment/ReplicaSet are aggregated together. |
| 3 | AnyProcessInSamePodPrefix | This level of aggregation builds on the previous two levels and also groups destination pod names based on their shared prefixes. |

### Understanding aggregation level differences

Expand All @@ -45,7 +45,7 @@ type minimizes the flow logs generated for traffic coming from different contain
and port. The two flows originating from `client-a` without aggregation are combined into one.

In Kubernetes, ReplicaSets and StatefulSets can automatically create names for pods. For example, the pods `nginx-1` and `nginx-2` are created by the
ReplicaSet nginx. The ReplicaSet name is considered a pod-prefix and is used to aggregate flow log entries (indicated with an asterisk * at the end
ReplicaSet `nginx`. The ReplicaSet name is considered a pod-prefix and is used to aggregate flow log entries (indicated with an asterisk * at the end
of the name). Flow logs originating from pods with the same prefix will be aggregated as long as the traffic is on the same protocol, and destined
towards the same IP, and destination port. The three flow logs without aggregation originating from `client-a` and `client-b` are combined into a
single flow log. This aggregation level is called `AnyProcessInSameSourcePodPrefix`.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -101,3 +101,70 @@ Where,
action for the tier. In this case, the `<policy name>` is selected arbitrarily from the set of policies within
the tier that apply to the endpoint.
* `-2` means "unknown". The rule index was not recorded.

### Flow log example, with **no aggregation**

When the aggregation level is set to **no aggregation** the flow log will look like the following:

```
{
"start_time": 1597166083,
"end_time": 1597166383,
"source_ip": "192.168.47.9",
"source_name": "access-6b687c8dcb-zn5s2",
"source_name_aggr": "access-6b687c8dcb-*",
"source_namespace": "policy-demo",
"source_port": 42106,
"source_type": "wep",
"source_labels": {
"labels": [
"pod-template-hash=6b687c8dcb",
"app=access"
]
},
"dest_ip": "192.168.138.79",
"dest_name": "nginx-86c57db685-h6792",
"dest_name_aggr": "nginx-86c57db685-*",
"dest_namespace": "policy-demo",
"dest_port": 80,
"dest_type": "wep",
"dest_labels": {
"labels": [
"pod-template-hash=86c57db685",
"app=nginx"
]
},
"proto": "tcp",
"action": "allow",
"reporter": "dst",
"policies": {
"all_policies": [
"0|default|policy-demo/default.access-nginx|allow"
]
},
"bytes_in": 388,
"bytes_out": 1113,
"num_flows": 1,
"num_flows_started": 1,
"num_flows_completed": 1,
"packets_in": 6,
"packets_out": 5,
"http_requests_allowed_in": 0,
"http_requests_denied_in": 0,
"original_source_ips": null,
"num_original_source_ips": 0,
"host": "bz-n8kf-kadm-node-1",
"@timestamp": 1597166383000
}
```

* The aggregation interval can be determined by examining the interval length between the `"start_time": 1597166083` and `"end_time": 1597166383`.
* Workload endpoints with a similar prefix access-6b687c8dcb-* in the policy-demo namespace connected to a workload-endpoints/pods with prefix nginx-86c57db685-* exposing a service on port 80.
* The aggregated source workload endpoints have labels app: nginx and pod-template-hash: 6b687c8dcb and the aggregated destination workload endpoint has the labels app: nginx and pod-template-hash: 86c57db685.
* This log is an incoming connection reported by the "Destination" node, and a policy "Allowed" the connection.
* There are 6 incoming packets and 5 outgoing packets.
* The `num_flow` field indicates how many flows were aggregated together within the reported interval. As the aggregation level increases, we might expect more flows to be grouped together, depending on your data.

For different levels of aggregation, some of the fields may display `null` values, like the `source_port` or `source_ip` signifying that the pods
originating from the same source and performing the same tasks will be aggregated together into the same flow log. For more information on
aggregation levels see [configure flow log aggregation](./aggregation.mdx).
10 changes: 5 additions & 5 deletions calico-enterprise/visibility/elastic/flow/aggregation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,9 @@ The following table summarizes the aggregation levels by flow log traffic:
| **Level** | **Name** | **Description** |
|-----------|-------------------------------------|-------------------------------------------------------------------|
| 0 | | No aggregation |
| 1 | AnyProcessInSameSourcePod | Identity fields below source pod level are masked out. It means that if multiple processes or containers, within the same source pod, perform the same operation, the events are aggregated. |
| 2 | AnyProcessInSameSourcePodPrefix | Identity fields below source pod-prefix level are masked out. It means that if multiple processes or containers, within pods with the same prefix, perform the same operation, the events are aggregated. |
| 3 | AnyProcessInSamePodPrefix | Identity fields below source and destination pod-prefix level are masked out. It means that if multiple processes or containers, within pods with the same prefix, perform the same operation towards pods with the same prefix, the events are aggregated. |
| 1 | AnyProcessInSameSourcePod | Identity fields below source pod level are masked out. It means that if multiple processes or containers, within the same source pod, perform the same operation, the events are aggregated. |
| 2 | AnyProcessInSameSourcePodPrefix | In addition to the above, source pod names are aggregated based on their shared prefixes. This means that flows, to the same destination, from pods within the same Deployment/ReplicaSet are aggregated together. |
| 3 | AnyProcessInSamePodPrefix | This level of aggregation builds on the previous two levels and also groups destination pod names based on their shared prefixes. |

### Understanding aggregation level differences

Expand All @@ -45,14 +45,14 @@ type minimizes the flow logs generated for traffic coming from different contain
and port. The two flows originating from `client-a` without aggregation are combined into one.

In Kubernetes, ReplicaSets and StatefulSets can automatically create names for pods. For example, the pods `nginx-1` and `nginx-2` are created by the
ReplicaSet nginx. The ReplicaSet name is considered a pod-prefix and is used to aggregate flow log entries (indicated with an asterisk * at the end
ReplicaSet `nginx`. The ReplicaSet name is considered a pod-prefix and is used to aggregate flow log entries (indicated with an asterisk * at the end
of the name). Flow logs originating from pods with the same prefix will be aggregated as long as the traffic is on the same protocol, and destined
towards the same IP, and destination port. The three flow logs without aggregation originating from `client-a` and `client-b` are combined into a
single flow log. This aggregation level is called `AnyProcessInSameSourcePodPrefix`.

Finally, with `AnyProcessInSamePodPrefix` we combine source and destination pods that are part of the same ReplicaSets. With level 3, the flow logs
are aggregated by the destination port and protocol, as long as they originate from pods with the same pod-prefix and destined for pods of the same
pod-prefix. All logs previously distinct, are aggregated with into a single flow log (see the last row).
pod-prefix. All logs previously distinct, are aggregated into a single flow log (see the last row).

| | | **Src Traffic** | | | **Dst Traffic** | | | **Packet counts** | |

Check failure on line 57 in calico-enterprise/visibility/elastic/flow/aggregation.mdx

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Spelling] Did you really mean 'Src'? Raw Output: {"message": "[Vale.Spelling] Did you really mean 'Src'?", "location": {"path": "calico-enterprise/visibility/elastic/flow/aggregation.mdx", "range": {"start": {"line": 57, "column": 44}}}, "severity": "ERROR"}

Check failure on line 57 in calico-enterprise/visibility/elastic/flow/aggregation.mdx

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Spelling] Did you really mean 'Dst'? Raw Output: {"message": "[Vale.Spelling] Did you really mean 'Dst'?", "location": {"path": "calico-enterprise/visibility/elastic/flow/aggregation.mdx", "range": {"start": {"line": 57, "column": 76}}}, "severity": "ERROR"}
|--------------------------|-----------|----------|---------|----------|----------|---------|----------|------------|-------------|
Expand Down
67 changes: 67 additions & 0 deletions calico-enterprise/visibility/elastic/flow/datatypes.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -101,3 +101,70 @@ Where,
action for the tier. In this case, the `<policy name>` is selected arbitrarily from the set of policies within
the tier that apply to the endpoint.
* `-2` means "unknown". The rule index was not recorded.

### Flow log example, with **no aggregation**

When the aggregation level is set to **no aggregation** the flow log will look like the following:

```
{
"start_time": 1597166083,
"end_time": 1597166383,
"source_ip": "192.168.47.9",
"source_name": "access-6b687c8dcb-zn5s2",
"source_name_aggr": "access-6b687c8dcb-*",
"source_namespace": "policy-demo",
"source_port": 42106,
"source_type": "wep",
"source_labels": {
"labels": [
"pod-template-hash=6b687c8dcb",
"app=access"
]
},
"dest_ip": "192.168.138.79",
"dest_name": "nginx-86c57db685-h6792",
"dest_name_aggr": "nginx-86c57db685-*",
"dest_namespace": "policy-demo",
"dest_port": 80,
"dest_type": "wep",
"dest_labels": {
"labels": [
"pod-template-hash=86c57db685",
"app=nginx"
]
},
"proto": "tcp",
"action": "allow",
"reporter": "dst",
"policies": {
"all_policies": [
"0|default|policy-demo/default.access-nginx|allow"
]
},
"bytes_in": 388,
"bytes_out": 1113,
"num_flows": 1,
"num_flows_started": 1,
"num_flows_completed": 1,
"packets_in": 6,
"packets_out": 5,
"http_requests_allowed_in": 0,
"http_requests_denied_in": 0,
"original_source_ips": null,
"num_original_source_ips": 0,
"host": "bz-n8kf-kadm-node-1",
"@timestamp": 1597166383000
}
```

* The aggregation interval can be determined by examining the interval length between the `"start_time": 1597166083` and `"end_time": 1597166383`.
* Workload endpoints with a similar prefix access-6b687c8dcb-* in the policy-demo namespace connected to a workload-endpoints/pods with prefix nginx-86c57db685-* exposing a service on port 80.
* The aggregated source workload endpoints have labels app: nginx and pod-template-hash: 6b687c8dcb and the aggregated destination workload endpoint has the labels app: nginx and pod-template-hash: 86c57db685.

Check failure on line 163 in calico-enterprise/visibility/elastic/flow/datatypes.mdx

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Spelling] Did you really mean 'nginx'? Raw Output: {"message": "[Vale.Spelling] Did you really mean 'nginx'?", "location": {"path": "calico-enterprise/visibility/elastic/flow/datatypes.mdx", "range": {"start": {"line": 163, "column": 61}}}, "severity": "ERROR"}

Check failure on line 163 in calico-enterprise/visibility/elastic/flow/datatypes.mdx

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Spelling] Did you really mean 'nginx'? Raw Output: {"message": "[Vale.Spelling] Did you really mean 'nginx'?", "location": {"path": "calico-enterprise/visibility/elastic/flow/datatypes.mdx", "range": {"start": {"line": 163, "column": 170}}}, "severity": "ERROR"}
* This log is an incoming connection reported by the "Destination" node, and a policy "Allowed" the connection.
* There are 6 incoming packets and 5 outgoing packets.
* The `num_flow` field indicates how many flows were aggregated together within the reported interval. As the aggregation level increases, we might expect more flows to be grouped together, depending on your data.

For different levels of aggregation, some of the fields may display `null` values, like the `source_port` or `source_ip` signifying that the pods
originating from the same source and performing the same tasks will be aggregated together into the same flow log. For more information on
aggregation levels see [configure flow log aggregation](./aggregation.mdx).
Loading

0 comments on commit 46196b6

Please sign in to comment.