Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EV-4885] Improve flow log aggregation level doc #1488

Merged
merged 4 commits into from
Jun 7, 2024

Conversation

dimitri-nicolo
Copy link
Contributor

Product Version(s):

Enterprise 3.19.2, Cloud and version 3.19.1

Issue:

Link to docs preview:

SME review:

  • An SME has approved this change.

DOCS review:

  • A member of the docs team has approved this change.

Additional information:

Merge checklist:

  • Deploy preview inspected wherever changes were made
  • Build completed successfully
  • Test have passed

@dimitri-nicolo dimitri-nicolo requested a review from a team as a code owner May 17, 2024 22:22
Copy link

netlify bot commented May 17, 2024

Deploy Preview for calico-docs-preview-next ready!

Name Link
🔨 Latest commit 45de23a
🔍 Latest deploy log https://app.netlify.com/sites/calico-docs-preview-next/deploys/6661f08e64ec9b000806806b
😎 Deploy Preview https://deploy-preview-1488--calico-docs-preview-next.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
1 paths audited
Performance: 29 (🔴 down 26 from production)
Accessibility: 90 (no change from production)
Best Practices: 83 (no change from production)
SEO: 86 (no change from production)
PWA: -
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

netlify bot commented May 17, 2024

Deploy Preview succeeded!

Built without sensitive environment variables

Name Link
🔨 Latest commit 45de23a
🔍 Latest deploy log https://app.netlify.com/sites/tigera/deploys/6661f08dd36bb20008147688
😎 Deploy Preview https://deploy-preview-1488--tigera.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
1 paths audited
Performance: 32 (no change from production)
Accessibility: 90 (no change from production)
Best Practices: 75 (no change from production)
SEO: 93 (no change from production)
PWA: -
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify site configuration.

|-----------|-------------------------------------|-------------------------------------------------------------------|
| 0 | | No aggregation |
| 1 | AnyProcessInSameSourcePod | Identity fields below source pod level are masked out. It means that if multiple processes or containers, within the same source pod, perform the same operation, the events are aggregated. |
| 2 | AnyProcessInSameSourcePodPrefix | Identity fields below source pod-prefix level are masked out. It means that if multiple processes or containers, within pods with the same prefix, perform the same operation, the events are aggregated. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| 2 | AnyProcessInSameSourcePodPrefix | Identity fields below source pod-prefix level are masked out. It means that if multiple processes or containers, within pods with the same prefix, perform the same operation, the events are aggregated. |
| 2 | AnyProcessInSameSourcePodPrefix | In addition to the above, source pod names are aggregated based on their shared prefixes. This means that flows, to the same destination, from pods within the same Deployment/ReplicaSet are aggregated together. |

"pod-prefix" is an implementation detail; I think it;d help to link back to user concepts such as ReplicaSets/Deployments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, I've added it and adapted the level 3 description accordingly.

Copy link
Member

@fasaxc fasaxc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks a lot clearer overall. Do we document the detailed flow log JSON somewhere else? Don't want to lose that completely, but I agree that it's confusing in this document so +1 for removing it here.

@dimitri-nicolo
Copy link
Contributor Author

Looks a lot clearer overall. Do we document the detailed flow log JSON somewhere else? Don't want to lose that completely, but I agree that it's confusing in this document so +1 for removing it here.

What about adding an example flow log section to the end of the flow logs page? I've added a no aggregation flow log, with a brief on some of the fields, along with a link to the aggregation page for more info on other levels. Check out the Calico Enterprise logs > Flow Logs > Flow Log data types page at the end.

Copy link
Member

@rene-dekker rene-dekker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, much improved compared with the old version.

Copy link
Member

@nelljerram nelljerram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement. I've left a few more detailed comments, but happy to leave them to your judgement.

| 0 | | No aggregation |
| 1 | AnyProcessInSameSourcePod | Identity fields below source pod level are masked out. It means that if multiple processes or containers, within the same source pod, perform the same operation, the events are aggregated. |
| 2 | AnyProcessInSameSourcePodPrefix | In addition to the above, source pod names are aggregated based on their shared prefixes. This means that flows, to the same destination, from pods within the same Deployment/ReplicaSet are aggregated together. |
| 3 | AnyProcessInSamePodPrefix | This level of aggregation builds on the previous two levels and also groups destination pod names based on their shared prefixes. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like maybe going too far in copying the RS doc. But I see that it does broadly work, and in particular that AnyProcessInSameSourcePod does correspond to source port masking. That said:

  • Would something like AnyConnectionFromSameSourcePod be better in this context than AnyProcessInSameSourcePod, to put the focus more on network connections than on processes? (Although it is correct that connections have to made by processes...) Then Level 3 could be AnyConnectionBetweenSamePodPrefixes. (Although also need to incorporate Shaun's point here about not saying "pod prefix" if possible.)

  • Similarly, perhaps "make the same connection to a destination" instead of "perform the same operation"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I see you've addressed the point about "pod prefix" below.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm finding it hard replacing pod prefix without creating a very long name. I feel the items have to be read along with their description.

I could replace:
AnyConnectionFromSamePodPrefix with AnyConnectionFromSamePodController
AnyConnectionBetweenSamePodPrefixes with AnyConnectionBetweenSamePodControllers

but I feel we're just replacing the word, without applying the intended change. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the words you already added to explain pod prefix suffice. (I.e. the "In Kubernetes ..." paragraph.) Given that you've done that, I would say it's OK to keep using PodPrefix here.

type minimizes the flow logs generated for traffic coming from different containers within the same pod and going to the same destination endpoint
and port. The two flows originating from `client-a` without aggregation are combined into one.

In Kubernetes, ReplicaSets and StatefulSets can automatically create names for pods. For example, the pods `nginx-1` and `nginx-2` are created by the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth mentioning Deployments and DaemonSets here too. In particular I would guess that Deployment is the most common usage pattern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(If you need an umbrella term for all of those, I believe it is "pod controller".)

```

The log shows an incoming connection reported by the "Destination" node, allowed by a policy on port 80. The flows in the log are grouped using a
5-minute aggregation interval, calculated as **`end_time`** - **`start_time`**. During this interval, one flow (**`"num_flow": 1`**) was recorded. At
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"calculated" feels wrong here. I think better to say that start_time and end_time describe the aggregation period.

Copy link
Member

@fasaxc fasaxc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a nice improvement, thanks!

@ctauchen
Copy link
Collaborator

ctauchen commented Jun 4, 2024

@dimitri-nicolo I've been keeping an eye on this and waiting for the technical reviews. Let me know when you're ready for me to review.

@dimitri-nicolo
Copy link
Contributor Author

@ctauchen This is now ready for your review, thanks!

Copy link
Collaborator

@ctauchen ctauchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimitri-nicolo Thanks, a few formatting and typo-level fixes for you. Othwerwise LGTM.

@@ -101,3 +101,66 @@ Where,
action for the tier. In this case, the `<policy name>` is selected arbitrarily from the set of policies within
the tier that apply to the endpoint.
* `-2` means "unknown". The rule index was not recorded.

### Flow log example, with **no aggregation**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Flow log example, with **no aggregation**
### Flow log example, with `no aggregation`


### Flow log example, with **no aggregation**

A flow log with aggregation level 0, **`no aggregation`**, might look like:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A flow log with aggregation level 0, **`no aggregation`**, might look like:
A flow log with aggregation level 0, `no aggregation`, might look like:

No need for bold when we're already using a code font. Should remove throughout.

}
```

The log shows an incoming connection reported by the "Destination" node, allowed by a policy on port 80. The **`start_time`** and **`end_time`**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The log shows an incoming connection reported by the "Destination" node, allowed by a policy on port 80. The **`start_time`** and **`end_time`**
The log shows an incoming connection reported by the destination node, allowed by a policy on port 80. The **`start_time`** and **`end_time`**

The log shows an incoming connection reported by the "Destination" node, allowed by a policy on port 80. The **`start_time`** and **`end_time`**
describe the aggregation period (5 min.) During this interval, one flow (**`"num_flow": 1`**) was recorded. At higher aggregation levels, flows from
endpoints performing the same operation and originating from the same Deployment/ReplicaSet are grouped into a single log. In this example, the
common source endpoints are prefixed with **`access-6b687c8dcb-`**. Parameters like **`source_ip`** may be dropped and set to **`null`** depending on
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
common source endpoints are prefixed with **`access-6b687c8dcb-`**. Parameters like **`source_ip`** may be dropped and set to **`null`** depending on
common source endpoints that are prefixed with **`access-6b687c8dcb-`**. Parameters like **`source_ip`** may be dropped and set to **`null`**, depending on


Finally, with `AnyConnectionBetweenSamePodPrefixes` we combine source and destination pods that are part of the same pod controller. With level 3, the flow logs
are aggregated by the destination port and protocol, as long as they originate from pods with the same pod-prefix and destined for pods of the same
pod-prefix. All logs previously distinct, are aggregated into a single flow log (see the last row).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pod-prefix. All logs previously distinct, are aggregated into a single flow log (see the last row).
pod-prefix. Previously distinct logs are aggregated into a single flow log (see the last row).

```

### Change default aggregation level

Before [changing the default aggregation level](../../../reference/resources/felixconfig.mdx#aggregationkind), note the following:

- Although any change in aggregation level affects flow log volume, lowering the aggregation number (especially to `0` for no aggregation) will cause significant impacts to log storage. If you allow more flow logs, ensure that you provision more log storage.
- Verify that the parameters that you want to see in your aggregation level, are not already [filtered](filtering.mdx)
- Verify that the parameters that you want to see in your aggregation level, are not already [filtered](filtering.mdx).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Verify that the parameters that you want to see in your aggregation level, are not already [filtered](filtering.mdx).
- Verify that the parameters that you want to see in your aggregation level are not already [filtered](filtering.mdx).

@dimitri-nicolo
Copy link
Contributor Author

@ctauchen thanks for the comments, I've address the changes in the latest commit

@ctauchen ctauchen merged commit 71ad753 into tigera:main Jun 7, 2024
9 of 10 checks passed
@dimitri-nicolo dimitri-nicolo deleted the dimitri-EV-4885 branch June 7, 2024 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants