New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFP-2023-05-17: Hubble Flow Logs #25508
Comments
MotivationMany users of k8s are interested in collecting information about traffic patterns from certain workloads for security and audit purposes. They would like to utilize existing observability stack (for example Loki+Grafana) instead of introducing and maintaining new tool such as Hubble Timescape. ProposalAdd a new CRD that will be watched by Definition// Not a real CRD
kind: flowlogging.cilium.io
spec:
properties:
fieldmask:
description: "List of field names that will be kept in the log output"
type: array
items:
description: "Field name that will be kept in the output. Subfields are separated by dot"
type: string
filter:
description: "Filter"
type: object
properties:
allowlist:
description: "Allowlist filter generated by CLI. Empty matches all. Flow passes the filter if any of the entries match. Only Flows matching Allowlist, but not matching Denylist will be logged"
type: array
items:
description: "Allowlist entry"
type: string
denylist:
description: "Denylist filter generated by CLI. Empty matches none. Flow passes the filter if none of the entries match. Only Flows matching Allowlist, but not matching Denylist will be logged"
type: array
items:
description: "Denylist entry"
type: string
end:
description: "End denotes the timestamp when logging will stop"
type: string ExamplesExample of collecting Flow logs for a specific user-project.
spec:
fieldmask:
- "time"
- "source.pod_name"
- "destination.pod_name"
- "verdict"
filter:
allowlist:
- '{"source_pod":["user-project/"]}'
- '{"destination_pod":["user-project/"]}'
denylist:
- '{"source_label":["k8s:k8s-app=kube-dns"]}'
- '{"destination_label":["k8s:k8s-app=kube-dns"]}'
end: "2023-05-10T15:04:05-07:00" Example of collecting Flow logs for an entire organization.
spec:
fieldmask:
filter:
end: "2023-05-10T15:04:05-07:00" Example of stopping Flow logs collection:
spec:
fieldmask:
filter:
end: "1970-01-01T00:00:00" |
Thinking out loud I wonder if this would already be possible with something like https://github.com/cilium/hubble-otel. You could send all of the flows through oTEL and do all filtering/storage that way. |
Thanks for pointing it out to me. This may indeed cover my use-case. However, as it is a separate component, I wonder if it doesn't waste too much resources by requiring every Flow to go through grpc. I need to investigate it further. I hope to have results next week |
It totally depends on your use case. Otel Collectors can run as a DaemonSet, so you could potentially send it all via Unix Domain Socket and then filter using Otel processors. |
Unfortunately there are too many issues with Otel approach:
|
That's too bad. Given that the CFP here would require engineering work either way, I think the engineering work would be better directed at fixing up the oTEL approach, rather than adding it directly into Hubble. That would immediately fix issues 1 + 2 outlined above. Also, if there was a robust oTEL approach available, it would allow consumer of this data to plug it into their existing data pipelines very easily rather than needing to spin up new tooling. As far as resource consumption. The logic to handle the flows needs to happen somewhere, whether it's in the oTEL pod or somewhere else, so I think optimizing it within the pipeline makes sense either way. |
We already have very similar functionality, so I would say that over 60% work for this is already done.
It is still surprising to me as other components (Hubble CLI/Relay) use ~20 MB. It seems that it is mainly caused by oTEL boilerplate as I see a similar collector using 110 MB and doing pretty much nothing.
I see that it is possible to run I started working on getting it working again. Let's see if I need to become a maintainer to change it quickly to suit my needs 😄 |
As discussed on slack, https://github.com/cilium/hubble-otel is unfortunately abandoned and there is no willing maintainer to take care of it. About the proposal, I think instead of having JSON embedded in YAML. It is better to just use flow.Filter directly. Go struct will look like this: type FlowLogging struct {
// List of field names that will be kept in the log output
FieldMask []string
// Filters specifies flows to include in the logging output. Flow passes the
// filter if any of the entries match. Only Flows matching the filter, but
// not matching exclude-filter will be logged. The check is disabled when
// empty.
Filters []flow.Filter
ExcludeFilters []flow.Filter
// End denotes the timestamp when logging will stop.
Expires metav1.Time
} And the first example: spec:
fieldMask:
- "time"
- "source.pod_name"
- "destination.pod_name"
- "verdict"
filters:
- source_pod:
- "user-project/"
- destination_pod:
- "user-project/"
excludeFilters:
- source_label:
- "k8s:k8s-app=kube-dns"
- destination_label:
- "k8s:k8s-app=kube-dns"
expires: "2023-05-10T15:04:05-07:00" The latest definition is part of #26646 |
Thanks for opening this CFP @AwesomePatrol, agree that export configuration through CRD would be very useful. First, note that although currently undocumented, there is already some facility to export Hubble flows to a file (see here and here). While it doesn't implement filtering, fieldmasking, or deadline (as the CFP suggest) it implements file rotation and compression. About the CFP, is the goal to have a single well-defined target output file and expect exactly one CR to drive the export? If yes, what happens when I submit two |
Are ConfigMap changes applied without
An idea was to add a webhook to limit CRs to only one per cluster (with a well-known name). Otherwise, we would probably need to add another CR with a list of FlowLogging CRs to apply which could fail reconciling when there are conflicts, etc. I think it is too complicated as more complex filtering could be applied to logs already collected. One-CR-per-cluster approach also solves the problem of write-amplification (when multiple requests log the same data to different files).
I think we discussed it previously. Combining filters is quite complicated. For example:
Combining these filters would result in the Another solution to the above would be creating some intermediate CR - FlowLogRequest - which would define filters, field mask, etc. Controller ( |
I experimented with the configuration update. I created a small proof of concept that reloads Hubble subsystem when ConfigMap changes in AwesomePatrol@b6c4d25 It works, but the main pain point is that it can take a long time for changes to take effect (up to 2m for ConfigMap propagation, Hubble restart is quick). However I believe that
I think that the output file path could be inferred from its name as it neatly prevents having two objects specify the same output. And in the case that an object is recreated it just overwrites the previous file. Alternatively it can use UUID in its path.
There is no need to combine filters when multiple CRs are present. They can work in independent threads outputting to different files.
If we make this CRD immutable, we can have both (maybe |
Agree that a CRD approach is more flexible than the ConfigMap, even if ConfigMap reload would work flawlessly.
Sounds reasonable to me.
This sounds great, but please keep in mind that currently Protobuf structs to JSON serialization is resource hungry. When multiple CRs select a given flow to be exported in their respective files, the implementation should encode it once.
👍 |
I will start the initial implementation then.
This could work with filters, but not field mask. The initial idea is to skip the masked parts when writing the serialized object, but this will likely be complex. We can consider adding support for lightweight output formats (LTSV or logfmt). This could potentially make field masking even cheaper (by implementing it as part of serialization process). |
@AwesomePatrol how is this work progressing? |
I was offline for a week (not sure how to nicely communicate it on github). I am expanding Hubble-exporter to support field masks and filters in #26379. Hopefully I will get everything review-ready this week. |
@AwesomePatrol what is the purpose of the |
It was renamed to The use case is that you want to capture the flows over a period of time across multiple clusters. Synchronization between clusters isn't instant, so it is best to set an expiration as it can be exactly the same on all of them. Same with nodes in a single cluster. For example, it is 15:00. I want to capture some logs and start debugging in 30m. I will set Moreover, even with filters and field mask, having it running indefinitely can be costly, so I think it is better to have it expire at some point instead of relying on user to delete CR and hope that all nodes reconcile it quickly enough. For having it running all the time, I think that implementing some simple storage solution could be better (maybe dumping in sqlite db on a node and quering it with regular |
Plugin will be used to generate yaml annotation in structs generated from proto to be able to unmarshall FlowFilter from YAML for cilium#25508. Signed-off-by: Marek Chodor <mchodor@google.com>
Yaml annotations are needed to unmarshall FlowFilter structs for flow logging feature cilium#25508. Signed-off-by: Marek Chodor <mchodor@google.com>
Plugin will be used to generate yaml annotation in structs generated from proto to be able to unmarshall FlowFilter from YAML for cilium#25508. Signed-off-by: Marek Chodor <mchodor@google.com>
Yaml annotations are needed to unmarshall FlowFilter structs for flow logging feature cilium#25508. Signed-off-by: Marek Chodor <mchodor@google.com>
Plugin will be used to generate yaml annotation in structs generated from proto to be able to unmarshall FlowFilter from YAML for cilium#25508. Signed-off-by: Marek Chodor <mchodor@google.com>
Yaml annotations are needed to unmarshall FlowFilter structs for flow logging feature cilium#25508. Signed-off-by: Marek Chodor <mchodor@google.com>
Plugin will be used to generate yaml annotation in structs generated from proto to be able to unmarshall FlowFilter from YAML for cilium#25508. Signed-off-by: Marek Chodor <mchodor@google.com>
Yaml annotations are needed to unmarshall FlowFilter structs for flow logging feature cilium#25508. Signed-off-by: Marek Chodor <mchodor@google.com>
Plugin will be used to generate yaml annotation in structs generated from proto to be able to unmarshall FlowFilter from YAML for cilium#25508. Signed-off-by: Marek Chodor <mchodor@google.com>
Yaml annotations are needed to unmarshall FlowFilter structs for flow logging feature cilium#25508. Signed-off-by: Marek Chodor <mchodor@google.com>
Plugin will be used to generate yaml annotation in structs generated from proto to be able to unmarshall FlowFilter from YAML for cilium#25508. Signed-off-by: Marek Chodor <mchodor@google.com>
Plugin will be used to generate yaml annotation in structs generated from proto to be able to unmarshall FlowFilter from YAML for cilium#25508. Signed-off-by: Marek Chodor <mchodor@google.com>
Yaml annotations are needed to unmarshall FlowFilter structs for flow logging feature cilium#25508. Signed-off-by: Marek Chodor <mchodor@google.com>
Cilium Feature Proposal
Is your feature request related to a problem?
Without Hubble Timescape it is impossible to get information about the past Hubble Flows (outside the history stored in the ring buffer).
Describe the feature you'd like
I want to have an option to output information on Flows for a period of time (maybe indefinite).
(Optional) Describe your proposed solution
I can't link to the doc, so I will copy the content in the first comment.
The text was updated successfully, but these errors were encountered: