Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set pod labels to be optional fields in the network flow records #2739

Merged
merged 1 commit into from Oct 18, 2021

Conversation

yanjunz97
Copy link
Contributor

@yanjunz97 yanjunz97 commented Sep 9, 2021

The network flow record contains two fields related to pod labels, sourcePodLabel and destinationPodLabel. Sometimes long pod labels may cause problems. This PR sets the pod labels to be optional fields by adding a parameter includePodLabels in the flow aggregator configmap. By default, the pod labels will not be included in the flow records. If the user would like to include them, they can modify the flow aggregator configmap.

Signed-off-by: Yanjun Zhou zhouya@vmware.com

@yanjunz97 yanjunz97 marked this pull request as draft September 9, 2021 20:47
@codecov-commenter
Copy link

codecov-commenter commented Sep 9, 2021

Codecov Report

Merging #2739 (b9d95d8) into main (0946ca6) will decrease coverage by 19.42%.
The diff coverage is 72.72%.

❗ Current head b9d95d8 differs from pull request most recent head 7f1ec92. Consider uploading reports for the commit 7f1ec92 to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##             main    #2739       +/-   ##
===========================================
- Coverage   60.31%   40.88%   -19.43%     
===========================================
  Files         284      158      -126     
  Lines       23470    19542     -3928     
===========================================
- Hits        14155     7990     -6165     
- Misses       7787    10802     +3015     
+ Partials     1528      750      -778     
Flag Coverage Δ
kind-e2e-tests ?
unit-tests 40.88% <72.72%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/flowaggregator/flowaggregator.go 21.20% <72.72%> (-48.09%) ⬇️
pkg/ovs/openflow/default.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/util/runtime/runtime.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/agent/cniserver/pod_configuration_linux.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/ovs/openflow/logs.go 9.52% <0.00%> (-90.48%) ⬇️
pkg/apis/controlplane/register.go 0.00% <0.00%> (-90.00%) ⬇️
pkg/agent/nodeportlocal/k8s/annotations.go 0.00% <0.00%> (-83.88%) ⬇️
pkg/agent/agent_linux.go 0.00% <0.00%> (-80.00%) ⬇️
pkg/agent/client.go 0.00% <0.00%> (-77.42%) ⬇️
pkg/ovs/ovsconfig/ovs_client_linux.go 0.00% <0.00%> (-76.93%) ⬇️
... and 231 more

@yanjunz97 yanjunz97 marked this pull request as ready for review September 9, 2021 21:19
Copy link
Contributor

@dreamtalen dreamtalen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Yanjun working on this.

fa.fillPodLabels(key, record.Record)
fa.aggregationProcess.SetExternalFieldsFilled(record)
}
} else {
fa.aggregationProcess.SetExternalFieldsFilled(record)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should be inside the if statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! It seems that in the if statement, we have really ensured the external fields to be filled. I added line 444 for the case that pod labels are not included. I tried to mark the external fields as filled to keep consistence with the case pod labels are provided. Should I just remove the line? Correct me if my understanding is wrong!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, missed you have this line inside the if statement, you could remove this else statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -679,6 +680,7 @@ func (data *TestData) mutateFlowAggregatorConfigMap(ipfixCollector string, faClu
flowAggregatorConf = strings.Replace(flowAggregatorConf, "#externalFlowCollectorAddr: \"\"", fmt.Sprintf("externalFlowCollectorAddr: \"%s\"", ipfixCollector), 1)
flowAggregatorConf = strings.Replace(flowAggregatorConf, "#activeFlowRecordTimeout: 60s", fmt.Sprintf("activeFlowRecordTimeout: %v", aggregatorActiveFlowRecordTimeout), 1)
flowAggregatorConf = strings.Replace(flowAggregatorConf, "#inactiveFlowRecordTimeout: 90s", fmt.Sprintf("inactiveFlowRecordTimeout: %v", aggregatorInactiveFlowRecordTimeout), 1)
flowAggregatorConf = strings.Replace(flowAggregatorConf, "#includePodLabels: false", fmt.Sprintf("includePodLabels: %v", includePodLabels), 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For e2e test, currently we are changing the config to include pod labels during e2e test and didn't add new test for not including pod labels situation to avoid making test longer, we would like to hear more options from @srikartati, do you think we need to add the test for the not including pod labels situation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is fine but I see that the "false" case was skipped for unit tests. Maybe it can be added there to get coverage for it.
@zyiou plans to add flow aggregator restart test in e2e tests. This scenario of not including Pod labels could be covered then. Please take note of that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -57,6 +57,8 @@ type Options struct {
flowAggregatorAddress string
// Format for record sent to the configured flow collector
format string
// Whether sourcePodLabels and destinationPodLabels will be included
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"includePodLabels indicates whether source and destination Pod labels are included or not"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -679,6 +680,7 @@ func (data *TestData) mutateFlowAggregatorConfigMap(ipfixCollector string, faClu
flowAggregatorConf = strings.Replace(flowAggregatorConf, "#externalFlowCollectorAddr: \"\"", fmt.Sprintf("externalFlowCollectorAddr: \"%s\"", ipfixCollector), 1)
flowAggregatorConf = strings.Replace(flowAggregatorConf, "#activeFlowRecordTimeout: 60s", fmt.Sprintf("activeFlowRecordTimeout: %v", aggregatorActiveFlowRecordTimeout), 1)
flowAggregatorConf = strings.Replace(flowAggregatorConf, "#inactiveFlowRecordTimeout: 90s", fmt.Sprintf("inactiveFlowRecordTimeout: %v", aggregatorInactiveFlowRecordTimeout), 1)
flowAggregatorConf = strings.Replace(flowAggregatorConf, "#includePodLabels: false", fmt.Sprintf("includePodLabels: %v", includePodLabels), 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is fine but I see that the "false" case was skipped for unit tests. Maybe it can be added there to get coverage for it.
@zyiou plans to add flow aggregator restart test in e2e tests. This scenario of not including Pod labels could be covered then. Please take note of that.

@dreamtalen
Copy link
Contributor

Thanks @srikartati, Yanjun please make corresponding changes. Also, please modify the flow visibility doc since we have this new config.

@yanjunz97
Copy link
Contributor Author

Thanks @dreamtalen and @srikartati . Starting to work on them.

@yanjunz97
Copy link
Contributor Author

/test-e2e

k8sClient: nil,
observationDomainID: testObservationDomainID,
}

for _, isIPv6 := range []bool{false, true} {
faWithoutPodLabels := &flowAggregator{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

observationDomainID: testObservationDomainID,
podInformer: informerFactory.Core().V1().Pods(),
}

faWithoutPodLabels := &flowAggregator{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have a for loop here with true or false includePodLabels when creating flowAggregator object instead of copying these lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Done.

@yanjunz97
Copy link
Contributor Author

/test-all

Copy link
Member

@srikartati srikartati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
In general, I feel changing this option runtime is more useful to the user rather than enforcing a restart of the Flow Aggregator. We should support this through antctl in the future.

Any thoughts on how this can be extended to other K8s metadata fields if required?

@srikartati
Copy link
Member

/test-ipv6-e2e /test-ipv6-only-e2e

@srikartati srikartati added the area/flow-visibility Issues or PRs related to flow visibility support in Antrea label Sep 14, 2021
@yanjunz97
Copy link
Contributor Author

/test-all
/test-ipv6-e2e
/test-ipv6-only-e2e

@yanjunz97
Copy link
Contributor Author

LGTM.
In general, I feel changing this option runtime is more useful to the user rather than enforcing a restart of the Flow Aggregator. We should support this through antctl in the future.

Any thoughts on how this can be extended to other K8s metadata fields if required?

I have not dug deep into the implementation of antctl. But it seems for the metadata fields in flow-aggregator configmap, all of them are parsed and saved in a variable of type flowAggregator, which is modifiable in runtime.

@yanjunz97
Copy link
Contributor Author

/test-windows-networkpolicy
/test-networkpolicy
/test-all-features-conformance

@yanjunz97
Copy link
Contributor Author

/test-windows-conformance
/test-windows-e2e
/test-windows-networkpolicy

srikartati
srikartati previously approved these changes Sep 15, 2021
Copy link
Member

@srikartati srikartati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a concern that this will make integration with other tools harder, as these tools (e.g. NetworkPolicy recommendation) may depend on these labels to function properly. @srikartati what's the plan in this case? It seems that we do not need a better solution for Pods with "long labels".

@@ -34,3 +34,6 @@
# Provide format for records sent to the configured flow collector.
# Supported formats are IPFIX and JSON.
#recordFormat: "IPFIX"

# Provide whether source and destination Pod labels will be included in the flow record.
#includePodLabels: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather define a more generic configuration parameter that can be extended later:

recordContents:
    podLabels: false

or recordInformation if it is preferred to recordContents.

This is the direction we are taking for other Antrea components as well. @srikartati what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It is a good idea to make this extensible to multiple fields. However, this would only help with a small subset of fields that are added at the Flow Aggregator. We cannot truly disable the fields, especially the rest of Kubernetes metadata, which are added at the Flow Exporter--for this, we may need config map parameter in Antrea agent.
The rest of the fields that are added at the Aggregator are source and destination statistics, which I feel is not so significant to have this toggle option.

Maybe we could continue to have this option at the Aggregator and add this map-based data type in the Antrea Agent config map to disable at the Flow Exporter in a different PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say symmetry between the 2 configs is a good thing to have. There is also no telling whether we will include more information in the FlowAggregator in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a valid point that we may add new fields like pod labels in the future to the Flow Aggregator. Changing to map-based generic config parameter sounds good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yanjunz97 could you address this ^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Start to work on it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@srikartati
Copy link
Member

I have a concern that this will make integration with other tools harder, as these tools (e.g. NetworkPolicy recommendation) may depend on these labels to function properly. @srikartati what's the plan in this case? It seems that we do not need a better solution for Pods with "long labels".

Yes, Network Policy Recommendation engine would need this. However, we have two options to deploy the reco engine--either inside or outside the cluster. If it is inside, netpol reco engine could fetch the labels from API server directly rather than getting from the flow record. Do you agree @dreamtalen?

And the other visibility application that would need labels is Monitoring UI, which can be customizable by the user.

@dreamtalen
Copy link
Contributor

I have a concern that this will make integration with other tools harder, as these tools (e.g. NetworkPolicy recommendation) may depend on these labels to function properly. @srikartati what's the plan in this case? It seems that we do not need a better solution for Pods with "long labels".

Yes, Network Policy Recommendation engine would need this. However, we have two options to deploy the reco engine--either inside or outside the cluster. If it is inside, netpol reco engine could fetch the labels from API server directly rather than getting from the flow record. Do you agree @dreamtalen?

And the other visibility application that would need labels is Monitoring UI, which can be customizable by the user.

Yes, I agree. I have another option in my head on long term architecture design is that we could fill these pod labels info when doing the flow processing before store flow records in the database, so that np recommendation engine, UI, and performance analysis module could have these info when fetching flow records from db.

@antoninbas
Copy link
Contributor

Yes, Network Policy Recommendation engine would need this. However, we have two options to deploy the reco engine--either inside or outside the cluster. If it is inside, netpol reco engine could fetch the labels from API server directly rather than getting from the flow record.

But since we do want to support out-of-cluster, what is the plan then? How will the label information be exported from the cluster?

@srikartati
Copy link
Member

Yes, Network Policy Recommendation engine would need this. However, we have two options to deploy the reco engine--either inside or outside the cluster. If it is inside, netpol reco engine could fetch the labels from API server directly rather than getting from the flow record.

But since we do want to support out-of-cluster, what is the plan then? How will the label information be exported from the cluster?

Label information will be present along with fields in the flow record. It will either be added by the Flow Aggregator or a visibility application. The design for exporting the flow record out of the cluster is still WIP. Currently, we are focusing on supporting single cluster.

@@ -169,12 +169,15 @@ data:
# Provide format for records sent to the configured flow collector.
# Supported formats are IPFIX and JSON.
#recordFormat: "IPFIX"

# Provide whether source and destination Pod labels will be included in the flow record.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Provide whether source and destination Pod labels will be included in the flow record.
# Determine whether source and destination Pod labels will be included in the flow records.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -34,3 +34,6 @@
# Provide format for records sent to the configured flow collector.
# Supported formats are IPFIX and JSON.
#recordFormat: "IPFIX"

# Provide whether source and destination Pod labels will be included in the flow record.
#includePodLabels: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yanjunz97 could you address this ^

for _, ie := range antreaLabelsElementList {
element, err := fa.registry.GetInfoElement(ie, ipfixregistry.AntreaEnterpriseID)
if err != nil {
return 0, fmt.Errorf("%s not present. returned error: %v", ie, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return 0, fmt.Errorf("%s not present. returned error: %v", ie, err)
return 0, fmt.Errorf("error when getting InformationElement %s from registry: %v", ie, err)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 206 to 221
{
false,
true,
},
{
true,
true,
},
{
false,
false,
},
{
true,
false,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe use a less verbose syntax here

{ false, true },
{ true, true },
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -679,6 +680,7 @@ func (data *TestData) mutateFlowAggregatorConfigMap(ipfixCollector string, faClu
flowAggregatorConf = strings.Replace(flowAggregatorConf, "#externalFlowCollectorAddr: \"\"", fmt.Sprintf("externalFlowCollectorAddr: \"%s\"", ipfixCollector), 1)
flowAggregatorConf = strings.Replace(flowAggregatorConf, "#activeFlowRecordTimeout: 60s", fmt.Sprintf("activeFlowRecordTimeout: %v", aggregatorActiveFlowRecordTimeout), 1)
flowAggregatorConf = strings.Replace(flowAggregatorConf, "#inactiveFlowRecordTimeout: 90s", fmt.Sprintf("inactiveFlowRecordTimeout: %v", aggregatorInactiveFlowRecordTimeout), 1)
flowAggregatorConf = strings.Replace(flowAggregatorConf, "#includePodLabels: false", fmt.Sprintf("includePodLabels: %v", includePodLabels), 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure of the usefulness of defining an includePodLabels constant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to keep the consistence with other variables. I'm ok to use true here if you think it is better.

@antoninbas
Copy link
Contributor

Sorry, I closed this by mistake...

@yanjunz97
Copy link
Contributor Author

/test-all
/test-ipv6-e2e
/test-ipv6-only-e2e

@@ -51,4 +51,12 @@ type FlowAggregatorConfig struct {
// Provide format for records sent to the configured flow collector. Supported formats are IPFIX and JSON.
// Defaults to "IPFIX"
RecordFormat string `yaml:"recordFormat,omitempty"`
// RecordContents contains flow records related configuration options.
RecordContents RecordContentsConfig `yaml:"recordContents,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use simple type definition like featureGates in Antrea Agent? We can avoid a separate struct.
https://github.com/antrea-io/antrea/blob/main/cmd/antrea-agent/config.go#L23

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Srikar. Sounds good as we may only need boolean value in the RecordContents. Updated.

@yanjunz97
Copy link
Contributor Author

/test-all
/test-ipv6-e2e
/test-ipv6-only-e2e

@@ -34,3 +34,8 @@
# Provide format for records sent to the configured flow collector.
# Supported formats are IPFIX and JSON.
#recordFormat: "IPFIX"

# recordContents contains flow records related configuration options.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"recordContents enables configuring some fields in the flow records, A field can be either included (true) or excluded (false) by providing a bool value."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -51,4 +51,6 @@ type FlowAggregatorConfig struct {
// Provide format for records sent to the configured flow collector. Supported formats are IPFIX and JSON.
// Defaults to "IPFIX"
RecordFormat string `yaml:"recordFormat,omitempty"`
// RecordContents contains flow records related configuration options.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

srikartati
srikartati previously approved these changes Sep 29, 2021
Copy link
Member

@srikartati srikartati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

antoninbas
antoninbas previously approved these changes Sep 29, 2021
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for making the changes

@antoninbas
Copy link
Contributor

/test-all

@yanjunz97
Copy link
Contributor Author

/test-conformance

@antoninbas
Copy link
Contributor

@yanjunz97 can you resolve the conflicts?

@yanjunz97 yanjunz97 dismissed stale reviews from antoninbas and srikartati via 0302334 October 1, 2021 21:34
@yanjunz97 yanjunz97 force-pushed the optional-pod-labels branch 2 times, most recently from 0302334 to 4e89fd6 Compare October 1, 2021 21:37
@yanjunz97
Copy link
Contributor Author

/test-all
/test-ipv6-e2e
/test-ipv6-only-e2e

flowAggregatorAddress: "",
observationDomainID: testObservationDomainID,
podInformer: informerFactory.Core().V1().Pods(),
fas := make(map[bool]*flowAggregator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we using a map for this?

newFlowAggregator := func(bool includePodLabels) * flowAggregator {
    return &flowAggregator{...}
}
flowAggregatorWithPodLabels := newFlowAggregator(true)
flowAggregatorWithoutPodLabels := newFlowAggregator(false)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Updated

Comment on lines 438 to 439
if fa.includePodLabels {
if !fa.aggregationProcess.AreExternalFieldsFilled(*record) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe unify the ifs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: Yanjun Zhou <zhouya@vmware.com>
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yanjunz97
Copy link
Contributor Author

/test-all
/test-ipv6-e2e
/test-ipv6-only-e2e

@yanjunz97 yanjunz97 added this to the Antrea v1.4 release milestone Oct 11, 2021
@yanjunz97
Copy link
Contributor Author

/test-all-features-conformance
/test-hw-offload

@antoninbas antoninbas merged commit ee5d3c4 into antrea-io:main Oct 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/flow-visibility Issues or PRs related to flow visibility support in Antrea
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants