Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NetworkChaos: Add support for ports in external targets #2932

Merged

Conversation

miedzinski
Copy link
Contributor

Signed-off-by: Dominik Miedziński dominik.miedzinski@allegro.pl

What problem does this PR solve?

This PR adds support for declaring ports in external targets in NetworkChaos experiments.

What's changed and how it works?

Chaos-daemon now creates 3 IP sets in pods: hash:net (same as before), hash:net,port (allowing for testing ip and port pairs) and list:set that combines them. Iptables rules now reference the list:set IP set.

There are changes to PodNetworkChaos, but NetworkChaos remains unchanged and there's no need to update dashboard.

Related changes

  • Need to update chaos-mesh/website
  • Need to update Dashboard UI
  • Need to cheery-pick to release branches
    • release-2.1
    • release-2.0

Checklist

Tests

  • Unit test
  • E2E test
  • No code
  • Manual test (add steps below)

Side effects

  • Breaking backward compatibility

Release note

Add support for declaring ports in external targets in NetworkChaos experiments.

Signed-off-by: Dominik Miedziński <dominik.miedzinski@allegro.pl>
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Feb 23, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • STRRL
  • cwen0

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot
Copy link
Member

Welcome @miedzinski!

It looks like this is your first PR to chaos-mesh/chaos-mesh 🎉.

I'm the bot to help you request reviewers, add labels and more, See available commands.

We want to make sure your contribution gets all the attention it needs!



Thank you, and welcome to chaos-mesh/chaos-mesh. 😃

@codecov
Copy link

codecov bot commented Feb 25, 2022

Codecov Report

Merging #2932 (48ed931) into master (96f979a) will increase coverage by 0.27%.
The diff coverage is 48.30%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2932      +/-   ##
==========================================
+ Coverage   37.92%   38.19%   +0.27%     
==========================================
  Files         105      105              
  Lines        9108     9182      +74     
==========================================
+ Hits         3454     3507      +53     
- Misses       5344     5366      +22     
+ Partials      310      309       -1     
Impacted Files Coverage Δ
controllers/podnetworkchaos/controller.go 37.38% <0.00%> (-1.65%) ⬇️
controllers/podnetworkchaos/ipset/ipset.go 2.59% <0.00%> (-2.17%) ⬇️
pkg/chaosdaemon/iptables_server.go 66.11% <50.00%> (ø)
pkg/chaosdaemon/ipset_server.go 90.15% <88.57%> (+3.48%) ⬆️
controllers/podnetworkchaos/netutils/cidr.go 73.07% <89.28%> (+73.07%) ⬆️
pkg/workflow/controllers/utils.go 85.71% <0.00%> (-1.59%) ⬇️
pkg/selector/generic/mode.go 28.20% <0.00%> (+2.56%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 02c0e1d...48ed931. Read the comment docs.

Signed-off-by: Dominik Miedziński <dominik.miedzinski@allegro.pl>
@STRRL
Copy link
Member

STRRL commented Mar 1, 2022

Hi @miedzinski , thanks for your contribution! That's an awesome feature!

But this PR is really huge, and it would still break the API of Chaos Mesh (with podnetworkchaos). So I am afraid this PR would not be merged recently.

Currently, the design of NetworkChaos does not care about the port thing, it is designed on the Network Layer(IP Layer). If we want to filter the network traffic with ports(actually it's not difficult with iptables and current code base), we should change our design with at least Transport Layer(consider some protocol, TCP/UDP/...). It would make huge changes to the design of NetworkChaos. I think we would better draft an RFC into rfcs before we do that.

How do you think about it? @miedzinski

@miedzinski
Copy link
Contributor Author

@STRRL, thanks for your response.

You're right there is a problem with backward compatibility of PodNetworkChaos, but I think it's something we can solve:

  • There is no need to name all IP sets in PodNetworkChaos. hash:net and hash:net,port names can be derived in daemon from list:set name.
  • IP set contents can remain strings and be parsed later.

This way all CRDs are unchanged. There is also a change to protobufs - if you think these shouldn't be changed too, then I can move parsing and resolving external targets to chaos daemon. That would additionally solve a TODO comment.

Regarding the RFC process, this is something I'd rather avoid if possible. Filtering traffic by ports is crucial for adoption of Chaos Mesh in my organization, but we definitely don't intend to make huge changes to design of experiments at this point. I don't think the L3 vs L4 is also an issue, because NetworkChaos already resolves hostnames and that's Application Layer.

Please let me know if you're okay with changes I suggested.

@STRRL
Copy link
Member

STRRL commented Mar 2, 2022

You're right there is a problem with backward compatibility of PodNetworkChaos, but I think it's something we can solve:

  • There is no need to name all IP sets in PodNetworkChaos. hash:net and hash:net,port names can be derived in daemon from list:set name.
  • IP set contents can remain strings and be parsed later.

This way all CRDs are unchanged.

We only consider "backward compatibility" now. We could accept appending more fields with default values. But renaming/deleting fields is NOT preferred.

I have another idea based on your suggestion. We could extend v1alpha1.RawIPSet and pb.IPSet to support more type of ipset, instead of using one v1alpha1.RawIPSet to introduce the combination of ipset rules which contains hash:ip, hash:IP,port and list:net.

const (
	IPSetTypeHashIP     IPSetType = "hash:ip"
	IPSetTypeHashIPPort IPSetType = "hash:ip,port"
	IPSetTypeListNet    IPSetType = "list:net"
	// other IPSet types that we need
)

// RawIPSet represents an ipset on specific pod
type RawIPSet struct {
	// The name of set ipset
	Name string `json:"name"`

	IPSetType IPSetType `json:"ipsetType"`

	// The contents of ipset.
	// Only available when IPSetType is IPSetTypeHashIP
	Cidrs []string `json:"cidrs"`

	// The contents of ipset.
	// Only available when IPSetType is IPSetTypeHashIPPort
	CidrAndPorts []CidrAndPort `json:"cidrAndPorts"`

	// The contents of ipset.
	// Only available when IPSetType is IPSetTypeListNet
	SetNames []string `json:"setNames"`

	// The name and namespace of the source network chaos
	RawRuleSource `json:",inline"`
}

We would append three ordered RawIPset not one into PodNetworkChaos.

before:

[{
  "setName": "chaos1_set_tgt",
  "netPortName": "chaos1_netPort_tgt",
  "netName": "chaos1_net_tgt",
  "cidrs": [
    {
      "cidr": "1.1.1.1",
      "port": 53
    },
    {
      "cidr": "1.2.3.4",
      "port": 0
    }
  ]
}]

after:

[
  {
    "name": "chaos1_ipport_tgt",
    "type": "hash:ip,port",
    "cidrAndPorts": [
      {
        "cidr": "1.1.1.1",
        "port": 53
      }
    ]
  },
  {
    "name": "chaos1_ip_tgt",
    "type": "hash:ip",
    "cidrs": [
      "1.2.3.4"
    ]
  },
  {
    "name": "chaos1_set_tgt",
    "type": "list:set",
    "setName": [
      "chaos1_ipport_tgt",
      "chaos1_ip_tgt"
    ]
  }
]

What do you think about it? @miedzinski

@miedzinski
Copy link
Contributor Author

That's a good solution too. I'll send a patch in the next few days.

Signed-off-by: Dominik Miedziński <dominik.miedzinski@allegro.pl>
Signed-off-by: Dominik Miedziński <dominik.miedzinski@allegro.pl>
Signed-off-by: Dominik Miedziński <dominik.miedzinski@allegro.pl>
@miedzinski
Copy link
Contributor Author

/cc @STRRL

@ti-chi-bot ti-chi-bot requested a review from STRRL March 4, 2022 13:33
@miedzinski
Copy link
Contributor Author

There is still one field removed from protobuf, but I believe this is not an issue, isn't it? If yes, I can remove a oneof and apply a similar layout to PodNetworkChaos.

@STRRL
Copy link
Member

STRRL commented Mar 7, 2022

There is still one field removed from protobuf, but I believe this is not an issue, isn't it? If yes, I can remove a oneof and apply a similar layout to PodNetworkChaos.

It's not an issue. But I still prefer to keep the same layout to PodNetworkChaos. Please update it. ❤️

Signed-off-by: Dominik Miedziński <dominik.miedzinski@allegro.pl>
Signed-off-by: Dominik Miedziński <dominik.miedzinski@allegro.pl>
Signed-off-by: Dominik Miedziński <dominik.miedzinski@allegro.pl>
@miedzinski
Copy link
Contributor Author

@STRRL done! Now we're only adding new fields.

Signed-off-by: Dominik Miedziński <dominik.miedzinski@allegro.pl>
dstIpset := ipset.BuildIPSet(targetPods, externalCidrs, networkchaos, string(tcType[0:2])+ipSetPostFix, m.Source)
impl.Log.Info("apply traffic control with filter", "sources", m.Source, "ipset", dstIpset)
dstSetIPSet, dstOtherIPSets := ipset.BuildIPSets(targetPods, externalCidrs, networkchaos, string(tcType[0:2])+ipSetPostFix, m.Source)
impl.Log.Info("apply traffic control with filter", "sources", m.Source, "ipset", dstSetIPSet)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should print the whole ipset info at this log.

netPortName := GenerateIPSetName(networkchaos, "netport_"+namePostFix)

cidrs := []string{}
cidrandPorts := []v1alpha1.CidrAndPort{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/cidrandPorts/cidrAndPorts/g

Comment on lines +93 to +101
// CidrAndPort represents CIDR and port pair
type CidrAndPort struct {
Cidr string `json:"cidr"`

// +kubebuilder:validation:Minimum=1
// +kubebuilder:validation:Maximum=65535
Port uint16 `json:"port"`
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also append fields to introduce protocol, like tcp, udp or icmp. When specify ipset rule without protocol, it would only specify with tcp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

example usage like 192.168.1.1,udp:53

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something I'd rather avoid right now for a couple of reasons:

  • This PR is already huge. Filtering by protocols needs more changes to IP sets and/or iptables rules.
  • Adding a support for filtering by protocols is outside the scope of this PR.
  • Defaulting to TCP would break backward compatibility. As of now Chaos Mesh is protocol agnostic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got that!

@@ -94,10 +94,10 @@ func (iptables *iptablesClient) setIptablesChain(chain *pb.Chain) error {
var matchPart string
var interfaceMatcher string
if chain.Direction == pb.Chain_INPUT {
matchPart = "src"
matchPart = "src,dst"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this modification? If I'm not wrong, it means this chain applies, iff both the source and destination are in the ipset, which seems not to be what we want?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means it must match a pair of source IP address and destination port. Please refer to man iptables-extensions - section set, --match-set parameter and man ipset - section introduction, set dimensions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need more documentation to describe it but it still makes sense:

  • with action: delay/loss/... (which requires tc) and direction: from, externalTargets would not works
  • with action: partition, direction: from, externalTagets: [10.96.0.0/16:8080], if would drop any packet from 10.96.0.0/16 to <pod-ip>:8080.

@STRRL
Copy link
Member

STRRL commented Mar 17, 2022

Hi @miedzinski , could you please append an entry in section Unreleased / Added of CHANGELOG.MD?

Signed-off-by: Dominik Miedziński <dominik.miedzinski@allegro.pl>
@miedzinski
Copy link
Contributor Author

Hi @miedzinski , could you please append an entry in section Unreleased / Added of CHANGELOG.MD?

@STRRL Done!

Signed-off-by: Dominik Miedziński <dominik.miedzinski@allegro.pl>
@miedzinski miedzinski force-pushed the network-external-targets-port branch from db07013 to 4a7e373 Compare March 17, 2022 11:04
@STRRL
Copy link
Member

STRRL commented Mar 18, 2022

/run-e2e-tests

Signed-off-by: Dominik Miedziński <dominik.miedzinski@allegro.pl>
Copy link
Member

@STRRL STRRL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Thanks for your contribution! 🤩

@cwen0
Copy link
Member

cwen0 commented Mar 29, 2022

/cc @YangKeao

Copy link
Member

@cwen0 cwen0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cwen0
Copy link
Member

cwen0 commented Apr 1, 2022

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: dd92f71

@ti-chi-bot
Copy link
Member

@miedzinski: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@STRRL
Copy link
Member

STRRL commented Apr 1, 2022

/run-e2e-tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants