Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native nftables dataplane #8780

Open
wants to merge 56 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
99b8610
Initial nftables prototype implementation
caseydavenport Mar 18, 2024
e95cb6a
Add NFTables to FelixConfiguration, fix up UTs
caseydavenport May 2, 2024
b4adaae
Slightly better node Dockerfile implementation
caseydavenport May 20, 2024
11e1c8f
Properly clean up chains that aren't ours
caseydavenport May 21, 2024
d7b99ad
Handle overlapping ranges in nftables sets
caseydavenport May 21, 2024
8e964b8
Update match builder to use meta l4proto
caseydavenport May 21, 2024
99f311d
Start getting FV tests to run
caseydavenport May 21, 2024
64b9d31
Add triemasker prototype
caseydavenport Jun 5, 2024
3f54cc1
Fix overlapping network sets in nftables mode
caseydavenport Jun 5, 2024
230bc62
Fix node.get
caseydavenport Jun 7, 2024
1c7375d
Fix deduplicator in dual stack
caseydavenport Jun 7, 2024
9e4b9e5
Add UT for member deduplicator
caseydavenport Jun 7, 2024
1e3eb8d
Fix health tests for nft
caseydavenport Jun 7, 2024
f4aa164
HEP test fixed
caseydavenport Jun 7, 2024
fbce58c
Enable XDP :|
caseydavenport Jun 7, 2024
d565d72
Fix packet check
caseydavenport Jun 7, 2024
2675bd1
Remove remaining bookmarks
caseydavenport Jun 7, 2024
27693e4
Fix IP set deletion in deduplicator
caseydavenport Jun 10, 2024
9882fca
Use correct funcs for network sets FVs
caseydavenport Jun 11, 2024
a6b4595
Fix ipset command in nft mode
caseydavenport Jun 12, 2024
5082e87
Fix test
caseydavenport Jun 12, 2024
6fb158d
Fix ipv6 vxlan check
caseydavenport Jun 12, 2024
e42d07e
REVERT: Target flapping test in CI
caseydavenport Jun 12, 2024
657a91a
Fix VXLAN IPv6 set name
caseydavenport Jun 13, 2024
4857a66
Add log for missing IP set
caseydavenport Jun 13, 2024
5f9ba6f
Delete multiple IP sets at once
caseydavenport Jun 14, 2024
850deb1
Code review pt. 1
caseydavenport Jun 18, 2024
716d6a8
Merge remote-tracking branch 'origin/master' into casey-nft-proto
caseydavenport Jun 20, 2024
b230f1b
Further code review addressing
caseydavenport Jun 20, 2024
41cea00
Finish up postWriteInterval changes
caseydavenport Jun 21, 2024
2ea25a8
Rip out postWriteInterval logic, not needed in nftables
caseydavenport Jun 21, 2024
9cf774c
Use NumOperations()
caseydavenport Jun 21, 2024
49f63fc
Merge remote-tracking branch 'origin/master' into casey-nft-proto
caseydavenport Jun 21, 2024
33724ea
Fix build
caseydavenport Jun 21, 2024
ca785bc
Fix FV tests using wrong enablement var
caseydavenport Jun 21, 2024
7c88a4a
Fix arm64 build
caseydavenport Jun 22, 2024
8e3e474
Update generated files
caseydavenport Jun 22, 2024
634ad56
Add FV for nftables + bpf
caseydavenport Jun 24, 2024
df7c772
Fix IPSet naming bug
caseydavenport Jun 25, 2024
7972ae3
Update timeout for nftables mode
caseydavenport Jun 25, 2024
58451b3
Merge remote-tracking branch 'origin/master' into casey-nft-proto
caseydavenport Jun 26, 2024
cc6df21
Fix up XDP FV
caseydavenport Jun 26, 2024
4f00325
Merge remote-tracking branch 'origin/master' into casey-nft-proto
caseydavenport Jun 26, 2024
5cd8c7a
Fix build error
caseydavenport Jun 26, 2024
c3bfb8c
Switch off of knftables fork
caseydavenport Jun 27, 2024
be69b1d
Fix ipset resync performance
caseydavenport Jun 27, 2024
e38b4a3
Merge remote-tracking branch 'origin/master' into casey-nft-proto
caseydavenport Jun 28, 2024
5817a94
Disable resync for nftables set FV
caseydavenport Jun 28, 2024
33ebc1a
Disable ipset resync during FV
caseydavenport Jul 1, 2024
a8a0b4e
Fix that some protocol aliases don't exist on all system
caseydavenport Jul 1, 2024
8308a7e
use longer timeout for now
caseydavenport Jul 1, 2024
4b29f1c
timout -> timeout
caseydavenport Jul 2, 2024
7d37e97
Merge remote-tracking branch 'origin/master' into casey-nft-proto
caseydavenport Jul 2, 2024
b9f9f62
Merge remote-tracking branch 'origin/master' into casey-nft-proto
caseydavenport Jul 3, 2024
407fe9a
Fix reject with for nftables
caseydavenport Jul 3, 2024
dd920cc
Make fix
caseydavenport Jul 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions api/pkg/apis/projectcalico/v3/felixconfig.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,13 @@ const (
IptablesBackendAuto = "Auto"
)

type NFTablesMode string

const (
NFTablesModeEnabled = "Enabled"
NFTablesModeDisabled = "Disabled"
)

// +kubebuilder:validation:Enum=DoNothing;Enable;Disable
type AWSSrcDstCheckOption string

Expand Down Expand Up @@ -439,8 +446,8 @@ type FelixConfigurationSpec struct {
// iptables. [Default: false]
GenericXDPEnabled *bool `json:"genericXDPEnabled,omitempty" confignamev1:"GenericXDPEnabled"`

// NFTablesEnabled enables nftables in Felix. When false, iptables is used. [Default: false]
NFTablesEnabled *bool `json:"nftablesEnabled,omitempty"`
// NFTablesMode configures nftables support in Felix. [Default: Disabled]
NFTablesMode *NFTablesMode `json:"nftablesMode,omitempty"`

// BPFEnabled, if enabled Felix will use the BPF dataplane. [Default: false]
BPFEnabled *bool `json:"bpfEnabled,omitempty" validate:"omitempty"`
Expand Down
6 changes: 3 additions & 3 deletions api/pkg/apis/projectcalico/v3/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions api/pkg/openapi/openapi_generated.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion calicoctl/calicoctl/commands/crds/crds.go

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion felix/calc/calc_graph.go
Original file line number Diff line number Diff line change
Expand Up @@ -287,7 +287,7 @@ func NewCalculationGraph(callbacks PipelineCallbacks, conf *config.Config, liveC
// |
// <dataplane>
//
ipsetMemberIndex := labelindex.NewSelectorAndNamedPortIndex()
ipsetMemberIndex := labelindex.NewSelectorAndNamedPortIndex(conf.NFTablesMode == "Enabled")
ipsetMemberIndex.OnAlive = liveCallback
// Wire up the inputs to the IP set member index.
ipsetMemberIndex.RegisterWith(allUpdDispatcher)
Expand Down
2 changes: 1 addition & 1 deletion felix/config/config_params.go
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ type Config struct {
WireguardPersistentKeepAlive time.Duration `config:"seconds;0"`

// nftables configuration.
NFTablesEnabled bool `config:"bool;false"`
NFTablesMode string `config:"oneof(Enabled,Disabled);Disabled"`

// BPF configuration.
BPFEnabled bool `config:"bool;false"`
Expand Down
2 changes: 1 addition & 1 deletion felix/dataplane/driver.go
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@ func StartDataplaneDriver(configParams *config.Config,
NetlinkTimeout: configParams.NetlinkTimeoutSecs,
},
RulesConfig: rules.Config{
NFTables: configParams.NFTablesEnabled,
NFTables: configParams.NFTablesMode == "Enabled",
WorkloadIfacePrefixes: configParams.InterfacePrefixes(),

IPSetConfigV4: ipsets.NewIPVersionConfig(
Expand Down
8 changes: 4 additions & 4 deletions felix/dataplane/linux/endpoint_mgr.go
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ type endpointManager struct {
osStat func(path string) (os.FileInfo, error)
epMarkMapper rules.EndpointMarkMapper
newMatch func() generictables.MatchCriteria
caseydavenport marked this conversation as resolved.
Show resolved Hide resolved
actions generictables.ActionSet
actions generictables.ActionFactory

// Pending updates, cleared in CompleteDeferredWork as the data is copied to the activeXYZ
// fields.
Expand Down Expand Up @@ -277,10 +277,10 @@ func newEndpointManagerWithShims(
wlIfacesRegexp := regexp.MustCompile(wlIfacesPattern)

newMatchFn := iptables.Match
actions := iptables.ActionSet()
actions := iptables.Actions()
if nft {
newMatchFn = nftables.Match
actions = nftables.ActionSet()
actions = nftables.Actions()
}

return &endpointManager{
Expand Down Expand Up @@ -911,7 +911,7 @@ func (m *endpointManager) updateRPFSkipChain() {
for _, addr := range addresses {
chain.Rules = append(chain.Rules, generictables.Rule{
Match: m.newMatch().InInterface(interfaceName).SourceNet(addr),
Action: m.actions.AllowAction(),
Action: m.actions.Allow(),
})
}
}
Expand Down
50 changes: 25 additions & 25 deletions felix/dataplane/linux/int_dataplane.go
Original file line number Diff line number Diff line change
Expand Up @@ -357,7 +357,7 @@ type InternalDataplane struct {
linkUpdateBatchSize int
addrsUpdateBatchSize int

actionSet generictables.ActionSet
actionSet generictables.ActionFactory
newMatch func() generictables.MatchCriteria
}

Expand Down Expand Up @@ -401,10 +401,10 @@ func NewIntDataplaneDriver(config Config) *InternalDataplane {
)

// Determine the action set and new match function based on the underlying generictables implementation.
actionSet := iptables.ActionSet()
actionSet := iptables.Actions()
newMatchFn := iptables.Match
if config.RulesConfig.NFTables {
actionSet = nftables.ActionSet()
actionSet = nftables.Actions()
newMatchFn = nftables.Match
}

Expand Down Expand Up @@ -1509,7 +1509,7 @@ func (d *InternalDataplane) bpfMarkPreestablishedFlowsRules() []generictables.Ru
return []generictables.Rule{{
Match: d.newMatch().ConntrackState("ESTABLISHED,RELATED"),
Comment: []string{"Mark pre-established flows."},
Action: d.actionSet.SetMaskedMarkAction(
Action: d.actionSet.SetMaskedMark(
tcdefs.MarkLinuxConntrackEstablished,
tcdefs.MarkLinuxConntrackEstablishedMask,
),
Expand All @@ -1531,7 +1531,7 @@ func (d *InternalDataplane) setUpIptablesBPF() {
// by the program at both ingress and egress.
Comment: []string{"Pre-approved by BPF programs."},
Match: d.newMatch().MarkMatchesWithMask(tcdefs.MarkSeenBypass, tcdefs.MarkSeenBypassMask),
Action: d.actionSet.AllowAction(),
Action: d.actionSet.Allow(),
},
}

Expand All @@ -1545,7 +1545,7 @@ func (d *InternalDataplane) setUpIptablesBPF() {
MarkMatchesWithMask(tcdefs.MarkSeenFallThrough, tcdefs.MarkSeenFallThroughMask).
ConntrackState("ESTABLISHED,RELATED"),
Comment: []string{"Accept packets from flows that pre-date BPF."},
Action: d.actionSet.AllowAction(),
Action: d.actionSet.Allow(),
},
generictables.Rule{
Match: d.newMatch().MarkMatchesWithMask(tcdefs.MarkSeenFallThrough, tcdefs.MarkSeenFallThroughMask),
Expand All @@ -1572,7 +1572,7 @@ func (d *InternalDataplane) setUpIptablesBPF() {
// RETURN would be a no-op since there's nothing to RETURN from.
inputRules = append(inputRules, generictables.Rule{
Match: d.newMatch().InInterface(prefix+wildcard).MarkMatchesWithMask(tcdefs.MarkSeen, tcdefs.MarkSeenMask),
Action: d.actionSet.AllowAction(),
Action: d.actionSet.Allow(),
})
}

Expand All @@ -1588,7 +1588,7 @@ func (d *InternalDataplane) setUpIptablesBPF() {
// SEEN traffic, so it was policed and accepted at a HEP. If the default INPUT
// chain policy was DROP, it would get dropped now, therefore an explicit accept
// is needed.
inputRules = append(inputRules, d.ruleRenderer.FilterInputChainAllowWG(t.IPVersion(), rulesConfig, d.actionSet.AllowAction())...)
inputRules = append(inputRules, d.ruleRenderer.FilterInputChainAllowWG(t.IPVersion(), rulesConfig, d.actionSet.Allow())...)
}

if t.IPVersion() == 6 {
Expand Down Expand Up @@ -1618,7 +1618,7 @@ func (d *InternalDataplane) setUpIptablesBPF() {
fwdRules = append(fwdRules,
generictables.Rule{
Match: d.newMatch().OutInterface(prefix + wildcard),
Action: d.actionSet.JumpAction(rules.ChainToWorkloadDispatch),
Action: d.actionSet.Jump(rules.ChainToWorkloadDispatch),
Comment: []string{"To workload, check workload is known."},
},
)
Expand All @@ -1632,15 +1632,15 @@ func (d *InternalDataplane) setUpIptablesBPF() {
fwdRules = append(fwdRules,
generictables.Rule{
Match: d.newMatch().InInterface(prefix + wildcard),
Action: d.actionSet.AllowAction(),
Action: d.actionSet.Allow(),
Comment: []string{"To workload, mark has already been verified."},
},
)
}
fwdRules = append(fwdRules,
generictables.Rule{
Match: d.newMatch().InInterface(bpfOutDev),
Action: d.actionSet.AllowAction(),
Action: d.actionSet.Allow(),
Comment: []string{"From ", bpfOutDev, " device, mark verified, accept."},
},
)
Expand All @@ -1655,7 +1655,7 @@ func (d *InternalDataplane) setUpIptablesBPF() {
t.UpdateChains(d.ruleRenderer.StaticNATPostroutingChains(t.IPVersion()))
t.InsertOrAppendRules("POSTROUTING", []generictables.Rule{{
Match: d.newMatch(),
Action: d.actionSet.JumpAction(rules.ChainNATPostrouting),
Action: d.actionSet.Jump(rules.ChainNATPostrouting),
}})
}

Expand All @@ -1665,11 +1665,11 @@ func (d *InternalDataplane) setUpIptablesBPF() {
))
t.InsertOrAppendRules("PREROUTING", []generictables.Rule{{
Match: d.newMatch(),
Action: d.actionSet.JumpAction(rules.ChainRawPrerouting),
Action: d.actionSet.Jump(rules.ChainRawPrerouting),
}})
t.InsertOrAppendRules("OUTPUT", []generictables.Rule{{
Match: d.newMatch(),
Action: d.actionSet.JumpAction(rules.ChainRawOutput),
Action: d.actionSet.Jump(rules.ChainRawOutput),
}})
}

Expand All @@ -1682,7 +1682,7 @@ func (d *InternalDataplane) setUpIptablesBPF() {
tcdefs.MarkSeenMask|mark,
),
Comment: []string{"Mark connections with ExtToServiceConnmark"},
Action: d.actionSet.SetConnmarkAction(mark, mark),
Action: d.actionSet.SetConnmark(mark, mark),
}})
}
}
Expand Down Expand Up @@ -1730,27 +1730,27 @@ func (d *InternalDataplane) setUpIptablesNormal() {
t.UpdateChains(rawChains)
t.InsertOrAppendRules("PREROUTING", []generictables.Rule{{
Match: d.newMatch(),
Action: d.actionSet.JumpAction(rules.ChainRawPrerouting),
Action: d.actionSet.Jump(rules.ChainRawPrerouting),
}})
t.InsertOrAppendRules("OUTPUT", []generictables.Rule{{
Match: d.newMatch(),
Action: d.actionSet.JumpAction(rules.ChainRawOutput),
Action: d.actionSet.Jump(rules.ChainRawOutput),
}})
}
for _, t := range d.filterTables {
filterChains := d.ruleRenderer.StaticFilterTableChains(t.IPVersion())
t.UpdateChains(filterChains)
t.InsertOrAppendRules("FORWARD", []generictables.Rule{{
Match: d.newMatch(),
Action: d.actionSet.JumpAction(rules.ChainFilterForward),
Action: d.actionSet.Jump(rules.ChainFilterForward),
}})
t.InsertOrAppendRules("INPUT", []generictables.Rule{{
Match: d.newMatch(),
Action: d.actionSet.JumpAction(rules.ChainFilterInput),
Action: d.actionSet.Jump(rules.ChainFilterInput),
}})
t.InsertOrAppendRules("OUTPUT", []generictables.Rule{{
Match: d.newMatch(),
Action: d.actionSet.JumpAction(rules.ChainFilterOutput),
Action: d.actionSet.Jump(rules.ChainFilterOutput),
}})

// Include rules which should be appended to the filter table forward chain.
Expand All @@ -1760,26 +1760,26 @@ func (d *InternalDataplane) setUpIptablesNormal() {
t.UpdateChains(d.ruleRenderer.StaticNATTableChains(t.IPVersion()))
t.InsertOrAppendRules("PREROUTING", []generictables.Rule{{
Match: d.newMatch(),
Action: d.actionSet.JumpAction(rules.ChainNATPrerouting),
Action: d.actionSet.Jump(rules.ChainNATPrerouting),
}})
t.InsertOrAppendRules("POSTROUTING", []generictables.Rule{{
Match: d.newMatch(),
Action: d.actionSet.JumpAction(rules.ChainNATPostrouting),
Action: d.actionSet.Jump(rules.ChainNATPostrouting),
}})
t.InsertOrAppendRules("OUTPUT", []generictables.Rule{{
Match: d.newMatch(),
Action: d.actionSet.JumpAction(rules.ChainNATOutput),
Action: d.actionSet.Jump(rules.ChainNATOutput),
}})
}
for _, t := range d.mangleTables {
t.UpdateChains(d.ruleRenderer.StaticMangleTableChains(t.IPVersion()))
t.InsertOrAppendRules("PREROUTING", []generictables.Rule{{
Match: d.newMatch(),
Action: d.actionSet.JumpAction(rules.ChainManglePrerouting),
Action: d.actionSet.Jump(rules.ChainManglePrerouting),
}})
t.InsertOrAppendRules("POSTROUTING", []generictables.Rule{{
Match: d.newMatch(),
Action: d.actionSet.JumpAction(rules.ChainManglePostrouting),
Action: d.actionSet.Jump(rules.ChainManglePostrouting),
}})
}
if d.xdpState != nil {
Expand Down
2 changes: 1 addition & 1 deletion felix/fv/infrastructure/felix.go
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ func RunFelix(infra DatastoreInfra, id int, options TopologyOptions) *Felix {
c.Exec("iptables", "-F", "-t", table)
}

// nftables mode requires that ipatbles be configured to allow by default. Otherwise, a default
// nftables mode requires that iptables be configured to allow by default. Otherwise, a default
// drop action will override any accept verdict made by nftables.
c.Exec("iptables",
"-w", "10", // Retry this for 10 seconds, e.g. if something else is holding the lock
Expand Down
35 changes: 20 additions & 15 deletions felix/generictables/actions.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,24 +16,29 @@ package generictables

import "github.com/projectcalico/calico/felix/environment"

type ActionSet interface {
AllowAction() Action
DropAction() Action
GoToAction(target string) Action
ReturnAction() Action
SetMarkAction(mark uint32) Action
SetMaskedMarkAction(mark, mask uint32) Action
ClearMarkAction(mark uint32) Action
JumpAction(target string) Action
NoTrackAction() Action
LogAction(prefix string) Action
SNATAction(ip string) Action
DNATAction(ip string, port uint16) Action
MasqAction(toPorts string) Action
SetConnmarkAction(mark, mask uint32) Action
type ActionFactory interface {
Allow() Action
Drop() Action
GoTo(target string) Action
Return() Action
SetMark(mark uint32) Action
SetMaskedMark(mark, mask uint32) Action
ClearMark(mark uint32) Action
Jump(target string) Action
NoTrack() Action
Log(prefix string) Action
SNAT(ip string) Action
DNAT(ip string, port uint16) Action
Masq(toPorts string) Action
SetConnmark(mark, mask uint32) Action
}

type Action interface {
ToFragment(features *environment.Features) string
String() string
}

// ReturnActionMarker is a marker interface for actions that return from a chain.
type ReturnActionMarker interface {
IsReturnAction()
}
11 changes: 0 additions & 11 deletions felix/generictables/match_builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,17 +71,6 @@ type MatchCriteria interface {
NotICMPV6Type(t uint8) MatchCriteria
ICMPV6TypeAndCode(t, c uint8) MatchCriteria
NotICMPV6TypeAndCode(t, c uint8) MatchCriteria
VXLANVNI(vni uint32) MatchCriteria
}

// NFTMatchCriteria extends the generictables.MatchCriteria interface with nftables-specific methods.
type NFTMatchCriteria interface {
MatchCriteria

IPVersion(version uint8) MatchCriteria

ConntrackStatus(statusNames string) MatchCriteria
NotConntrackStatus(statusNames string) MatchCriteria
}

type AddrType string
Expand Down
Loading