Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native nftables dataplane #8780

Open
wants to merge 37 commits into
base: master
Choose a base branch
from

Conversation

caseydavenport
Copy link
Member

@caseydavenport caseydavenport commented May 2, 2024

Description

This is still a WIP, but putting it up in order to get initial feedback
and to get some CI runs started.

Todos

  • UTs for new nftables code
  • FVs for nftables
  • Existing tests passing
  • Add FelixConfiguration option to enable nftables
  • Release note

Release Note

TBD

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

  • docs-pr-required: This change requires a change to the documentation that has not been completed yet.
  • docs-completed: This change has all necessary documentation completed.
  • docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

  • release-note-required: This PR has user-facing changes. Most PRs should have this label.
  • release-note-not-required: This PR has no user-facing changes.

Other optional labels:

  • cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
  • needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

@caseydavenport caseydavenport requested a review from a team as a code owner May 2, 2024 17:54
@marvin-tigera marvin-tigera added this to the Calico v3.29.0 milestone May 2, 2024
@marvin-tigera marvin-tigera added release-note-required Change has user-facing impact (no matter how small) docs-pr-required Change is not yet documented labels May 2, 2024
rules = append(rules, Rule{
Action: JumpAction{Target: failsafeChain},
rules = append(rules, generictables.Rule{
Match: r.NewMatch(),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the unfortunate consequences of using an interface type for the match is that leaving the Match type nil is no longer viable, as it results in nil pointer exceptions when calling bound functions where previously the function would have been called on a nil slice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was trying to think of a way around that. Probably not worth it, but you could maybe do it with generics:

type Rule[M Match, A Action] {
  Match M
  Action A
}

Then the rule renderer could have [M Match, A Action] too and pass them through. Would create a different kind of boilerplate everywhere! Advantage would be that the compiler would enforce getting it right.

Copy link
Contributor

@tomastigera tomastigera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to rename generictables (sounds pretty generic, but it is not so much ;-) ) to say linuxtables or nettables and place both iptables and nftables below that? Now we have 3 top level tables packages, which are for linux only and one is just interfaces.

UpdateChains([]*iptables.Chain)
RemoveChains([]*iptables.Chain)
// Table is a shim interface for generictables.Table.
type Table interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: What if we move this type now to generictables as say RuleChains and embed it in Table?

@caseydavenport
Copy link
Member Author

Would it make sense to rename generictables (sounds pretty generic, but it is not so much ;-) ) to say linuxtables or nettables and

@tomastigera yep, I think those are both better!

@caseydavenport caseydavenport force-pushed the casey-nft-proto branch 2 times, most recently from 6d4c097 to e7f8874 Compare May 20, 2024 21:31
@caseydavenport caseydavenport force-pushed the casey-nft-proto branch 3 times, most recently from 88a480e to 02e3b17 Compare June 6, 2024 22:59
@caseydavenport caseydavenport force-pushed the casey-nft-proto branch 3 times, most recently from 8244f4c to ca85987 Compare June 12, 2024 23:10
@caseydavenport caseydavenport changed the title Initial nftables prototype implementation Native nftables dataplane Jun 13, 2024
Copy link
Member

@fasaxc fasaxc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks like great work! I don't think I spotted any major bugs, just a few nits and minor things along with a few places where it might be worth sharing some code between the dataplanes.

The Github UI is really struggling with the size of the PR so my preference would be for you to make the smaller markups in new commits (so I can review as a diff) and then to bank this before doing any further work to share code etc.

A few spec things to tie down:

  • Config model; should we make a Dataplane enum and deprecate the old XXXEnabled flags?
  • Interleaving rules, I know some customers like to add their own rules in between our "insert" and "append" layers, should we define a way to do that (e.g. can it be done with multiple chains at different priorities or something like that)

type NFTMatchCriteria interface {
MatchCriteria

ConntrackStatus(statusNames string) MatchCriteria
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's ConntrackState() above too? Could they be unified?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In iptables mode, I believe these resolve down to the same match, but in nftables mode there are two separate matches - ct state and ct status for checking different types of conntrack information. ct state matches the state of the connection in conntrack (e.g., established, invalid, etc.) whereas ct status matches a number of other metadata bits related to the connection - like whether NAT is performed, or flow table offload is being used for this connection.

https://wiki.nftables.org/wiki-nftables/index.php/Matching_connection_tracking_stateful_metainformation#Matching_conntrack_metadata

api/pkg/apis/projectcalico/v3/felixconfig.go Outdated Show resolved Hide resolved
node/clean-up-filesystem.sh Outdated Show resolved Hide resolved
var ipSetsV4 common.IPSetsDataplane

if config.RulesConfig.NFTables {
// Create the underlying table.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you did want to trim this down a bit, you could make a new Tables interface:

if nftables {
  tables = nftables.Tables(...)
}
...
ipSetsV4 = tables.IPSets()
...
mangleTableV4 = tables.Mangle()
filterTableV4 = tables.Filter()
...
<get rid of allTables field>
func apply(...) {
 ...
for _, t := range dp.tables.TablesToApply() {
  t.apply()
}

felix/dataplane/linux/int_dataplane.go Outdated Show resolved Hide resolved
rules = append(rules, Rule{
Action: JumpAction{Target: failsafeChain},
rules = append(rules, generictables.Rule{
Match: r.NewMatch(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was trying to think of a way around that. Probably not worth it, but you could maybe do it with generics:

type Rule[M Match, A Action] {
  Match M
  Action A
}

Then the rule renderer could have [M Match, A Action] too and pass them through. Would create a different kind of boilerplate everywhere! Advantage would be that the compiler would enforce getting it right.

felix/rules/endpoints.go Outdated Show resolved Hide resolved
felix/rules/policy.go Outdated Show resolved Hide resolved
felix/fv/bpf_test.go Outdated Show resolved Hide resolved
felix/nftables/table_test.go Outdated Show resolved Hide resolved
Copy link

@danwinship danwinship left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(quick skim of just the nftables-specific bits...)

felix/generictables/nft_renderer.go Outdated Show resolved Hide resolved
felix/generictables/nft_renderer.go Outdated Show resolved Hide resolved
felix/generictables/nft_renderer.go Outdated Show resolved Hide resolved
felix/nftables/ipsets.go Show resolved Hide resolved
Comment on lines +414 to +415
// TODO: Ideally we'd extract this information from the data plane itself, but it's not exposed
// via knftables at the moment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, what does that mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The in-memory metadata we keep for each set includes some things like the max size of the set. This function is attempting to synchronize our in-memory view of the set with what actually is programmed, but it's just copying the "desired" Size / Range metadata instead of reading it from the programmed set. I couldn't easily find a way to get this information from knft (can just get set names, and elements).

I think it's OK and actually not especially relevant, considering right now the nftables code doesn't actually use any of these pieces of metadata when programming the set.

felix/nftables/match_builder.go Outdated Show resolved Hide resolved
felix/nftables/table.go Outdated Show resolved Hide resolved
felix/nftables/table.go Outdated Show resolved Hide resolved
felix/nftables/table.go Show resolved Hide resolved
felix/nftables/table.go Outdated Show resolved Hide resolved
felix/ip/trie.go Outdated Show resolved Hide resolved
felix/ip/trie.go Outdated Show resolved Hide resolved
@caseydavenport
Copy link
Member Author

Config model; should we make a Dataplane enum and deprecate the old XXXEnabled flags?
Interleaving rules, I know some customers like to add their own rules in between our "insert" and "append" layers, should we define a way to do that (e.g. can it be done with multiple chains at different priorities or something like that)

I think I'm in favor of both of these, but they should probably each be separate PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-pr-required Change is not yet documented release-note-required Change has user-facing impact (no matter how small)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants