Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: create and open maps using cilium/ebpf #22693

Merged
merged 11 commits into from May 25, 2023
Merged

Conversation

ti-mo
Copy link
Contributor

@ti-mo ti-mo commented Dec 12, 2022

This PR puts cilium/ebpf.Map at the core of Cilium's bpf.Map type, see #22513 for more context. Best reviewed per commit given its size. If you've been tagged for review and don't feel comfortable, please let me know so we can reassign. Don't hesitate to ping me offline for discussion or more context.

Part of: #22513.

Perform map creation and opening using cilium/ebpf API

@ti-mo
Copy link
Contributor Author

ti-mo commented Dec 12, 2022

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Dec 13, 2022
@tommyp1ckles tommyp1ckles self-requested a review December 13, 2022 18:09
@github-actions
Copy link

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Feb 12, 2023
@github-actions
Copy link

This pull request has not seen any activity since it was marked stale.
Closing.

@github-actions github-actions bot closed this Feb 26, 2023
@ti-mo ti-mo added release-note/misc This PR makes changes that have no direct user impact. pinned These issues are not marked stale by our issue bot. and removed stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. labels Feb 27, 2023
@ti-mo ti-mo reopened this Feb 27, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Feb 27, 2023
@tommyp1ckles
Copy link
Contributor

@ti-mo Sorry never got around to reviewing this, are you planning to reopen?

@ti-mo
Copy link
Contributor Author

ti-mo commented Mar 9, 2023

@ti-mo Sorry never got around to reviewing this, are you planning to reopen?

No worries, it's a draft. :) Will pick this up again in a few weeks.

@ti-mo ti-mo force-pushed the tb/ebpf-maps branch 3 times, most recently from 75f2f0e to dfe6a86 Compare April 25, 2023 07:36
@ti-mo
Copy link
Contributor Author

ti-mo commented Apr 25, 2023

/test

@ti-mo
Copy link
Contributor Author

ti-mo commented Apr 25, 2023

/test

@ti-mo
Copy link
Contributor Author

ti-mo commented Apr 25, 2023

@YutaroHayakawa I've had to make some changes to the way per-cluster CT maps are loaded from a pin, PTAL.

I think pinning inner maps there is not strictly necessary and creates quite some accidental complexity around managing pins. Also, 4 * 256 syscalls to attempt opening all possible pins on every single GC cycle. Map-in-map reads need RCU, but surely a few BPF_MAP_GET_NEXT_KEY calls on the outer map aren't that problematic? At least the latter would just block instead of hammering BPF_OBJ_GETs, leading to lower CPU usage?

@ti-mo
Copy link
Contributor Author

ti-mo commented Apr 25, 2023

/test

@ti-mo
Copy link
Contributor Author

ti-mo commented Apr 28, 2023

/test

@ti-mo ti-mo marked this pull request as ready for review April 28, 2023 10:05
@ti-mo ti-mo requested review from a team as code owners April 28, 2023 10:05
@ti-mo
Copy link
Contributor Author

ti-mo commented May 23, 2023

/test-1.26-net-next

Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathServicesTest Checks N/S loadbalancing With host policy Tests NodePort

Failure Output

FAIL: Can not connect to service "tftp://[fd04::12]:30774/hello" from outside cluster (1/10)

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/79/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next so I can create one.

Then please upload the Jenkins artifacts to that issue.

@ti-mo
Copy link
Contributor Author

ti-mo commented May 23, 2023

net-next is hitting #25524

@joestringer
Copy link
Member

Broadly I think the direction that this PR is pushing the codebase is the right path, reducing Cilium-specific syscall related code and relying more on the cilium/ebpf library 🎉

From a quick glance, the OpenOrCreate() replacement seems to retain the semantics we want, and I welcome the improvement on OpenParallel() to more accurately reflect the reality of what it does.

@ti-mo
Copy link
Contributor Author

ti-mo commented May 24, 2023

/test

Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathServicesTest Checks N/S loadbalancing With host policy Tests NodePort

Failure Output

FAIL: Can not connect to service "tftp://[fd04::11]:31363/hello" from outside cluster (1/10)

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/117/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next so I can create one.

Then please upload the Jenkins artifacts to that issue.

@ti-mo ti-mo added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 24, 2023
@ti-mo
Copy link
Contributor Author

ti-mo commented May 24, 2023

Marked ready-to-merge as Travis' arm64 job is stalling on go vet (maybe due to many files touched, missing cache?) and net-next is currently kernel panicking. Nothing changed here that would cause a failure on net-next only. Not much point in re-running, this PR would otherwise remain blocked and in perpetual rebase hell.

@ti-mo ti-mo removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 25, 2023
When running `V=0 make ... --quiet`, the `go test ./...` executions will never
show up, and all output is redirected to /dev/null. This makes it hard to see
what's going on.

On Travis, this can cause timeouts due to the job emitting no output.

Signed-off-by: Timo Beckers <timo@isovalent.com>
ti-mo added 10 commits May 25, 2023 09:55
We're having issues with Travis jobs timing out due to no output being
generated for over 10 minutes.

Integration tests run with V=0 and --quiet barely generate any output,
since all 'go test' output is hidden, and Makefile steps are also hidden.

Signed-off-by: Timo Beckers <timo@isovalent.com>
MapFdFromID was unused in the codebase, remove it. Also remove
unused command args to bpf(). The remainder acts as a list of behaviour
to port to use cilium/ebpf.

Signed-off-by: Timo Beckers <timo@isovalent.com>
Errors returned from OpenOrCreateMap don't necessarily contain any identifying
context about the map being created. Wrap all errors encountered during map
creation with the map/subsystem name.

Signed-off-by: Timo Beckers <timo@isovalent.com>
This commit removes the bool return value from OpenOrCreate() that indicates
whether or not a Map was recreated in the process.

It was only checked in tests and generally didn't provide much value or
actionable feedback, only inflating API surface.

Signed-off-by: Timo Beckers <timo@isovalent.com>
This is the first in a series of patches that aims to remove all uses of
unix.Syscall in the cilium/cilium codebase in favour of the cilium/ebpf APIs.

CreateMap() was unexported and renamed to createMap(). It now calls directly into
ebpf.NewMapWithOptions and takes care of passing MapOptions when specified (for
passing a PinPath).

OpenOrCreateMap() now takes an ebpf.MapSpec and a pinDir and had most of its logic
removed as compatibility checks are already taken care of by ebpf-go.

NewMap() no longer takes innerID. A new function NewMapWithInnerSpec() was introduced
to provide an ebpf.MapSpec to use for the inner map definition during map creation.

Signed-off-by: Timo Beckers <timo@isovalent.com>
The innerID argument was only used by 2 maps that needed inner maps. Those were
migrated to NewMapWithInnerSpec() in a prior commit, when the argument was
already blanked out. This commit removes it.

Signed-off-by: Timo Beckers <timo@isovalent.com>
OpenMap() now calls ebpf.LoadPinnedMap and must be given an asbolute path.
bpf.ObjGet() was removed as it is now unused.

Signed-off-by: Timo Beckers <timo@isovalent.com>
- Remove the bpf.Map.fd field and all accesses to it
- Rename bpf.Map.GetFd() to FD() and perform all map FD accesses through it

Signed-off-by: Timo Beckers <timo@isovalent.com>
A map is considered in 'parallel' mode when it has been unpinned, recreated and
re-pinned, but the old map is still referred to by living endpoint programs until
those eventually get phased out by replacing their progs.

Rename OpenParallel() to Recreate() as it's a confusing term from a data/bpf
standpoint, as there are no double-writes going on.

The Map.NonPersistent flag is a deferred way of saying "always remove the map's
pin when creating". Remove this bit of state, instead explicitly calling
Recreate() for opening the Map.

Signed-off-by: Timo Beckers <timo@isovalent.com>
…t map

By changing the underlying map abstraction to cilium/ebpf, opening a map from a pin
no longer returns an error chain containing a os.PathError, only a unix.ENOENT.
Return errors from getClusterMap() instead of a vague 'nil, nil'.

Check for ENOENT in getAllClusterMaps() instead of checking the first return value.
Stop checking the first return value in multiple places.

Also removed redundant mapType argument from PerClusterCTMap.newInnerMap().

Update tests to perform error checking first before accessing any other return value.

Signed-off-by: Timo Beckers <timo@isovalent.com>
@ti-mo ti-mo requested a review from a team as a code owner May 25, 2023 07:58
@ti-mo ti-mo requested a review from rgo3 May 25, 2023 07:58
@ti-mo
Copy link
Contributor Author

ti-mo commented May 25, 2023

/test

@ti-mo ti-mo linked an issue May 25, 2023 that may be closed by this pull request
Copy link
Contributor

@squeed squeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved the tophat-relevant files.

@ti-mo ti-mo merged commit 4d9c72c into cilium:main May 25, 2023
58 checks passed
@ti-mo ti-mo deleted the tb/ebpf-maps branch May 25, 2023 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pinned These issues are not marked stale by our issue bot. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI: Travis: Times out from no output received in the last 10m0s
8 participants