Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularize eventsmap and monitor.Agent #25197

Merged
merged 2 commits into from May 31, 2023

Conversation

bimmlerd
Copy link
Member

@bimmlerd bimmlerd commented Apr 28, 2023

Modularizes both the eventsmap and the monitor agent in the simplest way possible.

Would be happy to get some input on:

Ref: #24192

@bimmlerd bimmlerd added kind/enhancement This would improve or streamline existing functionality. release-note/misc This PR makes changes that have no direct user impact. sig/agent Cilium agent related. area/modularization labels Apr 28, 2023
@bimmlerd bimmlerd changed the title Modularize eventsmap and monito.Agent Modularize eventsmap and monitor.Agent Apr 28, 2023
@bimmlerd bimmlerd force-pushed the pr/bimmlerd/monitor-agent-cell branch 6 times, most recently from 4cca0b3 to ae82183 Compare May 9, 2023 15:31
@bimmlerd
Copy link
Member Author

bimmlerd commented May 10, 2023

Huh, the integration test failure is an ipcache/identity cache panic, seemingly unrelated to my PR:

 panic: send on closed channel

goroutine 778 [running]:
github.com/cilium/cilium/pkg/identity/cache.(*localIdentityCache).lookupOrCreate(0xc001d3bb40, 0xc001a2e360, 0x0?)
	/home/runner/work/cilium/cilium/pkg/identity/cache/local.go:109 +0x55d
github.com/cilium/cilium/pkg/identity/cache.(*CachingIdentityAllocator).AllocateIdentity(0xc002828300, {0x3b9b180, 0xc000776280}, 0xc001a2e360, 0x0?, 0xc0?)
	/home/runner/work/cilium/cilium/pkg/identity/cache/allocator.go:349 +0x76c
github.com/cilium/cilium/pkg/ipcache.(*IPCache).resolveIdentity(0xc00010fd00, {0x3b9b180, 0xc000776280}, {{{0x0?, 0x0?}, 0xc000010048?}, 0x0?}, 0x0?, 0x2fd8340?)
	/home/runner/work/cilium/cilium/pkg/ipcache/metadata.go:436 +0xcdd
github.com/cilium/cilium/pkg/ipcache.(*IPCache).InjectLabels(0xc00010fd00, {0x3b9b180, 0xc000776280}, {0xc002958000, 0x3, 0x3})
	/home/runner/work/cilium/cilium/pkg/ipcache/metadata.go:197 +0x68f
github.com/cilium/cilium/pkg/ipcache.(*IPCache).TriggerLabelInjection.func1({0x3b9b180, 0xc000776280})
	/home/runner/work/cilium/cilium/pkg/ipcache/metadata.go:547 +0x54
github.com/cilium/cilium/pkg/controller.(*Controller).runController(0xc0012e0240)
	/home/runner/work/cilium/cilium/pkg/controller/controller.go:216 +0x217
created by github.com/cilium/cilium/pkg/controller.(*Manager).updateController
	/home/runner/work/cilium/cilium/pkg/controller/manager.go:111 +0xb51
FAIL	github.com/cilium/cilium/daemon/cmd	12.826s

Adressed in #25353, rerunning.

@bimmlerd
Copy link
Member Author

/test

@bimmlerd bimmlerd force-pushed the pr/bimmlerd/monitor-agent-cell branch from ae82183 to ae9a453 Compare May 12, 2023 08:59
@bimmlerd bimmlerd marked this pull request as ready for review May 12, 2023 10:56
@bimmlerd bimmlerd requested review from a team as code owners May 12, 2023 10:56
@tklauser
Copy link
Member

That link unfortunately doesn't show commit contents anymore. Presumably because you've updated/rebased the commits on the branch?

daemon/cmd/daemon_main.go Outdated Show resolved Hide resolved
daemon/cmd/daemon_test.go Outdated Show resolved Hide resolved
pkg/datapath/cells.go Outdated Show resolved Hide resolved
pkg/monitor/agent/cell.go Outdated Show resolved Hide resolved
pkg/monitor/agent/cell.go Show resolved Hide resolved
@bimmlerd bimmlerd force-pushed the pr/bimmlerd/monitor-agent-cell branch from ae9a453 to afa6252 Compare May 16, 2023 08:32
Copy link
Member

@jrajahalme jrajahalme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits only, take them as coming from from a complete novice in hive/cell :-)

flags.Int("monitor-queue-size", 0, "Size of the event queue when reading monitor events")
}

func new(lc hive.Lifecycle, log logrus.FieldLogger, cfg AgentConfig, _ eventsmap.Map) Agent {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer naming this to newMonitorAgent for the sake of "greppability".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's not exported the naming doesn't matter much. For exported constructors I'd still recommend just using New as per usual Go guidelines. You can always grep for foo.New.

@@ -20,6 +20,8 @@ import (
"github.com/cilium/cilium/pkg/maps/authmap"
fakeauthmap "github.com/cilium/cilium/pkg/maps/authmap/fake"
"github.com/cilium/cilium/pkg/maps/egressmap"
"github.com/cilium/cilium/pkg/monitor/agent"
fakeagent "github.com/cilium/cilium/pkg/monitor/agent/fake"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call this fakemonitoragent to disambiguate with any other (future) "agents"

eventsmap.Cell,

// The monitor agent, which multicasts cilium and agent events to its subscribers.
monitorAgent.Cell,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe monitorAgent would be better placed in daemon/cmd/cells.go?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thinking was that anything which interacts with bpf Maps directly is datapath; but I agree that this is sort of on the fence. Let's leave it here for now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue it belongs to pkg/datapath since it provides a datapath-specific service.

@bimmlerd
Copy link
Member Author

Addressed most comments, as well as made the dependency on the eventsmap optional as suggested by Jussi out of band - this way, no config overrides are necessary.

I haven't removed --enable-monitor as it was present before -- but I guess we could talk about deprecating it; disabling doesn't make that much sense as it breaks hubble, and if it's not used, it also doesn't consume CPU cycles. Questions for another PR, though.

@bimmlerd bimmlerd requested a review from joamaki May 23, 2023 14:57
@bimmlerd
Copy link
Member Author

CI triage:

@bimmlerd
Copy link
Member Author

/ci-e2e

@bimmlerd
Copy link
Member Author

/test-runtime

@bimmlerd bimmlerd force-pushed the pr/bimmlerd/monitor-agent-cell branch from b2ff005 to dab15a4 Compare May 25, 2023 13:35
@bimmlerd
Copy link
Member Author

rebased to address conflicts

@bimmlerd
Copy link
Member Author

/test

@bimmlerd bimmlerd force-pushed the pr/bimmlerd/monitor-agent-cell branch from dab15a4 to a92188d Compare May 26, 2023 08:01
@bimmlerd
Copy link
Member Author

Rebased to get #25681

@bimmlerd
Copy link
Member Author

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 26, 2023
@squeed
Copy link
Contributor

squeed commented May 26, 2023

needs rebase :-(

@squeed squeed added dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. and removed ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels May 26, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 26, 2023
@squeed squeed added dont-merge/bad-bot To prevent MLH from marking ready-to-merge. and removed ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels May 26, 2023
The cilium events map is used by the monitor.Agent. In preparation of
modularization of the agent, move the initialization of the eventsmap
into the hive.

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
The monitor agent consumes the eventsmap to multicast its events, as
well as other agent events, to subscribers. Pull it's initialization
logic into a cell, to reduce the amount of logic needed in the daemon
startup sequence. If there's no eventsmap in the hive graph, the agent
functions solely for passing along agent events.

Dropping RunMonitor flag:
Except for testing, there isn't really a use case for not attaching to
the eventsmap (The attachment alone doesn't cause performance overhead
yet, only once there is a consumer/listener we start reading.) Putting
RunMonitor into AgentConfig would mean exposing it as a flag, which
is not desirable. Instead, make the eventsmap dependency optional so
that non-privileged tests/benchmarks can still use the monitor agent for
its userspace capabilities.

Proxy Logger changes:
To improve on the isolation of components, make NewAgent private. The
only consumer other than the cell is the proxy logger benchmark, which
can create a hive instead. Unfortunately, this makes the benchmarks
require privileges.

Dropping Context() from the API:
There isn't a good reason to export the context, and it doesn't
belong in the monitor's public API. Rework the coordination between the
monitor API server and the agent slightly to make the Context() method
unnecessary.

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
@bimmlerd bimmlerd force-pushed the pr/bimmlerd/monitor-agent-cell branch from a92188d to f594833 Compare May 26, 2023 12:40
@bimmlerd
Copy link
Member Author

/test

@bimmlerd bimmlerd removed dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. dont-merge/bad-bot To prevent MLH from marking ready-to-merge. labels May 26, 2023
@bimmlerd
Copy link
Member Author

/ci-eks

@bimmlerd
Copy link
Member Author

CI triage:

  • CI EKS failed with connectivity test failed: failed to fetch cilium runtime config: error dialing backend: read tcp 192.168.171.74:55806->192.168.107.35:10250: read: connection reset by peer which sounds like a network blip.

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 30, 2023
@julianwiedmann julianwiedmann merged commit d496777 into cilium:main May 31, 2023
58 checks passed
@bimmlerd bimmlerd deleted the pr/bimmlerd/monitor-agent-cell branch May 31, 2023 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/modularization kind/enhancement This would improve or streamline existing functionality. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact. sig/agent Cilium agent related.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants