Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/maps/egressmap: refactor into a cell #24865

Merged
merged 2 commits into from May 4, 2023
Merged

Conversation

lmb
Copy link
Contributor

@lmb lmb commented Apr 13, 2023

pkg/maps/egressmap: refactor policy map into a cell

Rework the egressmap from a global variable into a Cell. The map still has
singleton behaviour due to map pinning, but at least for unit tests we can
opt out of pinning and isolate the map from global changes.

Stop writing out EGRESS_POLICY_MAP_SIZE into node_config.h since the map
definition in C is never actually used. The map size is configured via a
command line parameter instead. Hardcode the previously used default value
to avoid errors when running BPF unit tests. This is simpler than wiring the
new PolicyConfig into HeaderfileWriter.

Reuse the MapOut strategy from configmap and authmap to ensure that the
egressmap is initialized by the agent before the loader is invoked. 
Otherwise map creation might use the (incorrect) MaxElems from the compiled
C code.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>

hivetest: add Lifecycle to make testing easier

hive requires splitting up object lifecycle into New, Start, Stop. It does
this by injecting a hive.Lifecycle into constructors. This in turn means
that we need a lifecycle to create objects during testing.

Implement a minimal hive.Lifecycle for use in testing, which doesn't 
distinguish between New and Start phases and instead immediately executes
any start hooks. Stop hooks are invoked at test cleanup time.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Refactor egressgateway specific maps into a cell

Updates: #23782

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 13, 2023
@lmb
Copy link
Contributor Author

lmb commented Apr 13, 2023

/test-runtime

@lmb
Copy link
Contributor Author

lmb commented Apr 13, 2023

/test-runtime

@lmb
Copy link
Contributor Author

lmb commented Apr 13, 2023

/test-runtime

@lmb lmb added the release-note/misc This PR makes changes that have no direct user impact. label Apr 17, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 17, 2023
@lmb
Copy link
Contributor Author

lmb commented Apr 17, 2023

/test

@lmb lmb marked this pull request as ready for review April 17, 2023 09:03
@lmb lmb requested review from a team as code owners April 17, 2023 09:03
@lmb lmb requested review from thorn3r, asauber, aspsk and ti-mo April 17, 2023 09:03
pkg/egressgateway/policy.go Outdated Show resolved Hide resolved
@@ -213,7 +213,7 @@ struct {
__type(key, struct egress_gw_policy_key);
__type(value, struct egress_gw_policy_entry);
__uint(pinning, LIBBPF_PIN_BY_NAME);
__uint(max_entries, EGRESS_POLICY_MAP_SIZE);
__uint(max_entries, 16384);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to keep the BPF unit tests happy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, value must be hardcoded and constant is not allowed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the define for EGRESS_POLICY_MAP_SIZE so that I can get rid of the entry in node_config.h which is written out at runtime. It's difficult to plumb the size requested via config / command line into node_config.h without relying on option.Config global which I wanted to avoid.

So: we could keep the define but move it to maps.h. That runs the risk of some C code relying on that define with the assumption that it reflects the configured value. So I opted to remove the define instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@lmb
Copy link
Contributor Author

lmb commented Apr 19, 2023

The current state of the PR is not good to merge, since it doesn't enforce a dependency between the loader component and the map.

@lmb lmb added the dont-merge/blocked Another PR must be merged before this one. label Apr 19, 2023
@lmb lmb force-pushed the egressmap-cell branch 2 times, most recently from c3ab40e to 180db68 Compare April 25, 2023 13:26
@lmb
Copy link
Contributor Author

lmb commented Apr 25, 2023

/test-runtime

@lmb lmb requested a review from a team as a code owner April 26, 2023 09:18
pkg/maps/egressmap/policy.go Outdated Show resolved Hide resolved
return err
}
func createPolicyMapFromDaemonConfig(daemonConfig *option.DaemonConfig, lc hive.Lifecycle, cfg PolicyConfig) bpf.MapOut[PolicyMap] {
if !daemonConfig.EnableIPv4EgressGateway {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can get away without this? Ties into the whole "optional hive feature".

@lmb lmb force-pushed the egressmap-cell branch 3 times, most recently from c74889d to 7174e0e Compare April 27, 2023 12:45
@lmb
Copy link
Contributor Author

lmb commented Apr 27, 2023

/test

Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathServicesTest Checks N/S loadbalancing With host policy Tests NodePort

Failure Output

FAIL: Can not connect to service "http://192.168.56.12:30861" from outside cluster (1/10)

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/1953/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next so I can create one.

Then please upload the Jenkins artifacts to that issue.

Copy link
Member

@jibi jibi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! Just one nit and a question

@@ -101,6 +101,7 @@ type datapathParams struct {
LC hive.Lifecycle
WgAgent *wg.Agent

// Force map initialisation before loader
// Force map initialisation before loader. You should not use these otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra dot at the end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't understand, sorry.

pkg/egressgateway/manager_test.go Outdated Show resolved Hide resolved
Copy link
Member

@mhofstetter mhofstetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks for introducing the generic MapOut type

@lmb
Copy link
Contributor Author

lmb commented May 2, 2023

/test

@lmb
Copy link
Contributor Author

lmb commented May 2, 2023

Travis error is probably a stuck test #23509

Copy link
Contributor

@joamaki joamaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hivetest package looks good to me

@lmb
Copy link
Contributor Author

lmb commented May 2, 2023

runtime test is #25178

@lmb lmb added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 2, 2023
@joestringer joestringer removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 2, 2023
@joestringer
Copy link
Member

Please make the tests green and get all codeowner groups' review.

@joestringer joestringer added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 2, 2023
lmb added 2 commits May 3, 2023 14:44
hive requires splitting up object lifecycle into New, Start, Stop.
It does this by injecting a hive.Lifecycle into constructors. This
in turn means that we need a lifecycle to create objects during testing.

Implement a minimal hive.Lifecycle for use in testing, which doesn't
distinguish between New and Start phases and instead immediately
executes any start hooks. Stop hooks are invoked at test cleanup time.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Rework the egressmap from a global variable into a Cell. The map still
has singleton behaviour due to map pinning, but at least for unit tests
we can opt out of pinning and isolate the map from global changes.

Stop writing out EGRESS_POLICY_MAP_SIZE into node_config.h since
the map definition in C is never actually used. The map size is
configured via a command line parameter instead. Hardcode the previously
used default value to avoid errors when running BPF unit tests.
This is simpler than wiring the new PolicyConfig into HeaderfileWriter.

Reuse the MapOut strategy from configmap and authmap to ensure that
the egressmap is initialized by the agent before the loader is invoked.
Otherwise map creation might use the (incorrect) MaxElems from the
compiled C code.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
@lmb
Copy link
Contributor Author

lmb commented May 3, 2023

@viktor-kurchenko could you take another look and approve?

@lmb
Copy link
Contributor Author

lmb commented May 4, 2023

/test

Copy link
Contributor

@thorn3r thorn3r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, this was a nice lil intro to hive for me 😎 nice work

@maintainer-s-little-helper maintainer-s-little-helper bot removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 4, 2023
@lmb lmb added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 4, 2023
@lmb
Copy link
Contributor Author

lmb commented May 4, 2023

I added ready-to-merge accidentally but am now getting 500 when trying to remove it.

@maintainer-s-little-helper maintainer-s-little-helper bot removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 4, 2023
@lmb lmb added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 4, 2023
@joestringer joestringer merged commit 445ae6c into cilium:main May 4, 2023
56 checks passed
@lmb lmb deleted the egressmap-cell branch May 5, 2023 08:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants