Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with kernels configured without IPV6_MULTIPLE_TABLES #2050

Open
xen0n opened this issue Jun 23, 2019 · 5 comments

Comments

@xen0n
Copy link

commented Jun 23, 2019

The route syncing logic does no filtering based on actual route type, but rather implicitly operates on the main routing table only. However on kernels built without CONFIG_IPV6_MULTIPLE_TABLES the IPv6 local routes get returned in the main table too. The local routes get (promptly) purged by Felix, and connectivity is lost because NDP packets could no longer be handled.

Expected Behavior

Normal IPv6 connectivity on kernels built with or without CONFIG_IPV6_MULTIPLE_TABLES.

Current Behavior

No IPv6 connectivity on kernels built without CONFIG_IPV6_MULTIPLE_TABLES. The routes are periodically purged by Felix so even manually adding them back doesn't work:

2019-06-22 12:57:13.826 [INFO][70] int_dataplane.go 907: Applying dataplane updates
2019-06-22 12:57:13.826 [INFO][70] route_table.go 222: Queueing a resync of routing table. ipVersion=0x4
2019-06-22 12:57:13.827 [INFO][70] route_table.go 222: Queueing a resync of routing table. ipVersion=0x6
2019-06-22 12:57:13.828 [INFO][70] route_table.go 528: Syncing routes: removing old route. dest=fe80::ecee:eeff:feee:eeee/128 ifaceName="cali0109c211321" ipVersion=0x6
2019-06-22 12:57:13.828 [INFO][70] int_dataplane.go 921: Finished applying updates to dataplane. msecToApply=1.940604
2019-06-22 12:57:13.828 [INFO][70] conntrack.go 78: Removing conntrack flows ip=fe80::ecee:eeff:feee:eeee

Possible Solution

Skip uninteresting routes when syncing route tables; or perhaps explicitly don't support such configurations if the compatibility cost turns out to be too high. This should affect most people using self-built kernels, though, because the option is not enabled in x86 defconfig.

Steps to Reproduce (for bugs)

  1. Boot a kernel built without CONFIG_IPV6_MULTIPLE_TABLES
  2. Fire up Calico for IPv6 operation
  3. ip -ts monitor the kernel routing table changes
  4. Start a container and watch the route disappear

Context

I'm using self-built kernel for my cluster nodes.

Your Environment

  • Felix version (if using Felix binary directly calico-felix --version): v3.9.0-0.dev-21-ge001ad4f
  • And/Or, calico/node container version (if running Felix in a container): 3.7.3
  • Which orchestrator are you using (e.g. Kubernetes, OpenStack, Docker, Mesos): Kubernetes
  • Orchestrator version (if used): 1.11
  • Etcd version (if using etcd datastore driver): 3.3.8
  • Operating System and version: Linux 5.2.0-rc3
  • Link to your project (optional):
@caseydavenport

This comment has been minimized.

Copy link
Member

commented Jun 25, 2019

Is there a reason you need multiple routing tables disabled?

@xen0n

This comment has been minimized.

Copy link
Author

commented Jun 26, 2019

@caseydavenport No, the base configuration is written down a few years ago, and back at that time this particular option just didn't matter. I've rebuilt kernels for maintenance next time, though.

Personally I prefer fixing the incompatibility, but choosing to ignore this particular case and explicitly documenting the required kernel config is OK too, as major distribution kernels are not affected.

@caseydavenport

This comment has been minimized.

Copy link
Member

commented Jun 26, 2019

perhaps explicitly don't support such configurations if the compatibility cost turns out to be too high

I'd be inclined to take this route - In the general case, I don't think we can reasonably document or support every build-time kernel option.

We do have a list of distributions which are tested with Calico here: https://docs.projectcalico.org/v3.7/getting-started/kubernetes/requirements

@xen0n

This comment has been minimized.

Copy link
Author

commented Jun 27, 2019

Yeah we could just note this one option in the Kernel dependencies section of the page you just linked. People using self-built kernels are naturally on their own to sort out incompatibilities (and consequently reboot their nodes), just mentioning the point should be friendly enough.

@xen0n

This comment has been minimized.

Copy link
Author

commented Jun 27, 2019

Also I've forked and patched Felix to temporarily workaround this; if a fix is deemed upstream-able I can submit it. Otherwise I'll just update the docs instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.