Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

route calculation takes O(n^2) time #1761

Closed
rade opened this issue Dec 8, 2015 · 1 comment
Closed

route calculation takes O(n^2) time #1761

rade opened this issue Dec 8, 2015 · 1 comment

Comments

@rade
Copy link
Member

rade commented Dec 8, 2015

when running something like WEAVE_NO_FASTDP=1 NUM_WEAVES=40 bin/multiweave launch --conn-limit 100 and taking a profile snapshot, 2/3rd of the time is spent in route calculation.

routes_profile

Looking at the code, I am pretty sure it is O(n_peers^2). Hopefully there's a better way.

@rade
Copy link
Member Author

rade commented Dec 9, 2015

One way we could improve matters here is by calculating individual broadcast routes lazily, i.e. only when requested.

Can do the same for unicast routes, though for the whole rather than individual routes since a single traversal of the topology gives us all unicast routes.

rade added a commit that referenced this issue Dec 10, 2015
Instead of calculating all routes in one go, we calculate them when
needed.

Except for unicast routes, and broadcast routes from ourself. We still
calculate those eagerly because they are needed to route locally
captured/fdp-missed packets, and performing route calculcations in
that critical path increases the chances of dropping packets due to
not keeping up.

Well, technically only the established & symmetric routes are needed
for the data plane, but we might as well calculate the equivalent
routes for the control plane - it keeps the code simple and uniform.

Fixes #1761.
rade added a commit that referenced this issue Dec 10, 2015
Instead of calculating all routes in one go, we calculate them when
needed.

Except for unicast routes, and broadcast routes from ourself. We still
calculate those eagerly because they are needed to route locally
captured/fdp-missed packets, and performing route calculcations in
that critical path increases the chances of dropping packets due to
not keeping up.

Well, technically only the established & symmetric routes are needed
for the data plane, but we might as well calculate the equivalent
routes for the control plane - it keeps the code simple and uniform.

Calculating all routes is O(n_peers^2), so this change represents a
huge saving in situations where not all of them are needed, which is
quite common, especially when the topology is evolving rapidly.

One downside of this new approach is that we can no longer suppress
OnChange callbacks when the data plane routes remain unchanged. This
means we will be purging the FDP flows more frequently than before,
e.g. when only the control plane topology changed, e.g. new
connections appeared that haven't been marked as 'established'
yet. It's a small price to pay, fixable if we really care, and in any
case making FDP flow invalidation more fine-grained would be a better
approach.

Fixes #1761.
@bboreham bboreham added this to the 1.5.0 milestone Jan 12, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants