Skip to content

Conversation

@conectado
Copy link
Contributor

@conectado conectado commented Sep 17, 2024

There were 3 problems currently on main, one on the tests and the actual bug.

Test problem

The routes were kept in a BTreeSet that when a new route was added it was inserted into and when it was removed it was removed from using the address of the route.

The problem is if there were overlapping route added twice in a row and then a single one of those resources is removed the test would believe the route no longer exists.

Test solution

Keep the routes in a BTreeMap which maps the id to the ip and then we calculate the routes based on that combined with the default routes, that way we just remove the ID and the routes are kept in the correct expected state.

Real bug

So fixing this revealed a similar bug in connlib, since we kept things in a similar struct, active_cidr_resources using IpNetworkTable.

To fix this I re-calculate the whole table each time we add/remove a resource.

Note that this really doesn't properly fixes overlapping routes, this is just helpful to fix the test, to fix them we need #4789

Furthermore, fixing these issues revealed an additional problem, whenever we add an overlapping CIDR resource the old resource might be overridden, causing the connection to be lost, furthermore this happened in a non-deterministic(it's deterministic really but not explicit) way causing the tests to fail.

To fix this we always sort resources by ID(it's an arbitrary order to keep consistency with the proptests) and then we don't replace the routing for resources that already had a connection.

Sadly, to model this in the test I had to almost copy exactly how we calculate resources in connlib.

Fixes #6721

@vercel
Copy link

vercel bot commented Sep 17, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
firezone 🛑 Canceled (Inspect) Sep 18, 2024 9:29pm

Copy link
Member

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@conectado conectado added this pull request to the merge queue Sep 17, 2024
@conectado conectado removed this pull request from the merge queue due to a manual request Sep 17, 2024
@conectado
Copy link
Contributor Author

@thomaseizinger another failing proptest test-case for this PR revealed another bug explained in the description. Please have another look :)

conectado and others added 4 commits September 18, 2024 14:26
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Signed-off-by: Gabi <gabrielalejandro7@gmail.com>
@conectado conectado added this pull request to the merge queue Sep 18, 2024
Merged via the queue into main with commit 93e923e Sep 18, 2024
@conectado conectado deleted the fix/flaky-proptest branch September 18, 2024 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky proptest

3 participants