Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix VM network clone IP calculation and consolidate cleanup #6461

Merged
merged 3 commits into from
Apr 30, 2024

Conversation

bduffany
Copy link
Member

@bduffany bduffany commented Apr 29, 2024

Instead of using vmIdx for the clone IP, we want to use the netIdx which we have locked. The netIdx is different from the vmIdx in the case where the vmIdx is already in use.

This also restructures the cleanup work that needs to be done after calling SetupVethPair. Because the clone IP depends on the netIdx and not the vmIdx, some of the current cleanup work that is done in firecracker.go would now need to know about the netIdx. But this feels a bit like implementation details leaking out. So, in order to avoid exposing the netIdx and having to plumb it around, this PR makes it so that the cleanup func already returned by SetupVethPair now also does this cleanup work. This makes the networking API a bit simpler since more of the cleanup work is done automatically.

Related issues: N/A

@bduffany bduffany force-pushed the fix-vm-networking branch 3 times, most recently from eb98f8d to fb57b6d Compare April 29, 2024 18:10
@bduffany bduffany changed the title Fix VM network clone IP calculation and improve cleanup Fix VM network clone IP calculation and consolidate cleanup Apr 29, 2024
server/util/networking/networking.go Outdated Show resolved Hide resolved
// this func that needs to be explicitly cleaned up, we append a cleanup
// task to this list. Cleanup work is done in the reverse order in which
// it's added to this list (i.e., it's a stack).
var cleanupStack []func(ctx context.Context) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this logic a bit complex to follow. For example it takes some mental work to figure out the order the cleanup tasks are executing (i.e. "if I want to cleanup the route before I unlock the net idx, when should I add it to the slice?") I would personally find it a bit more readable to use booleans

Ex.
if shouldCleanupRoute {}
if shouldCleanupRule {}

Copy link
Member Author

@bduffany bduffany Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it this way because it keeps the cleanup logic close to the creation logic, which gives me more peace of mind, similar to how I expect to see a defer Unlock() after a Lock(). Also, similar to defer, it provides a natural "reverse-dependency-ordering" execution order - if A depends on B then that means A will get created before B, which means A's cleanup func will also be registered before B's, which means it'll execute after B's, which is what we want.

Overall I feel like this makes the code more robust, but let me know if you feel strongly about the readability aspect. I feel like with booleans, it's another potential source of bugs, because we have to explicitly keep track of the dependencies between the resources whereas the stack naturally does the cleanup in reverse dependency order.

// If we return an error from this func then we need to clean up any
// resources that were created before returning.
defer func() {
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When err is shadowed - i.e. hostEndpointNet, err := hostNetAllocator.Get(vmIdx) - does this still work correctly?

Copy link
Member Author

@bduffany bduffany Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - in that example, err is not being declared in a new scope (shadowed), it's being re-assigned in the top-level function body scope. The declaration happens in the named return value.

I was on the fence about whether to do this vs using a wrapper func but I think ultimately the wrapper func is more error prone because we have to remember to return cleanup, err instead of return nil, err everywhere in this func.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay. I wasn't sure if a new err variable was being created, but it seems like it won't, and only the value will get reassigned (https://go.dev/ref/spec#Short_variable_declarations)

// If we return an error from this func then we need to clean up any
// resources that were created before returning.
defer func() {
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay. I wasn't sure if a new err variable was being created, but it seems like it won't, and only the value will get reassigned (https://go.dev/ref/spec#Short_variable_declarations)

@bduffany bduffany merged commit 615c78b into master Apr 30, 2024
19 checks passed
@bduffany bduffany deleted the fix-vm-networking branch April 30, 2024 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants