Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify/refactor the current program model #141

Merged
merged 12 commits into from
Feb 1, 2023

Conversation

csujedihy
Copy link
Contributor

@csujedihy csujedihy commented Jan 30, 2023

  1. The new model always allows program sharing. Thus, meta program and sharing related flags and structs have been deleted.
  2. Now, when a program is plumbed, a program "binding" is inserted to the RX queue. More programs just mean more bindings on a RX queue.
  3. All rules are still placed on a contiguous memory to remain the performance of the data path. When add/removing a program, a new "compiled" program is generated on control path from all bindings on the RX queue.

This refactoring IMO would make binding to all queues easier to implement. We would just need to extend program object to have an array of program bindings.

@csujedihy csujedihy requested a review from a team as a code owner January 30, 2023 02:53
src/xdp/rx.c Outdated Show resolved Hide resolved
@csujedihy
Copy link
Contributor Author

csujedihy commented Jan 30, 2023

This optimization is broken by this PR. The code might choose the fast path initially, and then subsequently added programs will be incorrectly ignored.

https://github.com/microsoft/xdp-for-windows/pull/141/files#diff-ffbb7352b99db671bfaad2a1bf9999d75fa6f92c1e32e7b11086fc72dc87cdceL794

    //
    // Implement a fast path for a single XSK receiving all frames.
    //
    if (XdpProgramCanXskBypass(RxQueue->Program)) {
        RxQueue->Dispatch = XdpRxExclusiveXskDispatch;
    } else {
        RxQueue->Dispatch = XdpRxDispatch;
    }

src/xdp/program.c Outdated Show resolved Hide resolved
src/xdp/program.c Outdated Show resolved Hide resolved
src/xdp/rx.c Outdated Show resolved Hide resolved
XDP_PROGRAM *
XdpRxQueueGetProgram(
LIST_ENTRY *
XdpRxQueueGetProgramBindingList(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this abstraction is going to be fragile once we integrate with eBPF, because we'll want to invoke either eBPF or our legacy built-in programs, but not both. It would be best if the RX queue exposes an opaque XDP_PROGRAM * pointer getter/setter and then invokes the XDP_PROGRAM on the data path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I am not quite following here. Are you saying this new concept of "program binding" would not exist with ebpf? Or this abstraction allows mixing ebpf and non-ebpf programs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, a notion of "program binding" definitely exists with eBPF, so the issue is the mixing of different types of programs, and to a lesser extent the leakage of the chain of non-eBPF programs into the RX queue. One option here is to create a top-level program binding (it could contain either an eBPF program or a list of non-eBPF programs) and then have the RX queue deal only with that.

For the non-eBPF programs, then the program.c module could peek into the list stored in that top-level binding object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. That would be cleaner. Do you want me to work on that in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(tests are all green now except a flaky functional test case)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to keep a legacy mechanism once ebpf integration is supported?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may be in a partially-integrated state for a while, where some features of our built-in program implementation are yet to be lit up through eBPF. Once everyone depending on those features can switch to eBPF, we should be able to migrate away from built-in programs.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like you're suggesting two separate steps, i.e., first add ebpf integration, then remove built-in programs? The word "replace" in the title of issue #7 implied to me that there is no "partially integrated state for a while", it's all at the same time. I'm asking whether you actually need a partially integrated state for a while instead of just doing it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dthaler MsQuic is dependent on the built-in program support. And we are going forward with deployments leveraging that. Until we have a complete replacement for those deployments, we cannot remove the built-in program support.

@mtfriesen
Copy link
Contributor

mtfriesen commented Jan 30, 2023

This optimization is broken by this PR. The code might choose the fast path initially, and then subsequently added programs will be incorrectly ignored.

//
// Implement a fast path for a single XSK receiving all frames.
//
if (XdpProgramCanXskBypass(RxQueue->Program)) {
    RxQueue->Dispatch = XdpRxExclusiveXskDispatch;
} else {
    RxQueue->Dispatch = XdpRxDispatch;
}

This micro-optimization actually improved perf a lot for "give me everything" workloads, but IDK if it's even possible to express this in eBPF. Theoretically we can also swap dispatch tables on the fly (i.e. after any program changes) if you want to experiment with that. You'd just swap the dispatch table within a callback from XdpRxQueueSync in theory.

@csujedihy
Copy link
Contributor Author

This optimization is broken by this PR. The code might choose the fast path initially, and then subsequently added programs will be incorrectly ignored.

//
// Implement a fast path for a single XSK receiving all frames.
//
if (XdpProgramCanXskBypass(RxQueue->Program)) {
    RxQueue->Dispatch = XdpRxExclusiveXskDispatch;
} else {
    RxQueue->Dispatch = XdpRxDispatch;
}

This micro-optimization actually improved perf a lot for "give me everything" workloads, but IDK if it's even possible to express this in eBPF. Theoretically we can also swap dispatch tables on the fly (i.e. after any program changes) if you want to experiment with that. You'd just swap the dispatch table within a callback from XdpRxQueueSync in theory.

Made an attempt to fix it by swapping dispatch tables on the fly.

src/xdp/program.c Outdated Show resolved Hide resolved
src/xdp/program.c Outdated Show resolved Hide resolved
mtfriesen
mtfriesen previously approved these changes Jan 31, 2023
@csujedihy
Copy link
Contributor Author

Hit this in local spinxsk run with FNDIS. It seems like the newly added rule and/or this PR make the fuzz'd socket setup tend to fail more often?

PS C:\Users\Administrator\Desktop\xdp> .\tools\spinxsk.ps1 -XdpmpPollProvider FNDIS -QueueCount 2 -Minutes 100
q[0]: socket setup success rate: (172 / 527) 32%
(successPct >= successThresholdPercent) failed line 1934
Press Ctrl+C to cancel the stop operation.
100%  [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
The trace was successfully saved.
Press Ctrl+C to cancel the stop operation.
100%  [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
The trace was successfully saved.
SpinXsk failed with 1
At C:\Users\Administrator\Desktop\xdp\tools\spinxsk.ps1:146 char:13
+             throw "SpinXsk failed with $LastExitCode"
+             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : OperationStopped: (SpinXsk failed with 1:String) [], RuntimeException
    + FullyQualifiedErrorId : SpinXsk failed with 1

@mtfriesen
Copy link
Contributor

Hit this in local spinxsk run with FNDIS. It seems like the newly added rule and/or this PR make the fuzz'd socket setup tend to fail more often?

PS C:\Users\Administrator\Desktop\xdp> .\tools\spinxsk.ps1 -XdpmpPollProvider FNDIS -QueueCount 2 -Minutes 100
q[0]: socket setup success rate: (172 / 527) 32%
(successPct >= successThresholdPercent) failed line 1934
Press Ctrl+C to cancel the stop operation.
100%  [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
The trace was successfully saved.
Press Ctrl+C to cancel the stop operation.
100%  [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
The trace was successfully saved.
SpinXsk failed with 1
At C:\Users\Administrator\Desktop\xdp\tools\spinxsk.ps1:146 char:13
+             throw "SpinXsk failed with $LastExitCode"
+             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : OperationStopped: (SpinXsk failed with 1:String) [], RuntimeException
    + FullyQualifiedErrorId : SpinXsk failed with 1

Hmm, is this with the recommended driver verifier settings, including randomized allocation failures? If so, it is expected that the success rate is lower than the default value used in the script.

@csujedihy
Copy link
Contributor Author

csujedihy commented Feb 1, 2023

I believe so. I ran these commands:
.\tools\prepare-machine.ps1 -ForSpinxskTest
.\tools\spinxsk.ps1 -XdpmpPollProvider FNDIS -QueueCount 2 -Minutes 100

Ok, you're good, then. I usually run with a minimum success rate of 1% with driver verifier allocation failures enabled, which I think is similar to our CI/CD.

src/xdp/rx.c Outdated Show resolved Hide resolved
@csujedihy csujedihy merged commit 5742c4b into main Feb 1, 2023
@csujedihy csujedihy deleted the huanyi/simplify-program-model branch February 1, 2023 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants