-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify/refactor the current program model #141
Conversation
This optimization is broken by this PR. The code might choose the fast path initially, and then subsequently added programs will be incorrectly ignored.
|
XDP_PROGRAM * | ||
XdpRxQueueGetProgram( | ||
LIST_ENTRY * | ||
XdpRxQueueGetProgramBindingList( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this abstraction is going to be fragile once we integrate with eBPF, because we'll want to invoke either eBPF or our legacy built-in programs, but not both. It would be best if the RX queue exposes an opaque XDP_PROGRAM *
pointer getter/setter and then invokes the XDP_PROGRAM on the data path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I am not quite following here. Are you saying this new concept of "program binding" would not exist with ebpf? Or this abstraction allows mixing ebpf and non-ebpf programs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, a notion of "program binding" definitely exists with eBPF, so the issue is the mixing of different types of programs, and to a lesser extent the leakage of the chain of non-eBPF programs into the RX queue. One option here is to create a top-level program binding (it could contain either an eBPF program or a list of non-eBPF programs) and then have the RX queue deal only with that.
For the non-eBPF programs, then the program.c module could peek into the list stored in that top-level binding object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. That would be cleaner. Do you want me to work on that in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(tests are all green now except a flaky functional test case)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to keep a legacy mechanism once ebpf integration is supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may be in a partially-integrated state for a while, where some features of our built-in program implementation are yet to be lit up through eBPF. Once everyone depending on those features can switch to eBPF, we should be able to migrate away from built-in programs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like you're suggesting two separate steps, i.e., first add ebpf integration, then remove built-in programs? The word "replace" in the title of issue #7 implied to me that there is no "partially integrated state for a while", it's all at the same time. I'm asking whether you actually need a partially integrated state for a while instead of just doing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dthaler MsQuic is dependent on the built-in program support. And we are going forward with deployments leveraging that. Until we have a complete replacement for those deployments, we cannot remove the built-in program support.
This micro-optimization actually improved perf a lot for "give me everything" workloads, but IDK if it's even possible to express this in eBPF. Theoretically we can also swap dispatch tables on the fly (i.e. after any program changes) if you want to experiment with that. You'd just swap the dispatch table within a callback from |
Made an attempt to fix it by swapping dispatch tables on the fly. |
Hit this in local spinxsk run with FNDIS. It seems like the newly added rule and/or this PR make the fuzz'd socket setup tend to fail more often?
|
Hmm, is this with the recommended driver verifier settings, including randomized allocation failures? If so, it is expected that the success rate is lower than the default value used in the script. |
Ok, you're good, then. I usually run with a minimum success rate of 1% with driver verifier allocation failures enabled, which I think is similar to our CI/CD. |
This refactoring IMO would make binding to all queues easier to implement. We would just need to extend program object to have an array of program bindings.