-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Connection Pool API #4832
base: main
Are you sure you want to change the base?
Conversation
if (RssConfig == NULL) { | ||
Status = QUIC_STATUS_INVALID_PARAMETER; | ||
goto Error; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is an internal function, we can probably assert this instead of checking it. SAL indicates this is a required output too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well it's part of the CxPlat interface, so if we make that library separate, it should validate this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we ever actually do that, then there will be plenty to clean up. For now, simpler is better IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm going to leave this here. Why would we want less validation? It's on the control path, so this isn't performance sensitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's not more or less validation, it is hard failure vs soft failure :)
This check means the caller might do something wrong (not provide the output buffer) and it will be harder for us to realize because we migth recover. A crash would point at the error directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer a DBG assert instead of a check for all internal functions, independent of if it might be exposed later.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4832 +/- ##
==========================================
- Coverage 87.08% 86.28% -0.80%
==========================================
Files 56 57 +1
Lines 17685 17894 +209
==========================================
+ Hits 15401 15440 +39
- Misses 2284 2454 +170 ☔ View full report in Codecov by Sentry. |
54be6ca
to
47753f0
Compare
d22ce49
to
acd61c6
Compare
| Flag | Effect | | ||
|------|--------| | ||
| QUIC_CONNECTION_POOL_FLAG_NONE | Nothing | | ||
| QUIC_CONNECTION_POOL_FLAG_CLOSE_CONNECTIONS_ON_FAILURE | Tells the API to clean up any created *and started* connections in the pool if an error occurrs. **Note:** The application must be able to handle having connections suddenly closed with this flag. Without this flag, the application is expected to clean up remaining connections. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"and started" -> does it means the flag causes all created connections to be closed on failure, or only started connections?
I assume the former, that is the reasonable thing to do - but that sentence can be confusing. Maybe the "started" part should be in the "Note": "The connection might have been started when it get close because of this flag, ...."
void | ||
) | ||
{ | ||
const uint16_t CurrentProc = (uint16_t)CxPlatProcCurrentNumber(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the cast needed? The function accepts a uint32_t
@@ -73,7 +74,8 @@ QuicConnAlloc( | |||
// partition can be updated accordingly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should be updated.
(the "just use the current processor" part is more nuanced now).
Ugh, github is annoying, it doesn't let me comment on a line out of its diff :(
QuicTestConnectionPoolCreate( | ||
Params->ConnPoolCreateParams.Family, | ||
Params->ConnPoolCreateParams.NumberOfConnections, | ||
Params->ConnPoolCreateParams.XdpSupported == TRUE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the == TRUE
? If this is already a "nice" boolean, there is no need for it. If it isn't yet a "nice" boolean, this could give the wrong result (and !!
or != FALSE
would be better).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a pattern (that seems to have been abandoned) where boolean values from user mode were converted for use in kernel mode with != FALSE
(which is what I should have done).
These are "nice" booleans, so I should be able to use them directly. Let's try that
DEFINE_ENUM_FLAG_OPERATORS(QUIC_CONNECTION_POOL_FLAGS); | ||
|
||
typedef struct QUIC_CONNECTION_POOL_CONFIG { | ||
_In_ HQUIC Registration; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_In_
SAL annotation don't mean anything on a struct, and are not used in MsQuic currently (or at least in this file). I don't think we should introduce that pattern.
_Field_*
annotations should be used if there is something to specify here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. I'll convert the ones that can be to _Field_*
and remove the rest.
if (RssConfig == NULL) { | ||
Status = QUIC_STATUS_INVALID_PARAMETER; | ||
goto Error; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's not more or less validation, it is hard failure vs soft failure :)
This check means the caller might do something wrong (not provide the output buffer) and it will be harder for us to realize because we migth recover. A crash would point at the error directly.
QUIC_STATUS Status = QUIC_STATUS_SUCCESS; | ||
uint32_t CreatedConnections = 0; | ||
|
||
if (Config == NULL || ConnectionPool == NULL || Config->Registration == NULL || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: That if test is very hard to read.
It could easily be split in several tests and potentially in a "validateConfig" function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll break it up to make it easier to read, but this is an area where I disagree with making a helper function.
Status); | ||
goto Error; | ||
} else if (RssConfig->RssSecretKeyLength < CXPLAT_TOEPLITZ_KEY_SIZE_MIN) { | ||
Status = QUIC_STATUS_INVALID_STATE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this one also be INTERNAL_ERROR
?
Is there a scenario that can lead here, or is this check to be cautious but essentially an assert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be an ASSERT because I don't think RSS can be configured with a smaller key size (on windows at least. No idea about Linux).
But I picked INVALID_STATE
because the RSS configuration would be in an invalid state. Potentially difficult for the caller to understand that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see...
INVALID_STATE
is generally used when the caller did something at the wrong time or in the wrong order.
Do you know if the user could do something that affect the rss key length? If yes, INVALID_STATE can make sense.
But if it means the OS is in a wrong config (or a config ew don't expect), I think I'd keep INTERNAL_ERROR (or some other more precise error code)
|
||
for (; RetryCount < MaxCreationRetries; RetryCount++) { | ||
|
||
uint32_t NewPort = QuicAddrGetPort(&LocalAddress) + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, everything inside the body of this loop should be extracted in a "TryCreateConnection" function, or something similar.
And everything in the parent loop should also be extracted in a "CreateConnection" function or some similar name.
Both are action that are pretty independent from all the pre-processing you need to do first, it will help a lot being able to abstract them.
$RssSeedPath = (Join-Path $SetupPath tcprssseed.exe) | ||
if (!(Test-Path $RssSeedPath)) { Write-Error "Missing file: $RssSeedPath" } | ||
Invoke-Expression "$RssSeedPath set aa55aa55aa55aa55aa55aa55aa55aa55aa55aa55aa55aa55aa55aa55aa55aa55aa55aa55aa55aa55" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you put a comment to say why we need this?
HANDLE InterfaceHandle = NULL; | ||
HRESULT Result = XdpInterfaceOpen(InterfaceIndex, &InterfaceHandle); | ||
if (FAILED(Result)) { | ||
QuicTraceLogError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, we should be using the following format for error traces (not logs) so they are collected in the low volume scenario:
QuicTraceEvent(
LibraryErrorStatus,
"[ lib] ERROR, %u, %s.",
WsaError,
"WSAStartup");
#define IOCTL_QUIC_RUN_VALIDATE_CONNECTION_POOL_CREATE \ | ||
QUIC_CTL_CODE(132, METHOD_BUFFERED, FILE_WRITE_DATA) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#define IOCTL_QUIC_RUN_VALIDATE_CONNECTION_POOL_CREATE \ | |
QUIC_CTL_CODE(132, METHOD_BUFFERED, FILE_WRITE_DATA) | |
#define IOCTL_QUIC_RUN_VALIDATE_CONNECTION_POOL_CREATE \ | |
QUIC_CTL_CODE(132, METHOD_BUFFERED, FILE_WRITE_DATA) |
if (This->StartedConnectionCount > This->MaxConnections) { | ||
return false; | ||
} | ||
This->Connections[This->StartedConnectionCount++] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if/when this is called in parallel on two threads/cpus?
_In_ size_t ServerNameLength | ||
) | ||
{ | ||
char* ServerNameCopy = CXPLAT_ALLOC_NONPAGED(ServerNameLength + 1, QUIC_POOL_SERVERNAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: offsets in this function are all off.
Definitions for the MsQuic Connection Pool API, which allows clients to | ||
create a pool of connections that are spread across RSS cores. | ||
|
||
--*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code coverage is still claiming a lot of the lines in this file are uncovered...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is because code coverage doesn't run with xdp. Probably should change that...
(Config->CibirIds != NULL && Config->CibirIdLength == 0) || | ||
(Config->CibirIds == NULL && Config->CibirIdLength != 0)) { | ||
Status = QUIC_STATUS_INVALID_PARAMETER; | ||
QuicTraceLogError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember, QuicTraceLog*
functions are effectively verbose level 'printf' style traces. If you want to be able to efficiently capture an error trace, you will need to use QuicTraceEvent
. Perhaps this is fine for invalid parameter calls (I don't think we even have these at all for api.c), but please keep in mind for all your error logging.
return Status; | ||
} | ||
|
||
QuicTraceEvent( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, in general, for other functions in api.c we are logging api enter/exit even for INVALID_PARAMETER cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but I can't dereference an untrusted pointer without validating it isn't NULL. So either I log entry after the validation, or I log entry without any parameter, which seems to be the requirement for this scenario.
|
||
Status = CxPlatDataPathRssConfigGet(InterfaceIndex, &RssConfig); | ||
if (QUIC_FAILED(Status)) { | ||
QuicTraceLogError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, you don't need to log an error if the function itself already logs an error.
Description
Provide an API for pooling connections across multiple RSS cores
Testing
Manual testing
Documentation
Documentation will be updated with the new API