You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The commits from PR #4422 not only block requests on port 80, but also some requests on port 443. After the attempt at debugging @benjamreis and I did, it looks like this happens for every subsequent request within the same TLS connection.
How to reproduce (1)
From a web browser, load a web page that fetches various resources from the host such as JS or images. This is what XCP-ng's web page does (sources):
Traces we added in the code showed most requests for resources were rejected with a 403 error.
This is what the page should look like:
How to reproduce (2)
You can also reproduce what is probably another manifestation of the same issue on Citrix Hypervisor 8.2 with XS82E031 applied: just load the web page from the host with Firefox, and refresh, again and again. It should reject one of two requests: the first request succeeds (new TLS connection logged in secure.log). The next one fails (HTTP 403. secure.log shows the second request causes the TLS connection to be closed). Then the next one succeeds (new TLS connection opened in secure.log), the one after that fails, and so on.
One of us reproduced it in Chrome too, but others didn't unless they'd mash F5 fast enough, so that two queries overlap, maybe. I think that Firefox leaves the connection open for subsequent requests more eagerly than chrome does.
What we found out so far
The main test to check for a valid TLS connection in XAPI is
We have verified that Context._client_of_rq req was usually not None for the first request of a series, but was None for any other request made under the same TLS connection. Thus causing a rejection and the connection to be closed. Then, after the connection has been closed, the next one succeeds because we start again with a fresh connection and Context._client_of_rq req is not None.
We assume that either Context._client_of_rq req is not an appropriate test to check that the request came from a valid TLS connection, or stunnel does not set the appropriate headers (STUNNEL_PROXY?) for any request within a TLS connection that is not the first of the series.
Addendum
I also tested what happens if I set website-https-only to false in xapi.conf. Then the page always loads correctly, as before, but the debug traces I had added showed that only the first request of a series would pass the Context._client_of_rq req != None test. When loading the same file successively as described in "How to reproduce (2)", I would get such behaviour visible in secure.log + xensource.log (output to the latter being my added traces):
As I tried to illustrate with forged logs (mine are less readable), the first request after establishing the TLS connection passes the Context._client_of_rq req != None test but not any of the subsequent ones... Until the connection finally closes and a new one is established, starting a new similar series.
The text was updated successfully, but these errors were encountered:
Hi @stormi, thanks for the report and comprehensive analysis. I think that you are quite right, and I can see why this is happening:
When making a connection to port 443, then stunnel receives it and handles the TLS part, and then passes the connection to xapi. We have configured stunnel to send a special header of TCP parameters (the PROXY header) right at the beginning of the connection to xapi, before sending the decrypted data from the external client. This header contains the client's IP address, which we want to log. We also use the presence of this header to determine whether the connection came from through stunnel, and is therefore a TLS connection.
Now, this header isn't part of the HTTP request. Xapi receives it first, and then receives the HTTP request. If the connection remains open and another request comes in on the same connection, then stunnel won't send the PROXY header again, as stunnel doesn't work at the HTTP layer but just below – after establishing the connection, it just decrypts and forwards whatever bits come next. The HTTP server in xapi, however, looks for the PROXY header and HTTP header in one step. So the first request on the connection gets annotated with this PROXY data (which end up in Context._client_of_rq), and any following requests on the same connection don't. It doesn't remember that it had already seen the PROXY header.
It should be quite easy to fix this in http_svr.ml in xen-api-libs-transitional. I'll take a look tomorrow.
The commits from PR #4422 not only block requests on port 80, but also some requests on port 443. After the attempt at debugging @benjamreis and I did, it looks like this happens for every subsequent request within the same TLS connection.
How to reproduce (1)
From a web browser, load a web page that fetches various resources from the host such as JS or images. This is what XCP-ng's web page does (sources):
Traces we added in the code showed most requests for resources were rejected with a 403 error.
This is what the page should look like:
How to reproduce (2)
You can also reproduce what is probably another manifestation of the same issue on Citrix Hypervisor 8.2 with XS82E031 applied: just load the web page from the host with Firefox, and refresh, again and again. It should reject one of two requests: the first request succeeds (new TLS connection logged in secure.log). The next one fails (HTTP 403. secure.log shows the second request causes the TLS connection to be closed). Then the next one succeeds (new TLS connection opened in secure.log), the one after that fails, and so on.
One of us reproduced it in Chrome too, but others didn't unless they'd mash F5 fast enough, so that two queries overlap, maybe. I think that Firefox leaves the connection open for subsequent requests more eagerly than chrome does.
What we found out so far
The main test to check for a valid TLS connection in XAPI is
What plays a significant role here is
Context._client_of_rq req = None
, which only lets requests that passContext._client_of_rq req != None
through.And the definition of
Context._client_of_rq
is:We have verified that
Context._client_of_rq req
was usually not None for the first request of a series, but was None for any other request made under the same TLS connection. Thus causing a rejection and the connection to be closed. Then, after the connection has been closed, the next one succeeds because we start again with a fresh connection andContext._client_of_rq req
is not None.We assume that either
Context._client_of_rq req
is not an appropriate test to check that the request came from a valid TLS connection, orstunnel
does not set the appropriate headers (STUNNEL_PROXY
?) for any request within a TLS connection that is not the first of the series.Addendum
I also tested what happens if I set
website-https-only
tofalse
inxapi.conf
. Then the page always loads correctly, as before, but the debug traces I had added showed that only the first request of a series would pass theContext._client_of_rq req != None
test. When loading the same file successively as described in "How to reproduce (2)", I would get such behaviour visible in secure.log + xensource.log (output to the latter being my added traces):As I tried to illustrate with forged logs (mine are less readable), the first request after establishing the TLS connection passes the
Context._client_of_rq req != None
test but not any of the subsequent ones... Until the connection finally closes and a new one is established, starting a new similar series.The text was updated successfully, but these errors were encountered: