Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade unipi to paf-le-chien #4

Closed
wants to merge 4 commits into from

Conversation

dinosaure
Copy link
Contributor

It's a draft, at least it compiles but the TLS path seems buggy and it's hard to understand why. I will try to investigate more deeper on this side soon. But it's a proposal, feel free to share your opinion on it and improve this unikernel 👍.

@dinosaure
Copy link
Contributor Author

So an implementation of unipi with paf is running here: https://unipi.egar.im/ (with a debug let's encrypt certificate and small tweaks on paf available here: dinosaure/paf-le-chien#28). WDYT?

@hannesm
Copy link
Collaborator

hannesm commented Jun 24, 2021

I don't know. What is the goal? This PR adds quite some boilerplate to the codebase :/

Will this remove the cohttp and conduit dependency entirely? if not, can we get to this point (easily?) -- maybe if let's encrypt specifies their own module type for the HTTP client used (to avoid the cohttp dependency)?

@dinosaure
Copy link
Contributor Author

I don't know. What is the goal? This PR adds quite some boilerplate to the codebase :/

I think indeed we can do something better on this side. The main problem is the mimic's ritual needed to let the sub-module let's encrypt to safely communicate with let's encrypt and do the challenge. Such part is provided by paf with paf.le (see this functor: https://github.com/dinosaure/paf-le-chien/blob/master/lib/lE.mli

But I'm not sure that the sub-module Letsencrypt and paf.le are equivalent. I need to check.

Will this remove the cohttp and conduit dependency entirely?

The only remaining module is the Cohttp.Client.S signature - then, conduit is definitely removed.

maybe if let's encrypt specifies their own module type for the HTTP client used (to avoid the cohttp dependency)?

It can be a nicer solution indeed! Again, we already talk about mirage-http and an ability to provide such interface without any dependencies (with cohttp or http/af). If you think that is the best way, I can dig on this way 👍 .

On the other side, paf (it's not currently the case but should be easy to do) can handle ALPN and dispatch correctly HTTP 1.1 requests and HTTP 2.0 requests which can be interesting for us.

@dinosaure
Copy link
Contributor Author

PS: I can try to run some benchmark to between this version and the version with cohttp to may be highlight an improvement 👍

@dinosaure
Copy link
Contributor Author

So I did a large stress-test between cohttp and http/af and it seems that http/af can handle ~ 15 000 requests per sec when cohttp handles only ~ 2500 requests per sec for the same file. The magnitude is:

  • for http/af, we need 0.007 sec to respond to the client
  • for cohttp, we need 0.027 sec to respond to the client

Such test is done over TLS for both. This is the plain text of the benchmark (http/af):

dinosaure@turbine:~$ hey -n 1000000 -c 24 https://unipi.egar.im/

Summary:
  Total:	73.8878 secs
  Slowest:	0.2575 secs
  Fastest:	0.0010 secs
  Average:	0.0018 secs
  Requests/sec:	13533.8195
  
  Total data:	22999632 bytes
  Size/request:	23 bytes

Response time histogram:
  0.001 [1]	|
  0.027 [999926]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.052 [47]	|
  0.078 [9]	|
  0.104 [0]	|
  0.129 [0]	|
  0.155 [0]	|
  0.181 [0]	|
  0.206 [0]	|
  0.232 [0]	|
  0.258 [1]	|


Latency distribution:
  10% in 0.0013 secs
  25% in 0.0014 secs
  50% in 0.0015 secs
  75% in 0.0017 secs
  90% in 0.0021 secs
  95% in 0.0024 secs
  99% in 0.0103 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0000 secs, 0.0010 secs, 0.2575 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0183 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0021 secs
  resp wait:	0.0017 secs, 0.0009 secs, 0.2575 secs
  resp read:	0.0000 secs, 0.0000 secs, 0.0025 secs

Status code distribution:
  [200]	999984 responses

And cohttp:

dinosaure@turbine:~$ hey -n 1000000 -c 24 https://unipi.egar.im/

Summary:
  Total:	427.5950 secs
  Slowest:	0.2574 secs
  Fastest:	0.0011 secs
  Average:	0.0102 secs
  Requests/sec:	2338.6243
  
  Total data:	22999632 bytes
  Size/request:	23 bytes

Response time histogram:
  0.001 [1]	|
  0.027 [828281]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.052 [170993]	|■■■■■■■■
  0.078 [681]	|
  0.104 [4]	|
  0.129 [0]	|
  0.155 [0]	|
  0.180 [0]	|
  0.206 [0]	|
  0.232 [0]	|
  0.257 [24]	|


Latency distribution:
  10% in 0.0019 secs
  25% in 0.0024 secs
  50% in 0.0028 secs
  75% in 0.0040 secs
  90% in 0.0451 secs
  95% in 0.0465 secs
  99% in 0.0495 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0000 secs, 0.0011 secs, 0.2574 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0019 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0011 secs
  resp wait:	0.0020 secs, 0.0010 secs, 0.0456 secs
  resp read:	0.0081 secs, 0.0000 secs, 0.0680 secs

Status code distribution:
  [200]	999984 responses

The tool hey should be available. The client is located to another geographic place than the server. It a small benchmark and it's not really reproductible but it shows nice results about http/af.

@hannesm
Copy link
Collaborator

hannesm commented Jun 30, 2021

Thanks @dinosaure, the benchmark looks convincing.

About let's encrypt: I don't know what the state of mirage-http is. I'd be fine to have a HTTP Client module type in letsencrypt directly.

@talex5
Copy link

talex5 commented Jun 30, 2021

So I did a large stress-test between cohttp and http/af and it seems that http/af can handle ~ 15 000 requests per sec when cohttp handles only ~ 2500 requests per sec for the same file.

That's a surprisingly large difference. There are some http benchmarks at https://github.com/ocaml-multicore/retro-httpaf-bench and there we see httpaf being "only" about twice as fast as cohttp:

mc-dev

That graph is from wrk2 with 1000 open connections repeatedly requesting a single static page (which fits in one packet). All servers were configured to use a single core for this test.

@dinosaure
Copy link
Contributor Author

That's a surprisingly large difference. There are some http benchmarks at https://github.com/ocaml-multicore/retro-httpaf-bench and there we see httpaf being "only" about twice as fast as cohttp:

Yeah, I think it's not a fair benchmark for many reasons:

  1. I don't have any control on the flow between the server and the client (two differents servers on internet)
  2. TLS is used in my context, not sure about its impact on performances (and how cohttp/conduit and http/af/paf differs on that point in details)
  3. mirage-tcpip is used here
  4. and the unikernel is virtualized with KVM - but http/af and cohttp are deployed into the same context

For these points, it's why I said that the "benchmark" is not reproductible and show-up (with unipi) especially a "real" usage about what we want (a simple MirageOS website synchronized to a Git repository). May be the context fits better for http/af but the huge diff between cohttp and http/af gives me some reasons to switch to http/af at the end 🙂 .

@dinosaure
Copy link
Contributor Author

So https://unipi.egar.im/ is alive for a long time so I believe that I'm ready to cut a release of paf (which facilitates the way to get a let's encrypt certificate) and this PR will be ready to merge then!

@hannesm
Copy link
Collaborator

hannesm commented Jul 8, 2021

Thanks again @dinosaure -- there's still the outstanding question of the dependency cone in respect to letsencrypt (which currently has a hard dependency on cohttp, and paf has a dependency on letsencrypt). I'd appreciate if we can conclude:

  • have a HTTP_client.S in letsencrypt (that is a module type which is trivially implemented by cohttp -- thus there's no need for any consumer change if you use a cohttp client)
  • have letsencrypt.httpaf implement that module type
  • remove (a) letsencrypt dependency from paf (or does it serve a good purpose there?) (b) have a unipi-without-cohttp [and then we can follow with other unikernels, removing the whole cohttp & conduit dependencies]

WDYT?

@dinosaure
Copy link
Contributor Author

Yes, let me sometimes to shape all of that, I still need to think to revive mirage-http or just delete the hard dependency on letsencrypt about cohttp 👍.

@hannesm
Copy link
Collaborator

hannesm commented Jul 21, 2021

So, now that letsencrypt 0.3.0 is in opam-repository, we could proceed with (a) adapting paf and (b) removing cohttp from unipi. This would clean up the dependency cone drastically :) (and reduce the binary size).

@dinosaure
Copy link
Contributor Author

I updated the PR with last changes on:

  • ocaml-git (we use git-mirage.3.5.0 now)
  • letsencrypt (we use letsencrypt.3.0.0)
  • and paf & paf-le (we use paf.0.0.5 & paf-le.0.0.5)

However, we still need to pin irmin with the right version. It seems that irmin will break its API and updates about it does not reflect yet what is going on irmin~master (the API will change again). However, at least, unipi compiles (I believe).

@dinosaure
Copy link
Contributor Author

Just waiting the release of git.3.6.0 and this PR will be ready, but as far as I can say, we can start to review it!

@dinosaure dinosaure marked this pull request as ready for review October 20, 2021 16:00
@hannesm
Copy link
Collaborator

hannesm commented Oct 26, 2021

Thanks for your PR. I pushed a cleanup commit on top. I'll raise some questions via the code comment / review system.

Logs.info (fun f -> f "requested %s" path);
match Astring.String.cuts ~sep:"/" ~empty:false path with
| [ h ] when String.equal hook_url h ->
begin
hookf () >>= function
| Ok data -> Http.respond ~status:`OK ~body:(`String data) ()
Lwt.async @@ fun () -> hookf () >>= function
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really comfortable with this Lwt.async -- what is the resource / connection / task story for Httpaf here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By type, http/af requires that the request_handler (in that case, dispatch) returns unit instead of unit Lwt.t. Then, httpaf has an internal queue which care about callbacks and execute these functions (unit -> unit) which can emits the request to Read/Write or Error into the socket (via respond_with_*) via the given reqd.

Concurrently, paf processes such tasks with mirage-tcpip on the other side via a server connection. So, resources (such as buffers, internal states, etc.) are shared between the server connection and the given reqd. As long as the reqd exists, internal resources exists.

In that case, and it's the case for the other branch, we require resources which are only available via Lwt. If you are scare about what it can happens into the Lwt.async (such as an exception), may be we can/should add an Lwt.catch inside and call then Reqd.report_exn to be sure that in any way, the resource will be free whatever happens the process.

let headers = Httpaf.Headers.of_list
[ "content-length", string_of_int (String.length data) ] in
let resp = Httpaf.Response.create ~headers `OK in
Httpaf.Reqd.respond_with_string reqd resp data ;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one does actually send out data, does it not? but it does not seem to be in the Lwt monad -- what is the story here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said above, respond_with_string will just fill an internal buffer into the reqd which is shared with a server connection owned by paf (for instance). Concurrently, paf launched an unit Lwt.t which cares about Read/Write operations.

Then, an internal queue of callbacks exists in httpaf which will execute them one per one and emits syscalls action in on side (via server connection) and consume what the user wants on the other side (via the given reqd).

A question can subsist about data race condition, but the computation model of lwt and the global GC lock ensure (due to mutation of the internal queue) ensure that everything is safe (I believed).

Http.respond ~status:`Internal_server_error ~body:(`String msg) ()
let headers = Httpaf.Headers.of_list
[ "content-length", string_of_int (String.length msg) ] in
let resp = Httpaf.Response.create ~headers `Internal_server_error in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the above three lines can be factored out together with 109 - 111.

Httpaf.Reqd.respond_with_string reqd resp data ;
Lwt.return_unit

let redirect port _ reqd =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could the second argument now be removed? what is provided here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second argument is the Ipaddr.t * int peer/the client. You can not remove it because it is given by paf.

let dispatch store hookf hook_url request _body =
let p = Uri.path (Cohttp.Request.uri request) in
let path = if String.equal p "/" then "index.html" else p in
let dispatch store hookf hook_url _conn reqd =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should _conn be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_conn is the Ipaddr.t * int too. You can not remove it (because the caller pass it to the callback).

let port = if port = 443 then None else Some port in
let new_uri = Uri.with_port new_uri port in
let path = request.Httpaf.Request.target in
let new_uri = Uri.make ~scheme:"https" ?host:(Key_gen.hostname ()) ?port ~path () in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the only remaining use of uri? can we drop that dependency? (I'm fine working out the string stuff to get the "new url" ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinion about that, feel free to remove it 👍.

; LE.account_seed = Key_gen.account_seed ()
; LE.account_key_type = `ED25519
; LE.account_key_bits = Some 4096
; LE.hostname = Key_gen.hostname () |> Option.get |> Domain_name.of_string_exn |> Domain_name.host_exn }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the previous defaults for account key type and certificate key type should be used here -- also there should be command line arguments for the key types (and bit sizes)

@hannesm
Copy link
Collaborator

hannesm commented Nov 4, 2021

with this branch, @hb9cwp reported successful builds (on Unix and hvt) on OpenBSD. But the hvt unikernel does not serve web pages -- maybe related to the httpaf/mimic/paf semantics of Lwt.async that were discussed above? @dinosaure did you do functional tests with a hvt unikernel (that it actually serves web pages)?

@hb9cwp
Copy link

hb9cwp commented Nov 7, 2021

@hannesm @dinosaure Good news: with this branch, I have got both unix as well as hvt targets working now on both OpenBSD 6.9 and 7.0 after opam update & upgrade to latest, using mirage v3.10.6 and ocaml 4.10.2 :-)
Further, on OpenBSD 7.0, I had to raise the stack size from 4 to 32 kB using ulimit -s 32768, otherwise the make depend/build fail.
Finally, in my start script for the hvt unikernel, I corrected the netmask from /32 to /24 on the unikernel's tap interface which is bridged at layer-2 to the physical NICs re or em of my OpenBSD hosts.
Tomorrow, I will try to apply the lessons learned to dns-primary-git.
Thank you for all the spontaneous support that got me so far!

@hannesm
Copy link
Collaborator

hannesm commented Nov 7, 2021

@hb9cwp great! This means your unix and hvt unikernels deliver data via HTTP(S) to clients that send requests (in contrast too your earlier comment that there's no content being delivered)?

@hb9cwp
Copy link

hb9cwp commented Nov 8, 2021

@hannesm Yes, exactly. So far, I tested serving simple .html pages and .png images with HTTP though, HTTPS clients to come. Also, Unipi's hook works and triggers it to refresh from Github using HTTPS, SSH to come.

P.S. Also, Unipi unikernels answer requests on both their IPv4 and IPv6 addresses of their tap interfaces in dual-stack OpenBSD hosts, and uses IPv6 for DNS resolution as well as HTTPS to fetch the selected branch from Github, if available.

@hannesm
Copy link
Collaborator

hannesm commented Nov 8, 2021

@hb9cwp great, thanks for your confirmation.

Unipi unikernels answer requests on both their IPv4 and IPv6 addresses of their tap interfaces in dual-stack OpenBSD hosts,

indeed :)

and uses IPv6 for DNS resolution

yes :) also using DNS-over-TLS by default (to anycast.uncensoreddns.org)

as well as HTTPS to fetch the selected branch from Github, if available.

sadly not yet AFAICT (lack of using happy-eyeballs in the git client code)

@hannesm
Copy link
Collaborator

hannesm commented Nov 10, 2021

merged manually into main, thanks for the PR!

@hannesm
Copy link
Collaborator

hannesm commented Apr 28, 2022

It has been some months after merging this, though there is some regression:

  • If tls/https was configured, previously a redirect from http to http (port 80 to 443) was in place - this PR removed that one (re-added in commit f2825c3
  • The redirect function was wrong, previously it used "let uri = Cohttp.Request.uri request", which is a full uri. Now it uses "let path = request.Httpaf.Request.target" -- which is only the path part of the uri, the result is that the redirect of http://10.0.42.2/foo.html puts the location to https:/foo.html -- i.e. missing host. (fixed in commit 6ab8f1f and 2851323)
  • The KV lookup used let p = Uri.path (Cohttp.Request.uri request) in, which is the path, and only the path (i.e. no query parameters), now let path = request.Httpaf.Request.target in is used, which again is the path including query parameters -- so https://10.0.42.2/foo.html?v=23 leads to a not found instead of delivering foo.html. (fixed in 91d0260)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants