Replace epee http client #1200

jagerman · 2020-07-27T04:10:29Z

Epee's http client sucks. This replaces it with a curl based one (cpr) that works much better (and that we don't have to maintain).

This PR is on top of #1185 (uWebSockets), but we can just drop that in favour of this. There are a couple things left to do here:

adding a static curl build in the StaticLibs script

In another PR:

convert the wallet rpc server to uWebSockets as well (and then delete all the epee http server code as well).

(copy-and-pasting from the big curl-adding commit here):

In short: epee's http client is garbage, standard violating, and unreliable.

This completely removes the epee http client support and replaces it with cpr, a curl-based C++ wrapper. rpc/http_client.h wraps cpr for RPC requests specifically, but it is also usable directly.

This replacement has a number of advantages:

requests are considerably more reliable. The epee http client code assumes that a connection will be kept alive forever, and returns a failure if a connection is ever closed. This results in some very annoying things: for example, preparing a transaction and then waiting a long tim before confirming it will usually result in an error communication with the daemon (with no way to retry submitting it). This is just terribly behaviour: the right thing to do on a connection failure is to resubmit the request.
epee's http client is broken in lots of other ways: for example, it tries throwing SSL at the port to see if it is HTTPS, but this is protocol violating and just breaks (with a several second timeout) on anything that isn't epee http server (for example, when lokid is behind a proxying server).
even when it isn't doing the above, the client breaks in other ways: for example, there is a comment (replaced in this PR) in the Trezor PR code that forces a connection close after every request because epee's http client doesn't do proper keep-alive request handling.
it seems noticeably faster to me in practical use in this PR; both simple requests (for example, when running lokid status) and wallet<->daemon connections are faster, probably because of crappy code in epee. (I think this is also related to the throw-ssl-at-it junk above: the epee client always generates an ssl certificate during static initialization because it might need one at some point).
significantly reduces the amount of code we have to maintain.
removes all the epee ssl option code: curl can handle all of that just fine.
removes the epee socks proxy code; curl can handle that just fine. (And can do more: it also supports using HTTP/HTTPS proxies).
When a cli wallet connection fails we now show why it failed (which now is an error message from curl), which could have all sorts of reasons like hostname resolution failure, bad ssl certificate, etc. Previously you just got a useless generic error that tells you nothing.

Doy-lee

Tried running some tests on this but I'm getting some errors. On a local walled off Testnet

[wallet T6TwRb]: sweep_all unimportant T6TwRb2puWgiU1Wv5WGh6F96m4vFrjzxxUsikJYNBNJ9Zhp99iNkRwqjbzw3VHrZPJfmY4ppmypva2p4F93ci1sV1CCH4h7GS
Wallet password:
Error: failed to get random outputs to mix: Failed to get random outs

Daemon failing in get_outs(...), I log the exception

2020-07-29 03:48:31.466	I !!!!!!!!!!!!!!!!!! Error Attempting to get output pubkey by global index (amount 0, index 1398, count 524), but key does not exist (current height 262)

Doy-lee · 2020-07-29T00:11:21Z

src/wallet/multisig_info.h

+#include <ringct/rctTypes.h>
+#include <boost/serialization/version.hpp>
+
+namespace wallet {


I think the separation here was abit aggressive in spreading out everything into individual files, some of them end up re-including each other anyway.

Perhaps merging the ones which reinclude each other already or having a single one for tx data structures.

Spreading things into individual files where 1 file = 1 thing is entirely the point. What is the advantage in blobbing them together in the source code? There's no advantage at all to the compiler, and from the human perspective it seems cleaner to me to have 10 files that each does one logical thing (or one set of closely related things) rather than one file that does 10 not-particularly-related things. Obviously there are limitations to that, but none of these are minuscule, trivial headers, and other than the fact that they are all used in the wallet, each one is different.

some of them end up re-including each other anyway.

Doesn't seem relevant. Whether two things belong in the same header depends on what they do (i.e. separation of concerns), not how they get used. For example: tx_sets.h's concern is being a container. It includes pending_tx.h because some of the containers contain pending_tx, but it has nothing to do with the actual inner structure of pending_tx -- because that's pending_tx.h's concern. Even though pending_tx.h is not included anywhere that doesn't also include tx_sets.h, that's not a justification for combining those concerns (and similarly for everything else that gets included here.). I'm fine with "A and B are tightly coupled therefore logically belong in the same header", I'm not fine with "currently A and B always get included at the same time therefore belong in the same header."

Organizing either way are both things I'm fine with it does boil down to personal preference for readability, I see the two ways of organizing as the same thing from different views. To use the file system to group or grouping it textually, the thing that leads me to prefer in the file itself is that we use text editors which optimise for one more than the other, of course all in consideration of the little code we're putting across files. There are things that are clearer when the boundary between them is files not text.

It's true the reincluding part was not as circular as I make it out to be.

I do note we end up in one of the situations where you only need to include one of the header files instead of the others and that's nice yes and has benefits, no down-playing that.

jagerman · 2020-07-29T18:20:54Z

Error: failed to get random outputs to mix: Failed to get random outs

I've definitely encountered that error before (long before this PR) when spinning up a fake testnet; I don't think it has anything to do with this PR.

Doy-lee · 2020-07-29T23:32:11Z

Error: failed to get random outputs to mix: Failed to get random outs

I've definitely encountered that error before (long before this PR) when spinning up a fake testnet; I don't think it has anything to do with this PR.

I think this might not be the case. I thought I tested this on dev vs this PR and only encountered it on this PR. I'll double check

Doy-lee · 2020-07-30T01:14:00Z

Error: failed to get random outputs to mix: Failed to get random outs

I've definitely encountered that error before (long before this PR) when spinning up a fake testnet; I don't think it has anything to do with this PR.

I think this might not be the case. I thought I tested this on dev vs this PR and only encountered it on this PR. I'll double check

Can't repro on dev, just through this branch.

- Rename hash_xor to `operator ^=`, so you can just write `hasha ^= hashb;` - Replace static variables with C++17 inline constexprs. This should result in slightly smaller binaries because the linker can throw away the duplicates, whereas right now each compilation unit gets its own private copy.

With the current approach -fPIC wasn't getting used when building external libraries. I tried moving add_subdirectory(external) down below where we set flags, but that results in a slew or warning because it *also* turn on a bunch of warnings that aren't safe in the various external code.

Because: boost::join(v | boost::adaptors::transformed([](uint64_t out){return std::to_string(out);}), " ") is ugly as sin, while: tools::join(" ", v) is nice and simple. Also removes a few unnecessary boost iterator adaptor includes and uses.

This replaces the NIH epee http server which does not work all that well with an external C++ library called uWebSockets. Fundamentally this gives the following advantages: - Much less code to maintain - Just one thread for handling HTTP connections versus epee's pool of threads - Uses existing LokiMQ job server and existing thread pool for handling the actual tasks; they are processed/scheduled in the same "rpc" or "admin" queues as lokimq rpc calls. One notable benefit is that "admin" rpc commands get their own queue (and thus cannot be delayed by long rpc commands). Currently the lokimq threads and the http rpc thread pool and the p2p thread pool and the job queue thread pool and the dns lookup thread pool and... are *all* different thread pools; this is a step towards consolidating them. - Very little mutex contention (which has been a major problem with epee RPC in the past): there is one mutex (inside uWebSockets) for putting responses back into the thread managing the connection; everything internally gets handled through (lock-free) lokimq inproc sockets. - Faster RPC performance on average, and much better worst case performance. Epee's http interface seems to have some race condition that ocassionally stalls a request (even a very simple one) for a dozen or more seconds for no good reason. - Long polling gets redone here to no longer need threads; instead we just store the request and respond when the thread pool, or else in a timer (that runs once/second) for timing out long polls. --- The basic idea of how this works from a high level: We launch a single thread to handle HTTP RPC requests and response data. This uWebSockets thread is essentially running an event loop: it never actually handles any logic; it only serves to shuttle data that arrives in a request to some other thread, and then, at some later point, to send some reply back to that waiting connection. Everything is asynchronous and non-blocking here: the basic uWebSockets event loop just operates as things arrive, passes it off immediately, and goes back to waiting for the next thing to arrive. The basic flow is like this: 0. uWS thread -- listens on localhost:22023 1. uWS thread -- incoming request on localhost:22023 2. uWS thread -- fires callback, which injects the task into the LokiMQ job queue 3. LMQ main loop -- schedules it as an RPC job 4. LMQ rpc thread -- Some LokiMQ thread runs it, gets the result 5. LMQ rpc thread -- Result gets queued up for the uWS thread 6. uWS thread -- takes the request and starts sending it (asynchronously) back to the requestor. In more detail: uWebSockets has registered has registered handlers for non-jsonrpc requests (legacy JSON or binary). If the port is restricted then admin commands get mapped to a "Access denied" response handler, otherwise public commands (and admin commands on an unrestricted port) go to the rpc command handler. POST requests to /json_rpc have their own handler; this is a little different than the above because it has to parse the request before it can determine whether it is allowed or not, but once this is done it continues roughly the same as legacy/binary requests. uWebSockets then listens on the given IP/port for new incoming requests, and starts listening for requests in a thread (we own this thread). When a request arrives, it fires the event handler for that request. (This may happen multiple times, if the client is sending a bunch of data in a POST request). Once we have the full request, we then queue the job in LokiMQ, putting it in the "rpc" or "admin" command categories. (The one practical different here is that "admin" is configured to be allowed to start up its own thread if all other threads are busy, while "rpc" commands are prioritized along with everything else.) LokiMQ then schedules this, along with native LokiMQ "rpc." or "admin." requests. When a LMQ worker thread becomes available, the RPC command gets called in it and runs. Whatever output it produces (or error message, if it throws) then gets wrapped up in jsonrpc boilerplate (if necessary), and delivered to the uWebSockets thread to be sent in reply to that request. uWebSockets picks up the data and sends whatever it can without blocking, then buffers whatever it couldn't send to be sent again in a later event loop iteration once the requestor can accept more data. (This part is outside lokid; we only have to give uWS the data and let it worry about delivery). --- PR specifics: Things removed from this PR: 1. ssl settings; with this PR the HTTP RPC interface is plain-text. The previous default generated a self-signed certificate for the server on startup and then the client accepted any certificate. This is actually *worse* than unencrypted because it is entirely MITM-readable and yet might make people think that their RPC communication is encrypted, and setting up actual certificates is difficult enough that I think most people don't bother. uWebSockets *does* support HTTPS, and we could glue the existing options into it, but I'm not convinced it's worthwhile: it works much better to put HTTPS in a front-end proxy holding the certificate that proxies requests to the backend (which can then listen in restricted mode on some localhost port). One reason this is better is that it is much easier to reload and/or restart such a front-end server, while certificate updates with lokid require a full restart. Another reason is that you get an error page instead of a timeout if something is wrong with the backend. Finally we also save having to generate a temporary certificate on *every* lokid invocation. 2. HTTP Digest authentication. Digest authentication is obsolete (and was already obsolete when it got added to Monero). HTTP-Digest was originally an attempt to provide a password authentication mechanism that does not leak the password in transit, but still required that the server know the password. It only has marginal value against replay attacks, and is made entirely obsolete by sending traffic over HTTPS instead. No client out there supports Digest but *not* Basic auth, and so given the limited usefulness it seems pointless to support more than Basic auth for HTTP RPC login. What's worse is that epee's HTTP Digest authentication is a terrible implementation: it uses boost::spirit -- a recursive descent parser meant for building complex language grammars -- just to parse a single HTTP header for Digest auth. This is a big load of crap that should never have been accepted upstream, and that we should get rid of (even if we wanted to support Digest auth it takes less than 100 lines of code to do it when *not* using a recursive descent parser). Still to do (in another PR): - Make the same server side change to the wallet_rpc_server. This is lower priority because the wallet_rpc server isn't generally a performance bottleneck (since it usually has just one client). - Delete the (to-be-)unused epee code for serving http, https, and digest.

- Add libuv submodule (only needed and built on Windows) - Temporarily switch uWebSockets to my github repo: I submitted some mingw compilation fixes upstream; as soon as a new upstream release comes out we will switch this back to the upstream repo. - Switch BOOST_SCOPE_EXIT to LOKI_DEFER - Don't compile `closefrom()` on Windows (it isn't used, and generates an unused function warning).

This includes aren't needed, and wallet2.h is an extremely heavy header to include.

This moves the wallet structs that are needed elsewhere (for instance, in the device drivers) from `tools::wallet2::whatever` to `wallet::whatever`, allowing them to be included individually via wallet/whatever.h without needing to include the entire wallet2.h. What was particularly problematic here is that you couldn't even forward declare them because they were nested inside the wallet2 class, but instead had to include the full wallet2.h.

Doing a DNS resolution here seems like overkill, and even if you get a resolution to a non-public IP that doesn't mean it's local/trusted (for example, you could resolve a .loki address, get 10.0.0.3, and then specify that). So just hard-code common ways of specifying localhost instead, which is really the main point of this.

The code here was convoluted; simplified it and avoided needing boost::format.

Red is used for errors, and these are not errors but rather just recommendations.

In short: epee's http client is garbage, standard violating, and unreliable. This completely removes the epee http client support and replaces it with cpr, a curl-based C++ wrapper. rpc/http_client.h wraps cpr for RPC requests specifically, but it is also usable directly. This replacement has a number of advantages: - requests are considerably more reliable. The epee http client code assumes that a connection will be kept alive forever, and returns a failure if a connection is ever closed. This results in some very annoying things: for example, preparing a transaction and then waiting a long tim before confirming it will usually result in an error communication with the daemon. This is just terribly behaviour: the right thing to do on a connection failure is to resubmit the request. - epee's http client is broken in lots of other ways: for example, it tries throwing SSL at the port to see if it is HTTPS, but this is protocol violating and just breaks (with a several second timeout) on anything that *isn't* epee http server (for example, when lokid is behind a proxying server). - even when it isn't doing the above, the client breaks in other ways: for example, there is a comment (replaced in this PR) in the Trezor PR code that forces a connection close after every request because epee's http client doesn't do proper keep-alive request handling. - it seems noticeably faster to me in practical use in this PR; both simple requests (for example, when running `lokid status`) and wallet<->daemon connections are faster, probably because of crappy code in epee. (I think this is also related to the throw-ssl-at-it junk above: the epee client always generates an ssl certificate during static initialization because it might need one at some point). - significantly reduces the amount of code we have to maintain. - removes all the epee ssl option code: curl can handle all of that just fine. - removes the epee socks proxy code; curl can handle that just fine. (And can do more: it also supports using HTTP/HTTPS proxies). - When a cli wallet connection fails we know show why it failed (which now is an error message from curl), which could have all sorts of reasons like hostname resolution failure, bad ssl certificate, etc. Previously you just got a useless generic error that tells you nothing. Other related changes in this PR: - Drops the check-for-update and download-update code. To the best of my knowledge these have never been supported in loki-core and so it didn't seem worth the trouble to convert them to use cpr for the requests. - Cleaned up node_rpc_proxy return values: there was an inconsistent mix of ways to return errors and how the returned strings were handled. Instead this cleans it up to return a pair<bool, val>, which (with C++17) can be transparently captured as: auto [success, val] = node.whatever(req); This drops the failure message string, but it was almost always set to something fairly useless (if we want to resurrect it we could easily change the first element to be a custom type with a bool operator for success, and a `.error` attribute containing some error string, but for the most part the current code wasn't doing much useful with the failure string). - changed local detection (for automatic trusted daemon determination) to just look for localhost, and to not try to resolve anything. Trusting non-public IPs does not work well (e.g. with lokinet where all .loki addresses resolve to a local IP). - ssl fingerprint option is removed; this isn't supported by curl (because it is essentially just duplicating what a custom cainfo bundle does) - --daemon-ssl-allow-chained is removed; it wasn't a useful option (if you don't want chaining, don't specify a cainfo chain). - --daemon-address is now a URL instead of just host:port. (If you omit the protocol, http:// is prepended). - --daemon-host and --daemon-port are now deprecated and produce a warning (in simplewallet) if used; the replacement is to use --daemon-address. - --daemon-ssl is deprecated; specify --daemon-address=https://whatever instead. - the above three are now hidden from --help - reordered the wallet connection options to make more logical sense.

This does the opposite of tools::hex_to_type: it converts the data inside a (simple) type to hex.

- updates to use rpc::http_client instead of epee - simplifies a bunch of json_rpc calls - removes update checking - fixes various other random compilation issues

Upstream master now has the patches we need merged.

jagerman · 2020-08-17T03:13:55Z

Merged (via #1208)

Doy-lee reviewed Jul 29, 2020

View reviewed changes

jagerman mentioned this pull request Aug 1, 2020

get_output_distribution fixes (plus related changes) #1207

Merged

jagerman added 15 commits August 2, 2020 11:09

Fix result bytes value in RPC logging

4a9264a

Remove wallet2.h includes

e1509c0

This includes aren't needed, and wallet2.h is an extremely heavy header to include.

Simplify get_human_readable_bytes (avoids boost)

a6f173c

The code here was convoluted; simplified it and avoided needing boost::format.

Change untrusted daemon messages to yellow

989f04e

Red is used for errors, and these are not errors but rather just recommendations.

Add tools::type_to_hex

eaff3d6

This does the opposite of tools::hex_to_type: it converts the data inside a (simple) type to hex.

wallet_api updates

910f14d

- updates to use rpc::http_client instead of epee - simplifies a bunch of json_rpc calls - removes update checking - fixes various other random compilation issues

Update cpr submodule to upstream

fe2c6ee

Upstream master now has the patches we need merged.

jagerman force-pushed the http-epeedectomy-client branch from 156f316 to fe2c6ee Compare August 2, 2020 14:10

jagerman mentioned this pull request Aug 2, 2020

Full HTTP epeedectomy #1208

Merged

jagerman closed this Aug 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace epee http client #1200

Replace epee http client #1200

jagerman commented Jul 27, 2020

Doy-lee left a comment

Doy-lee Jul 29, 2020

jagerman Jul 29, 2020

Doy-lee Jul 29, 2020

jagerman commented Jul 29, 2020

Doy-lee commented Jul 29, 2020

Doy-lee commented Jul 30, 2020

jagerman commented Aug 17, 2020

Replace epee http client #1200

Replace epee http client #1200

Conversation

jagerman commented Jul 27, 2020

Doy-lee left a comment

Choose a reason for hiding this comment

Doy-lee Jul 29, 2020

Choose a reason for hiding this comment

jagerman Jul 29, 2020

Choose a reason for hiding this comment

Doy-lee Jul 29, 2020

Choose a reason for hiding this comment

jagerman commented Jul 29, 2020

Doy-lee commented Jul 29, 2020

Doy-lee commented Jul 30, 2020

jagerman commented Aug 17, 2020