Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] rest: Stream entire utxo set #7759

Closed
wants to merge 2 commits into from

Conversation

@laanwj
Copy link
Member

laanwj commented Mar 28, 2016

This is by no means ready, but anyhow this builds on #7756 and

  • Adds a streaming API to the HTTP server. This allows streaming data to the client chunk by chunk, which is useful when not the entire data is available at once or it is huge and wouldn't fit (efficiently) in memory.
  • Allows downloading the entire UTXO set through /rest/utxoset. This is a raw dump of all outputs, the state normally hashed by gettxoutsetinfo. The dump is performed in the background by making use of leveldb snapshotting, so without keeping cs_main locked.
    • This can be useful for analysis purposes if you don't want to mess with bitcoin core's database
    • Filename (via content-disposition) is utxoset-<height>-<bestblockhash>.dat. Also a custom X-Best-Block and X-Block-Height header is added.

It matches:

$ src/bitcoin-cli -datadir=/store/tmp/testbtc gettxoutsetinfo
{
...
  "hash_serialized": "5017f82bbb82a8199ae0fbaa9e5881a0c82575db89e6edd5b39414b35299363b",
...
}
$ wget --content-disposition http://127.0.0.1:8332/rest/utxoset 
2016-03-28 22:58:32 (44.3 MB/s) - ‘utxoset-404681-0000000000000000034854f5a3ab27cfbc220a42c75061dd13d2067cda71191d.dat’ saved [1291578967]
$ ~/bin/dsha256 utxoset-404681-0000000000000000034854f5a3ab27cfbc220a42c75061dd13d2067cda71191d.dat
5017f82bbb82a8199ae0fbaa9e5881a0c82575db89e6edd5b39414b35299363b utxoset

TODO

  • Rebase after #7756 merged
  • Sensibly name and split up commits
  • Clean up and split up code
  • Actually handle errors (you can crash a worker thread right now by disconnecting while downloading)
  • The timeout actually runs while downloading, causing it to break off after downloading. I don't understand why this is. You can work around it with -rpcservertimeout=6000 or such.
  • UTXO set dump doesn't contain keys (?) I'm not sure this format is actually useful this way (see #7758) (fixed in #7848)
  • Other formats, potentially ^^

Note that the HTTP streaming API could in principle also be used for other large data (say, wallet backups), or even for websocket-like event notification.

@paveljanik

This comment has been minimized.

Copy link
Contributor

paveljanik commented Mar 29, 2016

Looks like evhttp_send_reply_chunk_with_cb is new in libevent 2.1 which is in alpha as of now.

@laanwj

This comment has been minimized.

Copy link
Member Author

laanwj commented Mar 29, 2016

Looks like evhttp_send_reply_chunk_with_cb is new in libevent 2.1 which is in alpha as of now.

Weird. What good is a chunk function if you have no clue if the data was sent. I'll take a look.

@laanwj laanwj removed the REST label Mar 29, 2016
@laanwj laanwj force-pushed the laanwj:2016_03_utxo_streaming branch Apr 28, 2016
@sipa

This comment has been minimized.

Copy link
Member

sipa commented Jun 2, 2016

It seems the last stable release of libevent (2.0.22) was 2.5 years, though the master branch is being updated still. Do we want for libevent 2.1 for this, or find another way?

@laanwj laanwj added this to the Future milestone Jun 9, 2016
@laanwj

This comment has been minimized.

Copy link
Member Author

laanwj commented Jun 9, 2016

It seems the last stable release of libevent (2.0.22) was 2.5 years, though the master branch is being updated still. Do we want for libevent 2.1 for this, or find another way?

Yes, libevent version management is like that, unfortunately.

I'd tend to include a newer libevent in depends, then disable the functionality when building with older libevent.

This will be too late for 0.13. Let's hope there will be a stable libevent release again before 0.14 which includes this. Not holding my breath though.

This builds on #7756 and

- Adds a streaming API to the HTTP server. This allows streaming data to
  the client chunk by chunk, which is useful when not the entire data is
  available at once or it is huge and wouldn't fit (efficiently) in
  memory.

- Allows downloading the entire UTXO set through `/rest/utxoset`. This
  is a raw dump of all outputs, the state normally hashed by
  `gettxoutsetinfo`. The dump is performed in the background by making
  use of leveldb snapshotting, so without keeping cs_main locked.

    - This can be useful for analysis purposes if you don't want to mess
      with bitcoin core's database

    - Filename (via content-disposition) is
      `utxoset-<height>-<bestblockhash>.dat`. Also a custom
      `X-Best-Block` and `X-Block-Height` header is added.
@laanwj laanwj force-pushed the laanwj:2016_03_utxo_streaming branch to 2498324 Sep 28, 2016
@laanwj

This comment has been minimized.

Copy link
Member Author

laanwj commented Sep 28, 2016

Rebased, updated for boost-removal from httpserver.h

// LogPrintf("set_http_chunk_cb\n");
{
std::unique_lock<std::mutex> lock(cs);
evhttp_send_reply_chunk_with_cb(req, databuf, &http_chunk_cb, this);

This comment has been minimized.

Copy link
@jonasschnelli

jonasschnelli Sep 28, 2016

Member

I guess this requires libevent2.1 (depends package is still on 2.0.x IIRC)

This comment has been minimized.

Copy link
@jonasschnelli

jonasschnelli Sep 28, 2016

Member

Sorry.. that was already discussed.

This comment has been minimized.

Copy link
@laanwj

laanwj Sep 28, 2016

Author Member

Yes, I need to find exactly what version this was introduce in and guard the streaming stuff with #if LIBEVENT_VERSION_NUMBER >= 0x0201XXXX. I think it can only be supported for newer libevent .

This comment has been minimized.

Copy link
@laanwj

laanwj Sep 28, 2016

Author Member

For reference, the commit that introduced evhttp_send_reply_chunk_with_cb libevent/libevent@8d8decf , first appearing in version 0x02010401 is from 2009, and that's still the beta branch. I'm getting a bit concerned about libevent's release process.

This comment has been minimized.

Copy link
@jgarzik

jgarzik Sep 29, 2016

Contributor

RE libevent release process -- several projects are feeling the limits of libevent http support, and moving to https://github.com/ellzey/libevhtp

I had to do that in one project, in order to support streaming chunked http downloads.

libevent's http was really written for simple app servers with small replies.

This comment has been minimized.

Copy link
@laanwj

laanwj Sep 29, 2016

Author Member

I know of that project, but I'd prefer to avoid adding another dependency. It can be considered if there is really no other way out, but it seems to me that chunked downloads can be done with that function.

This needs a better interface so that HTTPServer's users (such as rest)
can query capabilities.
@laanwj laanwj force-pushed the laanwj:2016_03_utxo_streaming branch to 5cd59bb Sep 29, 2016
@fanquake fanquake referenced this pull request Oct 3, 2016
@laanwj

This comment has been minimized.

Copy link
Member Author

laanwj commented Oct 18, 2016

Closing this. I think it was a nice experiement but I don't expect to get around to it again in the near future.
If anyone needs this functionality feel free to pick it up.

@laanwj laanwj closed this Oct 18, 2016
@sipa

This comment has been minimized.

Copy link
Member

sipa commented Oct 18, 2016

@laanwj

This comment has been minimized.

Copy link
Member Author

laanwj commented Dec 6, 2017

For anyone thinking about picking this up:

The good news here is that libevent 2.1 is out of alpha, and is stable as of 2.1.7 at this moment.

The bad news is that it might be impossible to stream reliably using libevent's http server. At least it's mentioned as one of the motivations for the libevhtp replacement:

As far as I know, streaming data back to a client is hard, if not impossible without messing with underlying bufferevents.

So I'm not sure "The timeout actually runs while downloading, causing it to break off after downloading." mentioned in the OP is solvable. It might be with some hack.

@NathanFrench

This comment has been minimized.

Copy link

NathanFrench commented Dec 7, 2017

I will be happy to assist with any migration issues to libevhtp if this project decides to go this route.

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.