Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return "all" from scantxoutset #14584

Closed
domob1812 opened this issue Oct 26, 2018 · 7 comments
Closed

Return "all" from scantxoutset #14584

domob1812 opened this issue Oct 26, 2018 · 7 comments

Comments

@domob1812
Copy link
Contributor

scantxoutset allows to retrieve UTXOs matching certain criteria, based on a whitelist of desired "addresses" (in a wider sense). However, as far as I see there is no way to use this command to retrieve the full UTXO set via the JSON-RPC interface.

I think that such a functionality would be useful for certain (non-mainstream) applications. For instance, block explorers, building rich lists and processing coin snapshots. I've seen multiple situations (and also done it myself) where applications like that either had to hack Bitcoin Core itself to add a custom-made RPC method for their task or read the full blockchain (rather than just the UTXO set) via RPC calls. Having the ability to process the full UTXO set by external tools hooking onto the RPC interface would be useful to have.

One big caveat with this is, of course, that the full UTXO set is very large and returning it from a single RPC call is probably not a good idea (if it is even possible). Thus we would need to make scantxoutset "step-able". For instance, a new optional argument could specify how many results the caller wants to get at most. Then the RPC would return those and remain with the operation in a "paused" state. Follow-up RPC calls would be needed to either "abort" or "continue" the paused scan. (With "continue" returning the next batch of results until the scan is done.)

What do you think about such an extension, would that be useful and fit into the general goal for scantxoutset (which seems to be wallet support at the moment)? Is this already possible and I simply missed how to do it?

@promag
Copy link
Member

promag commented Oct 26, 2018

I think that pulling the entire set (even in chunks) periodically is bad design. Maybe it could be possible to have an "utxo set log" to ease building the set externally.

@sipa
Copy link
Member

sipa commented Oct 26, 2018

I don't think RPC is a very good fit for that purpose, given the large amount of overhead.

@laanwj has worked on streaming the UTXO set over the REST interface before I think, but AFAIK had to wait for some libevent features to allow sending in multiple chunks.

@domob1812
Copy link
Contributor Author

@promag: The applications I have in mind are not based on continuously keeping an external version of the UTXO set up to date, but rather processing it infrequently. For instance, an external script that is run just once to produce a "rich list" of addresses. In that case, a "UTXO update event" won't help (although it might be useful to have for other purposes).

@sipa: Why would the overhead be larger than with the REST interface? I agree that it is a lot of data and thus at least splitting it into chunks is necessary (as per my suggestion). And of course, this is not something that most users will call - but it can be useful for some specific applications. But for someone who knows what they are doing, shouldn't it be fine to fetch the UTXO set over RPC? That said, if it is possible to read it through REST (or that would be a better approach), then that would be equally useful for the purposes I have in mind.

@sipa
Copy link
Member

sipa commented Oct 26, 2018

@domob1812 By REST interface I mostly meant in binary form rather than the JSON encoding that RPC necessarily uses.

But fair enough; no reason why it shouldn't be possible - but the need for chunking makes it nontrivial to implement.

@laanwj
Copy link
Member

laanwj commented Oct 26, 2018

+1 for using REST for this; it's a stateless request-only interface, after all

see #7759 for my PR, feel free to pick it up, it might be that the libevent issue is no longer an issue

By REST interface I mostly meant in binary form rather than the JSON encoding that RPC necessarily uses.

Many of the REST calls have a format parameter, so you could make it stream in JSON format as well. The advantage of REST, using plain http, is that you can just send JSON records for each UTXO, it doesn't have to be wrapped in a valid JSON-RPC envelope.

@domob1812
Copy link
Contributor Author

domob1812 commented Oct 26, 2018

@sipa: I don't think that a chunking implementation for scantxoutset would be too hard to do; keeping around the "currently active" scan between calls is not much different from the current state data that is already kept while a scan is in progress. (The data itself would be different, but not the general architecture.)

@laanwj: Indeed, REST looks like a good interface for that - thanks for the pointer to your PR. This is not high priority for me (I mainly wanted to check this as a general idea with the community), but I may pick up your PR at some point in the future if I find time for it.

However, by the same argument ("REST is a stateless request-only interface"), all of scantxoutset and many other RPC calls should be done through REST as well. RPC and REST are just two complementary interfaces that can both be convenient. (But I do agree that streaming the entire result through HTTP, if it can be made to work, is an elegant solution that avoids explicit chunking.)

@achow101
Copy link
Member

dumptxoutset was added in #16899

@bitcoin bitcoin locked and limited conversation to collaborators Oct 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants