-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recording comprehensive balance change history in evaluation nodes #2191
Comments
This leads to production another level of objects (dedicated virtual operation) which must be somehow associated to original operation performing the change. As I understand virt-ops shall be used for cases having no equivalents in real ops, but need some action to be taken (what is not true in this case). Then needs another storage, external service and a lot of work to integrate them. Also will be troublesome on deployment side, because another entity (called as eNode) must be spawned and maintained. Same can be done by extending the account_history_api, which actually even don't need additional data storage, just proper op filtering during retrieving them for given account. |
The correct way to do this in 2018 is not to "run more stuff inside the same container that listens to a fifo socket" (which also has the side effect of pushing the burden of dealing with the data off to another process that hasn't been written yet, which will need to then reimplement RPC), but to write it to some sane, modern, disk-based database (e.g. rocksdb) that can then be easily queried via the standard rpc protocol already supported by steemd. A "balance_history" plugin or something to avoid account_history, that uses an entirely separate db, that can be queried via entirely separate json-rpc methods. The way we get data in and out of steemd (when we're not also steemd speaking p2p) is via json-rpc and is always via json-rpc. There will not be a plugin that writes to a fifo, there will not be a plugin that sends SQL to an external db, there will not be a plugin that does anything except read and write to local disk (in and to files that will only ever be read by steemd) and add json-rpc methods by which to access that file-based datastore. The options are not exclusively "use our existing, bloated datastore" or "do it out of process". The interfaces are defined, and local disk is fast and cheap, and now under appbase, RPC reads should be too. |
The goal here should be to index relatively little in steemd, but make all of the relevant data available to external services (which will be json-rpc clients of steemd) in perpetuity so that those external services can do their indexing. That means probably a lot of memoizing to disk, indexed by block height, so that those external services can fetch the full history by iterating over the block numbers and building their own indices. I would love to see what we call "consensus-only nodes" use much more disk, and be useful as historical databases (without many indexes other than block height) out of the box, in a default config. Then, with the maturation of better indexing services (e.g. |
We want to record every change to account balances #2173.
We should clearly put the code that emits a note every time a balance changes in
adjust_balance()
, because while obviously we weren't careful to record every balance change in account history, I am pretty sure that during all phases of Steem's development we were pretty careful to always useadjust_balance()
to change balances. So if you put it inadjust_balance()
, then it will always Just Work and we don't have to worry about hunting down all the corner cases we missed, now and in the future.This will produce balance change notes for everything, including things that already have virtual ops, such as
transfer_operation
. Which is why I'm calling the things that are emitted by this code "balance change notes," not "virtual ops."For each balance change, we also want to know what caused it (an op, a virtual op, or per-block processing -- it would also be useful to know which phase of per-block processing). This means we have a tree structure.
Now at any point in time, we need to keep track in a data structure somewhere of what cause to record in the resulting balance change note when
adjust_balance()
happens to be called.Which means we effectively have
database
keep track of a stack data structure of nodes. Each phase of processing that can serve as a cause will need to push a node onto the stack data structure. (With a sufficiently clever class destructor, popping the stack node from the data structure can be made to occur at all exit points automatically.)Now already
account_history
is one of the most bloated parts of the database. So instead of writing the nodes toaccount_history
, let's write them somewhere else -- for example, stdout, or a plain file whose path is specified in the config file. (We can have the file be a UNIX fifo which a script listens to and records in a database for later querying.)If we regard operation nodes as being children of transaction nodes, and transaction nodes as being children of block nodes, we can see that each tree is attached to a block, or an unconfirmed transaction. This simplifies the fork handling logic that will be needed in the external script.
Let's call the tree nodes "evaluation nodes," or "enodes" for short.
I had working code for all of this here, but it has some very rough edges:
steemd
has had multiple breaking refactors since then.This system is potentially very useful because it's quite general. There are many occasions when people want to be able to get statistics or other historical information out of
steemd
(for example right in #2173 a desire's expressed to record the VESTS / STEEM exchange rate every block). The data path created by theenode
system would work quite well for this.The text was updated successfully, but these errors were encountered: