Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Restructure transaction traces to disambiguate and provide more information #6897

Closed
arhag opened this issue Mar 8, 2019 · 8 comments
Closed

Comments

@arhag
Copy link
Contributor

arhag commented Mar 8, 2019

The current transaction trace is structured in a manner that prevents determining some potentially useful information about transaction execution due to ambiguities in the structure. For example, if an action notifies another contract, and there is another inline action in the inline_traces of the original action, it is not clear whether that inline action was sent by the original action or the notified contract. This information can be useful if an application wishes to react to the event of the inline action but only if sent by the appropriate contract. Adding authorizations can help, but it does not fully resolve the potential ambiguities (also authorizations are not available for context-free actions).

Consider a transaction with a single action alice::act1 which initiates the following sequence of operations:

  1. alice::act1 executes in the alice contract which:
    • calls require_recipient with bob;
    • sends two inline actions alice::act2 and then alice::act3;
    • calls require_recipient with charlie.
  2. alice::act1 executes in the bob contract which:
    • sends an inline action bob::foo;
    • calls require_recipient with david.
  3. alice::act1 executes in the charlie contract which:
    • sends an inline action charlie::bar.
  4. alice::act1 executes in the david contract which does nothing.
  5. alice::act2 executes in the alice contract which:
    • calls require_recipient with david and then erin;
    • sends an inline action alice::act4.
  6. alice::act2 executes in the david contract which does nothing.
  7. alice::act2 executes in the erin contract which does nothing.
  8. alice::act4 executes in the alice contract which does nothing.
  9. alice::act3 executes in the alice contract which does nothing.
  10. bob::foo executes in the bob contract which does nothing.
  11. charlie::bar executes in the charlie contract which does nothing.

The current implementation of nodeos will generate a transaction trace with the following structure:

(alice::act1 -> alice) [1]
   ||
   ||===> (alice::act1 -> bob) [2]
   ||
   ||===> (alice::act1 -> charlie) [3]
   ||
   ||===> (alice::act1 -> david) [4]
   ||
   ||===> (alice::act2 -> alice) [5]
   ||        ||
   ||        ||===> (alice::act2 -> david) [6]
   ||        ||
   ||        ||===> (alice::act2 -> erin) [7]
   ||        ||
   ||        ||===> (alice::act4 -> alice) [8]
   ||
   ||===> (alice::act3 -> alice) [9]
   ||
   ||===> (bob::foo -> bob) [10]
   ||
   ||===> (charlie::bar -> charlie) [11]

The numbers in the bracket act as convenient labels to refer to the individual actions but also show the order of execution of the actions. This would correspond to the order of the global sequence numbers included in each action receipt.

One problem with the above structure is that action traces 9 and 10 both appear as children of action trace 1, however inline action 9 was sent by the execution of action 1 while inline action 10 was sent by the execution of action 2.

Another example of the problem of the above structure is that action trace 4 appears as a notification of action 1 to david in a similar manner as the notifications show in action traces 2 and 3; however, action traces 2 and 3 are there because of a require_recipient calls made during the execution of action 1, while action trace 4 is there because of a require_recipient call made during the execution of action 2.

Finally consider what the returned trace would look like if a failure was encountered during execution of action 5 in between the require_recipient calls that notified david and erin:

(alice::act1 -> alice) [1]
   ||
   ||===> (alice::act1 -> bob) [2]
   ||
   ||===> (alice::act1 -> charlie) [3]
   ||
   ||===> (alice::act1 -> david) [4]
   ||
   ||===> (alice::act2 -> alice) [5] (action trace will include error encountered)

However, up to the point of encountering the error, more information was known that was simply discarded because of the way transaction traces are structured. For example, it was known that action 2 sent inline action 10 and action 3 sent inline action 11, even though actions 10 and 11 did not have a chance to execute yet. It was also known that action 5 has notified david, however the notification to erin and inline action 8 being sent would not have been known.

The transaction trace is structured in a way that makes it easy to follow the order of execution. However, enough information already exists in the action receipts of the traces (specifically the global_sequence) to determine the exact order in which the actions executed. What does not exist is the information of the type described in the paragraphs above.

So this issue proposes changing the transaction trace structure to provide more information about the order in which things were scheduled/notified as well as the scheduling/notifying/dispatching relationship between actions. Since the global_sequence information would still exist, clients could transform the returned trace structure back into the original structure, thus giving users a choice of which representation they wish to see.

Under this new system the transaction trace of the example above would instead be:

(alice::act1 -> alice) [1]
   ||
   ||===> (alice::act1 -> bob) [2]
   ||        ||
   ||        ||===> (bob::foo -> bob) [10]
   ||        ||
   ||        ||===> (alice::act1 -> david) [4]
   ||
   ||===> (alice::act2 -> alice) [5]
   ||        ||
   ||        ||===> (alice::act2 -> david) [6]
   ||        ||
   ||        ||===> (alice::act2 -> erin) [7]
   ||        ||
   ||        ||===> (alice::act4 -> alice) [8]
   ||
   ||===> (alice::act3 -> alice) [9]
   ||
   ||===> (alice::act1 -> charlie) [3]
             ||
             ||===> (charlie::bar -> charlie) [11]

Furthermore, the scenario of encountering a failure during execution of action 5 described earlier would now result in a transaction trace of the following:

(alice::act1 -> alice) [1]
   ||
   ||===> (alice::act1 -> bob) [2]
   ||        ||
   ||        ||===> (bob::foo -> bob) [10] (action trace would have an empty receipt)
   ||        ||
   ||        ||===> (alice::act1 -> david) [4]
   ||
   ||===> (alice::act2 -> alice) [5] (action trace would have an empty receipt but would include error encountered)
   ||        ||
   ||        ||===> (alice::act2 -> david) [6]  (action trace would have an empty receipt and empty parent_seq_num)
   ||
   ||===> (alice::act3 -> alice) [9]  (action trace would have an empty receipt)
   ||
   ||===> (alice::act1 -> charlie) [3]
             ||
             ||===> (charlie::bar -> charlie) [11]  (action trace would have an empty receipt)

More information is provided in the above structure which could help contract developers debugging their contracts. Notice that the actions that did not have a chance to execute fully (or at all) would have an empty receipt. Child traces of an action trace with an empty receipt would not necessarily be complete, but would be as accurate as possible with the information known by the time of the failure.

The base_action_trace structure would be modified as follows:

  • Change the type of the receipt field from action_receipt to optional<action_receipt>.
  • Add a new field parent_seq_num of type optional<uint64_t> which be set to the global_sequence number of the parent action (the action that sent this inline action or called require_recipient causing this action to be dispatched). This field would be empty for the original top-level actions of the transaction.

Implementation

The goal is to only change the transaction trace structure as described above, and not to actually change the way actions get executed. Even if there is a more sensible way to do action dispatching, the order cannot be changed without breaking existing consensus rules (and this issue is not about a consensus protocol upgrade). Thus, the core code of the dispatch logic of actions, in apply_context::exec, will remain the same.

However, the general pattern of allocating new space in the inline_traces with an emplace_back at the time right before executing will be replaced by instead doing that trace allocation (as well as a copy of the action data to be eventually dispatched) during the time of scheduling or notification.

The _notified field of apply_context should be modified to be a vector<pair<account_name, size_t>>. The _inline_actions and _cfa_inline_actions fields of apply_context should be modified to be a vector<size_t>. A new field _trace of type action_trace* should be added to apply_context. On execution of apply_context::exec( action_trace& trace ) this field will be set to &trace.

In apply_context::require_recipient, instead of pushing back recipient into _notified, the code should add action_trace initialized with the current act to the end of trace->inline_traces and then do _notified.emplace_back( recipient, trace->inline_traces.size() - 1 ).

In apply_context::execute_inline and apply_context::execute_context_free_inline, instead of doing an emplace_back( move(a) ) on the _inline_action or _cfa_inline_actions field, respectively, the code should add action_trace initialized with the moved a to the end of trace->inline_traces and then call the method emplace_back( trace->inline_traces.size() - 1 ) on the _inline_action or _cfa_inline_actions field, respectively.

When the new action_trace is initialized as described above, the fields act, context_free, trx_id, block_num, block_time, producer_block_id can and should all be properly initialized (and would never have to modified again). The remaining fields should be initialized with their default constructed values and, with the exception of parent_seq_num, may only be changed during/after execution of the action in apply_context::exec_one. The parent_seq_num field of an action will be modified during the apply_context::exec_one execution for the parent action by iterating through all of the action traces in trace.inline_traces after r.global_sequence is determined.

In apply_context::exec, wherever trace.inline_traces.emplace_back() and trace.inline_traces.back() exist, they will have to be replaced with a lookup in the trace.inline_traces vector with the appropriate index retrieved either from one of the three vectors: _retrieved, _cfa_inline_actions, or _inline_actions. Furthermore, when looping through the actions in _cfa_inline_actions or _inline_actions, the inline_action variable can instead be replaced by the act field of the action_trace found via the index.

@abourget
Copy link
Contributor

abourget commented Mar 8, 2019

@arhag I'm sure you saw this thing https://www.dfuse.io/en/blog/eosq-removes-ambiguity-when-debugging-your-smart-contract .. wondering if we had the complete picture or if we missed something?

Is this change going to swap the "Creation Tree" view for the "Execution Tree" view forever? or will it provide both views in the transaction_trace ?

@arhag
Copy link
Contributor Author

arhag commented Mar 8, 2019

@abourget:

Yes, great job with that.

I believe your "Creation Tree" view is the same as the structure I am proposing for the transaction trace in this issue, although it is difficult to tell for sure with just the examples given on the web page. I assume eos121dacsky was the proxy of the gyzdembyg4ge account in the second example given?

I believe there should be enough information in the proposed transaction trace structure combined with basic rules for how action dispatching works on EOSIO to reliably convert the proposed transaction trace structure into the old transaction trace structure (equivalent to the "Execution Tree" view).

@abourget
Copy link
Contributor

abourget commented Mar 8, 2019

re eos121dacsky: exact.

Why not do the contrary? Things build up during execution, like database operations build upon things created in a certain order. Always storing a creation tree will make it more difficult to reliably know how things built up... or explain certain database modifications.

Why not add a bit of metadata about who creates what when?

You think creation tree is more valuable than execution tree?

@arhag
Copy link
Contributor Author

arhag commented Mar 8, 2019

The tree structure proposed in the issue provides more information than the current tree structure of transaction traces to the point where the latter can be fully derived from the former (that's even without the additional metadata of parent_seq_num which is just added as a convenience for any consumers that immediately throw away the structure).

I'm not sure I understand what alternative you are proposing, but are we in agreement that the existing trace structure (with all its ambiguities that, assuming no special knowledge of the contract sources, can only be resolved with modifications made to the core of nodeos) is less useful than the proposed alternative in this issue?

Is the concern about needing to do a transformation on the proposed tree (essentially a "Creation Tree") to get the "Execution Tree" view which may be viewed as the more useful one?

@abourget
Copy link
Contributor

abourget commented Mar 8, 2019

Yes, agreed there's missing bits and pieces to stitch it all.

I'm just wondering about the expectations of nesting people have today.. if it suddenly swaps around, they might be confused.

But it's a good question.. is creation what people are really after when they inspect the nesting? That might the default assumption. And it can be violated (which is never good :).

One could extract stats about such confusion in today's chains through our endpoint, see when creation trees and execution trees are different. I don't have the stats today..

I know nodeos spits out a flattened version of traces, sort of merged with the nested ones.. maybe having all the necessary info in a flat version, for both things would be interesting. Like a parent_create_seq and parent_exec_seq.. so you can rebuild the tree or navigate it to find those relationships. Having two values would also make it explicit that you need to choose :P and you better understand what you're doing!

arhag added a commit that referenced this issue Mar 30, 2019
Now compiles, but the tests fail due to bugs.
arhag added a commit that referenced this issue Apr 1, 2019
arhag added a commit that referenced this issue Apr 1, 2019
…e_traces #6897

Also correct state history ABI for action_trace_v0 to reflect that 
inline_traces are no longer included.
@arhag
Copy link
Contributor Author

arhag commented Apr 1, 2019

A few changes have been made to the design described above.

First, transaction_trace will switch to a flat structure by including all the action traces in the action_trace vector. action_trace will no longer have an inline_traces vector.

Second, action_trace will have the following new fields:

  • action_ordinal: An ordinal (specifically a variable-length positive integer) used to refer to the particular action trace which will be unique within this transaction trace. Note that the ordinals start with 1, not 0, because 0 is a special value used to represent the source of all the top-level actions in the original transaction.
  • creator_action_ordinal: The ordinal of the parent (within the creation tree) of this action, or 0 if this action is a top-level action in the transaction. The parent of this action within the creation tree is the trace of the action that when executing caused this action to be scheduled.
  • parent_action_ordinal (this was renamed to closest_unnotified_ancestor_action_ordinal in Add optional error_code to transaction and action traces #7108): The ordinal of the parent (within the execution tree) of this action, or 0 if this action is a top-level action in the transaction. The execution tree is the same tree modeled under the original transaction_trace structure prior to the changes described in this issue.
  • receiver: The receiver of this action.

Third, action_trace modifies the type of its receipt to an optional action_receipt. This means the converted JSON may not have the action_receipt field or may have one with a null value. If this happens, it means the trace for the action was scheduled but it was not able to successfully execute. It also means that the receipt of the associated transaction_trace would either not be present or would have a status of hard_fail.

Fourth, transaction_trace has a new optional field account_ram_delta which if present records the RAM usage delta of the particular account paying for the storage of the deferred transaction: An input delayed transaction will have the account_ram_delta field present recording the RAM usage increase for the first authorizer of the transaction. When a deferred transaction retires (regardless of which status) the account_ram_delta field will be present recording the RAM usage decrease for the payer of the deferred transaction.

@arhag
Copy link
Contributor Author

arhag commented Apr 2, 2019

Due to the changes in the transaction_trace structure, the returned JSON object from push_transaction and push_transactions (as they currently exist as of commit c62a5f1) will be different than what existing consumers of the API may expect. In particular, the objects in the action_traces vector will not have the inline_traces field and may potentially be missing the receipt field. Furthermore, the structure of the action traces are modified to flatten all action traces from the original tree structure into a single collection stored in the action_traces vector. Also, additional field have been added but that should not be a concern to existing consumers of the API since the extra JSON fields can just be ignored.

It is desirable to continue returning the current modified trace results in a new RPC endpoint, send_transaction, added to the v1 chain API that has the same signature and behavior as push_transaction. However, to maintain backwards compatibility, push_transaction should be modified to reorganize the transaction_trace (after conversion to fc::variant) to be in the same format as previously expected by consumers with the exception of the additional fields which would still remain.

This means that the action traces that are not for top-level actions should be moved into the inline_traces field (in the correct order) of the appropriate action_trace to reconstruct the execution tree structure expected by current consumers of the results of the push_transaction RPC. It also means ensuring that if an action_trace has a non-empty except, then it also set the receipt field to a action_receipt instance with the receiver and act_digest fields appropriately set, and that any other action_trace with an empty receipt (ones that never even started executing) are removed.

The execution tree can be reconstructed from the set of action_traces by using the parent_action_ordinal (this was renamed to closest_unnotified_ancestor_action_ordinal in #7108) of each action_trace to determine its parent within the tree (if 0 then the parent is the transaction_trace) and then ordering the siblings by ascending global_sequence number.

@arhag
Copy link
Contributor Author

arhag commented Apr 10, 2019

Resolved by #7044.

@arhag arhag closed this as completed Apr 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants