Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[All] Entity-centric tracing #1243

Merged
merged 75 commits into from Sep 20, 2021
Merged

Conversation

ramtinms
Copy link
Member

@ramtinms ramtinms commented Sep 2, 2021

This PR

  • refactors tracing to be entity centric
    • for entities (block, collection, transaction) entityID is going to be set as traceID
    • a dummy RootEntity span is also constructed to help keeping spans of an entity generated from the same node together and aligned
    • most of new methods returns isSampled to avoid extra computations for tag values
    • provides in-house proper entity sampling so we don't have to tune params from Jaeger
  • removes non-operational traces (e.g. time between two events), those are going to be captured differently in the future PRs
  • refactors components to pass trace spans using contexts
  • injects tracer into Finalizer
  • removes consensusTracingConsumer as it was not in use

@@ -484,13 +484,15 @@ func NewExecutionCollector(tracer module.Tracer, registerer prometheus.Registere
// StartBlockReceivedToExecuted starts a span to trace the duration of a block
// from being received for execution to execution being finished
func (ec *ExecutionCollector) StartBlockReceivedToExecuted(blockID flow.Identifier) {
ec.tracer.StartSpan(blockID, executionBlockReceivedToExecuted).SetTag("block_id", blockID.String)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to define a metric instead of tracing here, e.g. blockExecutionWaitTime

cmd/scaffold.go Outdated
@@ -291,7 +291,8 @@ func (fnb *FlowNodeBuilder) initMetrics() {

fnb.Tracer = trace.NewNoopTracer()
if fnb.BaseConfig.tracerEnabled {
tracer, err := trace.NewTracer(fnb.Logger, fnb.BaseConfig.NodeRole)
serviceName := fnb.BaseConfig.NodeRole + "-" + fnb.BaseConfig.nodeIDHex[:8]
tracer, err := trace.NewTracer(fnb.Logger, serviceName)
Copy link
Member Author

@ramtinms ramtinms Sep 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We consider each node role + the first 8 digits of the id as service id since info like consensus-001 is not available here.

engine/consensus/matching/core.go Show resolved Hide resolved
engine/consensus/sealing/core.go Outdated Show resolved Hide resolved
engine/consensus/compliance/core.go Outdated Show resolved Hide resolved
engine/consensus/compliance/core.go Outdated Show resolved Hide resolved
engine/consensus/compliance/core.go Outdated Show resolved Hide resolved
module/trace/trace.go Show resolved Hide resolved
module/trace/trace.go Show resolved Hide resolved
module/trace/trace.go Show resolved Hide resolved
model/flow/identifier.go Show resolved Hide resolved
module/tracer.go Show resolved Hide resolved
Copy link
Member

@zhangchiqing zhangchiqing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement!

Copy link
Contributor

@m4ksio m4ksio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the change, but (there is always a but) - I really think we should have some tests for it.
Even adding a tracer for existing tests and asserting is traces matches what we would expect will greatly increase confidence

cmd/scaffold.go Show resolved Hide resolved

e.log.Debug().
Hex("tx_id", logging.Entity(txBody)).
Msg("executing transaction")

txView := collectionView.NewChild()

tx := fvm.Transaction(txBody, txIndex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we want to move this below trace span as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if I understand? you mean the NewChild() call?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetTraceSpan won't start a trace, it would only pass the parent span reference to transaction object

engine/execution/state/state.go Outdated Show resolved Hide resolved
model/flow/identifier.go Outdated Show resolved Hide resolved
model/flow/identifier.go Outdated Show resolved Hide resolved
@ramtinms
Copy link
Member Author

@m4ksio this PR is already really big, going to add tests for the tracers in a follow up PR to this one.

@ramtinms ramtinms merged commit a783037 into master Sep 20, 2021
@ramtinms ramtinms deleted the ramtin/entity-life-cycle-tracing branch September 20, 2021 18:47
bors bot added a commit that referenced this pull request Jul 20, 2022
2824: chore(tests): remove unused mock r=SaveTheRbtz a=SaveTheRbtz

`MutationContext` was removed in 019f7b4 (part of #1243).

PS. `@gomisha` I'm curious how many of these unused mack files we have in the repo.  Maybe we can automagically detect them during the lint phase?

Co-authored-by: Alexey Ivanov <rbtz@dapperlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants