Skip to content
This repository has been archived by the owner on Aug 30, 2019. It is now read-only.

cmd/trace-agent: cleanup code related to APM events. #506

Merged
merged 4 commits into from Oct 29, 2018

Conversation

AlexJF
Copy link

@AlexJF AlexJF commented Oct 25, 2018

Move most of the code dealing with event extraction into its own package and rename samplers to extractors.

Also, change most references of transactions into events following the documented nomenclature.

This sets the ground for #508

package constants

// SamplingPriorityKey is the key of the sampling priority value in the metrics map of the root span
const SamplingPriorityKey = "_sampling_priority_v1"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ Regarding this file, I added this to fix some cyclic imports. Happy to move this somewhere else that also works 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move it into model where it originally was before I plucked it out, as discussed.

@gbbr gbbr changed the title Cleanup code related to APM events. cmd/trace-agent: cleanup code related to APM events. Oct 25, 2018
@gbbr gbbr added this to the 6.7.0 milestone Oct 25, 2018
Copy link
Contributor

@gbbr gbbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really nice change. I've just added some suggestions (ideas) around naming and exporting/unexporting stuff.

package constants

// SamplingPriorityKey is the key of the sampling priority value in the metrics map of the root span
const SamplingPriorityKey = "_sampling_priority_v1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move it into model where it originally was before I plucked it out, as discussed.

@@ -0,0 +1,52 @@
package eventextractor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just have one main package event, as discussed.


// AnalyzedExtractor is an event extractor that extracts APM events from traces based on
// `(service name, operation name) => sampling ratio` mappings.
type AnalyzedExtractor struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to suggest calling it like this:

// ratioExtractor extracts APM events from traces based on `(service name, operation name) => sampling ratio`
// mappings.
type ratioExtractor struct {

What do you think? The word analysed was always a bit confusing to me. But maybe I'm misunderstanding something.


// NewAnalyzed returns an APM event extractor that extracts APM events from a trace following the provided
// extraction rates for any spans matching a (service name, operation name) pair.
func NewAnalyzed(analyzedSpansByService map[string]map[string]float64) *AnalyzedExtractor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is going back into a main event package, it will probably sound nice like this:

event.NewRatioExtractor

Unless you think "analyzed" is better for some reason, which is fine by me. I just don't understand why this word is used, given that it doesn't appear in the documentation. What was analyzed (past tense) ?


// LegacyAnalyzedExtractor is an event extractor that extracts APM events from traces based on `serviceName => sampling
// ratio` mappings.
type LegacyAnalyzedExtractor struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could also just be legacyExtractor?

// LegacyAnalyzedExtractor is an event extractor that extracts APM events from traces based on `serviceName => sampling
// ratio` mappings.
type LegacyAnalyzedExtractor struct {
analyzedRateByService map[string]float64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the word "analyzed"? Could it maybe simply be "rateByService"?


// inspect the WeightedTrace so that we can identify top-level spans
for _, span := range t.WeightedTrace {
if s.shouldAnalyze(span) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From here I take it that the word analyze is meant for the future, as in: should the backend analyze this? Right? In that case we can remove the word "analyzed" from all the structs, etc. Here it makes sense, on this method.

type DisabledExtractor struct{}

// NewDisabled returns a new APM event extractor that does not extract any events.
func NewDisabled() *DisabledExtractor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could just be NoopExtractor if you like, or NewDisabledExtractor (the longer version)

// Extractor extracts APM event spans from a trace.
type Extractor interface {
// Extract extracts APM event spans from the given weighted trace information and returns them.
Extract(t model.ProcessedTrace, sampledTrace bool) []*model.APMEvent
Copy link
Contributor

@gbbr gbbr Oct 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this! A lot! I think this is a great idea. For this reason, I'd like to make this the main interface to interact with this package. What do you think? This means:

  • Make all Extractor implementations (noopExtractor, ratioExtractor and legacyExtractor) unexported.
  • Keep all constructors exported (NewRatioExtractor, NewLegacyExtractor and NoopExtractor) and have them all return an Extractor focusing the documentation about how they work on top of each constructor.

I think this will give for a great package. Having this interface will make a lot of sense then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing here: is there any way we can get rid of the second parameter? Is it needed? Wasn't this trace already sampled at this point? Or, would it make sense to include it into the ProcessedTrace struct instead? I think in your follow-up PR there can be a middle value (no decision) where the trace has neither been considered nor disconsidered yet.

}

// FromConf creates a new APM event extractor based on the provided agent configuration.
func FromConf(conf *config.AgentConfig) Extractor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this is the place to have this? I'm a bit skeptical about passing around the AgentConfig. It always seemed to me that this config was meant for the agent itself only. Maybe this logic belongs in there? (currently cmd/trace-agent)?

@AlexJF AlexJF force-pushed the alexjf/event-extractor branch 3 times, most recently from b07d45f to 20efb7e Compare October 29, 2018 08:49
Copy link
Contributor

@gbbr gbbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fantastic clean-up. So much better! ❤️And thanks for fixing the sneaky problem with missing events for un-sampled traces.

Added some comments but none are important to me, just nits and praises.

Feel free to merge whenever you like, but when you do, please squash and use the PR title as the commit message.

@@ -191,7 +169,7 @@ func (a *Agent) Process(t model.Trace) {
ts := a.Receiver.Stats.GetTagStats(info.Tags{})

// Extract priority early, as later goroutines might manipulate the Metrics map in parallel which isn't safe.
priority, hasPriority := root.Metrics[sampler.SamplingPriorityKey]
priority, hasPriority := root.GetSamplingPriority()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -65,7 +43,7 @@ type Agent struct {
// tags based on their type.
obfuscator *obfuscate.Obfuscator

sampledTraceChan chan *writer.SampledTrace
tracePkgChan chan *writer.TracePackage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

defer watchdog.LogOnPanic()
// Everything is sent to concentrator for stats, regardless of sampling.
a.Concentrator.Add(pt)
}()
}(pt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

return

if sampled {
pt.Sampled = sampled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be both inside the if and outside. Doesn't really matter, right? 😄

@@ -0,0 +1,11 @@
package event
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to document this package. It'd provide a nice opportunity to explain what events are in APM.

Transactions []*model.Span
// TracePackage represents the result of a trace sampling operation.
//
// If a trace was sampled, then Trace will be set to that trace. Otherwise, it will be nil.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two lines seem like they would be destined above each of the fields in the struct below 😊

type SampledTrace struct {
Trace *model.Trace
Transactions []*model.Span
// TracePackage represents the result of a trace sampling operation.
Copy link
Contributor

@gbbr gbbr Oct 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add here (if you think it helps) that "A package can still be valid when Trace is nil if it contains Events. This happens when a trace wasn't sampled but events might have been."

// If a trace was sampled, then Trace will be set to that trace. Otherwise, it will be nil.
// If events were extracted from a trace, then Events will be populated from these events. Otherwise, it will be empty.
type TracePackage struct {
Trace *model.Trace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels so unusual for this to be a pointer (to a slice, which is already technically a pointer).

return s.Trace == nil && len(s.Transactions) == 0
// Empty returns true if this TracePackage has no data.
func (s *TracePackage) Empty() bool {
return s.Trace == nil && len(s.Events) == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why this wasn't len(s.Trace) == 0 but it's fine.

func randomSampledTrace(numSpans, numTransactions int) *SampledTrace {
if numSpans < numTransactions {
panic("can't have more transactions than spans in a RandomSampledTrace")
func randomSampledTrace(numSpans, numEvents int) *TracePackage {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/randomSampledTrace/randomTracePackage/g

@gbbr gbbr merged commit 45a5f37 into master Oct 29, 2018
@AlexJF AlexJF deleted the alexjf/event-extractor branch November 16, 2018 09:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants