Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADDED] Distributed Message Tracing #5014

Merged
merged 7 commits into from Feb 8, 2024
Merged

[ADDED] Distributed Message Tracing #5014

merged 7 commits into from Feb 8, 2024

Conversation

kozlovic
Copy link
Member

When an incoming message contains the header Nats-Trace-Dest set to a valid subject, the server will send there messages representing events happening in each server where the message is processed.

The events will give details about ingress, subject mapping, stream export and service imports, jetstream and egress to subscriptions and/or routes, leafnodes and gateways. Except for a standalone server it is expected to receive multiple trace messages for a given inbound message.

The header Nats-Trace-Only, if set to true, will produce the same tracing events than without that header, except that the message will not be delivered to subscriptions or inserted in the JetStream stream. This is usefull to see the path that a message would take but without affecting running applications.

Note that in a mixed environment where not all servers would be running at 2.11+, when a message is supposed to be traced only and the remote does not understand message tracing, the message will not be forwared and the Egress event for that remote will indicate the reason.

Signed-off-by: Ivan Kozlovic ivan@synadia.com

@ripienaar
Copy link
Contributor

@kozlovic When tracing over an account import/export is it expected that I wont get trace messages once the message cross the account boundary?

Client on account "one" in london -> imports weather.> from account weather -> client on account weather connected across a gateway

In this case I get only 1 trace event with

  "events": [
    {
      "type": "in",
      "kind": 0,
      "cid": 859,
      "name": "NATS CLI Version development",
      "acc": "one",
      "subj": "weather.lon"
    },
    {
      "type": "si",
      "acc": "weather",
      "from": "weather.lon",
      "to": "weather.lon"
    }
  ]

In the same account I get events from otherside of the gateway also

I guess it could be a security concern to trace it through but otoh when debugging would be great to have

@kozlovic
Copy link
Member Author

@ripienaar The decision was to go through service import only if "share: true" is specified in the service import. This was for the reason you just mentioned: security concern. If you have sharing enabled in the service import, then there may be an issue. I have test for service import with sharing/not sharing across super cluster and it passes, but there may still be an issue.

@ripienaar
Copy link
Contributor

@ripienaar The decision was to go through service import only if "share: true" is specified in the service import. This was for the reason you just mentioned: security concern. If you have sharing enabled in the service import, then there may be an issue. I have test for service import with sharing/not sharing across super cluster and it passes, but there may still be an issue.

I didnt have share on, will try

@kozlovic
Copy link
Member Author

@ripienaar I am thinking of adding something to the MsgTraceEgress structure to be able to "link" MsgTraceEvent's from a network topology together. Something like this:

diff --git a/server/msgtrace.go b/server/msgtrace.go
index 88c35d5f..601cf16d 100644
--- a/server/msgtrace.go
+++ b/server/msgtrace.go
@@ -110,6 +110,11 @@ type MsgTraceEgress struct {
        Subscription string `json:"sub,omitempty"`
        Queue        string `json:"queue,omitempty"`
        Error        string `json:"error,omitempty"`
+
+       // This is for applications that unmarshal the trace events
+       // and want to link an egress to route/leaf/gateway with
+       // the MsgTraceEvent from that server.
+       Link *MsgTraceEvent `json:"-"`
 }
 
 // -------------------------------------------------------------

The idea is that the library that assembles message trace events would then return only one event (the one that had a CLIENT as an Ingress) and then by going through the Events list, when getting an Egress that is not a CLIENT kind, there would be the Link field that would not be nil and you could descend the graph. If the Link is nil, it would be an indication that this trace from that server (as given by the Egress's info) is missing.

Any thoughts? Would you think that it should be done differently?

@ripienaar
Copy link
Contributor

Any thoughts? Would you think that it should be done differently?

Yeah anything you can do to make assembling the flows easier would help (been trying to hack up a sorter/linker for them and its difficult!)

@kozlovic
Copy link
Member Author

@ripienaar I will have the library repo ready today.

@kozlovic
Copy link
Member Author

@ripienaar I am coming to the realization that I don't think we need the "Hop" because the Ingress/Egress has already the name of the server the message comes from/sent to, so that should be enough to link them. As for the Hops (the count at the MsgTraceEvent level), again, it is nice, but it is basically the number of Egress that are not a CLIENT kind, so again, not strictly necessary.

@ripienaar
Copy link
Contributor

@ripienaar I am coming to the realization that I don't think we need the "Hop" because the Ingress/Egress has already the name of the server the message comes from/sent to, so that should be enough to link them. As for the Hops (the count at the MsgTraceEvent level), again, it is nice, but it is basically the number of Egress that are not a CLIENT kind, so again, not strictly necessary.

Yeah sounds reasonable :)

Kind of this is why I wanted a kind of draft client to assemble them, we'll discover a lot in doing that

@MauriceVanVeen
Copy link
Contributor

Really cool! Have been trying it out for a bit. 🤩

Not sure if there's also an intention to add tracing between stream => consumer as well as mirroring/sourcing?
Am thinking of a use case where someone publishes into stream 1, a consumer consumes from that stream and publishes 1 or more events into stream 2.
Would that also become available with this tracing, or would that be an extension to this?

Thinking along the lines of a full graph, from the initial message all the way down to how it was processed. (From an OpenTelemetry perspective that could be one 'trace' consisting of multiple 'spans')

@kozlovic
Copy link
Member Author

Would that also become available with this tracing, or would that be an extension to this?

@MauriceVanVeen No, currently the only JetStream related tracing is to know if a message would have been inserted to a stream, with which subject, and if not what caused the message to not be inserted (say presence of a header that causes server checks to reject it).

@hwinkel
Copy link

hwinkel commented Jan 31, 2024

How are the traces exported? Towards "standard" OTEL collectors like jaeger, tempo etc. ? Does the export happens on a per serves bases or centrally like surveyor does ?

@ripienaar
Copy link
Contributor

How are the traces exported? Towards "standard" OTEL collectors like jaeger, tempo etc. ? Does the export happens on a per serves bases or centrally like surveyor does ?

Now the messages are published as JSON (see msgtrace.go for the types) to keep the server light weight, I imagine you'd run a collector process next to your nats servers and the purpose of that would be to receive these, assemble them and send them to otel etc

For now otel/jaeger etc headers do not trigger a trace (though our service latency tracing does support this see https://github.com/nats-io/nats-architecture-and-design/blob/main/adr/ADR-3.md)

The focus of this is primarily debugging though and not, for example, tracing 100% of message routing

@bruth
Copy link
Member

bruth commented Jan 31, 2024

@hwinkel You can think of this as the lower layer of the output. It is published to the designated trace subject which then the subscribed client would do the work to map/export to otel (or anything else). That is the next bit of work to write the exporter.

@bruth
Copy link
Member

bruth commented Jan 31, 2024

And second what @ripienaar said. The scope was initially for trace debugging, albeit, depending on throughput/load of a system, you could opt to trace whatever you want.

@hwinkel
Copy link

hwinkel commented Jan 31, 2024

@hwinkel You can think of this as the lower layer of the output. It is published to the designated trace subject which then the subscribed client would do the work to map/export to otel (or anything else). That is the next bit of work to write the exporter.

I like to have it on a lower, NATS native layer, hence publish to a subject. The OTEL "Formatter and Exporter" could be modelled in the same way as Surveyor already does for the metrics. IMHO a much cleaner NATS native architecture which is location independent for Trace Data.

@ripienaar
Copy link
Contributor

For community who are following these are some renders of traces in various cases

https://gist.github.com/ripienaar/78666080a4296bc5a0e317d1c9c08df8

The cluster is a 3 cluster super cluster each with 3 nodes and a single leafnode hanging off the lon cluster.

@kozlovic
Copy link
Member Author

@ripienaar Just a little comment, I think you should use %q for the strings so they are enclosed with double quotes, it would help for the reading.

@davedotdev
Copy link

I spent some (not negligible) time on this in the networking space working for a large vendor. Multiple approaches were taken, some out-of-band and some in-band, but the outcomes were the same; the desire to have tracing within an OAM domain, for targeted packets or a sample set. A networked system has some configuration to make it part of an OAM domain and each system in that domain must be able to understand the requirements of being part of a tracing capable system, or pass it on without modification exactly like you've done here with the header pass-thru.

What I wanted to raise for debate, was the ability to turn have tracing domains for NATS servers, so that one or more server or group of servers could ignore the header. For instance, this could be useful in secure environments, where some groups can trace and others cannot, or in situations where visibility needs to be reduced. Initial suggestion would be to have a server tag like "OAMDomain": "[X, Y, Z]". Is it something that's been thought about? Would any of the commenters here find it useful? I can see a requirement for it and if it changes any design decisions being made, it's worth raising for future considerations.

A good use-case would be hub and spoke, where the spokes are under your control, but the hubs are not. Traffic would traverse the hub in either case, but the traces would show the hub ingress and egress points and the resulting trace would show the delay, but not reveal anything about the hub.

Nice work Ivan - this really is great (and I'm not griping - just thinking about wider use cases!!).

@ripienaar
Copy link
Contributor

A good use-case would be hub and spoke, where the spokes are under your control, but the hubs are not. Traffic would traverse the hub in either case, but the traces would show the hub ingress and egress points and the resulting trace would show the delay, but not reveal anything about the hub.

for now our only control here is when crossing account barriers if you dont set sharing true you wont get traces from the other side. One can imagine spoke/leafnode on an account and access to it over exports/imports would control visibility

I dont want to over complicate things from the get-go so but this is a good point for future iterations. I imagine we'd want to eventually support disabling tracing on an account level for sensitive cases. Much can be learned from seeing which mappings apply etc

@ripienaar
Copy link
Contributor

For me I am happy with where this is to merge it. We might iterate and once it’s in a bit look at how it would be to trigger it from headers from other systems like we support for service traces

for now in the interest of getting it in peoples hands via nightly builds this works well.

To be clear I didn’t code review I looked at functionality only

@bruth
Copy link
Member

bruth commented Feb 2, 2024

I agree with getting it merged, knowing it may need some iteration. What are your thoughts @Jarema?

Once merged, I want to tag a preview release with this feature so people can access a build to try it out.

Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM - Some comments and questions.

server/client.go Outdated Show resolved Hide resolved
server/client.go Outdated Show resolved Hide resolved
server/client.go Outdated Show resolved Hide resolved
server/client.go Outdated
if mt != nil {
mt.addServiceImportEvent(siAcc.GetName(), string(pacopy.subject), to)
// If we are not sharing and doing trace only, we stop at this level.
if !share {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted I messed this up, let's switch to allow_trace for service export and stream import.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do in a separate commit.

server/client.go Show resolved Hide resolved
server/events.go Outdated Show resolved Hide resolved
server/jetstream_cluster.go Outdated Show resolved Hide resolved
server/msgtrace.go Show resolved Hide resolved
server/stream.go Show resolved Hide resolved
@Jarema
Copy link
Member

Jarema commented Feb 5, 2024

Yes, agree with merging and doing a preview release @bruth .
This will allow us to get more community feedback which could be invaluable for feature like this one.
The only concern is conflicts when cherry picking for patch releases, but that is unavoidable, as anyway this branch would need constant rebases and updates which is similar cost.
Let's merge and iterate.

@ripienaar
Copy link
Contributor

We're working on a fix about import sharing, will look to merge after

When an incoming message contains the header `Nats-Trace-Dest` set
to a valid subject, the server will send there messages representing
events happening in each server where the message is processed.

The events will give details about ingress, subject mapping, stream
export and service imports, jetstream and egress to subscriptions
and/or routes, leafnodes and gateways. Except for a standalone server
it is expected to receive multiple trace messages for a given inbound
message.

The header `Nats-Trace-Only`, if set to `true`, will produce the
same tracing events than without that header, except that the message
will not be delivered to subscriptions or inserted in the JetStream
stream. This is usefull to see the path that a message would take
but without affecting running applications.

Note that in a mixed environment where not all servers would be
running at 2.11+, when a message is supposed to be traced only and
the remote does not understand message tracing, the message will
not be forwared and the Egress event for that remote will indicate
the reason.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
@kozlovic
Copy link
Member Author

kozlovic commented Feb 6, 2024

I'll just add a log line and sleep :)

Not ideal, but this is a more general issue of interest propagation with gateways, unfortunately.

@ripienaar
Copy link
Contributor

thanks @kozlovic

@ripienaar
Copy link
Contributor

I am good to merge this @derekcollison @kozlovic @bruth

@derekcollison
Copy link
Member

ok so ready for final review?

Will try to get to this tomorrow.

@kozlovic
Copy link
Member Author

kozlovic commented Feb 7, 2024

@derekcollison Actually, could you let me check if I could replace addition of boolean in server's INFO/CONNECT in order to know if the remote supports message tracing, with the use of the Proto field. I will update the PR with a note when I am done (either with a new commit or saying that I am not changing and it is ok to review).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
@kozlovic
Copy link
Member Author

kozlovic commented Feb 7, 2024

@derekcollison Ok, I have made the changes. So you are good to go to review the last 2 commits, which were added since you have last reviewed. Thanks!

@derekcollison derekcollison self-requested a review February 7, 2024 23:03
Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - two minor comments.

@@ -2478,7 +2507,11 @@ func (a *Account) AddMappedStreamImportWithClaim(account *Account, from, to stri
a.mu.Unlock()
return ErrStreamImportDuplicate
}
a.imports.streams = append(a.imports.streams, &streamImport{account, from, to, tr, nil, imClaim, usePub, false})
// TODO(ik): When AllowTrace is added to JWT, uncomment those lines:
// if imClaim != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe coordinate with @aricart to get it added.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There would be also some code to add in updateAccountClaimsWithRefresh() to possibly call the function to set the service export allow trace property. And I would like to add a test with JWT of course. I would prefer to wait and add the JWT support before merging, but that would depend on how long it would take to get the JWT library updated. It can of course all be added after the merge. Your call Derek.
@aricart Would you mind adding a AllowTrace boolean property to a ServiceExport and StreamImport? If you are using common structures, validation should make it an error if this property is set on a ServiceImport or a StreamExport (the opposite of where it is valid).

server/accounts.go Show resolved Hide resolved
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
@derekcollison derekcollison merged commit cca23cd into main Feb 8, 2024
4 checks passed
@derekcollison derekcollison deleted the msg_tracing branch February 8, 2024 18:59
derekcollison pushed a commit that referenced this pull request Feb 10, 2024
When an incoming message contains the header `Nats-Trace-Dest` set to a
valid subject, the server will send there messages representing events
happening in each server where the message is processed.

The events will give details about ingress, subject mapping, stream
export and service imports, jetstream and egress to subscriptions and/or
routes, leafnodes and gateways. Except for a standalone server it is
expected to receive multiple trace messages for a given inbound message.

The header `Nats-Trace-Only`, if set to `true`, will produce the same
tracing events than without that header, except that the message will
not be delivered to subscriptions or inserted in the JetStream stream.
This is usefull to see the path that a message would take but without
affecting running applications.

Note that in a mixed environment where not all servers would be running
at 2.11+, when a message is supposed to be traced only and the remote
does not understand message tracing, the message will not be forwared
and the Egress event for that remote will indicate the reason.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>

---------

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
kozlovic added a commit that referenced this pull request Feb 11, 2024
If the `Nats-Trace-Dest` header is not present, but `Traceparent` is
and its last token is `01`, then message tracing is triggered. This
also requires that the account be defined with a `trace_dest` subject
so that traces can be sent there.
Note that `Nats-Trace-Only` is not applicable for `Traceparent`.

Addition to PR #5014
Resolves #5052

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
derekcollison added a commit that referenced this pull request Feb 13, 2024
If the `Nats-Trace-Dest` header is not present, but `Traceparent` is and
its last token is `01`, then message tracing is triggered. This also
requires that the account be defined with a `trace_dest` subject so that
traces can be sent there.
Note that `Nats-Trace-Only` is not applicable for `Traceparent`.

Addition to PR #5014
Resolves #5052

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
kozlovic added a commit that referenced this pull request Feb 14, 2024
Related to #5014

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
kozlovic added a commit that referenced this pull request Feb 15, 2024
Related to #5014

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
kozlovic added a commit that referenced this pull request Feb 15, 2024
Related to #5014

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
derekcollison added a commit that referenced this pull request Feb 15, 2024
Related to #5014

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
kozlovic added a commit that referenced this pull request Feb 22, 2024
If an account has a trace destination and an incoming message has
the `traceparent` header with the proper sampled flag, a trace
message would be triggered. The `sampling` field allows to trace
a certain percentage of the traffic.

The field `trace_dest` or now `msg_trace` can be a simple string
representing the destination, and in this case sampling is 100%
or it can be a structure with the `dest` and `sampling` fields.
Sampling values that are negative or above 100 will trigger an
error on configuration parsing. A value of 0 is converted to 100.
If the sampling is specified without an account trace destination,
it is set to 0 and a warning is issued when parsing configuration.

There is similar support for the property set in a JWT account claim.

Relates to #5014

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
kozlovic added a commit that referenced this pull request Feb 22, 2024
If an account has a trace destination and an incoming message has
the `traceparent` header with the proper sampled flag, a trace
message would be triggered. The `sampling` field allows to trace
a certain percentage of the traffic.

The field `trace_dest` or now `msg_trace` can be a simple string
representing the destination, and in this case sampling is 100%
or it can be a structure with the `dest` and `sampling` fields.
Sampling values that are negative or above 100 will trigger an
error on configuration parsing. A value of 0 is converted to 100.
If the sampling is specified without an account trace destination,
it is set to 0 and a warning is issued when parsing configuration.

There is similar support for the property set in a JWT account claim.

Relates to #5014

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
derekcollison added a commit that referenced this pull request Feb 23, 2024
If an account has a trace destination and an incoming message has the
`traceparent` header with the proper sampled flag, a trace message would
be triggered. The `sampling` field allows to trace a certain percentage
of the traffic.

The field `trace_dest` or now `msg_trace` can be a simple string
representing the destination, and in this case sampling is 100% or it
can be a structure with the `dest` and `sampling` fields. Sampling
values that are negative or above 100 will trigger an error on
configuration parsing. A value of 0 is converted to 100. If the sampling
is specified without an account trace destination, it is set to 0 and a
warning is issued when parsing configuration.

There is similar support for the property set in a JWT account claim.

Relates to #5014

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
@MikaelElkiaer
Copy link

After looking through this PR and related ones, it is still unclear to me whether spans can be exported.

Based on this reply I guess it is not yet possible to export OTEL spans to a service such as OTEL Collector or Grafana Tempo.
If not, is this in the pipeline? I would love to export spans to OTEL Collector to link NATS spans to my application spans in case of erroneous requests.

@ripienaar
Copy link
Contributor

We do not export spans exactly - we have no otel dependencies etc - but a receiver for these messages can be written that turns them into spans and sends them to a collector.

We have some early reviewers who have written such receivers and reported success with that approach

@MikaelElkiaer
Copy link

We do not export spans exactly - we have no otel dependencies etc - but a receiver for these messages can be written that turns them into spans and sends them to a collector.

We have some early reviewers who have written such receivers and reported success with that approach

Thanks for clarifying.
I had assumed it was something written to stdout at first, but it makes sense that it is a message subject.
Does it need to be enabled? Is there any docs on this?

@ripienaar
Copy link
Contributor

It is available in the preview releases and main of the CLI

see https://github.com/nats-io/nats-architecture-and-design/blob/main/adr/ADR-41.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants