Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trace a AMQP Workflow #122

Closed
cheveyo20 opened this issue Feb 8, 2018 · 17 comments

Comments

@cheveyo20
Copy link

commented Feb 8, 2018

Hi All,

i have a distributed processing (called Workflow1 in this case) with amqp support:

Start -- amqp--> General Processing --amqp-->Fork into two queues Task A and B

So i can trace this with jaeger :) My first attempt using inject and extract:

#Start
span = tracer.start_span('Workflow')
carrier = {}
tracer.inject(span, Format.TEXT_MAP ,carrier)
with tracer.start_span('ProcessInstance_Start', child_of=span) as cspan:
    time.sleep(2) # placeholder: Do some initial processing
#Then send it to the General Processing via amqp
#Sending this with the amqp msg: carrier["uber-trace-id"]
span.finish() # I think the problem lies here!
#General Processing
#recieving msg via amqp
carrier = {}
carrier['uber-trace-id'] = json_data[0]['variables']['Trace']['value']
parentspan = tracer.extract(Format.TEXT_MAP, carrier)
with tracer.start_span('GeneralProcessing', child_of=parentspan) as span:
    time.sleep(1) # placeholder: do some general processing

#send to the next topic queue: carrier['uber-trace-id'] etc.

So 4 issues / questions:

  1. I get sometimes in the UI the result for a trace (not always):
  1. There are gaps between the spans (queue waiting time) can i fill thes gaps somehow?

  2. Points to 2: I think the thing is that a span needs to be finished where it was started. But here is the Problem in general how can i handle the overall span 'Workflow' i have to finish it in the first service which is actually wrong

  3. Dependency analysis via spark: it should display a dependency like the workflow instead it creates a 'star'like dependency!

I hope you can help me in anyway, pseudo code is also ok

Thanks in advance

@yurishkuro

This comment has been minimized.

Copy link
Member

commented Feb 9, 2018

Q: is the program that starts the "Workflow" span long running, or does it exit quickly? If the latter, it's possible that the tracer doesn't have time to flush the spans.

2 - why do you need to fill the gaps? The messaging use case is naturally async, it's not unusual that you will have gaps in the timeline. The distance between the spans should reflect the queue wait. The only way to fill the gaps would be to instrument the queue broker so that it emits an additional span from receiving the msg to handing it out to the consumer.

3 - indeed, I recently ran into a similar scenario with long running workflows. Unless you have some sort of higher-level operation that keeps an open span for all that time, there's no way today to model the full workflow as a span. It could be done in systems based on X-Trace event model (like Facebook's Canopy). In Jaeger it would require an ability to submit in-progress spans to the backend (see jaegertracing/jaeger#677)

4 - can you post a screenshot?

@cheveyo20

This comment has been minimized.

Copy link
Author

commented Feb 9, 2018

  1. Similar to this:
    image

But i do expect something like this:
Start -- --> General Processing ---->Fork into two queues Task A and B
I understand that this does make sense from a technical point of view, because i make everything dependeable on the overal process, how can i improve this here?

To 1) i put a time.sleep at the end and i think it seems to befixed so the error was caused by fast exit

To 3) Is this planned to be integrated?

Is jaeger the right soloution to solve my issue or do you know something more matching to my use case :)?

Thanks

@yurishkuro

This comment has been minimized.

Copy link
Member

commented Feb 10, 2018

the shape of the graph depends on how you are starting the spans. In this case it appears that Task A span is started as a child of the root span, not of "General Processing" span. If your processing Start -- --> General Processing ---->Fork has some form of queued messages between steps, then each message needs to carry the span context of the currently active span. That does not seem to be the case in your instrumentation.

@cheveyo20

This comment has been minimized.

Copy link
Author

commented Feb 27, 2018

Thanks i worked it out :)
Last question to this topic, is it possible if i do have a following service which joins Task A & B, to show its dependency to Task A & B?
Because in the "Join Service", i am having two different span contexts

@yurishkuro

This comment has been minimized.

Copy link
Member

commented Feb 27, 2018

You can certainly capture this relationship by passing two follows-from references when creating the join span. It won't have special display in the UI though - suggestions are welcome how to display it.

@Mulkave

This comment has been minimized.

Copy link

commented Oct 29, 2018

@cheveyo20 can you please share some details on how you got it to work? New to tracing with AMQP 😄

@mohit-chawla

This comment has been minimized.

Copy link

commented Aug 5, 2019

@cheveyo20 Can you share details of how you got it to work?

@yurishkuro , is there a tracer for AMQP (rabbitmq) in python? I could not find any. (I am trying to set up jaeger with istio, unfortunately istio-proxy(envoy) does not support protocols other than HTTP, so i am trying to manually set up instrumentation for microservices using AMQP.)

@yurishkuro

This comment has been minimized.

Copy link
Member

commented Aug 5, 2019

@mohit-chawla I don't know, haven't used rabbitmq/amqp. Sounds like a reasonable case for OpenTracing instrumentation. Have you checked https://opentracing.io/registry?

@mohit-chawla

This comment has been minimized.

Copy link

commented Aug 5, 2019

@yurishkuro, Thanks for your prompt reply. Yes, i checked out https://opentracing.io/registry and many other sources, there seems to be one for go (https://github.com/opentracing-contrib/go-amqp) but surprisingly nothing for python.

@yurishkuro

This comment has been minimized.

Copy link
Member

commented Aug 5, 2019

then you may need to write one :-) . Having Go as a prototype should be helpful

@yurishkuro yurishkuro added the question label Aug 5, 2019

@mohit-chawla

This comment has been minimized.

Copy link

commented Aug 5, 2019

Yes, i think i may end up writing one. Yes, go as prototype is useful.

@yurishkuro

This comment has been minimized.

Copy link
Member

commented Aug 5, 2019

I'll close this since it does not really pertain to jaeger-client-python

@cheveyo20 if you could share the instrumentation, it might be useful to others.

@yurishkuro yurishkuro closed this Aug 5, 2019

@mohit-chawla

This comment has been minimized.

Copy link

commented Sep 6, 2019

@yurishkuro, I am trying to report spans manually from AMQP related services. Can you pl point me to a resource on how can i report a span to jaeger backend manually over http?
My services are in kubernetes-istio env and i cant figure out how exactly istio reports spans to jaeger-collector.

@yurishkuro

This comment has been minimized.

Copy link
Member

commented Sep 6, 2019

Python client doesn't support reporting spans over http right now.

@mohit-chawla

This comment has been minimized.

Copy link

commented Sep 6, 2019

@yurishkuro , is there any protocol over which i can send traces to jaeger collector manually in python. If yes, how?

@yurishkuro

This comment has been minimized.

Copy link
Member

commented Sep 7, 2019

Not in Python, I think it's the only Jaeger client that does not support reporting over HTTP.

You can try to implement your own reporter, but our long term plan is to support HTTP in Jaeger directly, it's just nobody has time to implement it. At Uber we always report spans to the agent that runs on the host, I don't know if you can use that w/ AMQP.

@mohit-chawla

This comment has been minimized.

Copy link

commented Sep 11, 2019

Okay, thanks for answering my queries. I am trying to implement my own reporter that uses thrift over HTTP as mentioned here: https://www.jaegertracing.io/docs/1.13/apis/#thrift-over-http-stable using https://github.com/jaegertracing/jaeger-idl/blob/master/thrift/jaeger.thrift

Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.