-
-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better support for cluster environments #471
Comments
I'm in a slightly different situation. I'm running several game services in cluster and each service has different functionality. It would be great if all services(clients) could connect to a single GUI and the timelines are aligned. |
Cluster tooling is an entirely different can of worms, but FWIW we'd also be very interested in (even just basic) support for this use case. (Where "basic support" would probably mean ingesting data from several processes and aligning the times) |
Ingesting data from several processes and aligning the times are enough to my case. For now I'm hacking this by a simple proxy which is connected by cluster processes and act as the only client to tracy. |
Making a proxy that would mux multiple clients would be a preferred solution here. To properly handle thread identifiers, which may be duplicated across different processes, you may use the already existing encoding: tracy/import-chrome/src/import-chrome.cpp Lines 143 to 183 in e1395f5
You can see how this works in #213 (comment). In 0.9 there were many changes in how the timeline items are handled, which is not really visible to users right now. Each track displayed on the timeline is now an instance of https://github.com/wolfpld/tracy/blob/master/server/TracyTimelineItem.hpp and the management of these items is now well defined in https://github.com/wolfpld/tracy/blob/master/server/TracyTimelineController.hpp, instead of the mess it was before. The takeaway here is that it should be now relatively easy to rearrange the threads, so that threads originating from the same process are next to each other, or to add different colorings to thread backgrounds, etc. |
Just wanted to add that at our company are looking into integrating Tracy into our development environment, and we also need to merge Tracy data multiple sources. At least two, but perhaps more, that are either on the same machine, or distributed across multiple machines. Sounds like we have a very similar problem to everyone else in this thread. A mux is a good idea, especially if we can also use that mux to record a trace for later analysis. Great work on Tracy! |
I tried my hand at writing a mux, and I thought I might add some color to this conversation based on my experience over the last few days. My application involves remote introspection of a target system comprising of many tracy-instrumented processes running at the same time. It should be much more convenient to run a mux/proxy on the target side, which aggregates streams from all processes into one point, shoving them all onto one unified timeline. The idea being that the mux would then present the aggregated data stream on tcp/w.x.y.z:8085, which would be easy to open-up on a firewall and push over the internet to a the profiling user interface running on some remote host (ie. with Towards writing this mux, I was able to fairly easily scan for the UDP broadcast packets sent out on port 8086 by tracy clients. Decoding them was fairly straightforward, and I was able to extract the TCP listenPort, which all tracy-instrumented processes negotiate to be unique on start-up (it looks like the first one gets 8086, the next one gets 8087, etc up to a hard-coded 20 max). This is where things fell apart. I had intended to spin up a thread to start a worker to bind to all TCP streams, collect and forward. However, I can't seem to work out how the handshake / lz4-encoding works for the TCP stream, and how the on-demand and regular implementations of the TCP protocols differ from each other! I can probably work it out by following the code (which I think is call captured in the Here are some hacky implementations of UDP listeners (the first version using the network protocol API in tracy, and another version using Boost.asio) for anybody who wants a starting point. Here's a cmake_minimum_required(VERSION 3.5)
project(tracy_mux)
add_definitions(-DTRACY_ENABLE)
## TRACY CODE ###########################################
# Fetch the core interface library and make available to the next steps
include(FetchContent)
FetchContent_Declare(
tracy
GIT_REPOSITORY https://github.com/wolfpld/tracy.git
GIT_TAG master
GIT_SHALLOW TRUE
GIT_PROGRESS TRUE)
FetchContent_MakeAvailable(tracy)
FetchContent_GetProperties(tracy)
message(STATUS "tracy: ${tracy_SOURCE_DIR} ${tracy_BINARY_DIR}")
## TRACY PROFILER UI #######################################
# Build the tracy profiler (server and UI)
include(ExternalProject)
ExternalProject_Add(tracy_profiler
SOURCE_DIR ${tracy_SOURCE_DIR}/profiler/build/unix
CONFIGURE_COMMAND ""
BUILD_COMMAND ${CMAKE_COMMAND} -E env LEGACY=1 make -j all
INSTALL_COMMAND cp ${tracy_SOURCE_DIR}/profiler/build/unix/Tracy-release ${CMAKE_CURRENT_BINARY_DIR}/tracy
BUILD_IN_SOURCE TRUE)
## TRACY MUXER ###########################################
find_package(Boost REQUIRED COMPONENTS thread)
add_executable(tracy_muxer_native tracy_muxer_native.cpp)
target_link_libraries(tracy_muxer_native TracyClient)
find_package(Boost REQUIRED COMPONENTS thread)
add_executable(tracy_muxer_boost tracy_muxer_boost.cpp)
target_link_libraries(tracy_muxer_boost TracyClient)
target_link_libraries(tracy_muxer_boost ${Boost_LIBRARIES}) One of the strange things about the native version of the UDP listener is that it finds itself ! In other words, when you run it, you see something like this...
Also, don't be a numpty like me and forget to |
My company is also using Tracy, great work! We could also benefit a lot for the requested enhancement to support merging the traces of multiple clients in one GUI, especially to profile network latencies. |
Such a feature would also be very useful for, e.g., profiling applications that spawn child processes. Build systems are one example where it'd be quite nice to have an end-to-end view of the performance timeline across all processes. |
Hi all ! I am joining the team of people that would be interested in a way to collect multiple process traces into a same GUI windows. My company is willing to let me do some work on open-source projects of importance for us, and I'd be happy to contribute here. If you feel like you'd accept a contribution on that topic, I could help. (To be honest, I will surely need some help/guidance on this part to make it happen ) |
Sure. |
Nice! Can I suggest the following plan?
|
I just wanted to warn you that I am almost done with initial multiplexer prototype, so that you don't end up doing duplicate work. I have few bugs to iron out, but I am at a stage where broadcasting clients are automatically adopted, all client events are weaved into single event stream by splitting at ThreadContext boundaries, broadcasting server queries to all clients and picking single most appropriate response. Edit: My current progress on the prototype can be found here: https://github.com/cipharius/tracy/blob/feature/multiplex/multiplex/src/multiplex.cpp And little preview of how it's looking right now: I have conviniently hidden the tracy thread zones in that screenshot, because those currently get messed up when new clients connect, still need to figure that out. On Linux I'm not seeing any thread ID conflicts, so I didn't bother creating pseudo IDs yet. |
Very kind of you to warn :) I had started digging into the existing code to get a sense of how things worked, but that's not lost time at all anyways I can confirm my company is giving me time to work on this (roughly half a day per week). @cipharius would you accept help on your branch to make this happen ? The minor caveat is that I am on holidays from mid-april to early may, so if you go too fast you might well be finished before I get back and try to help^^ |
Sure, can try to help, I'll have to update the branch with local changes first. Though the code being very prototypical and changing a lot, it might be tough to effectively collaborate on it. The most helpful feedback right now would be testing it out. Right now I'm trying to figure out the last crucial bit of normalising the time between clients such that timeline is correctly displayed. You can try figuring out how time is represented in tracy, but by that time I might have figured out what's going wrong with my current attempts. The most neutral help would be improving and testing the build scripts, since I only tested on linux and didn't pay too much attention to customising the build scripts. So would be good to see if it builds on windows for example. |
Anyone interested in this feature should have a look at #766. |
Hi @cipharius :) As you suggested, I'll start by testing your branch !
friendly reminder, in case you did not carry on the update (I see your last push dates from April 4th, does that look a good date for you ? cf https://github.com/cipharius/tracy/commits/feature/multiplex/multiplex/src/multiplex.cpp )
I don't have a working windows setup right now (I'm on Ubuntu Linux 22.04 / Xorg) - but if I get stuck at some point I might try this in a VM to check the build works. |
@wolfpld , @cipharius I will be documenting my work in #822. If you have a few minutes to spare to read/review my messages as they go along, that would be precious help for me. Otherwise, I am aware you probably have stuff to work on your own, so I will carry on on my own :) (I specifically created a new issue to limit noise on this one) |
Heya: I worked on an alternative approach, simpler (but more limited - at least in this first version), in #825 |
Totally understand if this is out of scope and a pretty niche usecase.
In our institution we are having a planetarium that is running the same instance 6 times in a networked environment. In the past I have used Tracy in this environment by starting the GUI 7 times and connecting remotely to all instances manually. It would be really neat to be able to connect to all of the clients from a single GUI and possibly also align the timelines from all of the instances and show the places where one of the instances takes longer to execute a function, for example.
Just to be clear, this would be N instances of the same executable and they should always go through the same function calls and where they disagree is where the interesting stuff happens.
The text was updated successfully, but these errors were encountered: