Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support tracing tensors in triton #3598

Merged
merged 24 commits into from
Jan 28, 2022
Merged

Support tracing tensors in triton #3598

merged 24 commits into from
Jan 28, 2022

Conversation

ZhuYuJin
Copy link
Contributor

@ZhuYuJin ZhuYuJin commented Nov 22, 2021

A draft implementation of tracing tensors according to the design doc(https://docs.google.com/document/d/1yL40ctSccNMnbhiAkR-Wg6T0zSaXrfoC/edit) and the PR of triton core(triton-inference-server/core#36).

The output of trace_summary.py is shown as follow.
image
image

Please check whether the implementation is reasonable. @deadeyegoodwin



TRITON_TYPE_TO_NUMPY = {
1: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wha are these magic numbers. At the least put comments indicating the corresponding enum name

@@ -34,6 +34,7 @@
#include <queue>
#include <set>
#include <thread>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove extra lines between includes added here and elsewhere. Are you running the formatter? It should not be adding these lines.

@@ -39,6 +40,10 @@
#include "src/core/status.h"
#include "src/core/tritonserver_apis.h"

#ifdef TRITON_ENABLE_GPU
#include <cuda_runtime_api.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this include?

@@ -304,7 +309,12 @@ class InferenceRequest {
void SetTrace(std::unique_ptr<InferenceTrace>&& trace)
{
trace_ = std::move(trace);
#ifdef TRITON_ENABLE_TRACING
response_factory_.SetTrace(std::move(trace_->CopyTrace()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to have the trace object be used by both request and response(s), then we should change the trace to be a shared pointer everywhere. We should not be creating copies of the trace object. The potential complication is that the activity collection and other functions may need additional synchronization as the response(s) may be reporting trace activities/tensors interleaved with each other and with the request.

@@ -52,7 +53,31 @@ class InferenceTrace {
{
}

InferenceTrace(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this the only constructor and change callers to explicitly pass tensor_activity_fn = nullptr when necessary

#include "triton/core/tritonserver.h"

#ifdef TRITON_ENABLE_GPU
#include <cuda_runtime_api.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you add this? I don't see any cuda api usage in this file.

@ZhuYuJin
Copy link
Contributor Author

ZhuYuJin commented Nov 29, 2021

@deadeyegoodwin Please check the latest commit. I have made the following modification. 1) Removing extra lines between includes; 2) Changing the trace to be a shared pointer; 3) Saving strings instead of magic numbers in trace.json.

@@ -31,6 +31,9 @@
#include "src/core/logging.h"
#include "src/core/model.h"
#include "src/core/server.h"
#ifdef TRITON_ENABLE_GPU
#include <cuda_runtime_api.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this include needed?

@@ -1222,6 +1222,8 @@ EnsembleScheduler::Enqueue(std::unique_ptr<InferenceRequest>& request)
INFER_TRACE_ACTIVITY(
request->Trace(), TRITONSERVER_TRACE_QUEUE_START,
request->QueueStartNs());
request->TraceTensor();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surround with "#ifdef TRITON_ENABLE_TRACING"

@@ -110,6 +113,70 @@ InferenceRequest::SetPriority(uint32_t p)
}
}

void
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put "#ifdef TRITON_ENABLE_TRACING" around entire function

TRITONSERVER_MemoryType memory_type;
int64_t memory_type_id;

TRITONBACKEND_InputProperties(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't call external TRITONBACKEND functions from here. All the information is available directly from this InferenceRequest object.

src/core/infer_request.cc Show resolved Hide resolved
@@ -102,6 +109,11 @@ class InferenceResponseFactory {
// Delegator to be invoked on sending responses.
std::function<void(std::unique_ptr<InferenceResponse>&&, const uint32_t)>
response_delegator_;

#ifdef TRITON_ENABLE_TRACING
// Inference trace associated with this request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... "associated with this response."

@@ -303,6 +320,11 @@ class InferenceResponse {
response_delegator_;

bool null_response_;

#ifdef TRITON_ENABLE_TRACING
// Inference trace associated with this request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... "associated with this response."

@@ -46,13 +46,24 @@ class InferenceTrace {
InferenceTrace(
const TRITONSERVER_InferenceTraceLevel level, const uint64_t parent_id,
TRITONSERVER_InferenceTraceActivityFn_t activity_fn,
TRITONSERVER_InferenceTraceReleaseFn_t release_fn, void* userp)
TRITONSERVER_InferenceTraceReleaseFn_t release_fn, void* userp,
TRITONSERVER_InferenceTraceTensorActivityFn_t tensor_activity_fn =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move tensor_activity_fn argument to after activity_fn. You will not be able to use a default augument value (nullptr) so you will need to change the callers to explicitly pass nullptr.

src/servers/tracer.cc Show resolved Hide resolved
src/servers/tracer.cc Show resolved Hide resolved
@ZhuYuJin
Copy link
Contributor Author

ZhuYuJin commented Dec 1, 2021

@deadeyegoodwin The above comments should get solved in the latest commits. Please check if there are any other problems.

src/core/infer_request.cc Show resolved Hide resolved
src/core/infer_request.cc Outdated Show resolved Hide resolved
src/core/tritonserver.cc Show resolved Hide resolved
@@ -181,9 +182,6 @@ TraceManager::TraceRelease(TRITONSERVER_InferenceTrace* trace, void* userp)
if (parent_id == 0) {
delete ts;
}

LOG_TRITONSERVER_ERROR(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you removing this?

Copy link
Contributor Author

@ZhuYuJin ZhuYuJin Dec 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TraceRelease is called in destructor of InferenceTrace by now. So it's no need to release the InferenceTrace again. Otherwise, it will cause deadlock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to keep this call. But some other changes are needed. Currently TRITONSERVER_InferenceTrace represents InferenceTrace. So TRITONSERVER_InferenceTrace* is a cast of InferenceTrace*. But now we need TRITONSERVER_InferenceTrace needs to represent shared_ptr so that TRITONSERVER_InferenceTrace* is a cast of shared_ptr*. That will require changing all of the uses of TRITONSERVER_InferenceTrace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I split original TraceRelease into two functions: TraceRelease and TraceStreamRelease.
TraceRelease should delete InferenceTrace object.
TraceStreamRelease should be called in InferenceTrace destructor.

src/servers/tracer.cc Outdated Show resolved Hide resolved
std::stringstream ss_tmp;

// collect and serialize trace details.
ss_tmp << ",{\"id\":" << id << ",\"activity\":\""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not write directly to 'ss'... doing it this way introduces unnecessary copies.

src/servers/tracer.cc Outdated Show resolved Hide resolved
@ZhuYuJin
Copy link
Contributor Author

ZhuYuJin commented Dec 2, 2021

@deadeyegoodwin The above comments should get resolved. Please check this PR again. Thanks for your patience.

src/core/infer_request.cc Show resolved Hide resolved
src/core/infer_request.cc Show resolved Hide resolved
@@ -181,9 +182,6 @@ TraceManager::TraceRelease(TRITONSERVER_InferenceTrace* trace, void* userp)
if (parent_id == 0) {
delete ts;
}

LOG_TRITONSERVER_ERROR(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to keep this call. But some other changes are needed. Currently TRITONSERVER_InferenceTrace represents InferenceTrace. So TRITONSERVER_InferenceTrace* is a cast of InferenceTrace*. But now we need TRITONSERVER_InferenceTrace needs to represent shared_ptr so that TRITONSERVER_InferenceTrace* is a cast of shared_ptr*. That will require changing all of the uses of TRITONSERVER_InferenceTrace.

@ZhuYuJin
Copy link
Contributor Author

ZhuYuJin commented Dec 3, 2021

@deadeyegoodwin I have explained both the comments. Do you agree my opinions?

@deadeyegoodwin
Copy link
Contributor

@deadeyegoodwin I have explained both the comments. Do you agree my opinions?

I only saw one response from you, and I agree. See my comment above.

@ZhuYuJin
Copy link
Contributor Author

ZhuYuJin commented Dec 5, 2021

image
@deadeyegoodwin Please check this response.

@deadeyegoodwin
Copy link
Contributor

"the release timing of InferenceTrace is determined, which should be the release timing of InferenceRequest". That is not true.
Previous to this change, we used a unique_ptr for InferenceTrace because its lifetime was closely tied to InferenceRequest.
With this change the lifetime of InferenceTrace is now complicated. InferrenceTrace is required by both InferenceRequest and (one or more) InferenceResponse. The lifetimes of InferenceRequest and InferenceResponse are completely decoupled... an InferenceResponse can live much longer than the corresponding InferenceRequest, and vice-versa. So we now use a shared_ptr for InferenceTrace and we can no longer assume that its lifetime is tied to InferenceRequest. An InferenceTrace reference can be held by an InferenceRequest object, and one or more InferenceResponse objects, and by the client using the C API. Once once all these references are released should the InferenceTrace object be released and finalized. Careful reading of the entire InferenceTrace lifetime/flow is required to make sure that this new lifetime is correctly handled.... and whatever we do must remain backwards compatible at the API level.

@deadeyegoodwin
Copy link
Contributor

BTW, the reason I couldn't see you response is because I don't think you actually posted it... note the "Pending" next to it.

@ZhuYuJin ZhuYuJin closed this Dec 8, 2021
@ZhuYuJin ZhuYuJin reopened this Dec 8, 2021
@ZhuYuJin
Copy link
Contributor Author

ZhuYuJin commented Dec 8, 2021

@deadeyegoodwin I have made two modification:

  1. I simply split original TraceRelease into two functions: TraceRelease and TraceStreamRelease.
    TraceRelease should delete InferenceTrace object.
    TraceStreamRelease should be called in InferenceTrace destructor.
  2. If there is less than one chunk, TraceTensor in InferenceRequest will not copy buffer.

@deadeyegoodwin
Copy link
Contributor

@deadeyegoodwin I have made two modification:

1. I simply split original TraceRelease into two functions: TraceRelease and TraceStreamRelease.
   TraceRelease should delete InferenceTrace object.
   TraceStreamRelease should be called in InferenceTrace destructor.

2. If there is less than one chunk, TraceTensor in InferenceRequest will not copy buffer.

The #2 change looks good. I'm not sure about #1 change. The first thing you need to do is to address the comments I made above:
"But now we need TRITONSERVER_InferenceTrace needs to represent shared_ptr so that TRITONSERVER_InferenceTrace* is a cast of shared_ptr*"
I don't see that you have done that change. Have you carefully audited the full life-cycle of trace objects along with making that change?

@ZhuYuJin
Copy link
Contributor Author

ZhuYuJin commented Dec 10, 2021

@deadeyegoodwin I have made two modification:

1. I simply split original TraceRelease into two functions: TraceRelease and TraceStreamRelease.
   TraceRelease should delete InferenceTrace object.
   TraceStreamRelease should be called in InferenceTrace destructor.

2. If there is less than one chunk, TraceTensor in InferenceRequest will not copy buffer.

The #2 change looks good. I'm not sure about #1 change. The first thing you need to do is to address the comments I made above: "But now we need TRITONSERVER_InferenceTrace needs to represent shared_ptr so that TRITONSERVER_InferenceTrace* is a cast of shared_ptr*" I don't see that you have done that change. Have you carefully audited the full life-cycle of trace objects along with making that change?

@deadeyegoodwin Hi. I think I don't get your point that we should represent TRITONSERVER_InferenceTrace* as shared_ptr*. What's the difference between shared_ptr* and InferenceTrace*?

In the current change, I simply split original TraceRelease into two functions: TraceRelease and TraceStreamRelease. TraceRelease will delete InferenceTrace object. InferenceTrace destructor will call TraceStreamRelease. And TraceStreamRelease will release TraceStream.

Since InferenceTrace destructor will call TraceStreamRelease to release TraceStream, we only need to release InferenceTrace.

The grpc/http server will call trace_manager->SampleTrace or TRITONSERVER_InferenceTraceNew to get a InferenceTrace*, and call TraceRelease or TRITONSERVER_InferenceTraceDelete to delete InferenceTrace*.

Inside Triton, we will call trace->SpawnChildTrace to get a child shared_ptr, which is maintained by InferenceRequest and InferenceResponse. After InferenceRequest and InferenceResponse are released, the InferenceTrace will get auto released.

@ZhuYuJin
Copy link
Contributor Author

ZhuYuJin commented Dec 14, 2021

@deadeyegoodwin I have made two modification:

1. I simply split original TraceRelease into two functions: TraceRelease and TraceStreamRelease.
   TraceRelease should delete InferenceTrace object.
   TraceStreamRelease should be called in InferenceTrace destructor.

2. If there is less than one chunk, TraceTensor in InferenceRequest will not copy buffer.

The #2 change looks good. I'm not sure about #1 change. The first thing you need to do is to address the comments I made above: "But now we need TRITONSERVER_InferenceTrace needs to represent shared_ptr so that TRITONSERVER_InferenceTrace* is a cast of shared_ptr*" I don't see that you have done that change. Have you carefully audited the full life-cycle of trace objects along with making that change?

@deadeyegoodwin Hi. I think I don't get your point that we should represent TRITONSERVER_InferenceTrace* as shared_ptr*. What's the difference between shared_ptr* and InferenceTrace*?

In the current change, I simply split original TraceRelease into two functions: TraceRelease and TraceStreamRelease. TraceRelease will delete InferenceTrace object. InferenceTrace destructor will call TraceStreamRelease. And TraceStreamRelease will release TraceStream.

Since InferenceTrace destructor will call TraceStreamRelease to release TraceStream, we only need to release InferenceTrace.

The grpc/http server will call trace_manager->SampleTrace or TRITONSERVER_InferenceTraceNew to get a InferenceTrace*, and call TraceRelease or TRITONSERVER_InferenceTraceDelete to delete InferenceTrace*.

Inside Triton, we will call trace->SpawnChildTrace to get a child shared_ptr, which is maintained by InferenceRequest and InferenceResponse. After InferenceRequest and InferenceResponse are released, the InferenceTrace will get auto released.

@deadeyegoodwin Hi. Please check this comment if you are free. I'm confused of our last divergence point.

@ZhuYuJin
Copy link
Contributor Author

ZhuYuJin commented Jan 7, 2022

@deadeyegoodwin Hi, I have updated cmdline and InferenceTrace. Please check the latest commit again.

BTW, I have run the existing trace tests qa/L0_cmdline_trace to ensure there are no regressions.
image

Copy link
Contributor

@deadeyegoodwin deadeyegoodwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In TraceManager as well you need to fix the initialization and handling of TraceLevel to account for deprecated MIN and MAX and the new TIMESTAMPS and TENSORS values.

@@ -69,6 +69,10 @@ class InferenceTrace {
void Report(
const TRITONSERVER_InferenceTraceActivity activity, uint64_t timestamp_ns)
{
if (level_ < TRITONSERVER_TRACE_LEVEL_TIMESTAMPS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use '<'. level_ is a bitmask of enabled trace levels. So you need to mask TRACE_LEVEL_TIMESTAMPS and only report if that bit is set.

@@ -90,6 +98,10 @@ class InferenceTrace {
const int64_t* shape, uint64_t dim_count,
TRITONSERVER_MemoryType memory_type, int64_t memory_type_id)
{
if (level_ < TRITONSERVER_TRACE_LEVEL_TENSORS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use '<'. level_ is a bitmask of enabled trace levels. So you need to mask TRACE_LEVEL_TENSORS and only report if that bit is set.

@@ -462,8 +462,9 @@ std::vector<Option> options_
{OPTION_TRACE_FILEPATH, "trace-file", Option::ArgStr,
"Set the file where trace output will be saved."},
{OPTION_TRACE_LEVEL, "trace-level", Option::ArgStr,
"Set the trace level. OFF to disable tracing, MIN for minimal tracing, "
"MAX for maximal tracing. Default is OFF."},
"Set the trace level. OFF to disable tracing, TIMESTAMPES to trace "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to allow --trace-level to be specified more than once because there are multiple trace levels that can be enabled. For example, see how --backend-config is handled. TIMESTAMPS enabled only timestamps and TENSORS enables only tensors.

@@ -69,6 +69,10 @@ class InferenceTrace {
void Report(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the contructor you need to set level_ in a way that handles the deprecated MIN and MAX settings.

@ZhuYuJin
Copy link
Contributor Author

ZhuYuJin commented Jan 11, 2022

@deadeyegoodwin
By now, I simply transfer both the deprecated MIN and MAX to the new TIMESTAMPS. So, I do not need to handle them.

Besides, If we take trace_level_ as a bitmask of enabled trace levels, we have to change the definition of the old interface TRITONSERVER_InferenceTraceNew in tritonserver.h because we take level as an enum in the current interface.
image

Maybe it should look like:
TRITONSERVER_DECLSPEC TRITONSERVER_Error* TRITONSERVER_InferenceTraceNew(
TRITONSERVER_InferenceTrace** trace, TRITONSERVER_InferenceTraceLevel* level_settings,
uint64_t parent_id, TRITONSERVER_InferenceTraceActivityFn_t activity_fn,
TRITONSERVER_InferenceTraceReleaseFn_t release_fn, void* trace_userp);

But, we can keep the old interface TRITONSERVER_InferenceTraceNew as before, and only fit the new level bitmap into the new interface TRITONSERVER_InferenceTraceTensorNew. Then, TRITONSERVER_InferenceTraceNew will only support one trace level. And TRITONSERVER_InferenceTraceTensorNew support multiple trace levels.

However, if we take trace_level_ as an enum and use comparison operators to check the trace level, we should not have the above problem.

What's your suggestion?

@deadeyegoodwin
Copy link
Contributor

The trace level remains an enum. It is just that we only allow power-of-2 enum values (except for the old deprecated values). A "trace level" is still a "trace level" and it is still an enum... the documentation is updated already to indicate what the trace level enum means. On the cmdline and C API we must still accept min and max for backwards compatibility, but internally we should convert those to TIMESTAMPS, so that internally in the code we can assume that trace level will never be min or max and so we can always treat it as a bitmask. There is no need to change the InfrenceTraceNew APIs.

@ZhuYuJin
Copy link
Contributor Author

@deadeyegoodwin Please check the latest commit.

I convert MIN and MAX to TIMESTAMPS on the cmdline, C APIs and grpc/http server.
Besides, I convert trace level bitmap to the enum type, and continue the rest logits.

@@ -807,6 +807,10 @@ TRITONSERVER_InferenceTraceNew(
TRITONSERVER_InferenceTraceReleaseFn_t release_fn, void* trace_userp)
{
#ifdef TRITON_ENABLE_TRACING
if ((level == TRITONSERVER_TRACE_LEVEL_MIN) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the right idea, but level could be (TRITONSERVER_TRACE_LEVEL_TIMESTAMPS | TRITONSERVER_TRACE_LEVEL_TENSORS). So if either min or max is set you need to turn on TIMESTAMPS (level |= TRITONSERVER_TRACE_LEVEL_TIMESTAMPS), and unconditionally you need to mask out MIN and MAX.

"Set the trace level. OFF to disable tracing, TIMESTAMPES to trace "
"timestamps, TENSORS to trace both timestamps and tensors. Default is "
"OFF."},
"Specify a trace level. OFF to disable tracing, TIMESTAMPES to "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix spelling of TIMESTAMPS. Also you need to say something about how the flag can be specified multiple times. That is needed so that you can turn on both TIMESTAMPS and TENSORS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to fix TIMESTAMPES -> TIMESTAMPS

@ZhuYuJin
Copy link
Contributor Author

@deadeyegoodwin Please check the latest commit. I have masked out MIN and MAX, and added some description of trace-level command.

Copy link
Contributor

@deadeyegoodwin deadeyegoodwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just need to fix the one spelling issue, rebuild and make sure existing tests still work and then I will try your changes in a full CI run. The last step will be to add new testing for the TENSORS trace.

@ZhuYuJin
Copy link
Contributor Author

@deadeyegoodwin Please check the latest commit. To make sure existing tests still work, I removed trace from InferenceResponse constructor and added SetTrace interface in InferenceResponse. BTW, I have added a test for TENSOR in L0_cmdline_trace.

@@ -104,8 +104,8 @@ fi

set -e

# trace-rate == 1, trace-level=MIN make sure every request is traced
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep one of the MIN and one of the MAX test cases to cover backwards compatibility

qa/L0_cmdline_trace/test.sh Show resolved Hide resolved
@deadeyegoodwin
Copy link
Contributor

deadeyegoodwin commented Jan 19, 2022

Just need a couple of test updated. Then you need to rebase everything to main branch and resolve any conflicts, and then I will run CI. Be sure to rebase your change in the core repo as well.

@ZhuYuJin
Copy link
Contributor Author

@deadeyegoodwin I have merged main branch in the server/core repos. Please check the latest commit.

@deadeyegoodwin
Copy link
Contributor

I ran the CI and linux looks good. There is a strange windows build failure that I will need to triage before we can merge.

@deadeyegoodwin
Copy link
Contributor

I fixed the windows build issue. It wasn't directly related to your change but was somehow triggered by it.
You need to rebase again and then I can run another CI.

@deadeyegoodwin
Copy link
Contributor

The CI passed so this is ready to merge. I can merge my local rebased copy of these changes but I would prefer to just merge your PR. We have other, conflicting changes pending so we need to get this merged ASAP. Are you able to rebase it soon?

@ZhuYuJin
Copy link
Contributor Author

ZhuYuJin commented Jan 28, 2022

@deadeyegoodwin I have merged the latest main branch of server/core and also fixed a print problem in trace_summary.py.

(payload_state == Payload::State::EXECUTING) ||
(payload_state == Payload::State::RELEASED));
namespace nvidia {
namespace inferenceserver {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change this formatting and other formatting within the file? This need to be reverted.

#ifdef TRITON_ENABLE_TRACING
request->TraceInputTensors(
TRITONSERVER_TRACE_TENSOR_QUEUE_INPUT, "DynamicBatchScheduler Enqueue");
request->TraceInputTensors(TRITONSERVER_TRACE_TENSOR_QUEUE_INPUT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to just after line 170, it needs to be inside of the conditional above.

@deadeyegoodwin deadeyegoodwin merged commit 2beb3ae into triton-inference-server:main Jan 28, 2022
@deadeyegoodwin
Copy link
Contributor

I went ahead and merged. I will fix the formatting and other issues in followup PR: #3867
Last task is to update the documentation. You should add appropriate content to https://github.com/triton-inference-server/server/blob/main/docs/trace.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants