Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing TimeProfiler.cpp clock from microseconds to nanoseconds #68096

Closed
wants to merge 2 commits into from

Conversation

fel-cab
Copy link
Contributor

@fel-cab fel-cab commented Oct 3, 2023

This PR was part of PR #68016. This change allow the profiling to trace better some short events, which can help application developers.

…mprove LIBOMPTARGET profiling, as it allows to measure more events. It was part of PR llvm#68016
@fel-cab fel-cab self-assigned this Oct 3, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 3, 2023

@llvm/pr-subscribers-lld-macho
@llvm/pr-subscribers-lld

@llvm/pr-subscribers-llvm-support

Changes

This PR was part of PR #68016. This change allow the profiling to trace better some short events, which can help application developers.


Full diff: https://github.com/llvm/llvm-project/pull/68096.diff

1 Files Affected:

  • (modified) llvm/lib/Support/TimeProfiler.cpp (+25-25)
diff --git a/llvm/lib/Support/TimeProfiler.cpp b/llvm/lib/Support/TimeProfiler.cpp
index 4d625b3eb5b1709..df412b6225a8490 100644
--- a/llvm/lib/Support/TimeProfiler.cpp
+++ b/llvm/lib/Support/TimeProfiler.cpp
@@ -30,7 +30,7 @@ namespace {
 
 using std::chrono::duration;
 using std::chrono::duration_cast;
-using std::chrono::microseconds;
+using std::chrono::nanoseconds;
 using std::chrono::steady_clock;
 using std::chrono::system_clock;
 using std::chrono::time_point;
@@ -76,18 +76,18 @@ struct TimeTraceProfilerEntry {
       : Start(std::move(S)), End(std::move(E)), Name(std::move(N)),
         Detail(std::move(Dt)) {}
 
-  // Calculate timings for FlameGraph. Cast time points to microsecond precision
+  // Calculate timings for FlameGraph. Cast time points to nanosecond precision
   // rather than casting duration. This avoids truncation issues causing inner
   // scopes overruning outer scopes.
-  ClockType::rep getFlameGraphStartUs(TimePointType StartTime) const {
-    return (time_point_cast<microseconds>(Start) -
-            time_point_cast<microseconds>(StartTime))
+  ClockType::rep getFlameGraphStartNs(TimePointType StartTime) const {
+    return (time_point_cast<nanoseconds>(Start) -
+            time_point_cast<nanoseconds>(StartTime))
         .count();
   }
 
-  ClockType::rep getFlameGraphDurUs() const {
-    return (time_point_cast<microseconds>(End) -
-            time_point_cast<microseconds>(Start))
+  ClockType::rep getFlameGraphDurNs() const {
+    return (time_point_cast<nanoseconds>(End) -
+            time_point_cast<nanoseconds>(Start))
         .count();
   }
 };
@@ -114,16 +114,16 @@ struct llvm::TimeTraceProfiler {
 
     // Check that end times monotonically increase.
     assert((Entries.empty() ||
-            (E.getFlameGraphStartUs(StartTime) + E.getFlameGraphDurUs() >=
-             Entries.back().getFlameGraphStartUs(StartTime) +
-                 Entries.back().getFlameGraphDurUs())) &&
+            (E.getFlameGraphStartNs(StartTime) + E.getFlameGraphDurNs() >=
+             Entries.back().getFlameGraphStartNs(StartTime) +
+                 Entries.back().getFlameGraphDurNs())) &&
            "TimeProfiler scope ended earlier than previous scope");
 
     // Calculate duration at full precision for overall counts.
     DurationType Duration = E.End - E.Start;
 
     // Only include sections longer or equal to TimeTraceGranularity msec.
-    if (duration_cast<microseconds>(Duration).count() >= TimeTraceGranularity)
+    if (duration_cast<nanoseconds>(Duration).count() >= TimeTraceGranularity)
       Entries.emplace_back(E);
 
     // Track total time taken by each "name", but only the topmost levels of
@@ -162,15 +162,15 @@ struct llvm::TimeTraceProfiler {
 
     // Emit all events for the main flame graph.
     auto writeEvent = [&](const auto &E, uint64_t Tid) {
-      auto StartUs = E.getFlameGraphStartUs(StartTime);
-      auto DurUs = E.getFlameGraphDurUs();
+      auto StartNs = E.getFlameGraphStartNs(StartTime);
+      auto DurNs = E.getFlameGraphDurNs();
 
       J.object([&] {
         J.attribute("pid", Pid);
         J.attribute("tid", int64_t(Tid));
         J.attribute("ph", "X");
-        J.attribute("ts", StartUs);
-        J.attribute("dur", DurUs);
+        J.attribute("ts", StartNs / 1000);
+        J.attribute("dur", DurNs / 1000);
         J.attribute("name", E.Name);
         if (!E.Detail.empty()) {
           J.attributeObject("args", [&] { J.attribute("detail", E.Detail); });
@@ -218,7 +218,7 @@ struct llvm::TimeTraceProfiler {
     // Report totals on separate threads of tracing file.
     uint64_t TotalTid = MaxTid + 1;
     for (const NameAndCountAndDurationType &Total : SortedTotals) {
-      auto DurUs = duration_cast<microseconds>(Total.second.second).count();
+      auto DurNs = duration_cast<nanoseconds>(Total.second.second).count();
       auto Count = AllCountAndTotalPerName[Total.first].first;
 
       J.object([&] {
@@ -226,11 +226,11 @@ struct llvm::TimeTraceProfiler {
         J.attribute("tid", int64_t(TotalTid));
         J.attribute("ph", "X");
         J.attribute("ts", 0);
-        J.attribute("dur", DurUs);
-        J.attribute("name", "Total " + Total.first);
+        J.attribute("dur", DurNs / 1000);
+        J.attribute("name", "Total: " + Total.first);
         J.attributeObject("args", [&] {
           J.attribute("count", int64_t(Count));
-          J.attribute("avg ms", int64_t(DurUs / Count / 1000));
+          J.attribute("avg us", int64_t(DurNs / Count / 1000));
         });
       });
 
@@ -261,10 +261,10 @@ struct llvm::TimeTraceProfiler {
     // Emit the absolute time when this TimeProfiler started.
     // This can be used to combine the profiling data from
     // multiple processes and preserve actual time intervals.
-    J.attribute("beginningOfTime",
-                time_point_cast<microseconds>(BeginningOfTime)
-                    .time_since_epoch()
-                    .count());
+    J.attribute("beginningOfTime", time_point_cast<nanoseconds>(BeginningOfTime)
+                                           .time_since_epoch()
+                                           .count() /
+                                       1000);
 
     J.objectEnd();
   }
@@ -281,7 +281,7 @@ struct llvm::TimeTraceProfiler {
   SmallString<0> ThreadName;
   const uint64_t Tid;
 
-  // Minimum time granularity (in microseconds)
+  // Minimum time granularity (in nanoseconds)
   const unsigned TimeTraceGranularity;
 };
 

@jdoerfert
Copy link
Member

So, this allows to include nanoseconds events but it still will list their time in microseconds, correct? Should there be a switch?

"TimeProfiler scope ended earlier than previous scope");

// Calculate duration at full precision for overall counts.
DurationType Duration = E.End - E.Start;

// Only include sections longer or equal to TimeTraceGranularity msec.
if (duration_cast<microseconds>(Duration).count() >= TimeTraceGranularity)
if (duration_cast<nanoseconds>(Duration).count() >= TimeTraceGranularity)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TimeTraceGranularity is in microseconds wherever timeTraceProfilerInitialize gets called. Is there a plan to update the interface for compatibility (i.e. adjust microseconds to nanoseconds internally)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think about it, but I can do it. We need a easy way for users, to change it. Do you have an idea of how to do it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Mark, I've been looking at this, but now I'm not sure what you mean. After looking at the tracing format ( Perfetto/chronium ) I see that it requires that the events duration have to be passed in microseconds. I made this change because I thought it was possible to provide trace duration in nanoseconds. I can obtain basically the same behavior on on the libomptarget trace by changing the TimeTraceGranularity at the construction of the trace, to 0. Given this, at this point I think that this patch is not necessary. But maybe I'm missing something, what do you mean with "update the interface for compatibility".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This silently changes the meaning of TimeTraceGranularity from microseconds to nanoseconds, and there are users of the code besides OpenMP, and it changes the meaning for them, too.
I usually add an environment variable (LIBOMPTARGET_PROFILE_GRANULARITY) to change the granularity, and set it to zero for profiling. I thought there was a review adding such a variable, but I can't find it now.

@fel-cab
Copy link
Contributor Author

fel-cab commented Oct 4, 2023

So, this allows to include nanoseconds events but it still will list their time in microseconds, correct? Should there be a switch?

Yes, it list the time still in microseconds, to use the Perfetto/chronium defaults. The switch would be better than leave it fixed. But we need an easy way to set it for the user.

@fel-cab
Copy link
Contributor Author

fel-cab commented Oct 11, 2023

At this point, I think this patch is not necessary, given that the events duration in the profiles have to be generated in microseconds, therefore having the times in nanoseconds do not have any benefit. For this reason, I'm going to close it.

@fel-cab fel-cab closed this Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants