Skip to content

Upgrade Kineto to C++20#1397

Closed
scotts wants to merge 8 commits into
pytorch:mainfrom
scotts:cpp_20
Closed

Upgrade Kineto to C++20#1397
scotts wants to merge 8 commits into
pytorch:mainfrom
scotts:cpp_20

Conversation

@scotts
Copy link
Copy Markdown
Contributor

@scotts scotts commented May 11, 2026

The rest of the PyTorch ecosystem has upgraded, so now it's our turn. At the same time, we're adopting better C++20 APIs. Changes:

  1. Bump C++ standard from 17 to 20 — CXX_STANDARD 17→20 in all five CMakeLists.txt
  2. Use map/set::contains() in place of find/end and count
  3. Remove AbstractConfig::endsWith helper now just callsstd::string::ends_with; I'll try to remove all callers in a follow-up.
  4. using enum ActivityType in CUPTI enable/disable — drops the ActivityType:: prefix from ~16 comparisons.
  5. Use designated initializers for DeviceInfo / ResourceInfo — convert the two POD sinks in IActivityProfiler.h to aggregates and update call sites. Incidentally fixes a footgun: ResourceInfo's constructor took (deviceId, id, sortIndex, name) but stored fields in (id, sortIndex, deviceId, name) order, so positional brace-init at call sites was effectively swapping the first two arguments.
  6. Use std::ranges algorithms and std::erasefind / sort / find_if / all_of / replace and the erase-remove idiom.

scotts added 6 commits May 11, 2026 07:35
Updates CXX_STANDARD in all CMakeLists.txt files (libkineto, benchmarks,
test, and the xpupti test subdirectories) to 20. No source code changes;
subsequent commits will pick up C++20 constructs.

Verified: CPU-only build configures, compiles, and all 28 tests pass
with GCC 11.5.
C++20 added an explicit contains() member to associative containers.
Replace the equivalent find() != end() and count() > 0 idioms wherever
the iterator is not used afterward.

Sites:
  CuptiActivityProfiler.cpp  count(...) == 0  ->  !contains(...)  (x3)
                             find(...) == end()  ->  !contains(...)  (x1)
  CuptiCbidRegistry.cpp      find(...) != end()  ->  contains(...)
  CuptiRangeProfiler.cpp     count(...) > 0      ->  contains(...)
                             find(...) == end()  ->  !contains(...)
  GenericActivityProfiler.{cpp,h}  count(...) >0/find!=end  -> contains (x5)
  output_csv.cpp             find(...) == end()  ->  !contains(...)
  output_json.cpp            find(...) == end()  ->  !contains(...)
C++20 added std::string::ends_with, making the hand-rolled helper
redundant. Drop the declaration and definition; no callers in tree.
C++20's using-enum-declaration lets the long if-chains in
enableCuptiActivities and disableCuptiActivities drop the
ActivityType:: prefix from every comparison.

Note: CuptiActivityApi.cpp is only built with KINETO_BACKEND=cuda,
so the change isn't exercised by the CPU-only CI build.
These two structs in IActivityProfiler.h are small POD-like sinks used
by output backends. Converting them to aggregates (drop the constructors)
lets every call site spell out which field is which.

The ResourceInfo constructor took (deviceId, id, sortIndex, name) but the
struct stored (id, sortIndex, deviceId, name) — every positional brace
init had to swap the first two arguments mentally. Designated initializers
remove that footgun.

Sites updated:
  GenericActivityProfiler.h    recordThreadInfo, recordStream, recordDevice
  GenericActivityProfiler.cpp  CPU + GPU handleDeviceInfo calls
  plugin/xpupti/XpuptiActivityProfilerSession.cpp  getResourceInfos
Replace begin/end-pair std algorithms with their C++20 range counterparts:
  std::find        -> std::ranges::find
  std::find_if     -> std::ranges::find_if
  std::sort        -> std::ranges::sort
  std::all_of      -> std::ranges::all_of
  std::replace     -> std::ranges::replace
  erase-remove     -> std::erase

Sites: ApproximateClock.cpp, CuptiCallbackApi.cpp, GenericActivityProfiler.cpp,
output_json.cpp, plugin/xpupti/XpuptiActivityProfilerSession.cpp.
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented May 11, 2026

@scotts has imported this pull request. If you are a Meta employee, you can view this in D104671263.

scotts added 2 commits May 11, 2026 08:30
fbsource clang-format flagged the multi-line designated init blocks in
the previous commit. Reformat the four call sites so the field
initializers go on one line where they fit. No behavior change.
The previous removal broke fb/FBConfig.cpp (a Meta-internal file not
visible in OSS), which still calls endsWith() in a few places. Restore
the method, but reimplement it as a thin wrapper over the new
std::string::ends_with so the duplicated hand-rolled logic stays gone.

Header comment marks it as not-for-new-callers; existing callers should
migrate to s.ends_with(suffix) directly and the wrapper can be removed
once they have.
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented May 12, 2026

@scotts merged this pull request in 77e2b46.

pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request May 22, 2026
Includes the following commits:

- ci: declare workflow-level `contents: read` on 5 workflows (pytorch/kineto#1404) 5902263
- Remove deprecated `REQUEST_TIMESTAMP` config key (pytorch/kineto#1409) 55883de
- Fix intermittent Mac CI failure from conda channel reset (pytorch/kineto#1407) ee27b5c
- Add nlohmann/json as a top-level third_party submodule (pytorch/kineto#1406) c044281
- Remove SIGUSR2 on-demand profiling path (pytorch/kineto#1408) 471ed38
- Fix ROCm HtoD memcpy stream attribution (pytorch/kineto#1398) 799b5f4
- Fix UST_LOGGER_MARK_COMPLETED build failure in manifold_trace_logger (pytorch/kineto#1389) 60967ce
- Remove `DefaultTimeConverterIsIdentity` test (pytorch/kineto#1401) 81d31cd
- Re-enable most PyTorch tests (pytorch/kineto#1403) 212f9a5
- Daily `arc lint --take CLANGFORMAT` (pytorch/kineto#1402) 6481fac
- Resolve CUPTI cbid names via cuptiGetCallbackName (pytorch/kineto#1400) e07e121
- XPUPTI: Fix ts=0 trace events on Windows (pytorch/kineto#1381) 4c8d01c
- Remove LIBKINETO_NO* compatibility shim (pytorch/kineto#1399) ea8bc18
- Upgrade Kineto to C++20 (pytorch/kineto#1397) 77e2b46
- Update the rocm api filtering (pytorch/kineto#1392) e0ac578
Pull Request resolved: #184784
Approved by: https://github.com/NicolasHug, https://github.com/malfet
pytorchmergebot pushed a commit to khushi-411/pytorch that referenced this pull request May 24, 2026
Includes the following commits:

- ci: declare workflow-level `contents: read` on 5 workflows (pytorch/kineto#1404) 5902263
- Remove deprecated `REQUEST_TIMESTAMP` config key (pytorch/kineto#1409) 55883de
- Fix intermittent Mac CI failure from conda channel reset (pytorch/kineto#1407) ee27b5c
- Add nlohmann/json as a top-level third_party submodule (pytorch/kineto#1406) c044281
- Remove SIGUSR2 on-demand profiling path (pytorch/kineto#1408) 471ed38
- Fix ROCm HtoD memcpy stream attribution (pytorch/kineto#1398) 799b5f4
- Fix UST_LOGGER_MARK_COMPLETED build failure in manifold_trace_logger (pytorch/kineto#1389) 60967ce
- Remove `DefaultTimeConverterIsIdentity` test (pytorch/kineto#1401) 81d31cd
- Re-enable most PyTorch tests (pytorch/kineto#1403) 212f9a5
- Daily `arc lint --take CLANGFORMAT` (pytorch/kineto#1402) 6481fac
- Resolve CUPTI cbid names via cuptiGetCallbackName (pytorch/kineto#1400) e07e121
- XPUPTI: Fix ts=0 trace events on Windows (pytorch/kineto#1381) 4c8d01c
- Remove LIBKINETO_NO* compatibility shim (pytorch/kineto#1399) ea8bc18
- Upgrade Kineto to C++20 (pytorch/kineto#1397) 77e2b46
- Update the rocm api filtering (pytorch/kineto#1392) e0ac578
Pull Request resolved: pytorch#184784
Approved by: https://github.com/NicolasHug, https://github.com/malfet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant