Skip to content

Releases: NERSC/timemory

macOS Python Fixes

18 Jul 11:20
Compare
Choose a tag to compare
  • Fixes: Fatal Python error: PyMUTEX_LOCK(gil->mutex) failed on macOS

Python tools submodule + various stability fixes

15 Jul 02:57
Compare
Choose a tag to compare
  • Python gotcha fixes
    • Fixed issues with mallocp segfaulting from Python
    • Fixed storage merge() segfaulting
  • New Python tools submodule (timemory.tools)
    • tools.function_wrappers combines {start,stop}_{mpip,ompt,ncclp,mallocp}
      into one configurable handle and provides decorator + context-manager features
  • New Python functions which are used within tools.function_wrappers
    • timemory.start_function_wrappers
    • timemory.stop_function_wrappers
  • Fixed timemory-python-line-profiler script calling timemory.profiler
  • API change in ring_buffer template
    • read/write member functions return pointer to object read/written to
      instead of bytes
  • API change in storage and tsettings
    • Classes are declared as final to optimize any vtable calls
  • Removed runtime_configurable restriction for do_enumerator_generate
    • This enables user_bundles to be used again in Python
  • Added operation::python_class_name
  • Updated examples:
    • ex_python_bindings (and libex_python_bindings)
  • Fix to get_hash_identifier
  • Removed concurrency comparison when generating a diff b/t two runs
  • Fixed issues with popen.cpp guarding with TIMEMORY_WINDOWS but never defined

pytimem fix + various build system improvements

11 Jul 15:55
76ff978
Compare
Choose a tag to compare
  • pytimem fix
    • fix missing import of component_bundle and component_tuple
  • added additional python tests
  • Ability to build with static libraries: python bindings, mpip library, mallocp library, ompt library, ncclp library, KokkosP libraries
  • Setting TIMEMORY_BUILD_PYTHON to OFF now results in searching for external pybind11 install
  • Renamed some CMake files in cmake/Modules
  • Updated caliper and gotcha submodules to support {CALIPER,GOTCHA}INSTALL{CONFIG,HEADER} options
  • Added TIMEMORY_INSTALL_PYTHON option
  • Fixed BUILD_STATIC_LIBS=ON + CMAKE_POSITION_INDEPENDENT_CODE=ON
  • Fixed TIMEMORY_USE_CUDA=ON + TIMEMORY_REQUIRE_PACKAGES=ON to fail
  • If TIMEMORY_REQUIRED_PACKAGES=OFF, search for packages first before adding submodule
  • Extended setup.py to support more options and support non-development install (no headers or cmake config)
  • Removed TIMEMORY_EMBED_PYTHON option
  • Disable timemory-jump when no shared libraries are built since dlopen isn't possible
  • Replaced allocator member functions construct, destroy, allocate, deallocate with calls to static functions of allocator traits
  • added support for CMAKE_ARGS env variable in setup.py
  • remove absolute rpath when SKBUILD/SPACK_BUILD (since these have staging directories)
  • timemory-{c,cxx,fortran} alias libraries in build tree
  • toggled python function profiler to not include line number by default
    • This can cause strange results when generators are used

Compiler instrumentation + Fortran module + New tool libraries + NCCL support + NVML support + Python tracing + Hatchet + User Metadata + CUPTI PCSampling

29 Jun 08:42
2bdd28e
Compare
Choose a tag to compare
  • Numerous stability fixes
  • Fortran module
  • Compiler instrumentation
  • NCCL support
  • timemory-mallocp
  • timemory-ncclp
  • timemory-nvml
  • Python line-by-line tracing
  • I/O {read,write}_{char,bytes}
  • Network stats components
  • libunwind support
  • CMake minimum upgraded to 3.15
  • Type-traits for tree/flat/timeline
  • Hierarchical serialization (hatchet support)
  • Concepts
  • Improved settings
  • Python tracer (line-by-line)
  • CTestNotes support
  • Command-line options for settings
  • Migrated cereal to internal (i.e. cereal:: -> tim::cereal::)
  • Dramatically improved Windows support
  • Improved kokkos support
    • Command-line options
    • Print help
  • XML serialization support
  • Shared caches for components
  • Support for C++17 string_view
  • Python bindings to storage classes
  • Windows support for different CPU timers
  • CUDA Cupti PCSampling support (CUDA v11+)
  • User metadata
  • Sampling support in opaque (i.e. within user-bundles)
  • Static polymorphic base for bundlers
  • Namespace re-organization
  • CUDA compilation with Clang compiler
  • Piecewise installation
  • timem support md5sum hashing of command-line
  • papi_threading setting
  • is_invalid in base_state
  • New operations
    • stack_push
    • stack_pop
    • insert
    • set_depth_change
    • set_is_flat
    • set_is_on_stack
    • set_is_invalid
    • set_iterator
    • get_is_flat
    • get_is_invalid
    • get_is_on_stack
    • get_depth
    • get_storage
    • get_iterator

New command-line tools, dynamic instrumentation, profiling libraries, python profiling, C++14 migration

10 Jul 21:24
Compare
Choose a tag to compare
  • New command line tools
    • timemory-run for (Linux) dynamic instrumentation support
    • timemory-avail for component/settings/hw-counter availability
    • timem-mpi for timem + MPI
    • timemory-python-profiler for python profiling
    • timemory-python-line-profiler for python line-by-line profiling
  • New instrumentation libraries
    • Kokkos profiling libraries
    • MPI profiling libraries
    • OpenMP profiling libraries
  • New components
    • CrayPAT components
    • AllineaMap components
    • Additional Caliper components
    • papi_vector
    • data_tracker for tracking values in application
  • Aggregation of MPI/UPC++ per-process results
  • New variadic bundlers component_bundle, auto_bundle, lightweight_tuple
  • Functional alternative to variadic bundlers

MT fix and integral_constant for priority

02 Jan 20:40
Compare
Choose a tag to compare
  • Storage fix for MT
    • Previously, when a thread had multiple entries at a depth of +1 from master bookmark, only the first subgraph from thread was merged into master (it did not appear to affect flat-profiles though)
  • trait::start_priority<T> and trait::stop_priority<T> use integral_constant instead of true/false
  • Updated copyright

Modularity Support

29 Dec 05:14
Compare
Choose a tag to compare
  • Essentially re-written from scratch to support modularity
  • This version supports the following "components" which can be assembled into a multiplexing measurement handle
COMPONENT
caliper
cpu_clock
cpu_roofline<Types...>
cpu_util
cuda_event
cuda_profiler
cupti_activity
cupti_counters
data_rss
gotcha<size_t, Bundle, Diff>
gperf_cpu_profiler
gperf_heap_profiler
gpu_roofline<Types...>
likwid_nvmon
likwid_perfmon
monotonic_clock
monotonic_raw_clock
num_io_in
num_io_out
num_major_page_faults
num_minor_page_faults
num_msg_recv
num_msg_sent
num_signals
num_swap
nvtx_marker
page_rss
papi_array<size_t>
papi_tuple<int...> .
peak_rss
priority_context_switch
process_cpu_clock
process_cpu_util
read_bytes
stack_rss
system_clock
tau_marker
thread_cpu_clock
thread_cpu_util
trip_count
user_bundle<size_t, Tag>
user_clock
virtual_memory
voluntary_context_switch
vtune_event
vtune_frame
wall_clock
written_bytes

Release v2.3.0

11 Feb 00:30
Compare
Choose a tag to compare
  • Release v2.3.0
  • This is a tag of the version of TiMemory before the string abi modifications

Performance improvement + C interface + env control

12 Jun 04:22
Compare
Choose a tag to compare
  • Significant performance improvement (~2x)
  • new C interface for TiMemory
    • requires variable assignment and freeing
      • void* atimer = TIMEMORY_AUTO_TIMER("")
      • FREE_TIMEMORY_AUTO_TIMER(atimer)
  • command-line tools: timem (UNIX-only) and pytimem
  • Environment control
    • TIMEMORY_VERBOSE
    • TIMEMORY_DISABLE_TIMER_MEMORY
    • TIMEMORY_NUM_THREADS_ENV
    • TIMEMORY_NUM_THREADS
    • TIMEMORY_ENABLE
    • TIMEMORY_TIMING_FORMAT
    • TIMEMORY_TIMING_PRECISION
    • TIMEMORY_TIMING_WIDTH
    • TIMEMORY_TIMING_UNITS
    • TIMEMORY_TIMING_SCIENTIFIC
    • TIMEMORY_MEMORY_FORMAT
    • TIMEMORY_MEMORY_PRECISION
    • TIMEMORY_MEMORY_WIDTH
    • TIMEMORY_MEMORY_UNITS
    • TIMEMORY_MEMORY_SCIENTIFIC
    • TIMEMORY_TIMING_MEMORY_FORMAT
    • TIMEMORY_TIMING_MEMORY_PRECISION
    • TIMEMORY_TIMING_MEMORY_WIDTH
    • TIMEMORY_TIMING_MEMORY_UNITS
    • TIMEMORY_TIMING_MEMORY_SCIENTIFIC
  • Ability of push/pop default formatting
  • improved thread-local singleton using C++ shared_ptrs
    • automatic merge and deletion of manager instances at sub-thread exit
  • Hard-code python exe into timemory python scripts
  • Various fixes (plotting, argparse, etc.)
  • Minor fix to avoid very rare FPE when serializing
  • fix to TiMemoryConfig.cmake when installed via sudo
  • self-cost available in manager + plotting safeguards
  • Improved singleton deletion
  • alternative colors for when len(_types) == 1 in plotting
  • plotting label fix

TiMemoryConfig.cmake fixes

26 Apr 07:31
Compare
Choose a tag to compare
  • Patches for TiMemoryConfig.cmake
    • no longer using add_library alias
    • fix for when TIMEMORY_USE_PYTHON_BINDINGS=OFF