feat[next]: Add instrumentation package and user-defineable hooks#2437
feat[next]: Add instrumentation package and user-defineable hooks#2437egparedes merged 44 commits intoGridTools:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces an instrumentation subpackage to gt4py.next containing performance metrics infrastructure and a new hook system for user-defined callbacks. The existing metrics module is moved from gt4py.next.metrics to gt4py.next.instrumentation.metrics and refactored to simplify the API by replacing the SourceHandler class with a more direct approach.
Changes:
- Adds new hook machinery supporting event and context hooks with signature validation and callback registration
- Refactors metrics collection to simplify source management (removes
SourceHandlerclass) - Creates three public hooks:
program_call_hook,embedded_program_call_hook, andcompile_variant_hook
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
src/gt4py/next/instrumentation/__init__.py |
New package initialization file |
src/gt4py/next/instrumentation/_hook_machinery.py |
Core hook implementation with EventHook and ContextHook classes |
src/gt4py/next/instrumentation/hooks.py |
Public hook exports for program and compilation callbacks |
src/gt4py/next/instrumentation/metrics.py |
Refactored metrics with simplified API (removed SourceHandler, renamed functions) |
src/gt4py/next/otf/compiled_program.py |
Integrated compile_variant_hook and updated metrics API calls |
src/gt4py/next/ffront/decorator.py |
Integrated program call hooks and updated metrics API calls |
tests/next_tests/unit_tests/instrumentation_tests/__init__.py |
Test package initialization |
tests/next_tests/unit_tests/instrumentation_tests/test_hook_machinery.py |
Comprehensive unit tests for hook machinery |
tests/next_tests/integration_tests/feature_tests/instrumentation_tests/__init__.py |
Integration test package initialization |
tests/next_tests/integration_tests/feature_tests/instrumentation_tests/test_hooks.py |
Integration tests for hooks with different backends |
tests/next_tests/unit_tests/instrumentation_tests/test_metrics.py |
Updated import path |
tests/next_tests/integration_tests/feature_tests/ffront_tests/test_decorator.py |
Updated import paths in mocks |
src/gt4py/next/program_processors/runners/gtfn.py |
Updated import path |
src/gt4py/next/program_processors/runners/dace/workflow/translation.py |
Updated import path |
src/gt4py/next/program_processors/runners/dace/workflow/decoration.py |
Updated import path |
Comments suppressed due to low confidence (2)
src/gt4py/next/instrumentation/metrics.py:197
- Assigning a class directly to
collectchanges the API frommetrics.collect()(function call) tometrics.collect()(class instantiation). While both syntactically work, this is a subtle API change. Consider documenting this or maintaining the function-based API for clarity by definingcollect = CollectorContextManagerasdef collect() -> CollectorContextManager: return CollectorContextManager().
src/gt4py/next/instrumentation/metrics.py:135 - The function name
is_current_source_setcould be more concise. Consider renaming tohas_current_sourceoris_collectingfor better readability, especially since the previous function wasin_collection_mode()which was more intuitive.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if metrics.is_level_enabled(metrics.MINIMAL): | ||
| # Create a new metrics entity for this compiled program variant and | ||
| # attach relevant metadata to it. | ||
| source_key = f"{_metrics_prefix_from_pool_root(_make_pool_root(program_definition, backend))}[{hash(key)}]" | ||
| assert source_key not in metrics.sources, ( | ||
| "The key for the program variant being compiled is already set!!" | ||
| ) | ||
|
|
||
| metrics.sources[source_key].metadata |= dict( | ||
| name=program_definition.definition.__name__, | ||
| backend=backend.name, | ||
| compiled_program_pool_key=hash(key), | ||
| **{ | ||
| f"{eve_utils.CaseStyleConverter.convert(key.__name__, 'pascal', 'snake')}s": value | ||
| for key, value in argument_descriptors.items() | ||
| }, | ||
| ) |
There was a problem hiding this comment.
From a repository structure point of few, wouldn't it make more sense to register this later? Now we have the function _metrics_prefix_from_pool_root in this file which seems wrong.
There was a problem hiding this comment.
The comment got detached by your recent changes, but still a question. Also the empty function stuff could just disappear if we don't have a default registered thing.
There was a problem hiding this comment.
As discussed offline, it's not clear where to put metrics-related pool functions. The built-in implementation of the hooks is meant to be not optional, so user can not remove it. It might worth to revise this in the future, but it seems like a good idea to move away from the current implementation in small steps.
| if debug: | ||
| print(f"Using cached fencil for key {cache_key}") | ||
| return typing.cast(stages.CompiledProgram, _FENCIL_CACHE[cache_key]) | ||
| return _FENCIL_CACHE[cache_key] # A CompiledProgram is just a Callable |
There was a problem hiding this comment.
off topic: maybe CompiledProgram -> ExecutableProgram to include embedded in the naming?
There was a problem hiding this comment.
Yes, you are right. It looks like a better name for the concept.
This PR adds the instrumentation subpackage to
gt4py.nextcontaining the already existing performance metrics infrastructure and a new hook infrastructure to allow users to attach custom callbacks to different parts of the gt4py programs execution.