Profiling kernels with additional function in psyclone tools#339
Profiling kernels with additional function in psyclone tools#339Matthew Walker (mattatmet) wants to merge 4 commits intoMetOffice:mainfrom
Conversation
0c9439a to
da0d83a
Compare
Matthew Hambley (MatthewHambley)
left a comment
There was a problem hiding this comment.
A couple of comments for your consideration. If, after considering them, you decide not to act on them, close the comment. Otherwise leave it open for me to use as a marker.
I notice there's no code reviewer yet. It will be easier if one is sorted out before I approve this change.
|
|
||
| :param psyir: the PSyIR of the PSy-layer. | ||
| :param colours_only: profile only the coloured kernels. Default True. | ||
| :type psyir: :py:class:`psyclone.psyir.nodes.FileContainer` |
There was a problem hiding this comment.
You can use type hinting to include this information in a syntactically significant fashion. e.g.
def profile_loops(psyir: FileContainer, colours_only=True):
This is more succinct than the sphinx form and can be used by tools such as mypy for static type checking.
Note that there is no need to specify colours_only: bool = True, although you can, because the type may be inferred from the default.
There was a problem hiding this comment.
Thanks for this tip Matthew. I've added the type hint to this function and the other functions in psyclone_tools.
| # Add timing calipers to coloured loops. This should be done | ||
| # before the application of the openmp transformation. |
There was a problem hiding this comment.
Is there a means to enforce this ordering?
There was a problem hiding this comment.
Thank you for this comment Matthew, it's been really good to check. I hadn't thought about how strict this condition should be so I tested it and it turns out that a check for this ordering was definitely warranted as well as another (making sure the function isn't called before colour_loops as well). I've added a few raise TransformationError lines so that it fails gracefully/clearly when the function is called out of order.
There was a problem hiding this comment.
Hacka Fett (@christophermaynard) Joerg Henrichs (@hiker): one thing that came up here was that Psyclone raises an error when a Profile node is placed between an OMPParallelDoDirective node and a Loop node. An idea for the future could be setting Psyclone up so that it allows it when you want to profile different threads?
There was a problem hiding this comment.
Hacka Fett (Hacka Fett (@christophermaynard)) Joerg Henrichs (Joerg Henrichs (@hiker)): one thing that came up here was that Psyclone raises an error when a Profile node is placed between an OMPParallelDoDirective node and a Loop node. An idea for the future could be setting Psyclone up so that it allows it when you want to profile different threads?
I don't think that's valid Fortran: after an omp parallel do there must be a loop, we can't put a call there.
Tools that support measurements for threads (well ... at least tau does :) ) either instrument the omp directives (i.e. source to source transformation, inserting calls), or using the ompt call back functionality.
| | mo-lucy-gordon | Lucy Gordon | Met Office | 2026-03-18 | | ||
| | shreybh1 | Shrey Bhardwaj | Met Office | 2026-03-26 | | ||
|
|
||
| | mattatmet | Matthew Walker | Met Office | 2026-04-21 | |
There was a problem hiding this comment.
Unless it states otherwise in the developer documentation, it is probably best to insert your name in alphabetical order by family name. This may be difficult if others have not been doing so.
There was a problem hiding this comment.
Sadly, there is no clear structure or order to this file, alphabetical or otherwise.
Oakley Brunt (oakleybrunt)
left a comment
There was a problem hiding this comment.
Thanks Matt, looking like a very useful change! There are just a few things that could be cleaner and some style changes I'd like to see as well.
Feel free to push back on any of my suggestions :)
| # Insert profiler calls before loop over colours | ||
| if (not colours_only and not loop.loop_type in leave_loops) or loop.loop_type == "colours": |
There was a problem hiding this comment.
I think this check could be more clearly signalled, e.g.:
| # Insert profiler calls before loop over colours | |
| if (not colours_only and not loop.loop_type in leave_loops) or loop.loop_type == "colours": | |
| # Always profile coloured loops. Optionally profile all loops if | |
| # colours_only=False and loop not over anything in `leave_loops` | |
| if ((loop.loop_type == "colours") or | |
| (colours_only is False and loop.loop_type not in leave_loops)): |
I know that I have used an extra pair of parentheses which are not strictly required, but it visually separates the two checks which I think is easier to read.
Please check that this works if you agree with this suggestion. I am trusting myself here and haven't tried to run this.
| k_name = loop.ancestor(InvokeSchedule).coded_kernels()[count].name | ||
| invoke_name = loop.ancestor(InvokeSchedule).invoke.name | ||
| file_name = loop.ancestor(Container).name | ||
| options = {"region_name": (file_name,invoke_name + ":" + k_name + "_k" + str(count))} |
There was a problem hiding this comment.
This line is too long, probably worth creating the region name above:
| options = {"region_name": (file_name,invoke_name + ":" + k_name + "_k" + str(count))} | |
| # Make region name | |
| region_name = invoke_name + ":" + k_name + "_k" + str(count) | |
| options = {"region_name": (file_name, region_name)} |
| k_name = loop.ancestor(InvokeSchedule).coded_kernels()[count].name | ||
| invoke_name = loop.ancestor(InvokeSchedule).invoke.name | ||
| file_name = loop.ancestor(Container).name |
There was a problem hiding this comment.
A comment would be nice here to let others know you are getting the parts to name the profiling calliper.
| invoke_name = loop.ancestor(InvokeSchedule).invoke.name | ||
| file_name = loop.ancestor(Container).name | ||
| options = {"region_name": (file_name,invoke_name + ":" + k_name + "_k" + str(count))} | ||
| profile_trans.apply(loop,options=options) |
There was a problem hiding this comment.
| profile_trans.apply(loop,options=options) | |
| profile_trans.apply(loop, options=options) |
| file_name = loop.ancestor(Container).name | ||
| options = {"region_name": (file_name,invoke_name + ":" + k_name + "_k" + str(count))} | ||
| profile_trans.apply(loop,options=options) | ||
| count += 1 |
There was a problem hiding this comment.
A comment about why we are using a counter would go a long way here.
Something like "Allows invokes of the same name to be profiled individually"
| if loop.ancestor(OMPParallelDirective) or loop.ancestor(OMPParallelDoDirective) \ | ||
| or loop.ancestor(OMPDoDirective): | ||
| raise TransformationError( | ||
| "Must apply profile_loops BEFORE openmp_parellelise_loops function in optimisation script.") |
There was a problem hiding this comment.
Line too long, we should be using an 80 character limit.
| "Must apply profile_loops BEFORE openmp_parellelise_loops function in optimisation script.") | |
| "Must apply profile_loops BEFORE " | |
| "openmp_parellelise_loops function in optimisation " | |
| "script.") |
| # Check if the profiling calipers have been added before the colouring. | ||
| if isinstance(child, ProfileNode): | ||
| raise TransformationError( | ||
| "Must apply colour_loops BEFORE profile_loops function in optimisation script.") |
There was a problem hiding this comment.
Line too long:
| "Must apply colour_loops BEFORE profile_loops function in optimisation script.") | |
| "Must apply colour_loops BEFORE profile_loops function in " | |
| "optimisation script.") |
| ctrans.apply(child, options={"tiling": enable_tiling}) | ||
|
|
||
| # ----------------------------------------------------------------------------- | ||
| def profile_loops(psyir: FileContainer,colours_only=True): |
There was a problem hiding this comment.
| def profile_loops(psyir: FileContainer,colours_only=True): | |
| def profile_loops(psyir: FileContainer, colours_only=True): |
| if loop.ancestor(OMPParallelDirective) or loop.ancestor(OMPParallelDoDirective) \ | ||
| or loop.ancestor(OMPDoDirective): |
There was a problem hiding this comment.
Line too long:
| if loop.ancestor(OMPParallelDirective) or loop.ancestor(OMPParallelDoDirective) \ | |
| or loop.ancestor(OMPDoDirective): | |
| if (loop.ancestor(OMPParallelDirective) or | |
| loop.ancestor(OMPParallelDoDirective) or | |
| loop.ancestor(OMPDoDirective)): |
|
Oakley Brunt (@oakleybrunt) Thanks for those suggested changes, the code looks much better now. I've added them, tested them with both core and apps and committed them. I've updated the trac.log to reflect the recent core test as well. Cheers. |
PR Summary
Sci/Tech Reviewer: Oakley Brunt (@oakleybrunt)
Code Reviewer: Ed Hone (@EdHone)
This PR consists of the addition of a function called
profile_loopsto thepsyclone_tools.pyfile. This function is called by users in the optimisation scripts in apps and usesProfileTransfrom Psyclone to wrap the outermost loops of individual kernels with timing callipers. The names of the callipers are generated in the function, using the file name, invoke name, kernel name and the number of each kernel in that particular invoke.Code Quality Checklist
Testing
trac.log
Test Suite Results - lfric_core - profiling_dev_core/run7
Suite Information
Task Information
✅ succeeded tasks - 388
Security Considerations
Performance Impact
AI Assistance and Attribution
Documentation
PSyclone Approval
Sci/Tech Review
(Please alert the code reviewer via a tag when you have approved the SR)
Code Review