-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: always report model throughput after finalize #1930
Comments
@lizziel Huh. You are right. There is a comment from William that it will be removed. I'm not sure why. I depend on it quite a bit (of course, I wrote it so, well... 😄). Obviously as you say, the reason it is more accurate overall is that the during-run timers never see the Finalize actions (and perhaps any History writing still going on). When I wrote that, I used the brain dead method of just wall time divided by model time. Nothing fancy. Frankly, I think that comment probably should go away or maybe change. Perhaps @tclune had a thought to revamp how we calculate or report that a few years ago? I guess the other question for @tclune is is there a "logger" that always exists? That is, even if a user doesn't provide a |
Even if we revamp, it is clear the comment is wrong. People now rely on it, so it cannot be deleted. The whole point of pflogger is to be able to control whether some outputs happen and also to control where they go. I want the timing profile to go through pflogger as in theory we really should want that in a separate file rather than hidden at the bottom of the console log. (Can be both, of course.) Isn't INFO the default, which means this in "always on" unless you don't want it to be? Could be I've not thought it through. |
I just checked and, yes, if you don't pass in a But maybe what @lizziel is asking is what if you build MAPL without pFlogger support? Then I suppose the print is never done since it is going through logger? Maybe there is there a way to "if pflogger". So we could do: if (pflogger) then
lgr => logging%get_logger('MAPL')
call lgr%info("Model Throughput: %f12.3 days per day", model_days_per_day)
else
write (*,*) "Model Throughput: %f12.3 days per day", model_days_per_day
end if (with the right format statement of course). I suppose we could do it with |
I'm pretty sure GCHP uses pFlogger - we've added several diagnostics in ExtData using that layer. And having a switch on top of pFlogger is the way of insanity. |
Indeed. Since it looks like the print happens no matter what, I'll make a PR to at least remove the comment as we will want this. |
Yes, we do use pFlogger but recently changed the default setting for MAPL to be WARNINGS rather than INFO. However, we still want model throughput printed. I'll revisit the reasoning behind changing INFO to WARNINGS. I think we were trying to avoid certain prints that were showing up in the log. |
You can leave the default settings for MAPL to be INFO, and we can spin off a sublogger for profiling. Then you can set the reporting to be INFO for that but WARNING for everything else. I'll have to look at the code, but I thought I was already using a sublogger for profiling. But maybe not. |
For the routine small throughput bursts, this line should change: MAPL/gridcomps/Cap/MAPL_Cap.F90 Line 339 in 5d0c6ec
Should instead be lgr => logging%get_logger('MAPL.profile') And then the level for logging the profile can be set independently. Things are a bit more complicated for the profiles of individual components. I see now that I never finished the other request to push those writes to pFlogger: Lines 2327 to 2334 in 5d0c6ec
This could at least check the appropriate logger to see if its level is above some threshold. Baby step. The problem is should it check |
And the same issue for the global timers: MAPL/base/ApplicationSupport.F90 Lines 187 to 193 in 5d0c6ec
But at least there it is clearer that it should be |
A profiling sublogger would be great. This reminds me, if there are changes to the loggers what is the best way for me to get that update? We have a logging.yml file that sits in the GEOS-Chem repository rather than be extracted from MAPL. Are the MAPL release notes the best place to get notified that GCHP's logging.yml should be updated for new features? |
@lizziel Well pFlogger itself is very stable. (Or at least mostly neglected at this time.) So the only reason to update the YAML file would be to change your defaults or add new sections because someone has instrumented a new GEOS component. AFAIK, there is no current activity on that front either. But if there were, you could monitor our main YAML file: https://github.com/GEOS-ESM/GEOSgcm_App/blob/develop/logging.yaml (Has not changed since March 2022) With MAPL3 there will probably be a bit more thought given to how to arrange MAPL specific loggers. In the mean time, hopefully we can quickly provide the changes you have requested in this ticket. |
Thank you! |
I'm taking a look at this. First, I'm just working on the "Model Throughput" timer and if I take the suggestion of @tclune for a MAPL:
handlers: [mpi_shared]
level: WARNING
root_level: INFO
MAPL.PROF:
handlers: [console_plain]
propagate: FALSE
level: WARNING
root_level: INFO we get out:
because it's just inheriting the MAPL logger setup. but if we change to: MAPL.profiler:
handlers: [console_plain]
propagate: FALSE
level: WARNING
root_level: INFO where we edit to use the old
which I kind of like. I'll try and figure out the timers now... |
@lizziel Can you look at #1938 I think if you update your console_plain:
class: streamhandler
formatter: plain
unit: OUTPUT_UNIT
level: INFO and a new logger: MAPL.profiler:
handlers: [console_plain]
propagate: FALSE
level: WARNING
root_level: INFO you should be able to do WARNING for all of MAPL, but set this at info for the profiler output |
This issue has been automatically marked as stale because it has not had recent activity. If there are no updates within 7 days, it will be closed. You can add the "long term" tag to prevent the Stale bot from closing this issue. |
Thanks @mathomp4! |
Reporting the model throughput to log at the end of the run is currently tied to pFlogger and requires MAPL logger level INFO (see here).
It would be useful if it was always output regardless of logging settings. Currently both the throughput estimation per timestep and the model timing info are always printed and are not tied to pFlogger. End-of-run model throughput reporting would fall into the same category of information as these, and always doing it would not impact performance or clutter the log.
The code has a comment that it is going to be removed soon. We use this in GCHP as a handy performance metric so request that it stay. It also seems to be much more accurate than the throughput reported per timestep. For example, my 6-hr run has the following stats. If you do the math with the total run time the correct estimation is 48 days/day.
The text was updated successfully, but these errors were encountered: