Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count FLOPs and HitRate inside functor and output them to a log file #850

Open
wants to merge 97 commits into
base: master
Choose a base branch
from

Conversation

SamNewcome
Copy link
Contributor

@SamNewcome SamNewcome commented Apr 25, 2024

Description

The FLOP counter functor was removed as it was too rigid. See #841 for more details on this "rigidity". The proposed alternative for counting FLOPs was for it to be handled internally by a functor. The advantages are outlined in #841. This has been implemented.

At the same time, md-flexible's FLOP outputting mechanic was misleading. See #707 for more details. A solution to this was sketched as an idea in #695.

This pull request

  • Refactors the FLOP counting interface into the functors themselves.
  • Implements the FLOP counting mechanic for LJFunctor. Implementing the FLOP counting mechanism inside the other vectorisations of this functor is not worthwhile, as their future is questionable in the face of potential vectorised wrapper variants, as discussed in Intrinsics Wrappers #833. The MultisiteLJFunctor will not get an implementation in this PR, as the implementation of the functor is suboptimal and will be replaced in Multi Site LJ AVX512 Functor #810.
  • Refactors and extends the FLOP counting tests.
  • Adds an extra optional logger to AutoPas which logs FLOPs and HitRate.

Related Pull Requests

Resolved Issues

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

  • Existing FLOP counter functor test is refactored as a test for LJFunctor's FLOP counter.
  • Existing tests are extended to also test FLOP counting with globals
  • Existing tests are extended to also test data races in FLOP counting with SoA Verlet.
  • The FLOPs counts produced by the old functor is compared against the FLOP counts produced by the new functor for the first iteration of falling drop.
  • Tested that SVE functor compiles and runs on ARM cluster, with FLOP logging disabled and that it compiles with FLOP login enabled.

@SamNewcome SamNewcome self-assigned this Apr 25, 2024
@SamNewcome SamNewcome mentioned this pull request May 16, 2024
2 tasks
AutoPasLog(WARN, "LJFunctorAVX::getHitRate called but is not implemented and will return 0.");
return 0;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be so annoying about the warnings.
But my default setup when I'm not interested in flops would use the LJ-AVX functor with AUTOPAS_LOG_FLOPS=OFF and log-level=warn.
This leads to a completely full terminal output :(

I'm also not so sure how to handle this if you really want a runtime warning within the functor...
Maybe the return value could be std::numeric_limits<double>::quiet_NaN(), to make it very clear in the output that these are no correct values?

@SamNewcome
Copy link
Contributor Author

SamNewcome commented Jul 9, 2024

To summarise a recent change to this branch:

  • @thesamriel had an issue with constant log spam due to FLOP counting metrics not being implemented in the case that AUTOPAS_LOG_FLOPS=ON.
  • To avoid this issue, as well as the maintainability of FLOP logging, the handling of Functors without FLOP counting implemented is moved into FLOPLogger itself.
  • The default implementation returns negative FLOPs and hit rate.
  • When the FLOPLogger receives negative FLOPs or hit rate, it outputs blank fields in the CSV.
  • If this happens at least once, FLOPLogger outputs an INFO level log upon destruction e.g. at the end of the simulation.
  • If a relevant function is not implemented "Not Implemented" is outputted instead in the relevant field.
  • The md-flexible functors that don't implement these functions still output a DEBUG level log upon construction if countFLOPs is enabled.

* leaves a blank space in the log.
* @return number of FLOPs
*/
[[nodiscard]] virtual int getNumFLOPs() const { return -1; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return a 64-bit value like long, otherwise, I fear we risk overflows here. Also, why sacrifice one full bit for the error state? Why not simply use 0 or std::numeric_limits<size_t>::max() (=2^64 - 1)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use std::numeric_limits<size_t>::max(). I think 0 is too potentially problematic.

*
* @return (number of kernel calls) / (number of distance calculations)
*/
[[nodiscard]] virtual double getHitRate() const { return -1; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I'd go for a special double value like std::numeric_limits<T>::quiet_NaN (not sure if there is something more fitting).

Comment on lines 56 to 58
const auto numFLOPsStr = numFLOPs >= 0 ? std::to_string(numFLOPs) : "";
const auto hitRateStr = hitRate >= 0 ? std::to_string(hitRate) : "";
spdlog::get(_loggerName)->info("{},{},{}", iteration, numFLOPsStr, hitRateStr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason you are logging anything at all in that case and not just skip the call to the logger?

Something like:

Suggested change
const auto numFLOPsStr = numFLOPs >= 0 ? std::to_string(numFLOPs) : "";
const auto hitRateStr = hitRate >= 0 ? std::to_string(hitRate) : "";
spdlog::get(_loggerName)->info("{},{},{}", iteration, numFLOPsStr, hitRateStr);
const auto numFLOPsStr = numFLOPs >= 0 ? std::to_string(numFLOPs) : "";
const auto hitRateStr = hitRate >= 0 ? std::to_string(hitRate) : "";
// Only write if at least one value is not empty
if (not numFLOPsStr.empty() or not hitRateStr.empty())) {
spdlog::get(_loggerName)->info("{},{},{}", iteration, numFLOPsStr, hitRateStr);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to skip the logger because it makes it clear that the functor was still called but that no metrics were provided, as opposed to that it wasn't called. Furthermore, consider the scenario that two functors are used, one of which has one function implemented only and the other has none (niche scenario, but illustrates wider point). One functor results in a log with one empty field, implying that empty fields are outputted, and the other doesn't log anything.

Ultimately, I don't think the scenario where the user has AUTOPAS_LOG_FLOPS=ON but is using a functor which has no implementation is that important - I just want something which works and is not going to leave the user wondering why the FLOP log makes no sense,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider the scenario that two functors are used,

For that case you should probably also write the functor name in every line or create one log file per functor.

Comment on lines 44 to 45
if (_loggedEmptyFields) {
AutoPasLog(INFO,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think this should be a warning. It's not spammy and if the user is annoyed by it they should disable the feature they did not implement because that is improper utilization.
  2. Maybe also write this message the first time this happens otherwise a user will only read this at the end of their simulation.
  3. I'm unsure if I would like this message only at the start, only the end, or both. Pro/Cons:
    • Start: Can be seen quickly after starting the simulation leading to better reaction times but easy to miss.
    • End: Harder to miss but very late.
    • Both: Best of both worlds but duplicated warning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're a bit too premature with this comment haha - it wasn't ready for review and this doesn't really apply anymore.

FYI, this didn't actually work, I think, because the AutoPas logger gets deregistered before this object is destructed. So I opted with a much more heavy-handed approach of just logging "Not Implemented".

The Start doesn't work because we would need to force the user simulator to tell the logger which functors it will use (unless I am missing something). This could be problematic.

Option 2 could work, but might be problematic if this is in the depths of the log file and not that noticeable. As a WARN this might not be too bad.

I will post a comment summarising this further change.

Copy link

github-actions bot commented Jul 9, 2024

DocTagChecker

Unchanged Documentation

The following doc files are unchanged, but some related sources were changed. Make sure the documentation is up to date!


What is this?

@SamNewcome
Copy link
Contributor Author

Just to summarise the even more recent change:

FLOPLogger now logs "Not Implemented" in the CSV field if a functor does not implement getNumFLOPs or getHitRate
This is heavy-handed but

  • It does not spam the regular AutoPas log at all
  • If the user does not really care about FLOP logging then they aren't going to read the file
  • If the user does care, they will see the problem and can do something about that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants