Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LTS Wiggle Factor #758

Merged
merged 38 commits into from Feb 1, 2023
Merged

Add LTS Wiggle Factor #758

merged 38 commits into from Feb 1, 2023

Conversation

krenzland
Copy link
Contributor

(Draft PR because currently NOT well tested!)

This PR implements the "wiggle factor" described in https://doi.ieeecomputersociety.org/10.1109/IPDPS53621.2022.00046
The idea is to make the time step of all clusters smaller. This can change the clustering and can move some elements towards a higher cluster - leading to a faster time-to-solution.

Note: Searching for this wiggle factor can be quite expensive. It can be made cheaper by setting "LtsWiggleFactorEnforceMaximumDifference = 0" but this may lead to a slightly worse wiggle factor.

(The default behavior is that this wiggle factor is NOT used because it can make unstable CFL constants seem stable.)

@codecov-commenter
Copy link

codecov-commenter commented Dec 13, 2022

Codecov Report

Merging #758 (4841e50) into master (c25ae5d) will increase coverage by 5.34%.
The diff coverage is 48.75%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##           master     #758      +/-   ##
==========================================
+ Coverage   10.18%   15.52%   +5.34%     
==========================================
  Files         262      291      +29     
  Lines       18189    21759    +3570     
==========================================
+ Hits         1852     3378    +1526     
- Misses      16337    18381    +2044     
Impacted Files Coverage Δ
src/Geometry/MeshReaderFBinding.cpp 0.00% <0.00%> (ø)
src/Initializer/MemoryManager.h 7.69% <0.00%> (-1.01%) ⬇️
src/Initializer/time_stepping/LtsLayout.cpp 0.61% <0.00%> (-1.10%) ⬇️
.../Initializer/time_stepping/LtsWeights/LtsWeights.h 100.00% <ø> (ø)
...tializer/time_stepping/LtsWeights/WeightsFactory.h 0.00% <0.00%> (ø)
src/Initializer/time_stepping/MultiRate.hpp 0.00% <0.00%> (ø)
src/Reader/readpar.f90 0.00% <ø> (ø)
src/SeisSol.h 8.16% <ø> (-2.05%) ⬇️
src/Initializer/time_stepping/LtsParameters.cpp 33.33% <33.33%> (ø)
...ializer/time_stepping/LtsWeights/WeightsModels.cpp 23.07% <33.33%> (-5.13%) ⬇️
... and 67 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@Thomas-Ulrich Thomas-Ulrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand.
For me, cost will always be cost_withoutwiggleFactor/wiggleFactor > cost_withoutwiggleFactor

src/Initializer/time_stepping/LtsWeights/LtsWeights.cpp Outdated Show resolved Hide resolved
@krenzland
Copy link
Contributor Author

For me, cost will always be cost_withoutwiggleFactor/wiggleFactor > cost_withoutwiggleFactor

The clustering changes with the wiggle factor because the time step width of the clusters change

EnforceMaximumDifference must be called before merging groups to get accurate cost estimate
Also fix small bug that sometimes created empty buffers when not merging clusters.
@krenzland krenzland marked this pull request as ready for review January 12, 2023 16:45
Copy link
Contributor

@Thomas-Ulrich Thomas-Ulrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of comments on parameters to clarify what we are talking about.

Documentation/parameters.par Outdated Show resolved Hide resolved
Documentation/parameters.par Outdated Show resolved Hide resolved
Documentation/parameters.par Outdated Show resolved Hide resolved
Documentation/parameters.par Show resolved Hide resolved
Copy link
Contributor

@sebwolf-de sebwolf-de left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, the code looks good to me, most of my comments are about the term group, which is misleading IMHO.
Also, I think that the setup is too complicated to document it with the parameter file only. There are, in fact, two features: The wiggle factor and the maximum number of allowed clusters (auto merge). So please add some more documentation in its own RtD chapter.

src/Initializer/MemoryManager.h Show resolved Hide resolved
src/Initializer/time_stepping/LtsWeights/LtsWeights.cpp Outdated Show resolved Hide resolved
src/Initializer/time_stepping/LtsWeights/LtsWeights.cpp Outdated Show resolved Hide resolved
src/Initializer/time_stepping/LtsWeights/LtsWeights.cpp Outdated Show resolved Hide resolved
src/Initializer/time_stepping/LtsWeights/LtsWeights.cpp Outdated Show resolved Hide resolved
src/Initializer/time_stepping/LtsWeights/LtsWeights.cpp Outdated Show resolved Hide resolved
src/Initializer/time_stepping/LtsWeights/LtsWeights.h Outdated Show resolved Hide resolved
Documentation/parameters.par Outdated Show resolved Hide resolved
src/Initializer/time_stepping/LtsWeights/LtsWeights.cpp Outdated Show resolved Hide resolved

// Compute baseline cost. No wiggle, no group merge
m_clusterIds = computeClusterIds(maxWiggleFactor);
totalWiggleFactorReductions += enforceMaximumDifference();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change to (see also comment above):
if (ltsParameters->getWiggleFactorEnforceMaximumDifference()) { totalWiggleFactorReductions += enforceMaximumDifference();}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I m sorry I don't understand why you resolved

// performance.
bool foundAdmissibleMerge = false;
for (const auto& [noOfGroups, cost] : mapNoOfGroupsToLowestCost) {
if (cost <= maxAdmissibleCost) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that you would choose the wiggle factor leading to the lowest number of clusters rather than the lowest cost?
I think it would make more sense to compute maxAdmissibleCost based on the lowest cost (e.g. lowest*1.2), and then choose from all wiggle factor the one with the lowest number of clusters.
Then you won't miss a wiggle factor with very low costs but many clusters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, if you gain 0.2% with the wiggle factor and then allow for 10%, then the wiggle factor appears useless in most cases.

src/Initializer/time_stepping/LtsLayout.cpp Outdated Show resolved Hide resolved
src/Initializer/time_stepping/MultiRate.hpp Outdated Show resolved Hide resolved
Copy link
Contributor

@Thomas-Ulrich Thomas-Ulrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partial review

Documentation/local-timestepping.rst Outdated Show resolved Hide resolved
Documentation/local-timestepping.rst Outdated Show resolved Hide resolved
Documentation/local-timestepping.rst Show resolved Hide resolved
Documentation/local-timestepping.rst Outdated Show resolved Hide resolved
Documentation/local-timestepping.rst Outdated Show resolved Hide resolved
Documentation/local-timestepping.rst Show resolved Hide resolved
Documentation/local-timestepping.rst Outdated Show resolved Hide resolved
Documentation/local-timestepping.rst Outdated Show resolved Hide resolved
const auto should = 0;
for (int i = 1; i <= 5; ++i) {
const auto is = computeMaxClusterIdAfterAutoMerge(
enforceMaxClusterId(clusterIds, 0), cellCosts, 1, i, 0, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this enforceMaxClusterId and put should=2

src/tests/Initializer/time_stepping/LTSWeights.t.h Outdated Show resolved Hide resolved
src/Initializer/time_stepping/LtsWeights/LtsWeights.cpp Outdated Show resolved Hide resolved

// Compute baseline cost. No wiggle, no group merge
m_clusterIds = computeClusterIds(maxWiggleFactor);
totalWiggleFactorReductions += enforceMaximumDifference();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I m sorry I don't understand why you resolved

auto maxClusterIdToEnforce = ltsParameters->getMaxNumberOfClusters() - 1;
if (ltsParameters->isAutoMergeUsed()) {
const auto maxClusterIdAfterMerging =
computeMaxClusterIdAfterAutoMerge(m_clusterIds,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the maxAdmissibleCost does not depend on the wiggle, then why not applying auto merge after you found the best wiggle?

@krenzland
Copy link
Contributor Author

With the most recent commit, this PR should now be bug free™. It at least seems to work for a production scenario of mine.

Copy link
Contributor

@Thomas-Ulrich Thomas-Ulrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@sebwolf-de sebwolf-de left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for providing the documentation, I still see some improvements there ;-)

SeisSol enforces some constraints on the clustering, for example, neighboring elements are always either in the same cluster,
or a cluster which has at most a time step size difference of 2.
Elements that are connected by a dynamic rupture face have to be in the same time-cluster.
THis is called the maximum difference property.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo

Comment on lines 23 to 24
To enable it, use the setting :code:`ClusteredLTS = 2`.
To disable it, use the value :code:`ClusteredLTS = 1`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it -> LTS

----------------------------

SeisSol enforces some constraints on the clustering, for example, neighboring elements are always either in the same cluster,
or a cluster which has at most a time step size difference of 2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or in a cluster

@krenzland
Copy link
Contributor Author

Thanks @sebwolf-de, I've fixed these issues.

@krenzland krenzland merged commit b4d32e9 into master Feb 1, 2023
@krenzland krenzland deleted the lts-wiggle-factor branch February 1, 2023 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants