Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Solverstatisticscheck #36

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

CusiniM
Copy link
Contributor

@CusiniM CusiniM commented Sep 12, 2023

In this PR I have added a new check called performancecheck (feel free to suggest a different name).

To summarize:

  1. it parses the log to extract info about number of time steps (cycles), number of attempts, number of configuartion iterations, nonlinear iterations and linear iterations.
  2. it writes this info to a file that's similar to what timehistory may provide for us in the future
  3. compares this file against a baseline file using two different tolerances for nonlinear and linear iterations.

I could add some plots of the number of iterations if we think they could be useful.

I still see 2 main issues:

  • it is hard to know what to do if the number of time steps is different. Even though it's probably rare, this can happen on different platforms just because of an extra iteration. Maybe we could have the check compare the total first and then do the time-step by time-step comparison.
  • I think that having a mechanism to only run larger tests that use iterative solvers upon requests will become necessary soon.

@castelletto1, @francoishamon @victorapm we should also list the MGR strategies we want to test and pick a case for each one of them so that I can start adding tests.

For now, I have added:

  • PoroElastic_Mandel_smoke_fim_mgr.xml (mgr strategy SinglephasePoromechanics )
  • PoroElastic_staircase_co2_3d_mgr.xml (mgr strategy MultiphasePoromechanicsWithWells)
  • Sneddon_embeddedFrac_benchmark (mgr strategy SolidMechanicsEmbeddedFractures)

@paveltomin
Copy link
Contributor

it is hard to know what to do if the number of time steps is different
how does currently the time history compared in that case? if data length is not the same - comparison gives 'fail', right?
we may think about some interpolation here and doing some norm between the interpolated curves if that is possible.

@paveltomin
Copy link
Contributor

Please have a look at GEOS-DEV/GEOS#2680

@CusiniM
Copy link
Contributor Author

CusiniM commented Sep 15, 2023

it is hard to know what to do if the number of time steps is different how does currently the time history compared in that case? if data length is not the same - comparison gives 'fail', right? we may think about some interpolation here and doing some norm between the interpolated curves if that is possible.

For the curve_check that's what happens. Basically the solution is interpolated. The restarts will fail if timestepping is different (and they should). I feel that interpolating for the number of iterations does not make that much sense. It won't give us that much more information than comparing the total number of iterations or the average number per time-step. We could decide to only check that on all machine apart from the one used to create the baseline where we require a 1 to 1 match.

@CusiniM
Copy link
Contributor Author

CusiniM commented Nov 8, 2023

can someone review this PR so that we can move forward with it?

@CusiniM
Copy link
Contributor Author

CusiniM commented Nov 8, 2023

Please have a look at GEOS-DEV/GEOS#2680

I don't think I will add checks for sequential strategies for now. The goal is to add some coverage for MGR strategies which are only used for fully coupled.

@wrtobin
Copy link
Contributor

wrtobin commented Nov 8, 2023

Conceptually this is a good start.

I worry that we might want to use a different term than performance, as I tend to think of performance checks in terms of timings/flops/etc, and this is more about convergence characteristics. I don't think performance is wrong, I just wonder if there is a more applicable term that doesn't carry the same connotation.

Generically, what we're doing with this is checking for any divergence from a baseline that is not strictly about problem correctness.

Also, longer-term I think we would want to make the tool more generic to extract arbitrary metrics using regex's (or just python functions which consume strings and return metric info -- if we don't want to be strongly tied to regexes) to construct plots we do curve-checking against.

@victorapm
Copy link

How about verification?

@wrtobin
Copy link
Contributor

wrtobin commented Nov 8, 2023

How about verification?

Classically yeah, it would be validation == correctness of the solution, verification == correctness of execution

@CusiniM CusiniM force-pushed the feature/cusini/performancecheck branch from 73a6165 to 7a434bf Compare November 8, 2023 18:41
@CusiniM
Copy link
Contributor Author

CusiniM commented Nov 8, 2023

Conceptually this is a good start.

I worry that we might want to use a different term than performance, as I tend to think of performance checks in terms of timings/flops/etc, and this is more about convergence characteristics. I don't think performance is wrong, I just wonder if there is a more applicable term that doesn't carry the same connotation.

Generically, what we're doing with this is checking for any divergence from a baseline that is not strictly about problem correctness.

Maybe, since at the moment that's the only thing that it is doing, we can simply call it SolverStatisticsCheck? I feel people already struggle with understanding how our testing works so the less abstract of a name we pick the better.

Also, longer-term I think we would want to make the tool more generic to extract arbitrary metrics using regex's (or just python functions which consume strings and return metric info -- if we don't want to be strongly tied to regexes) to construct plots we do curve-checking against.

To be honest, I really hope that eventually we can output this data through time-history and completely avoid parsing the log. I just did it for now coz it felt like we needed a quick solution.

@victorapm
Copy link

the less abstract of a name we pick the better

Good point! SolverStatisticsCheck sounds accurate here

@CusiniM CusiniM changed the title Adding Performancecheck Adding Solverstatisticscheck Nov 8, 2023
errors:
"""
# Define regular expressions
cycle_pattern = r"\d+\s*:\s*Time: [\d.e+-]+ s, dt: [\d.e+-]+ s, Cycle: (\d+)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new changes in log timestamp format, this would miss anything that is in minutes, years, etc. Maybe switch s, to \s*,?


# Create an HDF5 file for storing the data
output_fileName = os.path.join(os.path.dirname(fname), 'extracted_solverStat_data.h5')
with h5py.File(output_fileName, 'w') as hdf5_file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are already have hdf5_wrapper as a dependency, you could switch this to:

with hdf5_wrapper.hdf5_wrapper(output_fileName, 'w') as hdf5_file:
...

And then write to the object as if it were a simple python dictionary:

hdf5_file['some_key'] = {'some': ['value']}


def solver_statistics_check_parser():
"""
Build the curve check parser
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build the solver statistics parser

@paveltomin
Copy link
Contributor

Recently just saw this:

********************************************************************************
Error: /Problem/Solvers/linearElasticity/SolverStatistics/numSuccessfulNonlinearIterations/__values__
	Scalar values of types int32 and int32 differ: 91, 58.
********************************************************************************
********************************************************************************
Error: /Problem/Solvers/linearElasticity/SolverStatistics/numSuccessfulLinearIterations/__values__
	Scalar values of types int32 and int32 differ: 91, 58.
********************************************************************************

in restartcheck outputs and wondering if that way can be extended instead of parsing the log?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants