Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance testing via chron jobs #151

Open
jpmorgan98 opened this issue Jan 25, 2024 · 3 comments
Open

Performance testing via chron jobs #151

jpmorgan98 opened this issue Jan 25, 2024 · 3 comments
Assignees
Labels
hpc Issues relating to HPC deployments

Comments

@jpmorgan98
Copy link
Collaborator

In the test/performance directory there is currently only one test. I am thinking we just start with whatever that test is on a given number of quartz nodes, then go from there, but I go some other questions:

  • Are we going to be comparing the solutions for correctness? If so where are we storing that tally data (which might be massive)
  • What are the jobs (c5g7 pulsed sphere)
  • What are the job parameters we are shooting for (machine, # of nodes, # MPI ranks, Numba v Python mode, etc.)
  • Where will we be storing the job runtimes (I say another GitHub repo in the CEMeNT)
@jpmorgan98 jpmorgan98 added the hpc Issues relating to HPC deployments label Jan 25, 2024
@ilhamv
Copy link
Collaborator

ilhamv commented Jan 25, 2024

I think we don't need to check for correctness, as that is covered by regression and verification tests. But we still need to run problems with huge tally sizes.

For the runtime record, I think we can store it as a text file in the repo. We will also add a Python script that generates plots of the runtime record. As I think more about it, we may want to exclude the performance test from automatic testing. Instead, we should run it manually and commit new runtime records as a PR as needed.

I'll make lists of the test problems and the performance metrics and post them here (going to refer to our "metrics of victory").

@ilhamv
Copy link
Collaborator

ilhamv commented Jan 25, 2024

Performance test problems

SMR

  • k-eigenvalue
  • Four-phase transient
  • Off-critical transient

C5G7

  • k-eigenvalue
  • Four-phase transient
  • Off-critical transient

Transient shielding

  • Kobayashi dog-leg
  • Pulsed sphere

Burst transient

  • Dragon

@jpmorgan98
Copy link
Collaborator Author

jpmorgan98 commented Jan 26, 2024

Got it! For this point:

As I think more about it, we may want to exclude the performance test from automatic testing.

We could do this as nightly or weekly jobs. This would also help burn some cycles on the machines we have access to.

Also any idea what the number of nodes/ threads we want to do for each of these?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hpc Issues relating to HPC deployments
Projects
None yet
Development

No branches or pull requests

2 participants