Skip to content

Conversation

@jakob-fritz
Copy link
Collaborator

Splitting tests from base-environment into separate jobs to increase parallelism and speed

@jakob-fritz jakob-fritz self-assigned this Apr 29, 2024
@jakob-fritz
Copy link
Collaborator Author

@brownbaerchen is it on purpose, that the tests for monodomain are marked with "monodomain" instead of e.g. "base"? See all test-files in directory. E.g. this one

I was surprised and assume, that those tests won't run, because currently only tests for base or other keywords will run.

However, we can also "invert" the decision and I run all tests in projects, that are not marked with any of the keywords, that need another environment-file

@jakob-fritz
Copy link
Collaborator Author

Ignore question above. I saw, that monodomain needs packages, that are not included in "base" environment. Therefore, I created a separate env and moved the monodomain tests, so that they don't run with the base-env (and fail)

@brownbaerchen
Copy link
Contributor

The libpressio and monodomain tests need compiled code. That's why they're treated separately already and are not part of the matrix. You can just leave them as they are.

@jakob-fritz
Copy link
Collaborator Author

@brownbaerchen do you know, why this test on SDC_showdown always fails?

I tries to convert a float into a matrix, but I don't see what i changed that triggered this error.

@brownbaerchen
Copy link
Contributor

Cool! Can you add a short description of how to add a new project in the contribution guide.
The steps are:

  • Add a folder in pySDC/projects
  • Add a non-empty environment update file called env_update.yml in the project directory
  • Add project name to the GitHub pipeline
  • Add tests to a directory called tests inside the project folder

That's it, right? I am wondering if we can forgo the need by the user to add the project name to the pipeline. Can we just run the tests for all directories in pySDC/projects except ones we specifically exclude?

@brownbaerchen
Copy link
Contributor

@brownbaerchen do you know, why this test on SDC_showdown always fails?

I tries to convert a float into a matrix, but I don't see what i changed that triggered this error.

Hm. I think @pancetta wrote this test and probably knows best. The petsc environment limits the version to petsc4py<3.20. Have you tried that?

@jakob-fritz
Copy link
Collaborator Author

The petsc environment limits the version to petsc4py<3.20. Have you tried that?

Thank you. That fixed it.

@jakob-fritz
Copy link
Collaborator Author

One job fails for python 3.12. The issue arises for higher number of Threads (5 and 8) but does not arise for older versions of python <=3.11 (or lower numbers of threads).
See log here
This is not a one-time-issue, but occured at least twice for two pushes.

The issue seems to come from using a different MPI. When comparing the installation logs, in Python 3.11 impi (sometimes listed as impi_rt) is used. In Python 3.12 OpenMPI is installed instead. Maybe these two packages have different default options (maybe for oversubscription or the likes?)

Do you @brownbaerchen want to have a look there, or shall I dig in, but that will take more time, as I haven't worked with the pySDC code itself until now.

@pancetta
Copy link
Member

pancetta commented May 3, 2024

Looks like a oversubscription issue which is indeed fixed by either specifying a parameter during mpirun or by using another MPI version (I think MPICH does not have this problem).

@jakob-fritz
Copy link
Collaborator Author

Looks like a oversubscription issue which is indeed fixed by either specifying a parameter during mpirun or by using another MPI version (I think MPICH does not have this problem).

Yes, adding mpich in the requirements solved it. Thank you @pancetta for the pointer!

@brownbaerchen
Copy link
Contributor

Looking good! Testing the past 5 versions is perhaps a bit excessive. Which ones do you want to keep @pancetta?

@pancetta
Copy link
Member

pancetta commented May 3, 2024

Looking here, v3.8 to v3.12 would be great to have. I'd be fine dropping the oldest one, though.

@pancetta
Copy link
Member

pancetta commented May 3, 2024

So, runtime of the tests went up to 1h.. is that intended?

@jakob-fritz
Copy link
Collaborator Author

jakob-fritz commented May 3, 2024

No it is not. This is only for the project "parallelSDC" and in combination with different versions of matplotlib. When I limit the version, the duration goes down to 10-20 minutes. But fails for Python 3.11 and Python 3.12., because those older versions of matplotlib are not compatible with the later python versions.

I am currently at it and try to find where this issue with matplotlib comes from

@pancetta
Copy link
Member

pancetta commented May 3, 2024

Ahhh, yes, I recall that I did not pursue 3.12 because Matplotlib was causing trouble... right.

@brownbaerchen
Copy link
Contributor

Hm. That's annoying. What about testing projects only up to 3.10 and the rest with everything? Since only the projects do plots (right? Tutorials..?) we don't need to worry about matplotlib for the rest.

As I could not find and fix the specific reason for the issue
@jakob-fritz
Copy link
Collaborator Author

I added a statement to skip the combination of parallelSDC and Python 3.11 or Python 3.12. So all other projects are tested with 5 versions of python and the project parallelSDC is tested with "only" 3 versions of python. By this, we can cover as much as possible for the other projects, while keeping the version-limits for matplotlib

@brownbaerchen
Copy link
Contributor

By chance, the relative paths for the plots still work out. So the website and coverage report look like before. Time is down to ~30 mins from ~45. More Python versions including the latest 3.12 are tested.

To get time down further, we would have to rewrite some tests, mainly in the DAE, resilience and the parallelSDC_reloaded projects. But as they generate some plots, this may be difficult and definitely shouldn't be part of this PR.

I am happy now. Any more requests, @pancetta?

@pancetta
Copy link
Member

pancetta commented May 6, 2024

Ok, let me try to understand this: each project now has a separate environment file, yes? And then there are environment files for the tutorials (and other stuff?) in the env folder? Does it make sense to keep those separate? Why not do this consistently?

@jakob-fritz
Copy link
Collaborator Author

Yes, you got this right, @pancetta. The environment-file for each project is located in the directory of that project. Furthermore, there are more general environment-files (e.g. for tutorial, the tests outside projects and with testing-dependencies in general) in the etc directory. We could move the env-files of the projects also there, but then the projects are less self-contained. If we keep the environment-files in the project-dirs, the environment-files are in two places (conceptually; of course, many more, since the files are in the project-dirs).

@brownbaerchen
Copy link
Contributor

brownbaerchen commented May 6, 2024

Ok, let me try to understand this: each project now has a separate environment file, yes? And then there are environment files for the tutorials (and other stuff?) in the env folder? Does it make sense to keep those separate? Why not do this consistently?

I am not sure what the best strategy is here. I think we should keep the four environments we had earlier because many (hundreds of thousands) users of pySDC probably don't care about existing projects. I suggest leaving them where they are and leaving the project environment files in the projects to keep them self-contained as Jakob says.
We can put more environment files in the tutorial directories if you want. But maybe it get's a bit out of hand at that point? Because if I remember correctly, some tutorials in the same folder use MPI and some don't, for instance.

@pancetta
Copy link
Member

pancetta commented May 6, 2024

Alright, there is probably no single great solution to this. I really appreciate that the projects are now more self-contained and that the tests are faster/more modular. Any famous last words here, @brownbaerchen @jakob-fritz ?

@jakob-fritz
Copy link
Collaborator Author

Alright, there is probably no single great solution to this. I really appreciate that the projects are now more self-contained and that the tests are faster/more modular. Any famous last words here, @brownbaerchen @jakob-fritz ?

Not last words: Please don't merge yet. I am adding dependencies for the documentation (did not raise errors, but only warnings in the logs, I saw this morning).

@jakob-fritz
Copy link
Collaborator Author

jakob-fritz commented May 6, 2024

Now, the documentation has all needed dependencies installed.
I just pushed a minor change, where I removed print-statements in the CI, that I added earlier today for debugging.

Once this is done, I am fine with squash and merge

(Not so famous) last words @pancetta : Fire at will

@jakob-fritz jakob-fritz marked this pull request as ready for review May 6, 2024 08:22
@brownbaerchen brownbaerchen merged commit 8c0bcc5 into Parallel-in-Time:master May 6, 2024
@jakob-fritz jakob-fritz deleted the split_project_tests branch May 6, 2024 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants