-
Notifications
You must be signed in to change notification settings - Fork 3
Home
Welcome to the TeaLeaf_ref wiki!
4/11/14
Adding new test problems with pass criteria.
Test 2 takes about 20 seconds on a 16 core Sandy bridge, running 4 omp 4 mpi.
I increased eps to 1.0e-15 and it takes 33 ish.
Flat omp is slow so I need to look at first touch and numa issues.
13/11/15 - Richard P. Smedley-Stevenson
Commit 338 on branch dev/deflation is functionally correct provided we uncomment: "inner_use_ppcg = .TRUE.". The code performance is limited by the costs of the internal halo exchanges between patches, as we increase the size of the deflation space i.e. tiles_per_task*tasks. There is a sweet spot where the improved performance of the compute on the fine grid and the reduced number of outer iterations is counterbalanced by the additional costs of the halo exchange and coarse grid solves.
16/11/15 - Richard P. Smedley-Stevenson
Hopefully I've found all the issues so the code is functionally correct! The two-level scheme test problem has not been optimised and the parameters should be varied as specified below.
The default test problem can be modified to use different numbers of tiles:
- total_tiles=integer >= 1 and divisible by number of MPI ranks (square tiles will work best)
We can also specify the tolerance and maximum number of iterations for the coarse grid solve:
- coarse_solve_eps=real e.g. 0.1
- coarse_solve_max_iters=integer e.g. 10
These parameters should be optimised in order to minimize the total run time.