# rocALUTION GPU solvers

### Description 
Test of different iterative linear solvers and preconditionners from library rocALUTION library https://github.com/ROCmSoftwarePlatform/rocALUTION. These solvers run either on GPU AMD card, either on CPU.

The test cases are run with TRUST sequentially and parallel on several cores.

NB: They will be fully available in the TRUST 1.9.1 version.

In [None]:
from trustutils import run
run.TRUST_parameters()
run.introduction('Pierre LEDAC (CEA/DES/ISAS/DM2S/STMF/LGLS)')
# Creation des jeux de donnees
NP=4
seuil="seuil 1.e-4 impr"
cases=[
       #("gcp_ssor"          ,"CG/SSOR"         ,"gcp  { precond ssor { omega 1.6 } %s }" % seuil), 
       #("gc"            ,"CG"             ,"rocalution gcp      { precond null   { } %s }" % seuil),
       #("gcp_jacobi"    ,"CG/Jacobi"      ,"rocalution gcp      { precond jacobi { } %s }" % seuil),
       #("bicgstab_multicoloredgs" ,"BiCGStab/MulticoloredGS","rocalution bicgstab { precond multicoloredgs { } %s }" % seuil),
       #("bicgstab_ilu0" ,"BiCGStab/ILU(0)","rocalution bicgstab { precond ilu    { level 0 } %s }" % seuil),
       #("fgcp_pairwiseamg" ,"FCG/Pairwise-AMG","rocalution fgcp { precond pairwiseamg { } %s }" % seuil), #flexible plante
       ("gcp_pairwiseamg"  ,"CG/Pairwise-AMG" ,"rocalution gcp  { precond pairwise-amg { } %s }" % seuil), 
       ("gcp_uaamg"        ,"CG/UA-AMG"       ,"rocalution gcp  { precond ua-amg       { } %s }" % seuil),
       ("gcp_saamg"        ,"CG/SA-AMG"       ,"rocalution gcp  { precond sa-amg       { } %s }" % seuil),
       ("bicgstab_gs"      ,"BiCGStab/GS"     ,"rocalution bicgstab { precond gs       { } %s }" % seuil),
       #("bicgstab_sgs"     ,"BiCGStab/SGS"    ,"rocalution bicgstab { precond sgs      { } %s }" % seuil), # diverge en //
      ]
run.reset()
run.initCaseSuite()
for case,label,syntax in cases:
    # Create test case:
    run.executeCommand("cas=%s;mkdir -p $cas;cd $cas;cp ../base.data $cas.data;ln -s -f ../post_run ." % case, verbose=False)
    cas = run.addCase(case,"%s.data" % case)
    cas.substitute("_solveur_",syntax)
    # Create a parallel test case:
    run.executeCommand("cas=%s;cd $cas;make_PAR.data $cas %s;exit 0" % (case,NP), verbose=False)
    cas = run.addCase(case,"PAR_%s.data" % case, NP)
    
run.printCases()

In [None]:
run.runCases()

# Convergence

In [None]:
from trustutils import plot
    
a = plot.Graph("Relative residual ||Ax(it)-b||/||Ax(0)-b|| during the fist time step:","",1,1,[10,5])

for case,label,syntax in cases:
    cols = plot.loadText(case+"/%s.res" % case)
    a.add(cols[0],cols[1],label="%s" % label, marker='-')
    cols = plot.loadText(case+"/PAR_%s.res" % case)
    a.add(cols[0],cols[1],label="%s (%s MPI cores)" % (label,NP), marker='o')

a.label("Iteration","Residual")
a.subplot.set_yscale('log')

Though the fastest convergence is obtained with a multigrid preconditionner (SA-AMG) in sequential, the pairwise aggregation multigrid, as the only parallel implemented multigrid preconditionner in rocALUTION, is currently the most interesting one.

# Memory used

In [None]:
a = plot.Graph("Max RAM per core used during calculation:","",1,1,[10,5])
for case,label,syntax in cases:
    cols = plot.loadText(case+"/%s.ram" % case)
    a.add(cols[0],cols[1],label="%s" % label)
    cols = plot.loadText(case+"/PAR_%s.ram" % case)
    a.add(cols[0],cols[1],label="%s (%s MPI cores)" % (label,NP), marker='-o')
a.label("Time [s]","RAM [MB]")

# CPU time evolution

In [None]:
a = plot.Graph("CPU time of pressure solve during calculation:","",1,1,[10,5])
for case,label,syntax in cases:
    cols = plot.loadText(case+"/%s.cpu" % case)
    a.add(cols[0],cols[1],label="%s" % label)
    cols = plot.loadText(case+"/PAR_%s.cpu" % case)
    a.add(cols[0],cols[1],label="%s (%s MPI cores)" % (label,NP), marker='-o')
a.label("Time step","CPU [s]")
a.scale(yscale='log')
#a.subplot.set_xticks(range(1,6))

The faster solver in parallel is CG with Pairwise-AMG preconditionner. We can notice the SA-AMG precondionner is fast only in sequential, and that the LU solver given as the coarse grid solver (only available for SA-AMG and UA-AMG) improves CPU times.

### Computer performance:

In [None]:
# run.plotTU()
run.tablePerf()

# Conclusions

XXX