# Linear Solver Performance: XDG Poisson, MPI-parallel

### Part 1, Benchmark Setup and Execution

Console.WriteLine("Execution Date/time is " + DateTime.Now);

In [1]:
#r "BoSSSpad.dll"
//#r "../../../../src/L4-application/BoSSSpad/bin/Debug/net5.0/BoSSSpad.dll"
using System;
using System.Collections.Generic;
using System.Linq;
using ilPSP;
using ilPSP.Utils;
using BoSSS.Platform;
using BoSSS.Platform.LinAlg;
using BoSSS.Foundation;
using BoSSS.Foundation.XDG;
using BoSSS.Foundation.Grid;
using BoSSS.Foundation.Grid.Classic;
using BoSSS.Foundation.Grid.RefElements;
using BoSSS.Foundation.IO;
using BoSSS.Solution;
using BoSSS.Solution.Control;
using BoSSS.Solution.GridImport;
using BoSSS.Solution.Statistic;
using BoSSS.Solution.Utils;
using BoSSS.Solution.AdvancedSolvers;
using BoSSS.Solution.Gnuplot;
using BoSSS.Application.BoSSSpad;
using BoSSS.Application.XNSE_Solver;
using BoSSS.Application.XNSFE_Solver;
using static BoSSS.Application.BoSSSpad.BoSSSshell;
Init();


In [2]:
using BoSSS.Application.XdgPoisson3;

In [3]:
string PROJECT_NAME = System.Environment.GetEnvironmentVariable("LinslvPerfPar") ?? "LinslvPerfPar"; // this allows to modify the project name for testing purposes
wmg.Init(PROJECT_NAME);
wmg.SetNameBasedSessionJobControlCorrelation();
wmg.AllJobs

Project name is set to 'WIP-k-LinslvPerfPar'.
Opening existing database 'X:\jenkins\databases\WIP-k-LinslvPerfPar'.




## Utility definitions

In [4]:
static class Utils {
    // DOF per cell in 3D
    public static int Np(int p) {
        return (p*p*p + 6*p*p + 11*p + 6)/6;
    }    
    
    /*
    //Non-equidistant nodes
    public static double[] SinLinSpacing(double l, double r, double a, int n) {
        double[] linnodes = GenericBlas.Linspace(-Math.PI * 0.5, Math.PI * 0.5, n);
        double[] linnodes2 = GenericBlas.Linspace(-1, 1, n);
        double[] nodes = new double[n];

        for (int i = 0; i < n; i++)
            //nodes[i] = linnodes2[i] * (1 - a) + (1.0 - Math.Sin(linnodes[i])) * a;
            nodes[i] = linnodes2[i] * (1 - a) + Math.Sin(linnodes[i])*a;

        for (int i = 0; i < n; i++)
            nodes[i] = nodes[i] * (r - l)*0.5 + l;
        return nodes;
    }*/
}

## Init grids and save to database

In [5]:
wmg.Grids

Opening existing database 'C:\BoSSS-Databases\WIP-k-LinslvPerfPar'.


#0: { Guid = 4276c806-f69c-43e1-86ef-1ce920cdc92b; Name = WIP-k-LinslvPerfPar-SIP_Poisson_J10485760; Cell Count = 10485760; Dim = 3 }
#1: { Guid = 15ba8bcc-a9f1-4cb7-a605-f30461111732; Name = WIP-k-LinslvPerfPar-SIP_Poisson_J1310720; Cell Count = 1310720; Dim = 3 }
#2: { Guid = a978fef8-2202-4332-acb3-ef6cd17ffb0b; Name = WIP-k-LinslvPerfPar-SIP_Poisson_J552960; Cell Count = 552960; Dim = 3 }
#3: { Guid = ea4a0f29-66d2-492c-b11f-aa95bb86c947; Name = WIP-k-LinslvPerfPar-SIP_Poisson_J163840; Cell Count = 163840; Dim = 3 }
#4: { Guid = 26992ede-1732-4424-a72f-153ea7b6f643; Name = WIP-k-LinslvPerfPar-SIP_Poisson_J69120; Cell Count = 69120; Dim = 3 }
#5: { Guid = 81c39bfd-bc9a-4fbc-af54-bbd1a581d37a; Name = WIP-k-LinslvPerfPar-SIP_Poisson_J20480; Cell Count = 20480; Dim = 3 }
#6: { Guid = 9a2c6da0-49a0-4574-ac5d-1c6228038000; Name = WIP-k-LinslvPerfPar-SIP_Poisson_J2560; Cell Count = 2560; Dim = 3 }
#7: { Guid = 84cadb65-7876-441a-b4d3-017dd01d62f9; Name = WIP-k-LinslvPerfPar-SIP_Poi

Create meshes in various resolutions:
- domain $\Omega = (-1,1)^3$; 
- a Dirichlet boundary is set everywhere; the challenge of this benchmark,
  from a numerical point, is the 1:1000 ratio in the diffusion coefficient

In [None]:
int[] Resolutions_3D = new int[] { 4, 8, 16, 32, 64, 128, 256, 512 };
IGridInfo[] grids = new IGridInfo[Resolutions_3D.Length];
for(int cnt = 0; cnt < Resolutions_3D.Length; cnt++) {
    int Res = Resolutions_3D[cnt];    
    
    double[] xNodes = GenericBlas.Linspace(-1, +1, Res + 1);
    double[] yNodes = GenericBlas.Linspace(-1, +1, Res + 1);
    double[] zNodes = GenericBlas.Linspace(-1, +1, Res + 1);
    int J = (xNodes.Length - 1)*(yNodes.Length - 1)*(zNodes.Length - 1);
    
    string GridName = string.Format(wmg.CurrentProject + "-XdgPoisson_J" + J);
    
    grids[cnt] = wmg.Grids.SingleOrDefault(grd => grd.Name.Contains(GridName)); // check if an appropriate grid is already present in the database
    if(grids[cnt] == null){
        Console.WriteLine("Creating grid with " + J + " cells.");
        
        GridCommons g;
        g      = Grid3D.Cartesian3DGrid(xNodes, yNodes, zNodes);
        g.Name = GridName;
        
        g.DefineEdgeTags(delegate (double[] X) {
            return "Dirichlet";
        });
      
        g = wmg.SaveGrid(g);  
        grids[cnt] = g;
    } else {
        Console.WriteLine("Found Grid: " + grids[cnt]);
        if(grids[cnt].NumberOfCells != J)
            throw new Exception("J mismatch");
        
        if(grids[cnt].SpatialDimension != 3)
            throw new Exception("D mismatch");
    }
}

Found Grid: { Guid = 74a506df-42b2-4bde-a5c8-028de5262648; Name = WIP-k-LinslvPerfPar-XdgPoisson_J64; Cell Count = 64; Dim = 3 }
Found Grid: { Guid = 9d05b26d-a0bb-4f23-863a-e7624ab0cf79; Name = WIP-k-LinslvPerfPar-XdgPoisson_J512; Cell Count = 512; Dim = 3 }
Found Grid: { Guid = e0c3502c-2d3e-4f9a-9396-57c9f05434e3; Name = WIP-k-LinslvPerfPar-XdgPoisson_J4096; Cell Count = 4096; Dim = 3 }
Found Grid: { Guid = 79c05d5e-c840-444c-ad42-7da0e60c3724; Name = WIP-k-LinslvPerfPar-XdgPoisson_J32768; Cell Count = 32768; Dim = 3 }
Found Grid: { Guid = 3bc2997c-0f50-440d-991a-f5489919819e; Name = WIP-k-LinslvPerfPar-XdgPoisson_J262144; Cell Count = 262144; Dim = 3 }
Creating grid with 2097152 cells.
Grid Edge Tags changed.
Creating grid with 16777216 cells.
Grid Edge Tags changed.


In [None]:
grids

## Setup Control Object for a Solver Run

### Setup of Parameter Study

Polynomial degrees to test:

In [None]:
// polynomial degrees to test
int[] PolyDegS = new int[] {2, 3, 5};

Solvers which we want to instrument:

In [None]:
using BoSSS.Solution.AdvancedSolvers;

In [None]:
// Solvers which we want to instrument:
LinearSolverCode[] solver_nameS = new LinearSolverCode[] {
    LinearSolverCode.exp_Kcycle_schwarz
}; 

Number of processors:

In [None]:
int[] MPIsizes = new int[] { 1, 2, 4, 8, 16, 32, 64, 128 };

Loop over all combinations of parameters and define a control object for each combo:

In [None]:
using BoSSS.Solution.XNSECommon;
using BoSSS.Foundation.XDG;

In [None]:
var controls = new List<(XdgPoisson3Control ctrl, int NoOfProcs)>();
LinearSolverCode solver_name = LinearSolverCode.exp_Kcycle_schwarz;
foreach(int k in PolyDegS) {
foreach(IGridInfo grd in grids) {
foreach(int MPIsize in MPIsizes) {
    int Np = Utils.Np(k);
    int J  = grd.NumberOfCells;
    if(J / MPIsize < 16) {
        // less than 16 cells per processor - to low for a multigrid.
        continue; 
    }
    if(J*Np/MPIsize > 500000) {
        // not interested in doing more then 500'000 DOFs per processor
        continue;
    } 
    
    var ctrl = new XdgPoisson3Control();
    controls.Add((ctrl, MPIsize));

    string caseName = string.Format("XdgPoisson-J{0}_k{1}_{2}", J, k, solver_name);
    Console.WriteLine("setting up: " + caseName);
    ctrl.SessionName = caseName;
    
    ctrl.SetGrid(grd);
    ctrl.savetodb = true; //for debug's sake
    ctrl.SetDGdegree(k);
    
    ctrl.LinearSolver           = solver_name.GetConfig();
    var isc = ctrl.LinearSolver as IterativeSolverConfig;
    if(isc != null) {
        //Console.WriteLine(isc.ConvergenceCriterion);
        //ctrl.LinearSolver.TargetBlockSize      = Math.Min(J*Np-1,10000);
        isc.ConvergenceCriterion = 1e-8;
    }
    
    double radius           = 0.7;
    ctrl.ExcactSolSupported = false;
    ctrl.InitialValues.Add("Phi", new Formula("X => X[0].Pow2()+X[1].Pow2()+X[2].Pow2()-"+radius+".Pow2()"));
    ctrl.MU_A = -1;
    ctrl.MU_B = -1000;
    ctrl.InitialValues.Add("rhs#A", new Formula("X => 1"));
    ctrl.InitialValues.Add("rhs#B", new Formula("X => 1"));

    //ctrl.CutCellQuadratureType = XQuadFactoryHelper.MomentFittingVariants.Classic;
    //ctrl.SetDefaultDiriBndCnd  = true;
    //ctrl.ViscosityMode         = XLaplace_Interface.Mode.SIP;
        
    ctrl.AgglomerationThreshold = 0.1;
    
    ctrl.NoOfMultigridLevels = 100; // maximum number of multigrid levels to use; actual number will be far lower.
    ctrl.TracingNamespaces = "*";
}
}
}

In [None]:
//string path = @"C:\Users\flori\Documents\BoSSS-kummer\public\src\L4-application\XdgPoisson3\bin\Release\net5.0\bench";
//foreach(var ctrl in controls) {
//    ctrl.savetodb = false;
//    ctrl.SaveToFile(System.IO.Path.Combine(path, "control-" + ctrl.SessionName + ".obj"));
//}

Total number of simulations:

In [None]:
controls.Count

## Launch Jobs

Use the default queue defined on this machine:

In [6]:
ExecutionQueues

index,type,DeploymentBaseDirectory,DeployRuntime,RuntimeLocation,Name,DotnetRuntime,BatchInstructionDir,AllowedDatabasesPaths,Username,Password,ServerName,PrivateKeyFilePath,AdditionalBatchCommands,ExecutionTime,DeploymentBaseDirectoryAtRemote,SlurmAccount,Email,MonoDebug
0,BoSSS.Application.BoSSSpad.MiniBatchProcessorClient,C:\Users\flori\AppData\Local\BoSSS-LocalJobs,False,<null>,LocalPC,dotnet,<null>,"[ C:\BoSSS-Databases, C:\Users\flori ]",,,,,,,,,,
1,BoSSS.Application.BoSSSpad.SlurmClient,X:\jenkins\deploy,False,linux/amd64-openmpi,Lb2-specialPrj-Jenkins,dotnet,,[ X:\jenkins\databases == /work/scratch/fk69umer/jenkins/databases ],fk69umer,<null>,lcluster16.hrz.tu-darmstadt.de,C:\Users\flori\.ssh\id_rsa,"[ #SBATCH -C avx512, #SBATCH --mem-per-cpu=8000 ]",05:00:00,/work/scratch/fk69umer/jenkins/deploy,special00006,<null>,False


In [7]:
var myBatch = GetDefaultQueue();
myBatch

DeploymentBaseDirectory,DeployRuntime,RuntimeLocation,Name,DotnetRuntime,BatchInstructionDir,AllowedDatabasesPaths
C:\Users\flori\AppData\Local\BoSSS-LocalJobs,False,<null>,LocalPC,dotnet,<null>,"[ C:\BoSSS-Databases, C:\Users\flori ]"


In [None]:
foreach((var ctrl, int MPIsize) in controls) {
    Console.WriteLine(" Submitting: " + ctrl.SessionName); 
    var j = ctrl.CreateJob();
    j.RetryCount = 1;
    j.NumberOfMPIProcs = MPIsize;
    j.Activate(myBatch);
    //ctrl.RunBatch();
}

In [None]:
//foreach(var j in wmg.AllJobs.Values)
//    j.DeleteOldDeploymentsAndSessions();

### Wait for Completion and Check Job Status

In [None]:
wmg.BlockUntilAllJobsTerminate(3600*24*2); // wait at maximum two days for the jobs to finish

In [None]:
wmg.AllJobs

In [None]:
wmg.Sessions.Where(sess => sess.Name.StartsWith("XdgPoisson_J"))

In [None]:
var NoSuccess = controls.Select(ctrl => ctrl.GetJob()).Where(job => job.Status != JobStatus.FinishedSuccessful).ToArray();
NoSuccess

In [None]:
// In the case of some failed job, print the directory name for further inspection:
foreach(var fail in NoSuccess)
   Console.WriteLine(fail + ":  @" + ((fail.LatestDeployment?.DeploymentDirectory?.FullName) ?? " no deployment directory"));
    //Console.WriteLine(fail.LatestDeployment);

In [None]:
//foreach(var j in NoSuccess)
//    j.DeleteOldDeploymentsAndSessions();

In [None]:
/*
string PathOffset = @"C:\Users\jenkinsci\Desktop\LinSlvPerfFail-20apr22";
foreach(var fail in NoSuccess) {
    var C = fail.GetControl();
    C.savetodb = false;
    C.SaveToFile(System.IO.Path.Combine(PathOffset, fail.Name + ".obj"));
    
    string Stdout = fail.Stdout;
    System.IO.File.WriteAllText(System.IO.Path.Combine(PathOffset, fail.Name + "-stdout.txt"), Stdout);
    
    string Stderr = fail.Stderr;
    System.IO.File.WriteAllText(System.IO.Path.Combine(PathOffset, fail.Name + "-stderr.txt"), Stderr);
}
*/

In [None]:
var FailedSessions = wmg.Sessions.Where(Si => Si.Name.Contains("XdgPoisson") &&
                                        (Si.SuccessfulTermination == false
                                        || Convert.ToInt32(Si.KeysAndQueries["Conv"]) == 0));
FailedSessions

#### Asserting Success:

Remark: since this is currently (22 Apr. 2022) work-in-progress, we allow for some jobs to fail; 
At this intermediate milestone, I want to record (by the means of tests) what **is already working**.
Thereby, I hope I won't break the working cases while trying to fix the failing ones.

In [None]:
var prelim_allowedFails = new[] { "XdgPoisson-J32768_k5_pMultigrid" };

In [None]:
NUnit.Framework.Assert.Zero(NoSuccess.Where(job => !prelim_allowedFails.Contains(job.Name)).Count(), "Some Jobs Failed");

In [None]:
NUnit.Framework.Assert.Zero(FailedSessions.Where(s => !prelim_allowedFails.Contains(s.Name)).Count(), "Some Sessions did not terminate successfully.");

List Output of some job (arbitrarily the first one):

In [None]:
wmg.AllJobs.First().Value.Stdout