<a href="https://colab.research.google.com/github/robertopsouto/invmultifis_notebooks/blob/main/english/INVMULTIFIS_CSEM3D_tcmalloc.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **INVMULTIFIS Project: Development of multi-physics data inversion software with optimization via artificial intelligence**

The project proposes the development of an innovative inversion technology for the characterization and monitoring of deep water reservoirs for Petrobras (the Brazilian Oil Company) using CSEM (Controlled-Source Electromagnetic Methods), a robust risk reduction tool in the drilling of oil basins, using multiphysics data in the 3D domain. One of the main objectives of this project is to develop, optimize and parallelize CSEM codes, aiming at improving their performance.

# Steps to install CSEM3D program

### Installing GNU Make v4.3 to correct deal with CSEM3D `makefile`

In [2]:
%%bash
wget https://ftp.gnu.org/gnu/make/make-4.3.tar.gz
tar xfz make-4.3.tar.gz

--2023-08-07 15:55:26--  https://ftp.gnu.org/gnu/make/make-4.3.tar.gz
Resolving ftp.gnu.org (ftp.gnu.org)... 209.51.188.20, 2001:470:142:3::b
Connecting to ftp.gnu.org (ftp.gnu.org)|209.51.188.20|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2317073 (2.2M) [application/x-gzip]
Saving to: ‘make-4.3.tar.gz’

     0K .......... .......... .......... .......... ..........  2%  138K 16s
    50K .......... .......... .......... .......... ..........  4%  275K 12s
   100K .......... .......... .......... .......... ..........  6% 72.5M 8s
   150K .......... .......... .......... .......... ..........  8% 1.26M 6s
   200K .......... .......... .......... .......... .......... 11%  351K 6s
   250K .......... .......... .......... .......... .......... 13% 62.0M 5s
   300K .......... .......... .......... .......... .......... 15%  146M 4s
   350K .......... .......... .......... .......... .......... 17% 1.26M 4s
   400K .......... .......... .......... .......... ..

In [3]:
%%bash
cd make-4.3
./configure
make
make install

checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /usr/bin/mkdir -p
checking for gawk... no
checking for mawk... mawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether make supports the include directive... yes (GNU style)
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking dependency style of gcc... gcc3
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /usr/bi

src/main.c: In function ‘main’:
 1938 |       p[-1] = '\0';
      |       ~~~~~~^~~~~~
src/main.c:1935:15: note: destination object of size [0, 9223372036854775807] allocated by ‘quote_for_env’
 1935 |           p = quote_for_env (p, eval_strings->list[i]);
      |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from src/makeint.h:31,
                 from src/main.c:17:
lib/alloca.h:46:18: note: at offset -1 into destination object of size [0, 9223372036854775807] allocated by ‘__builtin_alloca’
   46 | #  define alloca __builtin_alloca
src/main.c:1930:19: note: in expansion of macro ‘alloca’
 1930 |       p = value = alloca (len);
      |                   ^~~~~~
In file included from src/makeint.h:31,
                 from src/read.c:17:
src/read.c: In function ‘eval_makefile’:
   46 | #  define alloca __builtin_alloca
src/read.c:443:3: note: in expansion of macro ‘alloca’
  443 |   alloca (0);
      |   ^~~~~~
src/read.c: In function ‘eval_buffer’:
   46 | #

In [4]:
%%bash
make -v

GNU Make 4.3
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.


##   Creating user, once OpenMPI does not recommend run MPI with `root` user



In [1]:
%%bash
adduser csem

Adding user `csem' ...
Adding new group `csem' (1000) ...
Adding new user `csem' (1000) with group `csem' ...
Creating home directory `/home/csem' ...
Copying files from `/etc/skel' ...
Try again? [y/N] Changing the user information for csem
Enter the new value, or press ENTER for the default
	Full Name []: 	Room Number []: 	Work Phone []: 	Home Phone []: 	Other []: Is the information correct? [Y/n] 

New password: Password change has been aborted.
passwd: Authentication token manipulation error
passwd: password unchanged
Use of uninitialized value $answer in chop at /usr/sbin/adduser line 595.
Use of uninitialized value $answer in pattern match (m//) at /usr/sbin/adduser line 596.
Use of uninitialized value $answer in chop at /usr/sbin/adduser line 625.
Use of uninitialized value $answer in pattern match (m//) at /usr/sbin/adduser line 626.


## PETSC v3.18.4

### Download, extract the source code file and run `configure` file

In [None]:
%%bash
su csem
mkdir -p ${HOME}/petsc/gnu
cd ${HOME}/petsc/gnu
wget https://gitlab.com/petsc/petsc/-/archive/v3.18.4/petsc-v3.18.4.tar.gz
tar zxvf petsc-v3.18.4.tar.gz
cd petsc-v3.18.4
./configure \
 --prefix=${PWD}/installdir \
 --with-fortran \
 --with-fortran-kernels=true \
 --with-cuda \
 --download-fblaslapack \
 --with-scalar-type=complex \
 --with-precision=double \
 --with-debugging=0 \
 --with-x=0 \
 --with-gnu-compilers=1 \
 --with-cc=mpicc \
 --with-cxx=mpicxx \
 --with-fc=mpif90 \
 --with-make-exec=make \
 2>&1 | tee ../configure.out

### Run `make all` phase.
```bash
If it is successfully finished, this message must appear:

=========================================
Now to install the libraries do:
make PETSC_DIR=/home/csem/petsc/gnu/petsc-v3.18.4 PETSC_ARCH=arch-linux-c-opt install
=========================================
```

In [None]:
%%bash
su csem
cd /home/csem/petsc/gnu/petsc-v3.18.4
make PETSC_DIR=/home/csem/petsc/gnu/petsc-v3.18.4 PETSC_ARCH=arch-linux-c-opt all

### Run `make install` phase
```bash
If it is successfully finished, this message must appear:

====================================
Install complete.
Now to check if the libraries are working do (in current directory):
make PETSC_DIR=/home/csem/petsc/gnu/petsc-v3.18.4/installdir PETSC_ARCH="" check
====================================
```

In [None]:
%%bash
su csem
cd /home/csem/petsc/gnu/petsc-v3.18.4
make PETSC_DIR=/home/csem/petsc/gnu/petsc-v3.18.4 PETSC_ARCH=arch-linux-c-opt install

### Run `make check` phase
```bash
If it is successfully finished, this message must appear:

Running check examples to verify correct installation
Using PETSC_DIR=/home/csem/petsc/gnu/petsc-v3.18.4/installdir and PETSC_ARCH=
C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
C/C++ example src/snes/tutorials/ex19 run successfully with cuda
Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
Completed test examples
```

In [None]:
%%bash
su csem
cd /home/csem/petsc/gnu/petsc-v3.18.4
make PETSC_DIR=/home/csem/petsc/gnu/petsc-v3.18.4/installdir PETSC_ARCH="" check

### Test an example performing complex numbers (`ex11f.F90`)

In [None]:
%%bash
su csem
cd /home/csem/petsc/gnu/petsc-v3.18.4/src/ksp/ksp/tutorials
make ex11f
mpirun -n 1 ./ex11f -norandom -pc_type none -ksp_monitor_short -ksp_gmres_cgs_refinement_type refine_always

### Check with reference output (`output/ex11f_1.out`)

In [None]:
%%bash
su csem
cd /home/csem/petsc/gnu/petsc-v3.18.4/src/ksp/ksp/tutorials
cat output/ex11f_1.out

### Following instructions in https://petsc.org/release/developers/testing/ to run an example that requires CUDA.

In [None]:
%%bash
su csem
cd /home/csem/petsc/gnu/petsc-v3.18.4/
make print-test query='suffix' queryval='2_aijcusparse'

In [None]:
%%bash
su csem
cd /home/csem/petsc/gnu/petsc-v3.18.4/
make test search=ksp_ksp_tutorials-ex1_2_aijcusparse

In [None]:
%%bash
su csem
cd /home/csem/petsc/gnu/petsc-v3.18.4/
cat arch-linux-c-opt/tests/ksp/ksp/tutorials/runex1_2_aijcusparse/ksp_ksp_tutorials-ex1_2_aijcusparse.sh

In [None]:
%%bash
su csem
cd /home/csem/petsc/gnu/petsc-v3.18.4/
cd src/ksp/ksp/tutorials/
make ex1
/usr/bin/mpiexec --oversubscribe  -n 1  ./ex1 \
-petsc_ci \
-pc_type sor \
-pc_sor_symmetric \
-ksp_monitor_short \
-ksp_gmres_cgs_refinement_type refine_always \
-mat_type aijcusparse \
-vec_type cuda \
-use_gpu_aware_mpi 0


If it is successfully finished, this output must appear:
```bash
  0 KSP Residual norm 0.968764
  1 KSP Residual norm 0.361001
  2 KSP Residual norm 0.247329
  3 KSP Residual norm 0.0808915
  4 KSP Residual norm 0.01289
  5 KSP Residual norm 0.00375064
  6 KSP Residual norm 0.000294092
  7 KSP Residual norm 1.40861e-05
  8 KSP Residual norm 3.48863e-07
KSP Object: 1 MPI process
  type: gmres
    restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with one step of iterative refinement
    happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI process
  type: sor
    type = symmetric, iterations = 1, local iterations = 1, omega = 1.
  linear system matrix = precond matrix:
  Mat Object: 1 MPI process
    type: seqaijcusparse
    rows=10, cols=10
    total: nonzeros=28, allocated nonzeros=28
    total number of mallocs used during MatSetValues calls=0
      not using I-node routines
Norm of error 4.10316e-07, Iterations 8
  0 KSP Residual norm 0.377523
  1 KSP Residual norm 0.0140399
  2 KSP Residual norm 0.000364106
  3 KSP Residual norm 7.83047e-06
  4 KSP Residual norm 1.33045e-07

  ```

In [None]:
%%bash
su csem
cd /home/csem/petsc/gnu/petsc-v3.18.4/
cd src/ksp/ksp/tutorials/
cat output/ex1_2_aijcusparse.out

### Profiling with `nvprof`

In [None]:
%%bash
su csem
cd /home/csem/petsc/gnu/petsc-v3.18.4/
cd src/ksp/ksp/tutorials/
make ex1
export PATH=/usr/local/cuda/bin:$PATH
/usr/bin/mpiexec --oversubscribe  -n 1 nvprof -f -o ex1.%q{OMPI_COMM_WORLD_RANK}.nvprof ./ex1 \
-petsc_ci \
-pc_type sor \
-pc_sor_symmetric \
-ksp_monitor_short \
-ksp_gmres_cgs_refinement_type refine_always \
-mat_type aijcusparse \
-vec_type cuda \
-use_gpu_aware_mpi 0

### Show profiling obtained with `nvprof`

In [None]:
%%bash
su csem
cd /home/csem/petsc/gnu/petsc-v3.18.4/
cd src/ksp/ksp/tutorials/
export PATH=/usr/local/cuda/bin:$PATH
nvprof -i ex1.0.nvprof

## `CSEM3D` program

### Download the source code files of `CSEM3D` program in the `root` area.

In [5]:
%%bash
wget --no-check-certificate 'https://docs.google.com/uc?export=download&id=10WfuzFuv9bfr9MTeyphTyRM7i9rjGlxf' -O csem3d_w-v1.0.2.tar.gz

--2023-08-07 15:56:15--  https://docs.google.com/uc?export=download&id=10WfuzFuv9bfr9MTeyphTyRM7i9rjGlxf
Resolving docs.google.com (docs.google.com)... 64.233.189.100, 64.233.189.139, 64.233.189.113, ...
Connecting to docs.google.com (docs.google.com)|64.233.189.100|:443... connected.
HTTP request sent, awaiting response... 303 See Other
Location: https://doc-0s-38-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/pj14jdoj6fm99tis4snnqc0qkooaeh42/1691423775000/11313332932477869617/*/10WfuzFuv9bfr9MTeyphTyRM7i9rjGlxf?e=download&uuid=413ea65b-6016-4f14-b2fa-0bdb59243266 [following]
--2023-08-07 15:56:20--  https://doc-0s-38-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/pj14jdoj6fm99tis4snnqc0qkooaeh42/1691423775000/11313332932477869617/*/10WfuzFuv9bfr9MTeyphTyRM7i9rjGlxf?e=download&uuid=413ea65b-6016-4f14-b2fa-0bdb59243266
Resolving doc-0s-38-docs.googleusercontent.com (doc-0s-38-docs.googleusercontent.com)... 64.233.187.132, 2404:6800:

### Copy the tarball file to `csem` user account area, and change the onwer of this file to `csem` user.

In [6]:
%%bash
cp csem3d_w-v1.0.2.tar.gz /home/csem/
chown csem:csem /home/csem/csem3d_w-v1.0.2.tar.gz

### Unpacking the tarball file

In [7]:
%%bash
su csem
cd /home/csem
tar zxvf csem3d_w-v1.0.2.tar.gz

csem3d_w-v1.0.2/
csem3d_w-v1.0.2/CSEM3D_W/
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W.sln
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/.vscode/
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/.vscode/launch.json
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/B_Tx_B_Rx.F
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/B_Tx_E_Rx.F
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/CSEM3D_W.F
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/CSEM3D_W.srm
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/CSEM3D_W.u2d
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/CSEM3D_W.vfproj
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/CSEM3D_mod.F
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/E_Tx_B_Rx.F
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/E_Tx_E_Rx.F
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/abs_to_rel.F
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/addair.F
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/allvars.F
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/bipole2.F
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/biprho.F
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/blocks.F
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/bottom.F
csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W/chk_rx_tx.F
csem3d_w-v1.0.

### Run `make all` to install the `CSEM3D` program
```bash
If it is successfully finished, this message must appear:

mpif90 -o CSEM3D_W outfields_E_Tx.o spline.o bottom.o allvars.o intpol1.o biprho.o dimped.o in3dmod.o compute_src_wts.o set_resist_vector.o locals.o kinds.o outfields_B_Tx.o B_Tx_B_Rx.o dimens.o set_P.o set_bv_e.o d1imped.o set_src.o grid.o in3drho.o CSEM3D_mod.o CSEM3D_W.o set_bv_h.o bipole2.o set_A.o chk_rx_tx.o set_1d_resist.o txrx.o splint.o set_rhs.o blocks.o E_Tx_E_Rx.o convres.o abs_to_rel.o B_Tx_E_Rx.o E_Tx_B_Rx.o addair.o -L/home/csem/petsc/gnu/petsc-v3.18.4/installdir/lib -lpetsc  
```

In [10]:
%%bash
su csem
cd /home/csem/csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W
sed 's/FLFLAGS=-L${PETSC_DIR}\/lib -lpetsc/FLFLAGS=-L${PETSC_DIR}\/lib -lpetsc -ltcmalloc/' scripts/makefile_gnu > scripts/makefile_gnu_tcmalloc
cat scripts/makefile_gnu_tcmalloc
make PETSC_DIR=${HOME}/petsc/gnu/petsc-v3.18.4/installdir -f scripts/makefile_gnu_tcmalloc clean

FC=mpif90
FFLAGS=-ffixed-line-length-none -g -O2 -ffpe-trap=zero,invalid,overflow -Wcompare-reals -Wconversion -I${PETSC_DIR}/include

FLFLAGS=-L${PETSC_DIR}/lib -lpetsc -ltcmalloc 

SRCS=$(wildcard *.F)
OBJS=$(patsubst %.F,%.o,$(SRCS))

include make/depend.mk

PROGRAM=CSEM3D_W

all: $(PROGRAM)

# Linker
$(PROGRAM) : $(OBJS)
	$(FC) -o $@ $^ $(FLFLAGS) 

clean:
	rm -rf *.o $(PROGRAM) *.mod

include ${PETSC_DIR}/lib/petsc/conf/variables


scripts/makefile_gnu_tcmalloc:22: /home/csem/petsc/gnu/petsc-v3.18.4/installdir/lib/petsc/conf/variables: No such file or directory
make: *** No rule to make target '/home/csem/petsc/gnu/petsc-v3.18.4/installdir/lib/petsc/conf/variables'.  Stop.


CalledProcessError: ignored

In [None]:
%%bash
su csem
cd /home/csem/csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W
make PETSC_DIR=${HOME}/petsc/gnu/petsc-v3.18.4/installdir -f scripts/makefile_gnu_tcmalloc all

### Run this bash `script` to execute the generated `CSEM3D_W` binary file.

```bash
If it is successfully finished, a similar message like below must appear:

 16 KSP preconditioned resid norm 3.863008301354e-18 true resid norm 1.284864871592e-05 ||r(i)||/||b|| 3.405644702963e-01
 17 KSP preconditioned resid norm 2.061402843466e-18 true resid norm 1.054741071689e-05 ||r(i)||/||b|| 2.795681805313e-01
 18 KSP preconditioned resid norm 1.062033155132e-18 true resid norm 3.992776343547e-06 ||r(i)||/||b|| 1.058319664983e-01
 converged reason            2
 total number of relaxations           18
 ========================================


 ************************************************
  3D finished
  Total CPU time:    18.7500000      seconds
 ************************************************

 total cpu time:    18.7500000      seconds
 CSEM3D_W finished
```



In [None]:
%%bash
su csem
cd /home/csem/csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W

PETSC_DIR=${HOME}/petsc/gnu/petsc-v3.18.4/installdir

dataset=Sintetico
ntasks=1
nnodes=1

TIMESTART=$(date +%Y%m%d%H%M%S)

if [[ -L ${dataset} ]]
then
    echo "Link já existe para o dataset ${dataset}"
else
    ln -s dataset/${dataset}
fi
sed 's/\.\//'${dataset}'\//g' ${dataset}/Parameters.inp | \
sed 's/'${dataset}'\/OutData/OutData/g' > Parameters.inp

outputdir="OutData"
if [[ -d ${outputdir} ]]
then
    echo "OutData já existe."
    rm -fr ${outputdir}
fi
mkdir ${outputdir}


resultsdir=results/${dataset}/NUMNODES-${nnodes}/MPI-${ntasks}/EXECSTART-${TIMESTART}
mkdir -p ${resultsdir}

export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${PETSC_DIR}/lib

executable=CSEM3D_W

echo "mpirun -np $ntasks ./${executable}"
mpirun -np $ntasks ./${executable} \
 -A_mat_type mpiaij \
 -P_mat_type mpiaij \
 -em_ksp_monitor_true_residual \
 -em_ksp_type bcgs \
 -em_pc_type bjacobi \
 -em_sub_pc_type ilu \
 -em_sub_pc_factor_levels 3 \
 -em_sub_pc_factor_fill 6 \
 < ./Parameters.inp \
 2>&1 | tee csem3d_w-${TIMESTART}.out

mv $outputdir/ ${resultsdir}/
cp csem3d_w-${TIMESTART}.out ${resultsdir}/


In [None]:
%%bash
su csem
cd /home/csem/csem3d_w-v1.0.2/CSEM3D_W/CSEM3D_W

PETSC_DIR=${HOME}/petsc/gnu/petsc-v3.18.4/installdir


export PATH=/usr/local/cuda/bin:$PATH

dataset=Sintetico
ntasks=1
nnodes=1

TIMESTART=$(date +%Y%m%d%H%M%S)

if [[ -L ${dataset} ]]
then
    echo "Link já existe para o dataset ${dataset}"
else
    ln -s dataset/${dataset}
fi
sed 's/\.\//'${dataset}'\//g' ${dataset}/Parameters.inp | \
sed 's/'${dataset}'\/OutData/OutData/g' > Parameters.inp

outputdir="OutData"
if [[ -d ${outputdir} ]]
then
    echo "OutData já existe."
    rm -fr ${outputdir}
fi
mkdir ${outputdir}


resultsdir=results/${dataset}/NUMNODES-${nnodes}/MPI-${ntasks}/EXECSTART-${TIMESTART}
mkdir -p ${resultsdir}

export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${PETSC_DIR}/lib

executable=CSEM3D_W

echo "mpirun -np $ntasks nvprof -f -o ${executable}.%q{OMPI_COMM_WORLD_RANK}.nvprof ./${executable}"
mpirun -np $ntasks nvprof -f -o ${executable}.%q{OMPI_COMM_WORLD_RANK}.nvprof ./${executable} \
 -A_mat_type aijcusparse \
 -P_mat_type aijcusparse \
 -vec_type cuda \
 -use_gpu_aware_mpi 0 \
 -em_ksp_monitor_true_residual \
 -em_ksp_type bcgs \
 -em_pc_type bjacobi \
 -em_sub_pc_type ilu \
 -em_sub_pc_factor_levels 3 \
 -em_sub_pc_factor_fill 6 \
 < ./Parameters.inp \
 2>&1 | tee csem3d_w-${TIMESTART}.out

mv $outputdir/ ${resultsdir}/
cp csem3d_w-${TIMESTART}.out ${resultsdir}/