# Regression tests for the mulskips suites

## Some general considerations

This notebook allows developers and users to run a batch of regression tests for the mulskips code.
In particular the notebook allows you (in order):
- to run several run tests starting from selected input file `start.dat`, each test will run in a dedicated folder containing the mulskips output files;
- compare some output \*.xyz files and the global log.txt;
- make a summary of successfull/failed tests, that are green/red marked, respectively.

Mulskips is a Kinetic Monte Carlo (KMC) code which aims to simulate the kinetic evolution of an atomistic system based on a-priori defined event frequencies. The  stochastic evolution is driven by a sequence of random numbers generated by well-tested Fortran routines. Random numbers generated by the latter can depend on the machine/compiler.
Regression tests in the present notebooks are based on mulskips runs driven by a stored sequence of random numbers.
To activate this modality you have to set a "T" al line `31` of the `start.dat` input file, which means "KMC run in test modality".
To check that the code development work did not destroy old functionalities, current output files obtained with the following runs are compared with reference output files.
In particular we compare some output files for each test, that are:
- the inital KMC input structure `I00000000.xyz` at KMC time = 0 of undercoordinated atoms;
- the final KMC structure `I00000001.xyz` of undercoordinated atoms;
- the global log file `log.txt`.

After the development of some mulskips routine, before commit and push all changes in the github repository, each mulskips developer is supposed to check that old functionality are not broken. This is accomplished with the run of the present regression tests with success.

To introduce a new test you need to load in the `input-and-references-files` folder:
- a `start.dat` file which corresponds to a mulskips run that touches the modified/developed routines. Mulskips has to run in the test modality, that is you have to set a "T" al line `31` of the `start.dat` input file;
- its corresponding output reference files;
- add the `testname` in the list `tests` below.

We associate to each test a name. Input `start.dat` and output \*.xyz reference files carry such test name.
For example, if we are dealing with the test run "pippo", then the input file name is `start-pippo.dat`, and output references files are named `I00000000-pippo.xyz` and `I00000001-pippo.xyz`. These files has to be load in the `input-and-references-files` folder During the regression tests a folder `pippo-test` is generated. There we run mulskips with the input `start-pippo.dat` files (renamed as `start.dat`). From this folder we take the current output files and compare them with the reference files. If files are identical, the test is green, and it passed with success. 

## Settings

To run regression tests, the user is supposed to set only the variables below of this section. 

**IMPORTANT: Make sure that the KMC box has the same size (480 x 480 x 480 for all tests)!!! Change it in modules/defsystem.f**

In [35]:
import os

# Full path of the compiled mulskips executable:
# In case of LA or CAD, make sure that the mulskips box is 480 x 240 x 240, instead of 480 x 480 x 480
mulskipsdir = os.getenv('PWD')+'/mulskips-source/'
# mulskipsdir = os.getenv('PWD')+'/mulskips-source-CAD-LA/'

# Directory with all input `start.dat` and reference files.
inputrefdir = os.getenv('PWD')+'/input-and-references-files/'

# List of test names:
tests = ['cube','cube-zigzag','sphere','surface','inverted-pyramid',
         'finfet', 'Si-SL', 'CVD-Si-SiH2']
# tests = ['CAD', 'LA']

##### A few important notes before starting:
* `LA` and `CVD-Si-SiH2` might take > 5 minutes to run.
* Before running "LA" you should execute `tar xvfz LA-input.tar.gz` inside the `inputrefdir` folder, to extract the input geometry and temperature map.

## Test runs

Following we run mulskips for all `testname` in `tests` doing the following steps:
1. Make a directory `testname`-`test`;
2. copy the associated `start-testname.dat` file in `testname`-`test`;
3. change from the current working directoty to the `testname`-`test` one;
4. run mulskips.

In [25]:
# import the os module
import os,shutil,subprocess,sys

# detect the current working directory and print it
cwd = os.getcwd()
print ("The current working directory is %s" % cwd)

The current working directory is /mnt/c/Users/tanoc/OneDrive - University of Pisa/CNR-IMM/mulskips-dev/regression-tests-CVD


In [26]:
from tempfile import mkstemp
def replace(file_path, pattern, subst):
    #Create temp file
    fh, abs_path = mkstemp()
    with os.fdopen(fh,'w') as new_file:
        with open(file_path) as old_file:
            for line in old_file:
                new_file.write(line.replace(pattern, subst))
    #Copy the file permissions from the old file to the new file
    shutil.copymode(file_path, abs_path)
    #Remove original file
    os.remove(file_path)
    #Move new file
    shutil.move(abs_path, file_path)

In [27]:
mulskipsexecutable = mulskipsdir+'/mulskips.e | tee log.txt;'

for test in tests:
    print('Processing test: {}'.format(test))
    print('')

    # define the name of the directory to be created
    testdir = test+'-test'

    filenamesrc = inputrefdir+'start-'+test+'.dat'
    filedest = 'start.dat'
    
    try:
        # remove the testdir directory if it already exist
        shutil.rmtree(testdir)
    except OSError as e:
        print("Error: %s : %s" % (testdir, e.strerror))

    try:
        # create the testdir directory
        os.mkdir(testdir)
        print('Created directory: {}'.format(testdir))
    except OSError:
        print('Can not create the Directory: {}'.format(testdir)) 

    try:
        # Change from current working Directory to testdir    
        os.chdir(testdir)
        print('Directory changed to: {}'.format(testdir))
    except OSError:
        print("Can't change the Current Working Directory") 

    shutil.copy(filenamesrc, filedest)

    if test == 'LA':
        # Replace path to LattGeo.dat and TempStat.dat
        replace(filedest, "LattGeo.480x240x240.dat", inputrefdir+"LattGeo.480x240x240.dat")
        replace(filedest, "TempStat.480x240x240.dat", inputrefdir+"TempStat.480x240x240.dat")
        
    # execute the mulskips.e program with the input file start.dat
    make_process = subprocess.Popen(mulskipsexecutable, shell=True, stdout=subprocess.PIPE)
    while True:
        line = make_process.stdout.readline()
        if not line:break
        print(line) #output to console in time
        sys.stdout.flush()

    try:
        # Change from current working Directory to testdir    
        os.chdir(cwd)
        print('Directory changed to: {}'.format(cwd))
    except OSError:
        print("Can't change the Current Working Directory") 
    print('')

Processing test: CAD

Error: CAD-test : No such file or directory
Created directory: CAD-test
Directory changed to: CAD-test
b' KMC box size:          480         240         240\n'
b'            ***** Simulating CVD *****\n'
b' Crystal species (NCrystal=           1 ) --> Z:          14\n'
b' Coverage species (NCov=           1 ) --> Z:           1\n'
b' KMC Super-Lattice parameter (ang):   0.45250000000000001     \n'
b' Exit strategy: Iter\n'
b' MaxIter:               3000000\n'
b' OutMolMol:               3000000\n'
b' ATTENTION! You are running mulskips in test modality,  that is random numbers are not so random!!\n'
b' IDUM     9117116\n'
b' SaveFinalState flag is not "T": checkpoint file will  not be written.\n'
b' SaveCoo flag is not "T": Coor file not stored\n'
b' maxElines           9\n'
b' numParticelle=      500000 Levels =          20 , SizeTree =     1048575\n'
b' Reading input geometry file: /home/tano/CNR-IMM/mulskips-dev/regression-tests-CVD/input-and-references-files/L

## Comparison of current and reference runs

Following we compare current and reference files. We compare line by line, entry by entry. The comparison stops at the first nonequal entry, and print the line,lineplace and the current and reference values.

In [28]:
def get_type(user_input):
    vtype = 'notfound'
    try:
        val = int(user_input)
#        print("Input is an integer number. Number = ", val)
        vtype = 'int'
    except ValueError:
        try:
            val = float(user_input)
#            print("Input is a float  number. Number = ", val)
            vtype = 'float'
        except ValueError:
#            print("No.. input is not a number. It's a string")
            vtype = 'str'
    return vtype

def reset_value(user_value):
    vtype = get_type(user_value)
    if vtype == 'int':
        new_value = int(user_value)
    elif vtype == 'float':
        new_value = float(user_value)
    elif vtype == 'str':
        new_value = str(user_value)
    else:
        new_value = 'bah'
        print('type not found for {}, with type {}'.format(user_value,type(user_value)))
    return new_value

def print_error(cvalue,rvalue,ind,li):
    cv = get_type(cvalue)
    rv = get_type(rvalue)
    print('Current and Reference values are different: line: {}; position {}.'.format(ind+1,li+1))
    print('Current value {}: {}'.format(cvalue,cv))
    print('Reference value {}: {}'.format(rvalue,rv))
    return    

def compare_values(cvalue_input,rvalue_input,tol,ind,li):

    cv = get_type(cvalue_input)
    rv = get_type(rvalue_input)
    cvalue = reset_value(cvalue_input)
    rvalue = reset_value(rvalue_input)

    if cv != rv:
        print('Current and Reference value types are different')
        print_error(cvalue,rvalue,ind,li)
        teststatus = False
    elif cv == rv:
        teststatus = True

    if teststatus:
        if cv == 'str':
            if cvalue != rvalue:
                print_error(cvalue,rvalue,ind,li)              
                teststatus = False
            elif cvalue == rvalue:
                teststatus = True
        elif cv == 'int':
            if cvalue != rvalue:
                print_error(cvalue,rvalue,ind,li)              
                teststatus = False
            elif cvalue == rvalue:
                teststatus = True
        elif cv == 'float':
            diff = abs(cvalue-rvalue)
            if diff > tol:
                print('Current and Reference float values differs for: {}'.format(diff))
                print('The tollerance has been set to: {}'.format(tol))
                print_error(cvalue,rvalue,ind,li)              
                teststatus = False
            else:
                teststatus = True
        else:
            print('status not present in the type case list. Type: {}'.format(cv))
            print_error(cvalue,rvalue,ind,li)
            teststatus = False
    return teststatus

def compare_files(currfile,reffile,tol):
    teststatus = False
    
    with open (currfile) as cf, open (reffile) as rf:
        ccontent = cf.read().splitlines()
        rcontent = rf.read().splitlines()

        # Compare the lenght of current and reference files. 
        if len(ccontent) != len(rcontent):
            print('Current and Reference files have different lenghts:')
            print('Lenght of the current file: {}'.format(len(ccontent)))
            print('Lenght of the reference file: {}'.format(len(rcontent)))
            teststatus = False
        elif len(ccontent) == len(rcontent):
            teststatus = True

        if teststatus:
            for ind in range(len(ccontent)):
                cline = ccontent[ind]
                rline = rcontent[ind]
                cs = cline.split()
                rs = rline.split()
                    # Compare the lenght of current and reference files. 
                if len(cs) != len(rs):
                    print('Current and Reference lines have different lenghts:')
                    print('Lenght of the current line {}: {}'.format(ind,len(cs)))
                    print('Lenght of the reference line {}: {}'.format(ind,len(rs)))
                    teststatus = False
                if teststatus:
                    for li in range(len(cs)):
                        if teststatus:
                            cvalue = cs[li]
                            rvalue = rs[li]
                            teststatus = compare_values(cvalue,rvalue,tol,ind,li)
    return teststatus

In [29]:
test_results = {}
for test in tests:
    testdir = test+'-test'
    xyzfile0 = 'I00000000' # Input file a KMC time = 0
    xyzfile1 = 'I00000001' # Final file at the end of the KMC run
    tol = 1.0E-10
    teststatus = False
    
    currfilexyz0 = testdir+'/'+xyzfile0+'.xyz'
    reffilexyz0 = inputrefdir+'/'+xyzfile0+'-'+test+'.xyz'
    xyz0_teststatus = compare_files(currfilexyz0,reffilexyz0,tol)
    print('Status xyz0 test for {}: {}'.format(test,xyz0_teststatus))

    currfilexyz1 = testdir+'/'+xyzfile1+'.xyz'
    reffilexyz1 = inputrefdir+'/'+xyzfile1+'-'+test+'.xyz'
    xyz1_teststatus = compare_files(currfilexyz1,reffilexyz1,tol)
    print('Status xyz1 test for {}: {}'.format(test,xyz1_teststatus))

    currlog = testdir+'/'+'log.txt'
    reflog = inputrefdir+'/'+'log-'+test+'.txt'
    log_teststatus = compare_files(currlog,reflog,tol)
    print('Status log test for {}: {}'.format(test,log_teststatus))
    print('')
    
    if test == 'LA': 
        if log_teststatus == False and (xyz0_teststatus == True and xyz1_teststatus == True):
            print("WARNING:\nlog.txt does not coincide with reference, but it's probably only because of the different LattGeo and TempStat paths. CHECK IT!")
    
    test_results[test] = {'xyz0file': xyz0_teststatus, 'xyz1file': xyz1_teststatus, 'log': log_teststatus}

Status xyz0 test for CAD: True
Status xyz1 test for CAD: True
Status log test for CAD: True

Status xyz0 test for LA: True
Status xyz1 test for LA: True
Status log test for LA: True



In [30]:
print(test_results)

{'CAD': {'xyz0file': True, 'xyz1file': True, 'log': True}, 'LA': {'xyz0file': True, 'xyz1file': True, 'log': True}}


## Regression test report

Below we report results for all regression tests. A green test means that the test passed with success, and current and reference files are identical. Red tests mean that the test failed, and the developer is supposed to check/identify/resolve all bugs.

In [11]:
def get_color(test):
    color = 'green'
    if test['xyz0file'] and test['xyz1file'] and test['log']:
        color = 'green'
    else:
        color = 'red'
    return color

In [12]:
from tabulate import tabulate
from math import sqrt
from termcolor import colored

results = [(colored(key, color=get_color(test_results[key])), colored(test_results[key]['xyz0file'], color=get_color(test_results[key])), colored(test_results[key]['xyz1file'], color=get_color(test_results[key])), colored(test_results[key]['log'], color=get_color(test_results[key]))) for key in test_results]
print(tabulate(results, headers=["Test name", "initial xyz file", "final xyz file", "log file"]))

ModuleNotFoundError: No module named 'tabulate'