# Read, clean, merge, and write PLIC facets

The Basilisk simulations write PLIC facecets to the folder *plic/* of each simulation. The PLIC points are written after **10** iterations. Each process stores the points in a separate file. The files are named as

points_**iteration**_n**process**.txt

For example, the file of process 8 after 120 iterations is called

points_**000120**_n**008**.txt

Each file contains the all points in Cartesian coordinates. The points belonging to one PLIC facet are separated by one empty line. In the following example, the first facet consists of **three** and the second of **four** vertices:
```
# the order is px / py / pz
-0.472504 3.39844 -0.0585938 # first point of facet one
-0.46875 3.38866 -0.0585938
-0.46875 3.39844 -0.0823362

-0.474992 3.39844 -0.0292969 # first point of facet two
-0.46875 3.38206 -0.0292969
-0.46875 3.38931 -0.0585938
-0.47223 3.39844 -0.0585938
...
```
In two-dimensional data sets the entry **pz** is missing. For 2D cases, pass *shape_2D=True* as argument to the *process_shape* function

## Dependencies

This notebook depends on no other notebooks. It accesses the raw, distribited Basilisk output files provided in the data set accompanying this repository, and writes merged PLIC files in the *plic_clean* folder of each Basilisk simulation.

## Data processing

In [7]:
# paths to define from where to read and where to write the processed data
source_base = "../data"
target_base = "../data"
# the processed data will be stored in the newly created folder plic_clean
plic_source = "plic"
plic_target = "plic_clean"

bhaga_cases = ["bhaga_{:02d}".format(case) + "_l" + str(level) for case in [2, 3, 4] for level in [14, 15, 16]]
water_cases = ["water_{:02d}".format(case) + "_l" + str(level) for case in [1, 3, 5] for level in [14, 15, 16]]
cases = water_cases + bhaga_cases

for i, case in enumerate(cases):
    print(i, case)

0 water_01_l14
1 water_01_l15
2 water_01_l16
3 water_03_l14
4 water_03_l15
5 water_03_l16
6 water_05_l14
7 water_05_l15
8 water_05_l16
9 bhaga_02_l14
10 bhaga_02_l15
11 bhaga_02_l16
12 bhaga_03_l14
13 bhaga_03_l15
14 bhaga_03_l16
15 bhaga_04_l14
16 bhaga_04_l15
17 bhaga_04_l16


In [8]:
import glob
import pandas as pd
import numpy as np

def countNan(px):
    '''Count the time the passed argument is a NAN
    
    Parameters
    ----------
    px - float : either a float or NAN
    
    Returns
    -------
    count - int : number of times a NAN was passed as argument
    '''
    if pd.isna(px):
        countNan.counter += 1   
    return countNan.counter
    

def process_shape(path, iteration, shape_2D=False):
    '''Read PLIC intersections points from disk.
       The function reads all avaialable processor files und concatenates them.
        
    Parameters
    ----------
    path - string: path to the file location
    iteration - integer: iteration to load
    
    Returns
    -------
    facets - DataFrame: DataFrame containing the x, y, z coordintes and
             the number of the facet to which a pints belongs
             
    '''
    base_name = path + "/points_{:06d}_n".format(iteration)
    files = sorted(glob.glob(base_name + "*"))
    points = []
    if shape_2D:
        columns = ["px", "py"]
    else:
        columns = ["px", "py", "pz"]
    for file in files:
        points.append(pd.read_csv(file, sep=" ", names=columns, engine='c', dtype=np.float32))
    all_points = pd.concat(points)
    countNan.counter = 0
    all_points["element"] = all_points["px"].apply(countNan)
    all_points.dropna(inplace=True)
    return all_points.reset_index(drop=True)


def get_iterations(path):
    ''' Find all iterations based on the file names.
    
    Parameters
    ----------
    path - string : where to search for files
    
    Returns
    -------
    iterations - array-like : set of all iterations
    
    '''
    file_paths = glob.glob(path + "/*_n000.txt")
    iterations = sorted([int(path.split("/")[-1].split("_")[1]) for path in file_paths])
    return iterations

In [9]:
from tqdm import tqdm
import os

for case in cases:
    print("Processing simulation folder {}".format(case))
    source = source_base + "/" + case + "/" + plic_source
    iterations = get_iterations(source)
    target = target_base + "/" + case + "/" + plic_target
    if not os.path.exists(target):
        os.makedirs(target)
    for it in tqdm(iterations):
        data = process_shape(source, it, shape_2D=True)
        file_name = "plic_{:06d}.pkl".format(it)
        data.to_pickle(target + "/" + file_name)

0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 77.54it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 70.03it/s]
100%|██████████| 1/1 [00:00<00:00, 91.75it/s]
100%|██████████| 1/1 [00:00<00:00, 85.47it/s]
100%|██████████| 6/6 [00:00<00:00, 98.88it/s]
  0%|          | 0/1 [00:00<?, ?it/s]

Processing simulation folder water_01_l14
Processing simulation folder water_01_l15
Processing simulation folder water_01_l16
Processing simulation folder water_03_l14
Processing simulation folder water_03_l15
Processing simulation folder water_03_l16
Processing simulation folder water_05_l14
Processing simulation folder water_05_l15
Processing simulation folder water_05_l16
Processing simulation folder bhaga_02_l14


100%|██████████| 1/1 [00:00<00:00, 95.87it/s]
100%|██████████| 1/1 [00:00<00:00, 79.65it/s]
  0%|          | 0/5 [00:00<?, ?it/s]

Processing simulation folder bhaga_02_l15
Processing simulation folder bhaga_02_l16


100%|██████████| 5/5 [00:00<00:00, 90.56it/s]
100%|██████████| 1/1 [00:00<00:00, 89.25it/s]
100%|██████████| 1/1 [00:00<00:00, 89.79it/s]
100%|██████████| 5/5 [00:00<00:00, 91.12it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 73.21it/s]
  0%|          | 0/5 [00:00<?, ?it/s]

Processing simulation folder bhaga_03_l14
Processing simulation folder bhaga_03_l15
Processing simulation folder bhaga_03_l16
Processing simulation folder bhaga_04_l14
Processing simulation folder bhaga_04_l15
Processing simulation folder bhaga_04_l16


100%|██████████| 5/5 [00:00<00:00, 68.54it/s]
