This notebook aim to provide a tutorial on how to run TRACK using python as well as on the hpcc. 
By the end of this notebook, you should be able to:

1. Create custom sclinac.dat inputs and track.dat inputs to be able to run any TRACK lattice file.

2. Be able to run TRACK on n cores on Windows or Linux such as the MSU hpcc.

3. Finally be able to save all the data as an .hdft5 file.

## 1. Load packages

In [1]:
import sys
sys.path.insert(0, 'src')  # Change this to location of PyTrack src
from runTrack import *
from config import Config
from utils import *
from hdf5Track import *

Loading runTrack...
Loading config...
Loading utils...
Loading hdf5Track...


file path guide

   /   = Root directory
   
   .   = This location
   
   ..  = Up a directory
   
   ./  = Current directory
   
   ../ = Parent of current directory
   
   ../../ = Two directories backwards

'D:' tells which driver you are in for absolute directory

## Check that src/runTrack.py works

Go to unit_testing folder and start runing the unit testing. Make sure that works.

If you are using Linux, you will need to make sure you have a track executable complied for your linux hardware. A version of track that was complied on MSU HPCC is included. If that doesn't work, contact Kei Fukushima for the track biuld. Then put the complied track into the **TRACK folder** and the **parentTRACK** folder. Make sure you give it permission to execute. This can be done using the cmd "chmod 750 track" or "chmod -x track"

Can contact Kei Fukushima for a Linus version of TRACK: 
fukushim@frib.msu.edu

## 2. Setting up config.py

You need to make sure you have the correct track executable on your machine. The following code should all run correctly if your paths are set correctly and you have the track executable in trackFiles.

Make sure paths are set correctly. Change the paths in src/config.py

In [2]:
cf = Config()  # This is how you will access the config files

In [3]:
trackdir = cf.TRACK_DIRECTORY
trackexe = cf.TRACK_EXE

In [4]:
runTrack(trackexe, trackdir)

(CompletedProcess(args='C:\\Users\\trana\\Desktop\\TRACK_Development\\TRACK_reinforcement_learning\\TRACK\\TRACKv39C.exe', returncode=0),
 0.5815575122833252)

If TRACK works, then the path were set correctly. The return code should be 0 and the exectution time, the number at the end, should be around .5 secs. Next we will set up the track.dat, the sclinac.dat, and the hdf5 configs

### 2.1 Set up track.dat

To set up the default parameters of track.dat, you just need to change the corresponding parameters of the TRACK_DAT dictionary. Make sure the paths are set correctly. Also some numbers might need to be written as a string, for example, if it has letters in them to indicate decimal places. Also make sure to download the following Field folder or if you know what you are going, create you own field folder. https://michiganstate-my.sharepoint.com/:f:/g/personal/tranant2_msu_edu/EkNo130XJrlBrLdz6ZS-HIYBE6peozQ279coF6U-dMDyXw?e=pJ55jt

### 2.2 Set up sclinac.dat

To set up the defult sclinac.dat file, make a list of lists where each list correspond to each beam element and their settings. This directly writes each element in the sclinac file writing a new line for each list and putting spaces between each elements. Don't put the '0 stop' element at the end as the wrapper does that automattically.

### 2.3 Set up HDF5

This is going to be a bit complicated, but this allows you to specify exactly what data to save and how much to save.
You will need to specify a couple of things in the hdf5 dictionary.
1. 00_start - the initial settings in track.dat you want to save
2. location - the beam elements you want to save
3. inputs - the settings on the specific beam element you want to save
4. outputs - the output from beam.out you want to save
5. size - the total number of simulations you want to save
6. index - index telling which sclinac element to get distro from. index=3 mean 3 elements up to element 3 will be recorded

**00_start**
This is a list of all the different track.dat settings you want to save. Make an item in the list is included in one of the veriables in track.dat. This will be all contained in a folder called 00_start

**location**
This is a list of all the different beam elements. This creates top level folders wih the name of each beam element. Should have the naming convention of '##_XXXX'.

**Setting up inputs**
This is a dictionary which gives the input of a beam element at different locations. The tags are the different locations. Given a tag, we get a list of tuples. Itss two element long where the first element gives the name of the dataset, and the second element gives the index number for where the variable is. For example:  [1, 'drift', 42.7, 3.0, 3.0] is the line that describes a drift space. If we want to save index 2, which is the length, then we would add the tuple ('length', 2) in the list. Example: 'inputs': {'00_drift':[('length',2),('xrap',3),('yrap',4)],'01_drift':[],'02_drift':[]},

**Setting up outputs**
This is also a dictionary which saves the output at the location of a beam element. The tags are the different locations. Given a tag, we get a list of tupes, but this time, the first index represent the varable being save and the second tells how many particles are being save. Because of how TRACK was programed, this is n + 1 for the reference particles.

**size**
This is the number of simulations we want to make

**index**
This is the index number.

## 3. Run a single TRACK instance and save the data in an hdf5 file

A run function does the following in order.

1. update the track config file with the new settings spcify in run_config['track']
2. update sclinac config file with the new settings spcify in run_config['sclinac']
3. update track.dat with track config file using make_track
4. save track inputs using save_inputs
5. update sclinac.dat with sclinac config file
6. save outputs of sclinac.dat at corrsponding locations (this is where the code runs)

Before you run everything, copy content of parent folder into child folder using copy2folder. This is where TRACK will run to keep things organized. 

Next you want to set up the hdf5 files using makehdf5()

Next you want to set up the run_config dictionary. Notice you will need to input changes in both sclinac and inputs if you do any. The run_config is the main file you will want to change in order to update each track simulation.

In [5]:
# setup
cf = Config()
hdf5_path = './test0.hdf5'
hdf5_config = cf.HDF5

copy2folder(cf.PARENT_TRACK,cf.CHILD_TRACK,'test')  # create folder
sim_folder = str(cf.CHILD_TRACK)+'/test'
makehdf5(hdf5_path, hdf5_config)  # make hdf5 file

# This dictionary contains the variables you want to change.
# Make sure the run_config matches the input and output hdf5 configs
run_config = {'track': {'epsnx':'0.12d0',
                        'alfax':'1.00d0',
                        'betax':'100.0d0',
                        'epsny':'0.12d0',
                        'alfay':'1.00d0',
                        'betay':'100.0d0'},
             'sclinac': [[4,2,-6],[5,2,6],[7,2,-6],[8,2,6],[10,2,-6],[11,2,6]],
             'inputs':{'05_eq3d':{'voltage':-6},  # first index in name, second index is value
                       '06_eq3d':{'voltage':6},
                      '08_eq3d':{'voltage':-6},
                      '09_eq3d':{'voltage':6},
                      '11_eq3d':{'voltage':-6},
                      '12_eq3d':{'voltage':6}}
             }

run(run_config, sim_folder, hdf5_path, 0)

In [None]:
#  This is mainly how each run is configered. The run_config dictionary is changed and then
#  A run is started. Here is the case for 6 different runs
sclinac_q1 = [-6,-5,-4,-3,-2,-1]
for i in range(len(sclinac_q1)):
    run_config['sclinac'][0][2] = sclinac_q1[i]
    run_config['inputs']['05_eq3d']['voltage'] = sclinac_q1[i]
    run(run_config, sim_folder, hdf5_path, i)

In [None]:
import numpy as np

with h5py.File('./test0.hdf5', "a") as database:
    print(database[f'05_eq3d'].keys())
    print(np.array(database['05_eq3d/inputs/voltage'][:6]))
    print(np.array(database['12_eq3d/inputs/voltage'][:6]))
    print(np.array(database['06_eq3d/outputs/x'][0]))
    print(np.array(database['06_eq3d/outputs/x'][7]))
    print(np.array(database['12_eq3d/outputs/y'][0]))
    print(np.array(database['12_eq3d/outputs/y'][7]))
    print(np.array(database['00_start/outputs/y'][0]))
    print(np.array(database['00_start/outputs/y'][7]))

The results should be something like this. Zeros where there should be zeros and numbers where there should be numbers.

<KeysViewHDF5 ['inputs', 'outputs']>

[-6. -5. -4. -3. -2. -1.]

[6. 6. 6. 6. 6. 6.]

[ 4.3201604e-07 -1.0199331e+00 -7.6650989e-01 ...  8.0134898e-02
 -2.4582863e-01 -9.5469528e-01]
 
[0. 0. 0. ... 0. 0. 0.]

[-8.7229802e-07  7.7979213e-01 -9.5388597e-01 ... -4.1215295e-01
 -5.0551176e-01 -1.0993438e+00]
 
[0. 0. 0. ... 0. 0. 0.]

[ 0.         -0.52694255  0.51611817 ...  0.7277629   0.7714185
  0.49240583]
  
[0. 0. 0. ... 0. 0. 0.]

In [None]:
# remove hdf5 file
if os.path.exists(hdf5_path):
    os.remove(hdf5_path)
# remove test folder
rmfolder(cf.CHILD_TRACK,'test')

# 4. Run track on multiple cores

We will now do the last test, running TRACK on mulitple cores and saving all the data in an hdf5 file. How this code works is that each core will get its own child folder from the parent folder to run, then it will save in its own hdf5 file. First, I need to make a function which creates the hdf5 files, and iteratively runs a simulation with different configerations. Call this datagen. This will have to be customized for different configurations of inputs as this tells where each input goes. Start by testing it out here then copy datagen into the hdf5Track.py file in order for the next section to work. The next section cannot use the function define in a notebook. It has to be define in a separate .py file. Then commet out the datagen in the notebook and run everything again.

In [None]:
# def datagen(inputs, child, hdf5):
#     """
#     inputs:
#     inputs - vectors of different inputs. Must all have the same dimensions
#     child - folder where simulation will run
#     hdf5 - hdf5 file. Where all the data is stored
#     """
#     # This dictionary contains the variables you want to change.
#     # Make sure the run_config matches the input and output hdf5 configs
#     # At the moment, you have to specify voltages for both the sclinac and inputs.
#     for i in range(len(inputs)):
#         v1 = inputs['v1']
#         v2 = inputs['v2']
#         v3 = inputs['v3']
#         v4 = inputs['v4']
#         v5 = inputs['v5']
#         v6 = inputs['v6']
#         run_config = {'track': {'epsnx':'0.12d0',
#                                 'alfax':'1.00d0',
#                                 'betax':'100.0d0',
#                                 'epsny':'0.12d0',
#                                 'alfay':'1.00d0',
#                                 'betay':'100.0d0'},
#                      'sclinac': [[4,2,v1[i]],[5,2,v2[i]],[7,2,v3[i]],[8,2,v4[i]],[10,2,v5[i]],[11,2,v6[i]]],
#                      'inputs':{'05_eq3d':{'voltage':v1[i]},  # first index in name, second index is value
#                                '06_eq3d':{'voltage':v2[i]},
#                               '08_eq3d':{'voltage':v3[i]},
#                               '09_eq3d':{'voltage':v4[i]},
#                               '11_eq3d':{'voltage':v5[i]},
#                               '12_eq3d':{'voltage':v6[i]}}
#                      }
#         run(run_config, child, hdf5, i)

In [None]:
# setup
cf = Config()
hdf5_dir = cf.HDF5_DIR
hdf5_path = f'{hdf5_dir}/test.hdf5'
hdf5_config = cf.HDF5

copy2folder(cf.PARENT_TRACK,cf.CHILD_TRACK,'test')  # create folder
sim_folder = str(cf.CHILD_TRACK)+'/test'
makehdf5(hdf5_path, hdf5_config)  # make hdf5 file

n = 10
vv1 = np.random.rand(n)*8
vv2 = -np.random.rand(n)*8
vv3 = np.random.rand(n)*8
vv4 = -np.random.rand(n)*8
vv5 = np.random.rand(n)*8
vv6 = -np.random.rand(n)*8
inputs = {'v1':vv1,
         'v2':vv2,
         'v3':vv3,
         'v4':vv4,
         'v5':vv5,
         'v6':vv6,}
datagen(inputs, sim_folder, hdf5_path)

In [None]:
with h5py.File(hdf5_path, "a") as database:
    print(database[f'05_eq3d'].keys())
    print(np.array(database['05_eq3d/inputs/voltage'][:10]))
    print(np.array(database['12_eq3d/inputs/voltage'][:10]))
    print(np.array(database['06_eq3d/outputs/x'][0]))
    print(np.array(database['06_eq3d/outputs/x'][1]))
    print(np.array(database['06_eq3d/outputs/x'][2]))
    print(np.array(database['06_eq3d/outputs/x'][6]))
    print(np.array(database['06_eq3d/outputs/x'][7]))
    print(np.array(database['06_eq3d/outputs/x'][10]))

In [None]:
# remove hdf5 file
if os.path.exists(hdf5_path):
    os.remove(hdf5_path)
# remove test folder
rmfolder(cf.CHILD_TRACK,'test')

## Pooling

In [None]:
from multiprocessing import Pool
import numpy as np

n = 160  #number of simulations
vv1 = np.random.rand(n)*8
vv2 = -np.random.rand(n)*8
vv3 = np.random.rand(n)*8
vv4 = -np.random.rand(n)*8
vv5 = np.random.rand(n)*8
vv6 = -np.random.rand(n)*8
distro = np.random.choice(np.array([0,1,3,4,5,6,7,8,9]),n)

n=4  # Change this to get different numbers of cores
cf = Config()
parent_dir = cf.PARENT_TRACK
child_dir = cf.CHILD_TRACK
hdf5_dir =  cf.HDF5_DIR
hdf5_config = cf.HDF5

for file in os.listdir(child_dir):  # Removes all files in child_dir. If don't do this, then when run n=32 then n=4, it will still run all 32 first
    shutil.rmtree(str(child_dir)+"/"+str(file))
    
sim_folder = [None]*n
for i in range(n):  # creates the folders
    copy2folder(parent_dir, child_dir, str(i))  # create the child folder names after numbers
    sim_folder[i] = str(child_dir)+'/' + str(i)  # create a list of path to the child folders
    makehdf5(f'{hdf5_dir}/{i}.hdf5', hdf5_config)  # make hdf5 file named after numbers '0','1',...'n'
    
inputs = [None]*n    
nsim = len(vv1)
vsize = int(nsim/n)
for i in range(n):
    inputs[i] = {'v1':vv1[vsize*i: vsize*(i+1)],  # split voltages evening among the cores
         'v2':vv2[vsize*i: vsize*(i+1)],
         'v3':vv3[vsize*i: vsize*(i+1)],
         'v4':vv4[vsize*i: vsize*(i+1)],
         'v5':vv5[vsize*i: vsize*(i+1)],
         'v6':vv6[vsize*i: vsize*(i+1)]}
    
if (inputs[0]['v1'].shape[0] % n != 0):
    print("Input shape not divisiable evenly by n cores!")
    print(str(voltages.shape[0])+"%"+str(n) +"="+str(voltages.shape[0] % n ) )

all_index=np.arange(n)
items = [(inputs[i], sim_folder[i], f'{hdf5_dir}/{i}.hdf5') for i in all_index]

with Pool(processes=n) as pool:
    pool.starmap(datagen, items)

Note for future works. Pool.starmap requres a function in order to distribute it to all the cores. You can't have it define in this notebook. YOu have to have it define in a separate .py file such as hdf5TRACK.py in this case because this enable the computer to distribute the function out to different cores; I would even say you need to do this to all function that can potentilaly be distributed just to make sure.