# trace_retrace.ipynb
## Training ML Models Using Trace and Retrace Data

This notebook should succeed in improving the ML models we have tested in `benchmark.ipynb` and `simulator.ipynb` by adding both **trace and retrace** data to the dataset fed into the ML algorithms.

Some pickle files have been added to the repo in addition to this notebook. The pickle files contain this information (in addition to `.ibw` filedata):
- __traces:__ shape of (N, M, C, L), forward and backward scan lines for the "Height", "Amplitude", "Phase", and "ZHeight" channels.
- __x_measured:__ shape of (N, D), controlling parameters of "drive", "setpoint", and "I gain" for each set of the traces. Both drive and setpoint are in the unit of nanometers.
- __y_measured:__ shape of (N, P), recorded reward values in the optimization process.
- __param:__ important global parameters for the dataset, including ['ScanSize', 'ScanRate', 'PointsLines', 'IntegralGain', 'InvOLS', 'SpringConstant', 'DriveFrequency']
- __header:__ all the global parameters recorded in the topo.ibw file by the SPM controller
- __topo:__ topography with the optimized parameters for visualizing the area where the traces were taken

***Explanation of the shape values:***
- __N:__ number of different sets of controlling parameters. This value corresponding to the length of x_measured and y_measured. The first 5-10 are usually the random seeding points.
- __M:__ number of repeats for each set of controlling parameters. Usually this value is set to 5 to avoid random errors in the scan lines.
- __C:__ number of different channels recorded for the traces, usually set to 8
- __L:__ number of pixels in each trace line, usually set to be 256
- __D:__ number of controlling parameters, usually set to be 3
- __P:__ number of rewards. For BO, it's 1. For MOBO, it can be 3-4

It is important to note that this information was taken from another notebook. Though there is talk about rewards for training for reinforcement learning (RL) or Bayesian optimization (BO), **this notebook will solely focus on using the data from these pickle files to detect failures in the SPM image.**




In [2]:
import numpy as np
import pickle
import os
import matplotlib.pyplot as plt
import aespm as ae

In [4]:

sample_folder = 'scan_traces/MOBO'

files = os.listdir(sample_folder)

file_name = []
save_name = []

for f in files:
    if f.endswith('.pickle'):
        file_name.append(os.path.join(sample_folder, f))
        save_name.append(f.split('.')[0])

file_name = sorted(file_name)
file_name

with open(file_name[-7], 'rb') as fopen:
    obj = pickle.load(fopen)

topo = obj['topo']
titles = ['Height', 'Amplitude', 'Phase', 'Height']
fig, ax = plt.subplots(1,4,figsize=[12, 2.5])
for i in range(4):
    # If there are only 4 data channels in the topo.data, we don't need to skip any index
    im = ax[i].imshow(topo.data[::2][i], origin='lower')
    ax[i].set_title(titles[i])
    plt.colorbar(im, ax=ax[i])

plt.tight_layout()

FileNotFoundError: [Errno 2] No such file or directory: 'scan_traces/MOBO'