The goal of this workbook will be to simulate all of the simulated ellipsometric spectra that will be used for training neural networks. First load in any previously defined functions. This includes the functions developed to simulate Tauc-Lorentz, Cody-Lorentz, and Bruggeman effective medium approximations (EMA). 

In [2]:
#Imports libraries and defines functions
%run Functions.ipynb

Now load in the needed optical properties

In [7]:
# input location of files
os.chdir(r"XXXX") ########## PLEASE PUT THE DIRECTORY TO THE FOLDER LABELED "Optical Properties" in the XXXX space ####################
file =  "SLG.csv"
SLG = pd.read_csv(file)
SLG.name = 'SLG'

#Load in optical properties for void
file =  "Void.csv"
Void = pd.read_csv(file)
Void.name = "Void"

E = SLG['Energy (eV)'] # define Energy range
wv = SLG['Wavelength (nm)'] # Define Wavelength range 

# Now we also need to load in the Si native oxide and the Si wafer. 

file =  "Si_JAW.csv"
Si_JAW = pd.read_csv(file)
Si_JAW.name = "Si_JAW"


file =  "NTVE_JAW.csv"
NTVE_JAW = pd.read_csv(file)
NTVE_JAW.name = "NTVE_JAW"


In [9]:

"""
Author: Alex Bordovalos
Last Edited: 3/12/2025

Description: The goal of this function is to simulate spectroscopic ellipsometry data for filmstacks
in the substrate configuration. This means that the substrate is the bottom most layer of the stack, and the 
simulated measurement is in the reflection configuration. 

Calculations are based off of the Abeles Notation scattering matrix.

Structure is a list of pandas DATA FRAMES. These data frames should corrispond to materials in
a film stack, and the data frames should at least consist of the following columns: 'Wavelength (nm)', 'n', 'k'

For example if you want a film stack of SnO2 on Glass measured in lab ambient air, the input for structure should look like

Structure = [Air, SnO2, Glass] with Air, SnO2, and Glass all being pandas data frames with optical propeties

The next input is Theta_Incident. This is the the angle of incidence you want to simulate. Angle must be in degrees.

Example: Theta_Incident = 60

The next input is Mat_Thick. This input is a list of lists. The list should have a length of len(Structure) - 2.
The program will simulate spectra for each value of the list. Using the SnO2 example from before. Say you wanted
to simulate 3 different SnO2 thickness values, the Mat_Thick = [[100, 200, 150]] (units in nm)

If you had 2 films of differen thicknesses say film 1 can be 50 or 100 and film 2 can be 200 or 250, you would input 
Mat_Thick = [[50, 100] , [200, 250]]

The next term is Ang_Offset is a list of values. You need one value per simulated spectra. This is an offset in degrees for angle of incidence.
Most of the time, if you do not want this offset, you can pass a list of "0"s. 

when write_data = false, no data is saved anywhere. When write_data = True a csv of the material will be created in the current working directory.


To Run this function properly, have these in your import block: 
import pandas as pd
import numpy as np
import os
import cmath
import math
import sympy
from sympy import Symbol
import matplotlib.pyplot as plt
import glob
import random


"""

def SE_Sim_Substrate(Structure, Theta_Incident, Mat_Thick, Ang_Offset, write_data=False ):
     
    # Define Terms Used Throughout the Calculation
    
    wv = Structure[0]['Wavelength (nm)']
    

    # S1: User input may only have n and k, This will add N, e1, e2, and e to the tabulated data frames  
    
    for i in range(len(Structure)):
        N = []
        e1 =[]
        e2 =[]
        e = []
        for j in range(len(wv)):
            N.append(complex (Structure[i]['n'][j] , Structure[i]['k'][j] ))
            e1.append( (Structure[i]['n'][j] * Structure[i]['n'][j]) - (Structure[i]['k'][j] * Structure[i]['k'][j]) )
            e2.append( 2 * (Structure[i]['n'][j] * Structure[i]['k'][j]) )
            e.append(complex(e1[j], e2[j]))

        Structure[i]['N'] = N
        Structure[i]['ε1'] = e1
        Structure[i]['ε2'] = e2
        Structure[i]['ε'] = e   


    
    # S2: Defines the First Angle based on the User input Theta_Incident
    
    t = []
    new_angle = Theta_Incident + Ang_Offset[0]
    for i in range(len(wv)):
        t.append(complex(math.radians(new_angle), 0))
    Structure[0]['Angle'] = t

    
    # S3: Defines the other angles from Snells law - (N1)(sin(t1)) = (N2)(sin(t2))
    
    for i in range(len(Structure) - 1):
        t =[]
        for j in range(len(wv)):
            t.append(cmath.asin( (Structure[i]['N'][j] / Structure[i+1]['N'][j]) * cmath.sin(Structure[i]['Angle'][j]) ) )

        Structure[i+1]['Angle'] = t
        
    
        
    # S4: Next the terms of the interface matrices and propagation matrices will be defined
    # If Mat_Thick[0] is a list, then multiple simulations will be done for each thickness
    
    for k in range(len(Mat_Thick[0])):

       
        Is01 =[]
        Ip01 =[]
        L1 = []
        Is12 = []
        Ip12 =[]
        L2 = []
        Is23 = []
        Ip23 =[]
        L3 = []
        Is34 = []
        Ip34 =[]
        L4 =[]
        Is45 =[]
        Ip45 =[]
        
        L5 =[]
        Is56 =[]
        Ip56 = []
        
        L6 =[]
        Is67 =[]
        Ip67 =[]
        
        L7 =[]
        Is78 =[]
        Ip78 =[]
        
        L8 =[]
        Is89 =[]
        Ip89 =[]
        
        L9 =[]
        Is910 =[]
        Ip910 =[]
        
        L10 =[]
        Is1011 =[]
        Ip1011 =[]
        
        L11 =[]
        Is1112 =[]
        Ip1112 =[]
        
        L12 =[]
        Is1213 =[]
        Ip1213 =[]
        
        L13 =[]
        Is1314 =[]
        Ip1314 =[]
        
        L14 =[]
        Is1415 =[]
        Ip1415 =[]
        
        L15 =[]
        Is1516 =[]
        Ip1516 =[]


      # List of terms for interface and propagation matricies
        
           # List of terms for interface and propagation matricies
        
        Mlist = [
                 [Is01,Ip01], 
                 [L1, Is12,Ip12], 
                 [L2, Is23, Ip23], 
                 [L3, Is34, Ip34], 
                 [L4, Is45, Ip45], 
                 [L5, Is56, Ip56], 
                 [L6, Is67, Ip67], 
                 [L7, Is78, Ip78], 
                 [L8, Is89, Ip89],
                 [L9, Is910, Ip910],
                 [L10, Is1011, Ip1011],
                 [L11, Is1112, Ip1112],
                 [L12, Is1213, Ip1213],
                 [L13, Is1314, Ip1314],
                 [L14, Is1415, Ip1415],
                 [L15, Is1516, Ip1516]
                ]
        
    
        
        for j in range(len(Structure) - 1):

            for i in range(len(wv)):
                
                # Fresnel coefficients are calculated for each layer 

                tsjk = ( 2 * Structure[j]['N'][i] * cmath.cos(Structure[j]['Angle'][i]) ) / ( Structure[j+1]['N'][i] * cmath.cos(Structure[j+1]['Angle'][i]) + Structure[j]['N'][i] * cmath.cos(Structure[j]['Angle'][i]) )
                tpjk = ( 2 * Structure[j]['N'][i] * cmath.cos(Structure[j]['Angle'][i]) ) / ( Structure[j+1]['N'][i] * cmath.cos(Structure[j]['Angle'][i]) + Structure[j]['N'][i] * cmath.cos(Structure[j+1]['Angle'][i]) )
                rsjk = ( Structure[j]['N'][i] * cmath.cos(Structure[j]['Angle'][i]) - Structure[j+1]['N'][i] * cmath.cos(Structure[j+1]['Angle'][i]) ) / ( Structure[j]['N'][i] * cmath.cos(Structure[j]['Angle'][i]) + Structure[j+1]['N'][i] * cmath.cos(Structure[j+1]['Angle'][i]) )
                rpjk=  ( Structure[j+1]['N'][i] * cmath.cos(Structure[j]['Angle'][i]) - Structure[j]['N'][i] * cmath.cos(Structure[j+1]['Angle'][i]) ) / ( Structure[j+1]['N'][i] * cmath.cos(Structure[j]['Angle'][i]) + Structure[j]['N'][i] * cmath.cos(Structure[j+1]['Angle'][i]) )

                if j == 0:
                    Mlist[j][0].append( 1 / tsjk * np.array([[1, rsjk],
                                                             [rsjk, 1]]))

                    Mlist[j][1].append( 1 / tpjk * np.array([[1, rpjk],
                                                             [rpjk, 1]]))
                else:
                    e = (2 * math.pi * Structure[j]['N'][i] * cmath.cos(Structure[j]['Angle'][i]) ) / Structure[j]['Wavelength (nm)'][i]

                    Mlist[j][0].append( np.array([[cmath.exp( complex(0,-1) * Mat_Thick[j-1][k]* e), 0],
                                            [0, cmath.exp(complex(0,1) * Mat_Thick[j-1][k] * e)]]) )

                    Mlist[j][1].append( 1 / tsjk * np.array([[1, rsjk],
                                                             [rsjk, 1]]))

                    Mlist[j][2].append( 1 / tpjk * np.array([[1, rpjk],
                                                             [rpjk, 1]]))

                    

     # S5: Now We can define the S matrix and calculated rho, psi, delta, N, C, and S
    
        Ss = []
        Sp = []
        rho_calc = []

        rs = []
        rp = []
        ts = []
        tp = []

        pdivs = []
        psi = []
        delta = []
        N = []
        C = []
        S =[]


        # S6: This logic accounts for different number of films. Current range is 3-6 layers. Can easily be increased
        # by adding terms to S4 and at S6
        #This area calculates the S matrix for both the p and s orientation
        
        
        
        for i in range(len(Ip01)):

            if len(Structure) == 3: 

                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] ))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] ))
            
            if len(Structure) == 4: 

                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] @ L2[i] @ Is23[i]))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] @ L2[i] @ Ip23[i]))

            if len(Structure) == 5: 

                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] @ L2[i] @ Is23[i] @ L3[i] @ Is34[i] ))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] @ L2[i] @ Ip23[i] @ L3[i] @ Ip34[i]))
                
            if len(Structure) == 6: 
               
                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] @ L2[i] @ Is23[i] @ L3[i] @ Is34[i] @ L4[i] @ Is45[i] ))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] @ L2[i] @ Ip23[i] @ L3[i] @ Ip34[i] @ L4[i] @ Ip45[i] ))
                
                
            if len(Structure) == 7: 
               
                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] @ L2[i] @ Is23[i] @ L3[i] @ Is34[i] @ L4[i] @ Is45[i] @ L5[i] @ Is56[i]))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] @ L2[i] @ Ip23[i] @ L3[i] @ Ip34[i] @ L4[i] @ Ip45[i] @ L5[i] @ Ip56[i]))
                
              
                      
            if len(Structure) == 8: 
               
                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] @ L2[i] @ Is23[i] @ L3[i] @ Is34[i] @ L4[i] @ Is45[i] @ L5[i] @ Is56[i] @ L6[i] @ Is67[i]))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] @ L2[i] @ Ip23[i] @ L3[i] @ Ip34[i] @ L4[i] @ Ip45[i] @ L5[i] @ Ip56[i] @ L6[i] @ Ip67[i]))
                
    
            if len(Structure) == 9: 
               
                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] @ L2[i] @ Is23[i] @ L3[i] @ Is34[i] @ L4[i] @ Is45[i] @ L5[i] @ Is56[i] @ L6[i] @ Is67[i] @ L7[i] @ Is78[i]))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] @ L2[i] @ Ip23[i] @ L3[i] @ Ip34[i] @ L4[i] @ Ip45[i] @ L5[i] @ Ip56[i] @ L6[i] @ Ip67[i] @ L7[i] @ Ip78[i]))
               
               
            if len(Structure) == 10: 
               
                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] @ L2[i] @ Is23[i] @ L3[i] @ Is34[i] @ L4[i] @ Is45[i] @ L5[i] @ Is56[i] @ L6[i] @ Is67[i] @ L7[i] @ Is78[i] @ L8[i] @ Is89[i]))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] @ L2[i] @ Ip23[i] @ L3[i] @ Ip34[i] @ L4[i] @ Ip45[i] @ L5[i] @ Ip56[i] @ L6[i] @ Ip67[i] @ L7[i] @ Ip78[i] @ L8[i] @ Ip89[i]))
                
    
            if len(Structure) == 11: 
               
                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] @ L2[i] @ Is23[i] @ L3[i] @ Is34[i] @ L4[i] @ Is45[i] @ L5[i] @ Is56[i] @ L6[i] @ Is67[i] @ L7[i] @ Is78[i] @ L8[i] @ Is89[i] @ L9[i] @ Is910[i]))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] @ L2[i] @ Ip23[i] @ L3[i] @ Ip34[i] @ L4[i] @ Ip45[i] @ L5[i] @ Ip56[i] @ L6[i] @ Ip67[i] @ L7[i] @ Ip78[i] @ L8[i] @ Ip89[i] @ L9[i] @ Ip910[i]))
              
            if len(Structure) == 12: 
               
                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] @ L2[i] @ Is23[i] @ L3[i] @ Is34[i] @ L4[i] @ Is45[i] @ L5[i] @ Is56[i] @ L6[i] @ Is67[i] @ L7[i] @ Is78[i] @ L8[i] @ Is89[i] @ L9[i] @ Is910[i] @ L10[i] @ Is1011[i]))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] @ L2[i] @ Ip23[i] @ L3[i] @ Ip34[i] @ L4[i] @ Ip45[i] @ L5[i] @ Ip56[i] @ L6[i] @ Ip67[i] @ L7[i] @ Ip78[i] @ L8[i] @ Ip89[i] @ L9[i] @ Ip910[i] @ L10[i] @ Ip1011[i]))
      
            
            if len(Structure) == 13: 
               
                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] @ L2[i] @ Is23[i] @ L3[i] @ Is34[i] @ L4[i] @ Is45[i] @ L5[i] @ Is56[i] @ L6[i] @ Is67[i] @ L7[i] @ Is78[i] @ L8[i] @ Is89[i] @ L9[i] @ Is910[i] @ L10[i] @ Is1011[i] @ L11[i] @ Is1112[i]))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] @ L2[i] @ Ip23[i] @ L3[i] @ Ip34[i] @ L4[i] @ Ip45[i] @ L5[i] @ Ip56[i] @ L6[i] @ Ip67[i] @ L7[i] @ Ip78[i] @ L8[i] @ Ip89[i] @ L9[i] @ Ip910[i] @ L10[i] @ Ip1011[i] @ L11[i] @ Ip1112[i]))
      
            
            if len(Structure) == 14: 
               
                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] @ L2[i] @ Is23[i] @ L3[i] @ Is34[i] @ L4[i] @ Is45[i] @ L5[i] @ Is56[i] @ L6[i] @ Is67[i] @ L7[i] @ Is78[i] @ L8[i] @ Is89[i] @ L9[i] @ Is910[i] @ L10[i] @ Is1011[i] @ L11[i] @ Is1112[i] @ L12[i] @ Is1213[i] ))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] @ L2[i] @ Ip23[i] @ L3[i] @ Ip34[i] @ L4[i] @ Ip45[i] @ L5[i] @ Ip56[i] @ L6[i] @ Ip67[i] @ L7[i] @ Ip78[i] @ L8[i] @ Ip89[i] @ L9[i] @ Ip910[i] @ L10[i] @ Ip1011[i] @ L11[i] @ Ip1112[i] @ L12[i] @ Ip1213[i]))
      
    
      
            if len(Structure) == 15: 
               
                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] @ L2[i] @ Is23[i] @ L3[i] @ Is34[i] @ L4[i] @ Is45[i] @ L5[i] @ Is56[i] @ L6[i] @ Is67[i] @ L7[i] @ Is78[i] @ L8[i] @ Is89[i] @ L9[i] @ Is910[i] @ L10[i] @ Is1011[i] @ L11[i] @ Is1112[i] @ L12[i] @ Is1213[i] @ L13[i] @ Is1314[i]  ))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] @ L2[i] @ Ip23[i] @ L3[i] @ Ip34[i] @ L4[i] @ Ip45[i] @ L5[i] @ Ip56[i] @ L6[i] @ Ip67[i] @ L7[i] @ Ip78[i] @ L8[i] @ Ip89[i] @ L9[i] @ Ip910[i] @ L10[i] @ Ip1011[i] @ L11[i] @ Ip1112[i] @ L12[i] @ Ip1213[i] @ L13[i] @ Ip1314[i] ))
      
            if len(Structure) == 16: 
               
                Ss.append(np.conj(Is01[i] @ L1[i] @ Is12[i] @ L2[i] @ Is23[i] @ L3[i] @ Is34[i] @ L4[i] @ Is45[i] @ L5[i] @ Is56[i] @ L6[i] @ Is67[i] @ L7[i] @ Is78[i] @ L8[i] @ Is89[i] @ L9[i] @ Is910[i] @ L10[i] @ Is1011[i] @ L11[i] @ Is1112[i] @ L12[i] @ Is1213[i] @ L13[i] @ Is1314[i] @ L14[i] @ Is1415[i] ))
                Sp.append(np.conj(Ip01[i] @ L1[i] @ Ip12[i] @ L2[i] @ Ip23[i] @ L3[i] @ Ip34[i] @ L4[i] @ Ip45[i] @ L5[i] @ Ip56[i] @ L6[i] @ Ip67[i] @ L7[i] @ Ip78[i] @ L8[i] @ Ip89[i] @ L9[i] @ Ip910[i] @ L10[i] @ Ip1011[i] @ L11[i] @ Ip1112[i] @ L12[i] @ Ip1213[i] @ L13[i] @ Ip1314[i] @ L14[i] @ Ip1415[i] ))
      
        

            # S7: From the S matrix we can calculate rho, psi, delta, N, C, and S  
            
            rho_calc.append ( (Sp[i][1][0] / Sp[i][0][0])  * (Ss[i][0][0] / Ss[i][1][0]) )

            rs.append(Ss[i][1][0] / Ss[i][0][0])
            rp.append(Sp[i][1][0] / Sp[i][0][0])
            pdivs.append(rp[i] / rs[i])
            ts.append( 1 / Ss[i][0][0] )
            tp.append( 1 / Sp[i][0][0] )

            psi.append( (cmath.atan(abs(rp[i] / rs[i]))).real )

            delta.append(  ( np.angle(rp[i]) - cmath.phase(rs[i]))) 

            N.append( (cmath.cos(2* psi[i])).real)
            C.append( (cmath.sin(2* psi[i]) * cmath.cos(delta[i])).real)
            S.append( (cmath.sin(2* psi[i])*cmath.sin(delta[i])).real)
            
            # S8: Now the N, C, and S are all collected into a dataframe
            # The data frame is named based on the structure and thicknesses
            # The data frame is then written to a CSV file before the next set of calculations
            
            
            
            #S9 now we want to have the option to save the optical properties of a film of interest. 
            

            

        my_dict =  {'Wavelength (nm)': wv, 'N': N, 'C': C, 'S': S}

        df = pd.DataFrame(my_dict)
        name = str()
        for z in range(len(Mat_Thick)):
            name = name + str(Structure[z+1].name) + "_"
            name = name + str(Mat_Thick[z][k]) + "nm"
            if z < len(Mat_Thick) - 1:
                name = name + "_"
        name = name + "_AO_" + str(Ang_Offset[k]) 

        name_row = pd.DataFrame([[name] + [''] * (df.shape[1] - 1)], columns=df.columns)
        df = pd.concat([name_row, df], ignore_index=True)

        timestamp = datetime.now()
        timestamp_string = timestamp.strftime('%Y-%m-%d-%H-%M-%S') + f"-{timestamp.microsecond // 1000:03d}"
        title = "trial_" + timestamp_string  + ".csv"

        if write_data: 
            df.to_csv( title + ".csv" , index=False)
            #return(df)
        else: 
            return(df)

Data_Set_1: This data set will be generated with Cody-Loretnz oscillator on native oxide coated Si wafer. The structure will consist of Si Wafer / Native Oxide / Bulk Film. This structure mimics the structure determined from Least-Square regression on Samples MV1519 and MV1523. 

The train set will be 100,000 files. The validation set will be 10,000. The test set will be 10,000

In [10]:
#GENERATE TRAIN SET

os.chdir(r"XXXX") ########## PLEASE PUT THE DIRECTORY WHERE YOU WOULD LIKE TO STORE THE TRAINING DATA in the XXXX space ####################

for i in range(100000): #How many files we want to generate 

    #start by randomly generating the parameters from a pre-defined range. 
    E_inf = (1) # fixed at 1
    Amp  = (random.randrange(500,1500) / 10 ) # range from [50:150.0]
    Br = (random.randrange(150,350) / 100) # range from [1.50 :3.50]
   
    Eo_temp = random.randrange(300,500) 
    Eo = Eo_temp / 100 # range from [3.00:5.00]
    
    Eg_temp = random.randrange(100,250) 
    Eg = Eg_temp / 100 # range from [1.00:2.5]

    Ep = (random.randrange(50, (Eo_temp - Eg_temp))  / 100 ) # range from [0.50:2.00]
    Et = 0 #this term being 0 effectivly neglects the Urbach energy and simplifies the equations.
    Egt = Eg + Et



    # Now we generate our Cody-Lorentz material
    CL = Get_CL_Material(E, Ep, Eg, Eo, Br, Amp, Egt, E_inf, wv)

    #Now for each model, lets make some simulated SE data 
    
    #Generate Thicknesses
    Bulk_Thickness = (random.randrange(2500,12500) / 100 ) # range from [25.00:125.00]
    NTVE_JAW_Thickness = (random.randrange(160,170) / 100 ) # range from [1.60:1.70] # fixed
    ang_off = 0 # range from [0:0]

    Structure = [Void, CL, NTVE_JAW, Si_JAW]
    Mat_Thick = [ [Bulk_Thickness], [NTVE_JAW_Thickness] ]
    Theta_Incident = 64.93
    SE_Sim_Substrate(Structure, Theta_Incident, Mat_Thick, [ang_off], write_data=True )

In [11]:
#GENERATE Validation Set

os.chdir(r"XXXX") ########## PLEASE PUT THE DIRECTORY WHERE YOU WOULD LIKE TO STORE THE VALIDATION DATA in the XXXX space ####################

for i in range(10000): #How many oscillators we want to generate

    #start by randomly generating the parameters from a range. 

    E_inf = (1) # fixed at 1
    Amp  = (random.randrange(500,1500) / 10 ) # range from [50:150.0]
    Br = (random.randrange(150,350) / 100) # range from [1.50 :3.50]
   
    Eo_temp = random.randrange(300,500) 
    Eo = Eo_temp / 100 # range from [3.00:5.00]
    
    Eg_temp = random.randrange(100,250) 
    Eg = Eg_temp / 100 # range from [1.00:2.5]

    Ep = (random.randrange(50, (Eo_temp - Eg_temp))  / 100 ) # range from [0.50:2.00]
    Et = 0 #this term being 0 effectivly neglects the Urbach energy and simplifies the equations.
    Egt = Eg + Et



    # Now we generate our Cody-Lorentz material
    CL = Get_CL_Material(E, Ep, Eg, Eo, Br, Amp, Egt, E_inf, wv)

    #Now for each model, lets make some simulated SE data 
    
    #Generate Thicknesses
    Bulk_Thickness = (random.randrange(2500,12500) / 100 ) # range from [25.00:125.00]
    NTVE_JAW_Thickness = (random.randrange(160,170) / 100 ) # range from [1.60:1.70] # fixed
    ang_off = 0 # range from [0:0]

    Structure = [Void, CL, NTVE_JAW, Si_JAW]
    Mat_Thick = [ [Bulk_Thickness], [NTVE_JAW_Thickness] ]
    Theta_Incident = 64.93
    SE_Sim_Substrate(Structure, Theta_Incident, Mat_Thick, [ang_off], write_data=True )

In [12]:
#GENERATE TEST SET

os.chdir(r"XXXX") ########## PLEASE PUT THE DIRECTORY WHERE YOU WOULD LIKE TO STORE THE TEST DATA in the XXXX space ####################

for i in range(10000): #How many oscillators we want to generate

    #start by randomly generating the parameters from a range. 

    E_inf = (1) # fixed at 1
    Amp  = (random.randrange(500,1500) / 10 ) # range from [50:150.0]
    Br = (random.randrange(150,350) / 100) # range from [1.50 :3.50]
   
    Eo_temp = random.randrange(300,500) 
    Eo = Eo_temp / 100 # range from [3.00:5.00]
    
    Eg_temp = random.randrange(100,250) 
    Eg = Eg_temp / 100 # range from [1.00:2.5]

    Ep = (random.randrange(50, (Eo_temp - Eg_temp))  / 100 ) # range from [0.50:2.00]
    Et = 0 #this term being 0 effectivly neglects the Urbach energy and simplifies the equations.
    Egt = Eg + Et



    # Now we generate our Cody-Lorentz material
    CL = Get_CL_Material(E, Ep, Eg, Eo, Br, Amp, Egt, E_inf, wv)

    #Now for each model, lets make some simulated SE data 
    
    #Generate Thicknesses
    Bulk_Thickness = (random.randrange(2500,12500) / 100 ) # range from [25.00:125.00]
    NTVE_JAW_Thickness = (random.randrange(160,170) / 100 ) # range from [1.60:1.70] # fixed
    ang_off = 0 # range from [0:0]

    Structure = [Void, CL, NTVE_JAW, Si_JAW]
    Mat_Thick = [ [Bulk_Thickness], [NTVE_JAW_Thickness] ]
    Theta_Incident = 64.93
    SE_Sim_Substrate(Structure, Theta_Incident, Mat_Thick, [ang_off], write_data=True )

Data_Set_2: This data set will be generated with Cody-Loretnz oscillator on soda-lime glass (SLG). The structure will consist of SLG / Bulk Film. This structure mimics the structure determined from Least-Square regression on Samples MV1519 and MV1523. Except the substrate for the MV1530 samle is used. 

The train set will be 100,000 files. The validation set will be 10,000. The test set will be 10,000

Data_Set_2: This data set will be generated with Tauc-Loretnz oscillator on Soda-Lime Glass. The structure will consist of Soda Lime Glass / Bulk Film / Surface Layer. The surface layer will consist of 50% void and 50% of the randomy generated bulk film. This structure mimics the structure determined from Least-Square regression on Samples MV1530. 

In [13]:
#Generate Training Set
# Now lets change directory to the train set directory
os.chdir(r"XXXX") ########## PLEASE PUT THE DIRECTORY WHERE YOU WOULD LIKE TO STORE THE TRAINING DATA in the XXXX space ####################

for i in range(50000): #How many oscillators we want to generate

    #start by randomly generating the parameters from a range. 

    E_inf = (1) # fixed at 1
    Amp  = (random.randrange(500,1500) / 10 ) # range from [50:150.0]
    Br = (random.randrange(150,350) / 100) # range from [1.50 :3.50]
   
    Eo_temp = random.randrange(300,500) 
    Eo = Eo_temp / 100 # range from [3.00:5.00]
    
    Eg_temp = random.randrange(100,250) 
    Eg = Eg_temp / 100 # range from [1.00:2.5]

    Ep = (random.randrange(50, (Eo_temp - Eg_temp))  / 100 ) # range from [0.50:2.00]
    Et = 0 #this term being 0 effectivly neglects the Urbach energy and simplifies the equations.
    Egt = Eg + Et



    # Now we generate our Cody-Lorentz material
    CL = Get_CL_Material(E, Ep, Eg, Eo, Br, Amp, Egt, E_inf, wv)
    EMA = Bruggeman_EMA( CL, Void , 0.5 )
    #Now for each model, lets make some simulated SE data 
    
    #Generate Thicknesses
    Bulk_Thickness = (random.randrange(2500,12500) / 100 ) # range from [25.00:125.00]
    EMA_Thickness = (random.randrange(10,200) / 100 ) # range from [0.10:2.00]
    ang_off = 0 # range from [0:0]

    Structure = [Void, EMA, CL, SLG]
    Mat_Thick = [ [EMA_Thickness], [Bulk_Thickness]]
    Theta_Incident = 64.93
    SE_Sim_Substrate(Structure, Theta_Incident, Mat_Thick, [ang_off], write_data=True )

In [15]:
# Generate Validation Set

os.chdir(r"XXXX") ########## PLEASE PUT THE DIRECTORY WHERE YOU WOULD LIKE TO STORE THE Validation in the XXXX space ####################

for i in range(5000): # How many files you want to generate
    #start by randomly generating the parameters from a range. 

    E_inf = (1) # fixed at 1
    Amp  = (random.randrange(500,1500) / 10 ) # range from [50:150.0]
    Br = (random.randrange(150,350) / 100) # range from [1.50 :3.50]
   
    Eo_temp = random.randrange(300,500) 
    Eo = Eo_temp / 100 # range from [3.00:5.00]
    
    Eg_temp = random.randrange(100,250) 
    Eg = Eg_temp / 100 # range from [1.00:2.5]

    Ep = (random.randrange(50, (Eo_temp - Eg_temp))  / 100 ) # range from [0.50:2.00]
    Et = 0 #this term being 0 effectivly neglects the Urbach energy and simplifies the equations.
    Egt = Eg + Et



    # Now we generate our Cody-Lorentz material
    CL = Get_CL_Material(E, Ep, Eg, Eo, Br, Amp, Egt, E_inf, wv)
    EMA = Bruggeman_EMA( CL, Void , 0.5 )
    #Now for each model, lets make some simulated SE data 
    
    #Generate Thicknesses
    Bulk_Thickness = (random.randrange(2500,12500) / 100 ) # range from [25.00:125.00]
    EMA_Thickness = (random.randrange(10,200) / 100 ) # range from [0.10:2.00]
    ang_off = 0 # range from [0:0]

    Structure = [Void, EMA, CL, SLG]
    Mat_Thick = [ [EMA_Thickness], [Bulk_Thickness]]
    Theta_Incident = 64.93
    SE_Sim_Substrate(Structure, Theta_Incident, Mat_Thick, [ang_off], write_data=True )

In [14]:
#Generate Test Set

os.chdir(r"XXXX") ########## PLEASE PUT THE DIRECTORY WHERE YOU WOULD LIKE TO STORE THE TEST DATA in the XXXX space ####################

for i in range(5000): #How many oscillators we want to generate

    #start by randomly generating the parameters from a range. 

    E_inf = (1) # fixed at 1
    Amp  = (random.randrange(500,1500) / 10 ) # range from [50:150.0]
    Br = (random.randrange(150,350) / 100) # range from [1.50 :3.50]
   
    Eo_temp = random.randrange(300,500) 
    Eo = Eo_temp / 100 # range from [3.00:5.00]
    
    Eg_temp = random.randrange(100,250) 
    Eg = Eg_temp / 100 # range from [1.00:2.5]

    Ep = (random.randrange(50, (Eo_temp - Eg_temp))  / 100 ) # range from [0.50:2.00]
    Et = 0 #this term being 0 effectivly neglects the Urbach energy and simplifies the equations.
    Egt = Eg + Et



    # Now we generate our Cody-Lorentz material
    CL = Get_CL_Material(E, Ep, Eg, Eo, Br, Amp, Egt, E_inf, wv)
    EMA = Bruggeman_EMA( CL, Void , 0.5 )
    #Now for each model, lets make some simulated SE data 
    
    #Generate Thicknesses
    Bulk_Thickness = (random.randrange(2500,12500) / 100 ) # range from [25.00:125.00]
    EMA_Thickness = (random.randrange(10,200) / 100 ) # range from [0.10:2.00]
    ang_off = 0 # range from [0:0]

    Structure = [Void, EMA, CL, SLG]
    Mat_Thick = [ [EMA_Thickness], [Bulk_Thickness]]
    Theta_Incident = 64.93
    SE_Sim_Substrate(Structure, Theta_Incident, Mat_Thick, [ang_off], write_data=True )

Data_Set_3-1: This data set will be generated with Cody-Loretnz oscillator oon soda-lime glass (SLG). The structure will consist of SLG / Bulk Film.

Data set 3 will contain 12,500 files from Data_Set_1, 12,500 files from Data_Set_2, 12,500 files from Data_Set_3-1, and 12,500 files from Data_Set_3-2 Each of the 4 data sets that make up Data Set 3 will use a different structure.

In [19]:
#GENERATE TRAIN SET

os.chdir(r"XXXX") ########## PLEASE PUT THE DIRECTORY WHERE YOU WOULD LIKE TO STORE THE TRAINING DATA in the XXXX space ####################

for i in range(100000): #How many oscillators we want to generate

    #start by randomly generating the parameters from a range. 

    E_inf = (1) # fixed at 1
    Amp  = (random.randrange(500,1500) / 10 ) # range from [50:150.0]
    Br = (random.randrange(150,350) / 100) # range from [1.50 :3.50]
   
    Eo_temp = random.randrange(300,500) 
    Eo = Eo_temp / 100 # range from [3.00:5.00]
    
    Eg_temp = random.randrange(100,250) 
    Eg = Eg_temp / 100 # range from [1.00:2.5]

    Ep = (random.randrange(50, (Eo_temp - Eg_temp))  / 100 ) # range from [0.50:2.00]
    Et = 0 #this term being 0 effectivly neglects the Urbach energy and simplifies the equations.
    Egt = Eg + Et



    # Now we generate our Cody-Lorentz material
    CL = Get_CL_Material(E, Ep, Eg, Eo, Br, Amp, Egt, E_inf, wv)

    #Now for each model, lets make some simulated SE data 
    
    #Generate Thicknesses
    Bulk_Thickness = (random.randrange(2500,12500) / 100 ) # range from [25.00:125.00]
    #NTVE_JAW_Thickness = (random.randrange(160,170) / 100 ) # range from [1.60:1.70] # fixed
    ang_off = 0 # range from [0:0]

    Structure = [Void, CL, SLG]
    Mat_Thick = [ [Bulk_Thickness] ]
    Theta_Incident = 64.93
    SE_Sim_Substrate(Structure, Theta_Incident, Mat_Thick, [ang_off], write_data=True )

In [21]:
#GENERATE Validation SET

os.chdir(r"XXXX") ########## PLEASE PUT THE DIRECTORY WHERE YOU WOULD LIKE TO STORE THE VALIDATION DATA in the XXXX space ####################

for i in range(10000): #How many oscillators we want to generate

    #start by randomly generating the parameters from a range. 

    E_inf = (1) # fixed at 1
    Amp  = (random.randrange(500,1500) / 10 ) # range from [50:150.0]
    Br = (random.randrange(150,350) / 100) # range from [1.50 :3.50]
   
    Eo_temp = random.randrange(300,500) 
    Eo = Eo_temp / 100 # range from [3.00:5.00]
    
    Eg_temp = random.randrange(100,250) 
    Eg = Eg_temp / 100 # range from [1.00:2.5]

    Ep = (random.randrange(50, (Eo_temp - Eg_temp))  / 100 ) # range from [0.50:2.00]
    Et = 0 #this term being 0 effectivly neglects the Urbach energy and simplifies the equations.
    Egt = Eg + Et



    # Now we generate our Cody-Lorentz material
    CL = Get_CL_Material(E, Ep, Eg, Eo, Br, Amp, Egt, E_inf, wv)

    #Now for each model, lets make some simulated SE data 
    
    #Generate Thicknesses
    Bulk_Thickness = (random.randrange(2500,12500) / 100 ) # range from [25.00:125.00]
    #NTVE_JAW_Thickness = (random.randrange(160,170) / 100 ) # range from [1.60:1.70] # fixed
    ang_off = 0 # range from [0:0]

    Structure = [Void, CL, SLG]
    Mat_Thick = [ [Bulk_Thickness] ]
    Theta_Incident = 64.93
    SE_Sim_Substrate(Structure, Theta_Incident, Mat_Thick, [ang_off], write_data=True )

In [20]:
#GENERATE Test SET

os.chdir(r"XXXX") ########## PLEASE PUT THE DIRECTORY WHERE YOU WOULD LIKE TO STORE THE TEST DATA in the XXXX space ####################

for i in range(10000): #How many oscillators we want to generate

    #start by randomly generating the parameters from a range. 

    E_inf = (1) # fixed at 1
    Amp  = (random.randrange(500,1500) / 10 ) # range from [50:150.0]
    Br = (random.randrange(150,350) / 100) # range from [1.50 :3.50]
   
    Eo_temp = random.randrange(300,500) 
    Eo = Eo_temp / 100 # range from [3.00:5.00]
    
    Eg_temp = random.randrange(100,250) 
    Eg = Eg_temp / 100 # range from [1.00:2.5]

    Ep = (random.randrange(50, (Eo_temp - Eg_temp))  / 100 ) # range from [0.50:2.00]
    Et = 0 #this term being 0 effectivly neglects the Urbach energy and simplifies the equations.
    Egt = Eg + Et



    # Now we generate our Cody-Lorentz material
    CL = Get_CL_Material(E, Ep, Eg, Eo, Br, Amp, Egt, E_inf, wv)

    #Now for each model, lets make some simulated SE data 
    
    #Generate Thicknesses
    Bulk_Thickness = (random.randrange(2500,12500) / 100 ) # range from [25.00:125.00]
    #NTVE_JAW_Thickness = (random.randrange(160,170) / 100 ) # range from [1.60:1.70] # fixed
    ang_off = 0 # range from [0:0]

    Structure = [Void, CL, SLG]
    Mat_Thick = [ [Bulk_Thickness] ]
    Theta_Incident = 64.93
    SE_Sim_Substrate(Structure, Theta_Incident, Mat_Thick, [ang_off], write_data=True )

Data_Set_3-2: This data set will be generated with Cody-Loretnz oscillator on native oxide coated Si wafer. The structure will consist of Si Wafer / Native Oxide / Bulk Film / Surface Layer. The surface layer will consist of 50% void and 50% of the randomy generated bulk film. 

In [37]:
#GENERATE TRAIN SET

os.chdir(r"XXXX") ########## PLEASE PUT THE DIRECTORY WHERE YOU WOULD LIKE TO STORE THE TRAINING DATA in the XXXX space ####################

for i in range(12500): #How many oscillators we want to generate

    #start by randomly generating the parameters from a range. 

    E_inf = (1) # fixed at 1
    Amp  = (random.randrange(500,1500) / 10 ) # range from [50:150.0]
    Br = (random.randrange(150,350) / 100) # range from [1.50 :3.50]
   
    Eo_temp = random.randrange(300,500) 
    Eo = Eo_temp / 100 # range from [3.00:5.00]
    
    Eg_temp = random.randrange(100,250) 
    Eg = Eg_temp / 100 # range from [1.00:2.5]

    Ep = (random.randrange(50, (Eo_temp - Eg_temp))  / 100 ) # range from [0.50:2.00]
    Et = 0 #this term being 0 effectivly neglects the Urbach energy and simplifies the equations.
    Egt = Eg + Et



    # Now we generate our Cody-Lorentz material
    CL = Get_CL_Material(E, Ep, Eg, Eo, Br, Amp, Egt, E_inf, wv)
    EMA = Bruggeman_EMA( CL, Void , 0.5 )
    #Now for each model, lets make some simulated SE data 
    
    #Generate Thicknesses
    Bulk_Thickness = (random.randrange(2500,12500) / 100 ) # range from [25.00:125.00]
    EMA_Thickness = (random.randrange(10,200) / 100 ) # range from [0.10:2.00]
    NTVE_JAW_Thickness = (random.randrange(160,170) / 100 ) # range from [1.60:1.70] # fixed
    ang_off = 0 # range from [0:0]

    Structure = [Void, EMA, CL, NTVE_JAW, Si_JAW]
    Mat_Thick = [ [EMA_Thickness], [Bulk_Thickness], [NTVE_JAW_Thickness] ]
    Theta_Incident = 64.93
    SE_Sim_Substrate(Structure, Theta_Incident, Mat_Thick, [ang_off], write_data=True )

In [36]:
#GENERATE Validation SET

os.chdir(r"XXXX") ########## PLEASE PUT THE DIRECTORY WHERE YOU WOULD LIKE TO STORE THE VALIDATION DATA in the XXXX space ####################

for i in range(1250): #How many oscillators we want to generate

    #start by randomly generating the parameters from a range. 

    E_inf = (1) # fixed at 1
    Amp  = (random.randrange(500,1500) / 10 ) # range from [50:150.0]
    Br = (random.randrange(150,350) / 100) # range from [1.50 :3.50]
   
    Eo_temp = random.randrange(300,500) 
    Eo = Eo_temp / 100 # range from [3.00:5.00]
    
    Eg_temp = random.randrange(100,250) 
    Eg = Eg_temp / 100 # range from [1.00:2.5]

    Ep = (random.randrange(50, (Eo_temp - Eg_temp))  / 100 ) # range from [0.50:2.00]
    Et = 0 #this term being 0 effectivly neglects the Urbach energy and simplifies the equations.
    Egt = Eg + Et



    # Now we generate our Cody-Lorentz material
    CL = Get_CL_Material(E, Ep, Eg, Eo, Br, Amp, Egt, E_inf, wv)
    EMA = Bruggeman_EMA( CL, Void , 0.5 )
    #Now for each model, lets make some simulated SE data 
    
    #Generate Thicknesses
    Bulk_Thickness = (random.randrange(2500,12500) / 100 ) # range from [25.00:125.00]
    EMA_Thickness = (random.randrange(10,200) / 100 ) # range from [0.10:2.00]
    NTVE_JAW_Thickness = (random.randrange(160,170) / 100 ) # range from [1.60:1.70] # fixed
    ang_off = 0 # range from [0:0]

    Structure = [Void, EMA, CL, NTVE_JAW, Si_JAW]
    Mat_Thick = [ [EMA_Thickness], [Bulk_Thickness], [NTVE_JAW_Thickness] ]
    Theta_Incident = 64.93
    SE_Sim_Substrate(Structure, Theta_Incident, Mat_Thick, [ang_off], write_data=True )

In [35]:
#GENERATE Test SET

os.chdir(r"XXXX") ########## PLEASE PUT THE DIRECTORY WHERE YOU WOULD LIKE TO STORE THE TEST DATA in the XXXX space ####################

for i in range(1250): #How many oscillators we want to generate

    #start by randomly generating the parameters from a range. 

    E_inf = (1) # fixed at 1
    Amp  = (random.randrange(500,1500) / 10 ) # range from [50:150.0]
    Br = (random.randrange(150,350) / 100) # range from [1.50 :3.50]
   
    Eo_temp = random.randrange(300,500) 
    Eo = Eo_temp / 100 # range from [3.00:5.00]
    
    Eg_temp = random.randrange(100,250) 
    Eg = Eg_temp / 100 # range from [1.00:2.5]

    Ep = (random.randrange(50, (Eo_temp - Eg_temp))  / 100 ) # range from [0.50:2.00]
    Et = 0 #this term being 0 effectivly neglects the Urbach energy and simplifies the equations.
    Egt = Eg + Et



    # Now we generate our Cody-Lorentz material
    CL = Get_CL_Material(E, Ep, Eg, Eo, Br, Amp, Egt, E_inf, wv)
    EMA = Bruggeman_EMA( CL, Void , 0.5 )
    #Now for each model, lets make some simulated SE data 
    
    #Generate Thicknesses
    Bulk_Thickness = (random.randrange(2500,12500) / 100 ) # range from [25.00:125.00]
    EMA_Thickness = (random.randrange(10,200) / 100 ) # range from [0.10:2.00]
    NTVE_JAW_Thickness = (random.randrange(160,170) / 100 ) # range from [1.60:1.70] # fixed
    ang_off = 0 # range from [0:0]

    Structure = [Void, EMA, CL, NTVE_JAW, Si_JAW]
    Mat_Thick = [ [EMA_Thickness], [Bulk_Thickness], [NTVE_JAW_Thickness] ]
    Theta_Incident = 64.93
    SE_Sim_Substrate(Structure, Theta_Incident, Mat_Thick, [ang_off], write_data=True )

Data set 3 will contain 12,500 files from Data_Set_1, 12,500 files from Data_Set_2, 12,500 files from Data_Set_3-1, and 12,500 files from Data_Set_3-2 Each of the 4 data sets that make up Data Set 3 will use a different structure.

These files will be combined manually outside of this notebook. 