## Generation of light-curves for training NNs

The scripts is used to generate data sets containing light curves for training neural networks. The script contains:

1. Paths to directories and Input parameters
2. Extracting data from microlensing light-curves
3. Creating mock light-curves fpr a given magnification map and velocity
4. Storing the generated light curves 

Rewritten by: Soumya Shreeram <br>
Script adapted from: Eric Paic <br>
Date: 02nd March 2020

In [13]:
import numpy as np
import pickle as pkl
from astropy.io import fits
import glob

from time import sleep
import os,sys
from tempfile import TemporaryFile

### 1.1 Paths to directories and Input parameters

In [14]:
current_dir = os.getcwd()
root_dir = os.path.abspath(os.path.join(current_dir, os.pardir))
data_dir = os.path.join(root_dir, "TP4b")
print("Does directory exists? \n>",os.path.isdir(data_dir))

# setting the paths
datadir = os.path.join(data_dir,  'Data')
resultdir = os.path.join(datadir,  'results')
trainingsetdir = os.path.join(resultdir,  'LC_training_set')
mapdir = os.path.join(datadir,  'maps', 'unconvolved')
storagedir = os.path.join(datadir,  'maps', 'storage')

Does directory exists? 
> True


### 1.2 Input Parameters

* Constants used to convert pixels to physical length

* The Einstein radius, $E_R$, is for QJ0158 assumes a mean quasar mass of $ \left< M \right>=0.3, \ 0.1$ or $0.001\ M_{\odot}$ (defined by choice of `mass_index=0/1/2`). 

In [15]:
einstein_r_1131= 2.5e16 #Einstein ring of RXJ1131 for 0.3 M_sun
#Einstein ring of QJ0158 for 0.3, 0.1 and 0.001 M_sun
einstein_r_0158_arr = [3.414e16, 3.414e16/np.sqrt(3), 3.414e16/np.sqrt(30)] 

# choose the mass for which you want QJ0158's Einstein radius
mass_idx = 0
einstein_r_0158 = einstein_r_0158_arr[mass_idx]

# pixel scale assuming the map is 20R_E x 20R_E and 8192 pxl x 8192pxl
cm_per_pxl = 20*einstein_r_0158/8192 
ld_per_pxl = cm_per_pxl/(30000000000*3600*24) #Light-day per pixel

* Radii for just the microlensed curves are generated
* source velocity set to $500\ {\rm km\ s}^{-1}$
* boolean `season_gaps` defines if the gaps are included in the light-curves 

In [16]:
def decideNpix(v_source,season_gaps):
    """
    Function that defines the number of timestamps per light curve
    @v_source :: the source velocity initially selected
    @season_gaps :: boolean that decided whether the generated data contains season gaps
    """
    # n_pixels = [v500 euler sampling, v500 non-euler sampling, v300 non-euler sampling] 
    n_pix_arr = [955, 4546, 1137]
    if season_gaps:
        n_pix  = n_pix_arr[0] 
    elif v_source == 500:
        n_pix  = n_pix_arr[1]
    else:
        n_pix  = n_pix_arr[2] 
    return n_pix

In [17]:
#radii of the source in pxl, the reference radius is 15pxl
list_r0 = [2,4,10,15,20,30,40,60,80,100] 
list_comb = [('A1', 'B4'),('A2','B3'),('A3','B2'),('A4','B1'),('A2','B4'),('A1','B3')]

n_curves = 100000 #number of generated curves
n_good_curves = 10000 #upper bound on number of curves that are not flat
select_curves = 5000 # select only these num of cuvrves from n_good_curves
v = [500, 300]
v_source = v[0] #in km.s^-1
v = v_source * np.ones(n_curves)


# generate mock light-curves with season gaps?
season_gaps = True

# defines the number of timestamps per light curve
n_pix = decideNpix(v_source, season_gaps)

### 2. Extracting the data microlensing light curve

In [18]:
def projectVelocities(v, angle, cm_per_pxl):
    """
    Function to project the velocity on x and y axis and converts units
    @v :: magnitude velocity
    @angle :: inclination angle
    """
    v_x = np.multiply(v, np.cos(angle))
    v_x = np.divide(np.multiply(100000 * 3600 * 24, v_x), cm_per_pxl)
    
    v_y = np.multiply(v, np.sin(angle))
    v_y = np.divide(np.multiply(100000 * 3600 * 24, v_y), cm_per_pxl)
    return v_x, v_y

def calTrajectory(params, v_x, v_y, time, mjhd):
    """
    Function calculates the trajectory of the source in the map
    @params :: [x_start, y_start, v, angle]
    @(v_x, v_y) :: projection of v on x-y axis
    Returns:
    @(path_x, path_y) :: evolution of the trajectories with time/mjhd
    """
    if v_x == 0:
        path_x = params[0] * np.ones(len(time))
    else:
        path_x = np.add(np.multiply(np.add(time, -time[0]), v_x), params[0])
    if v_y == 0:
        path_y =  params[1] * np.ones(len(mjhd))
    else:
        path_y = np.add(np.multiply(np.add(time, -time[0]), v_y), params[1])

    path_x = path_x.astype(int)
    path_y = path_y.astype(int)
    return path_x, path_y

def checkTrajectoryCalMag(path_x, path_y, map_name, err_data, add_shut_noise):
    """
    Function does the following:
    1. checks if the trajectory is bounded within the map 
    2. gathers the value of the corresponding pixels which give the flux magnification (Hence 2.5*log() to convert in mag)
    """
    lc = []
    if path_x[-1] <= len(map_name)-1 and path_y[-1] <= len(map_name)-1 and path_x[-1] >= 0 and path_y[-1] >= 0:
        if add_shut_noise:
            temp = np.add(np.multiply(-2.5, np.log10(map[path_y, path_x])),np.random.normal(0, np.mean(err_data), len(path_y)))
        else:
            temp = np.multiply(-2.5, np.log10(map_name[path_y, path_x]))
        
        # normalizes the light curves
        lc = temp - temp[0] * np.ones(len(temp))
    return lc

def drawLightCurves(params, map_name, time, cm_per_pxl , err_data, add_shut_noise):
    """
    Function to draw a light curve in a microlensing map
    @params:: list composed with the starting coordinaates of the trajectory, velocity and direction [x_start, y_start, velcity, angle]
    @map:: map used to draw the curve
    @time:: decides the sampling of the microlensing curve
    @cm_per_pxl:: scale of the map that is calculated for a 20 R_e x 20 R_e map 
    
    @Returns:
    Light curve, coordinates of the starting and ending point of the trajectory (latter is used only to display the trajectory)
    """    
    v = params[2]
    angle= params[3]
    
    # projects velocities
    v_x, v_y = projectVelocities(v, angle, cm_per_pxl)
    
    # draws the trajectories
    path_x, path_y = calTrajectory(params, v_x, v_y, time, mjhd)
    
    # check if trajectory is bounded & calculates magnification per pixel
    lc = checkTrajectoryCalMag(path_x, path_y, map_name, err_data, add_shut_noise)
    return lc, [path_x[0], path_y[0], path_x[-1], path_y[-1]]


Reading from a sample file and to learn useful information.

In [19]:
def getFilename(rootdir, string_name, params, no_params=False):
    """
    Function generates the filenames for reading/writing out data
    @rootdir, string_name :: root directory containing the file, file name
    @params :: parameters that distinguish the file name
    """
    if no_params:
        return os.path.join(rootdir, string_name)
    return os.path.join(rootdir, string_name%params)

def readFile(datadir):
    """
    Function reads the sample file and outputs the mjhd, mag_ml and errors on mag_ml
    @Returns 
    @mjhd :: time
    @mag_ml :: magnitude of microlensing
    @err_mag_ml :: error on the magnitude
    """
    filename = getFilename(datadir, "J0158_Euler_microlensing_upsampled_B-A.rdb", '', no_params=True)
    # open, read and extract data
    f = open(filename,"r")
    f= f.read()
    f=f.split("\n")
    data = f[2:]
    
    mjhd, mag_ml, err_mag_ml= [], [], []
    
    # fills the arrays
    for i,elem in enumerate(data):
        mjhd = np.append(mjhd,float(elem.split("\t")[0]))
        mag_ml = np.append(mag_ml, float(elem.split("\t")[1]))
        temp = elem.split("\t")[2]
        err_mag_ml= np.append(err_mag_ml,float(temp.split("\r")[0]))
    return mjhd, err_mag_ml

In [20]:
if season_gaps:
    mjhd, err_mag_ml = readFile(datadir)
else:
    mjhd, err_mag_ml = np.arange(n_pix), []

### 3. Creating mock light-curves for a given magnification map and velocity

In [21]:
def getFinalMap(resultdir, comb, r0):
    "Function retrieves convolved, magnification map from results directory"
    map_name = getFilename(resultdir, 'map%s-%s_fml09_R%s_thin_disk.fits', (comb[0],comb[1],r0))
    img = fits.open(map_name)[0]
    final_map = img.data[:, :]
    return final_map

def generateRandomVals(final_map, n_curves):
    "Function generates random (x, y) start points for trajectories at random angles"
    x = np.random.randint(200, len(final_map) - 200, n_curves)
    y = np.random.randint(200, len(final_map) - 200, n_curves)
    angle = np.random.uniform(0, 2 * np.pi, n_curves)
    return x, y, angle

def checkFlatLCs(temp, k, lc):
    """
    Function considers light curves that are "not flat" i.e. difference between min and max is over 1
    """
    if np.any(temp):
        if np.amax(np.absolute(temp))>0.5:
            lc.append(temp)
            k+=1
    return lc, k

def checkNumLCurves(select_curves, lc, n_good_curves):
    if len(lc) < select_curves:
        print("\nNumber of light curves selected (set to %d): %d; Total curves available: %d"%(select_curves, len(lc), n_good_curves))  
    return len(lc)

def saveFile(resultdir, select_curves, v_source, r0, lc, mjhd, err_mag_ml):
    "function saves the light curves per radius"
    with open(getFilename(resultdir, 'simLC_A-B_n%s_v%s_R%s_M0,3.pkl', (select_curves, v_source, r0), no_params=False), 'wb') as handle:
        pkl.dump((lc, mjhd, err_mag_ml), handle, protocol=pkl.HIGHEST_PROTOCOL)  
    return
 
def showProgress(idx, n):
    """
    Function prints the progress bar for a running function
    @param idx :: iterating index
    @param n :: total number of iterating variables/ total length
    """
    j = (idx+1)/n
    sys.stdout.write('\r')
    sys.stdout.write("[%-20s] %d%%" % ('='*int(20*j), 100*j))
    sys.stdout.flush()
    sleep(0.25)
    return

def saveNumLCurves(resultdir, total_lcurves, v_source, season_gaps):
    """
    Function saves the total number of light curves
    @resultdit :: directory to save the curves
    @total_lcurves :: np array with no. of light curves per radius
    @v_source :: the source velocity initially selected
    @season_gaps :: boolean that decided whether the generated data contains season gaps
    """
    # file names change based on whether the data contain gaps
    if season_gaps:
        with open(os.path.join(resultdir, 'numLcurvesPerRadius_v%d_gaps.npy'%v_source), 'wb') as f:
            np.save(f, total_lcurves)

    else:
        with open(os.path.join(resultdir, 'numLcurvesPerRadius_v%d.npy'%v_source), 'wb') as f:
            np.save(f, total_lcurves)
    return    

* `mjhd` is the time vector extracted form the data so if you use it to generate mock curves they will already have the season gaps.

In [22]:
# variable counts the total light curves
total_lcurves = []

for index, r0 in enumerate(list_r0):
    # retrive convolved, mmicrolensed magnification map
    final_map = getFinalMap(resultdir, list_comb[0], r0)

    params = []
    # generating random starting coordinates, angles of the trajectories
    x, y, angle = generateRandomVals(final_map, n_curves)    
    for i in range(len(x)):
        params.append([x[i], y[i], v[i], angle[i]])
    
   # i keeps track of non-flat curves, j increments in params
    lc = []
    k, l = 0, 0
    for k in range(n_good_curves):
        temp, _ = drawLightCurves(params[l], final_map, mjhd, cm_per_pxl, err_mag_ml, add_shut_noise=False)
        l+=1
        
        # checks for flat light curves, eliminates them
        lc, k = checkFlatLCs(temp, k, lc)
        
        # if the no. of required curves is reached, exits loop
        if len(lc) == select_curves:
          break
          
    # if there are not enough curves
    len_lc = checkNumLCurves(select_curves, lc, n_good_curves)
    total_lcurves.append(len_lc)

    # saves the file for every R0
    saveFile(trainingsetdir, select_curves, v_source, r0, lc, mjhd, err_mag_ml)
    
    # shows progress for every radius
    print('\nCurrently processing R0 = %d'%r0)
    showProgress(index, len(list_r0)) 
    
# file saves the array with number of curves found per radius
saveNumLCurves(resultdir, total_lcurves, v_source, season_gaps)


Currently processing R0 = 2
[==                  ] 10%
Currently processing R0 = 4
[====                ] 20%
Currently processing R0 = 10
Currently processing R0 = 15
Currently processing R0 = 20
Currently processing R0 = 30
Currently processing R0 = 40
Currently processing R0 = 60
Currently processing R0 = 80
Currently processing R0 = 100