#NIRC2 preprocessing

We present here functions related to the fits file handling, basic preprocessing, determination of the VORTEX center in a raw image, image registration and cube creation.

More specifically, this notebook talks about:
0. [Import](#import)
1. [Basic preprocessing](#prepro)
  0. [Open a fits file](#open)  
  1. [Master flat](#mflat)
  2. [Preprocess files](#prefiles)
  3. [Create a cube from fits images](#cube)
2. [Find the VORTEX center](#center)
   0. [Introduction](#intro)
   1. [Step-by-step procedure](#proc)
       1. [Initialization](#ini)
       2. [Minimization](#min)
       3. [Representation](#res)
   4. [Routine](#routine)
3. [Registration](#regi)
4. [Crop the cube](#crop)

##Import <a id='import'></a>

We import all the functions from **Vortex_Preprocessing**. Furthermore, we will make use of few VIP and numpy package functions. 

NB: Whatever part of the notebook you plan to execute, you better first run this cell. 

In [None]:
from NIRC2_Preprocessing import *
import numpy as np
from vip.fits import display_array_ds9, write_fits

%matplotlib inline
#%matplotlib

##Basic preprocessing <a id='prepro'></a> 

###Open a fits file <a id='open'></a>

**Summary**: Open a fits file and display it with DS9. Optionally, the header is extracted.

We can easily open a fits image using the function open_fits( ), even those which raise a *missing END card* error. 

+ If **header** is False (default value), only 1 output is returned: image = open_fits(path).

In [None]:
path = '/Users/Olivier/Documents/ULg/VORTEX/Data/RAW/Keck/HR8799/20150609/sci/sci_Tint_0p2_coadds_100/flatted/n0377_flatted.fits'
image, header = open_fits(path, header=True, verbose=False)

The variable **image** contains the fits image as a numpy array and can be displayed with DS9 using the VIP function display_array_ds9( ). As explained in the VIP docstring, DS9 should be already installed in your system along with XPA.

In [None]:
print type(image)
print image.shape
display_array_ds9(image)

The variable **header** contains all the fits header cards into a dictionary. 

In [None]:
# Specific header card
card = 'NAXIS'
print '{} = {}'.format(card,header[card])
print ''

# All header cards
for key, value in header.items():
    print '{} = {}'.format(key,value)

###Master flat <a id='mflat'></a>

**Summary**: create and save a master flat from a set of fits image. 

We define the repository which contains all flats.

In [None]:
path_flat = '/Users/Olivier/Documents/ULg/VORTEX/Data/RAW/Keck/HR8799/20150610/flat/'
#'HR8799_Keck/20150610/flat/'

We list all the fits files using the function listing( ). If **selection** is *True*, each fits image will be opened with DS9 and you will be asked to keep it or discard it. 

In [None]:
fileList_flat = listing(path_flat, selection=True, ext='fits')

for filename in fileList_flat:
    print filename

From the flat images listed in **fileList_flat**, the master flat is created with the function masterFlat( ).

+ If **header** is *True*, a list of all individual headers associated to all the flat images is returned.
+ If **norm** is *True*, the master flat is normalized.
+ If **display** is *True*, the master flat is automatically displayed with DS9.
+ If **save** is *True*, the master flat is saved into the 'mflat.fits' file.

In [None]:
mflat, headers = masterFlat(fileList_flat, header=True, norm=True, display=True, save=True)

print type(mflat)
print mflat.shape

From **headers**, one can retrieve informations about all the flats. For instance:

In [None]:
for j in range(len(fileList_flat)):
    print '{}: {} at {}'.format(fileList_flat[j],headers[j]['DATE-OBS'],headers[j]['EXPSTART'])

###Preprocess files <a id='prefiles'></a>
**Summary**: divide a set of fits image by the master flat and save them into a folder. 

We define two paths which respectively contains: 
+ all the files to process
+ the master flat 

Then, we list all files in the files-to-process directory with the function listing( ). 
For example, we start with all Keck sky images taken on 2015-06-09 and characterized by the coaddition of 100 frames (coadds) of 0.2sec integration time (Tint).

In [None]:
## Master flat
path_mflat = '/Users/Olivier/Documents/ULg/VORTEX/Data/RAW/Keck/Calibration/mflat_20150610.fits'

## Repository which inclues a set of files - OR - a single file
##  + Repository
path_files = '/Users/Olivier/Documents/ULg/VORTEX/Data/RAW/Keck/HR8799/20150609/sci/sci_Tint_0p2_coadds_100/'
file_list = listing(path_files, selection=False)

## + Single file
#file_list = ['HR8799_Keck/20150609/sky/n0404.fits']

for filename in file_list:
    print filename

Then, by the use of the function applyFlat( ), we divide all images by the master flat and (optionally) save them into a specific folder. 

+ If **header** is *True*, a list of all individual headers associated to all the flat images is returned.
+ If **display** is *True*, the preprocessed images are automatically displayed with DS9.
+ If **save** is *True*, the preprocessed images are saved into the 'path_files/flatted/' repository.

In [None]:
preprocessed, headers = applyFlat(file_list, path_mflat, header=True, display=False, save=True, verbose=True)

The variable **headers** is a dictionary which contains the headers of all preprocessed images. The variable **preprocessed** is a dictionary which contains the preprocessed images for a further possible use. We display one of them with DS9.  

In [None]:
key = file_list[0]
print 'Flat preprocessing for image: {}'.format(key)
display_array_ds9(preprocessed[key])

###Create a cube from fits images <a id='cube'></a>
**Summary**: create and save a cube from a set of fits images.

From a set of $N$ fits images (registered or not), we create a cube with the shape $N \times l \times c$ where $l \times c$ corresponds to the size of each image in pixels. 

+ If **header** is *True*, a list with all fits image headers is returned.
+ If **save** is *True*, the cube is saved into the current repository.

In [None]:
path_files = '/Users/Olivier/Documents/ULg/VORTEX/Data/RAW/Keck/HR8799/20150609/sci/sci_Tint_0p2_coadds_100/flatted/'
#'HR8799_Keck/20150609/sci/sci_Tint_0p2_coadds_100/flatted/'
file_list = listing(path_files, selection=False)

cube, headers = create_cube_from_frames(path_files, header=True, verbose=False, save=False)    

One can display the cube with DS9 and retrieve header cards of all original files. For instance: 

In [None]:
# Display the cube with DS9
display_array_ds9(cube)

# Extract some header cards
dec = [h['DEC'] for h in headers]
ra = [h['RA'] for h in headers]
exp = [h['EXPSTART'] for h in headers]

for j, filename in enumerate(file_list):
    print '{}: [{},{}] at {}'.format(filename,ra[j],dec[j],exp[j])

##Find the VORTEX center <a id='center'></a>

0. [Introduction](#intro)
1. [Step-by-step procedure](#proc)
    1. [Initialization](#ini)
    2. [Minimization](#min)
    3. [Representation](#res)
4. [Routine](#routine)

###Introduction <a id='intro'></a> 
We model the VORTEX signature profile with a 2-D Gaussian profile. It constitutes so far a good approximation and is sufficient to obtain the position of the center and reduce the effect due to the model choice. For this purpose, we minimize a function of merit which compare the intensity of all pixels in a box with the corresponding intensity obtained from the model. The function of merit is a reduced $\chi^2$ define by:

$\chi^2_r = \frac{1}{N-n_p}\sum_{j=1}^{N} \frac{\left(I_j - I_j^{model}\right)^2}{I_j}$,

where $n_p$ is the number of model parameters and $N$ the total number of pixels in the box (if $L$ is the size of the box in pixels, we have $N = L^2$). The error associated to $I_j$ is $\sigma_j = \sqrt{I_j}$. The 2-D Gaussian profile is defined by:


$I^{Gaussian}(x,y) = \mbox{bkg} + I_0 \exp{\left[- \left(\frac{(x-x_0)^2}{2 \sigma_x^2} + \frac{(y-y_0)^2}{2 \sigma_y^2}\right)\right]}$,

where $\mbox{bkg}$ represents the background, $I_0$ the maximum intensity (with $\mbox{bkg}$) and $(x_0,y_0)$ the position where $I = I_0$. Other models are available, such as a cone or a Moffat profile defined by: 


$I^{Moffat}(x,y) = \mbox{bkg} + I_0 \left[1 + \left(\frac{(x-x_0)^2 + (y-y_0)^2}{\alpha^2}\right) \right]^{-\beta}$,
where $\alpha$ is a scale parameter and $\beta$ the parameter which determines the overall shape of the profile.

###Step-by-step procedure <a id='proc'></a> 

####Initialization <a id='ini'></a> 
We first display the image with DS9 in order to roughtly estimage the position of the VORTEX center.

In [None]:
path_file = '/Users/Olivier/Documents/ULg/VORTEX/Data/RAW/Keck/HR8799/20150609/sci/sci_Tint_0p2_coadds_100/flatted/n0377_flatted.fits'

image = open_fits(path_file, header=False, verbose=False)
display_array_ds9(image)

We initialize the center position of a box and its size (in pixels).

In [None]:
# Initializate the center and the size (in pixels) of the box
center, size =  (592,606), 31 # Dust
#(521,731), 16 # VORTEX center

# Display a 3-D surface plot of the box
plot_surface(image, center, size, output=False, figsize=(16,14), cmap='jet')

We adopt a model for the VORTEX signature and define the associated additional parameters. This module already includes various models:
+ Gaussian profile, callable through **gauss2d( )**, 2 additional parameters = $\sigma_x$, $\sigma_y$
+ Gaussian symetrical profile, callable through **gauss2d_sym( )**, 1 additional parameter = $\sigma$
+ Moffat profile, callable through **moffat( )**, 2 additional parameters = $\alpha$, $\beta$
+ Cone profile, callable through **cone( )**, 1 additional parameter = radius

In [None]:
fun = gauss2d_sym
p_additional = [12]

Then, the vector which contains all the initial parameter values is created.

In [None]:
# The box
box = image[center[0]-size//2:center[0]+size//2,center[1]-size//2:center[1]+size//2]

# The center of the box
x_ini, y_ini = (size//2,size//2) 

# The background estimation
bkg_ini = np.median(image) 

# The maximum intensity
i0_ini = np.max(box) - bkg_ini

# The initial values for all parameters to optimize
p_initial = np.array([x_ini,y_ini,i0_ini,bkg_ini]+p_additional)

####Define a model (optional)

Let us note that you can also define your own model. For instance, we adopt a Sersic profile defined by:

$I^{Sersic}(x,y) = \mbox{bkg} + I_0 \exp{\left[- \left(\frac{\sqrt{(x-x_0)^2 + (y-y_0)^2}}{\alpha}\right)^{1/n}\right]}$

Then, we only have to define a function **sersic( )**, and the corresponding **p_initial** vector. The only requirement is that the 2 first arguments must the (x,y) grid. Indeed, these arguments are fixed the others will be optimized during the minimization procedure. However, we advocate to organize the arguments as follow: $x, y, x_0, y_0, I_0, \mbox{bkg}$, [additional parameters] where 
+ $(x_0,y_0)$ locates the position of the maximum intensity
+ $I_0$ is the maximum intensity irrespectively of the background, i.e. $I_0 = I(x_0,y_0) - \mbox{bkg}$
+ $\mbox{bkg}$ is the background
For the case of the Sersic profile, there are 2 additional parameters, respectively $\alpha$ and $n$. Finally, the code is given by: 

In [None]:
# Define the Sersic profile as a callable function
def sersic(x, y, x0, y0, i0, bkg, alpha, n):
    r = ((x-x0)**2+(y-y0)**2)**0.5
    return bkg + i0 * np.exp(-(r/(alpha))**(1/float(n)))

# *fun* is passed to vortex_center() function. However, we can directly pass sersic or any other.
fun = sersic
p_additional = [20, 0.7]

# Define the vector which contains the initial values for all parameters, i.e. x0, y0, i0, bkg, [additional parameters]
box = image[center[0]-size//2:center[0]+size//2,center[1]-size//2:center[1]+size//2]
x_ini, y_ini = (size//2,size//2) 
bkg_ini = np.median(image)
i0_ini = np.max(box) - bkg_ini

p_initial = np.array([x_ini,y_ini,i0_ini,bkg_ini]+p_additional)

####Minimization <a id='min'></a> 

To start the minimization, we call the function vortex_center( ). The first 5 arguments are mandatory while the others are optional, including:

+ If **Display** is *True*, some figures are displayed during the minimization.
+ If **verbose** is *True*, the results are displayed at the end of the minimization.
+ ****kwargs** which are options passed to scipy.optimize.minimize( ). 

The vortex_center( ) function returns a tuple of 3 objects:
1. the position of the VORTEX center in the original image
2. the information returned by the minimization tool
3. the box grid as a tuple of 2 numpy.array (useful to represent the model)

In [None]:
solver_options = {'xtol': 1e-04, 'maxiter': 1e+05,'maxfev': 1e+05}
center_vor, minimization_output, grid = vortex_center(image, 
                                                      center, 
                                                      size, 
                                                      p_initial,
                                                      fun,
                                                      display= True, 
                                                      verbose=True,
                                                      savefig=False,
                                                      method = 'Nelder-Mead',
                                                      options = solver_options)

p_optimized = minimization_output.x

# Representation <a id='res'></a> 

In [None]:
print 'Representation of the best model (in terms of chi2)'
print '---------------------------------------------------'
print 'Model adopted: {}'.format(fun)

model = fun(grid[0],grid[1],*p_optimized)
plot_surface(model, figsize=(12,10), cmap='jet')

###Routine <a id='routine'></a> 

Here is a routine which allows to determine the center of the VORTEX for a set of raw images. Into a loop, each image is optionally preprocessed (flat) and the position is determined (Nelder-Mead minimization). 

Parameters initialization

In [None]:
# Repositories which contain all files to process (or a list of file paths) and, if required, the master flat.
path_files = '/Users/Olivier/Documents/ULg/VORTEX/Data/RAW/Keck/HR8799/20150609/sci/sci_Tint_0p2_coadds_100/'
path_mflat = None #'/Users/Olivier/Documents/ULg/VORTEX/Data/RAW/Keck/Calibration/mflat_20150610.fits'

# Box parameters
center, size = (592,606), 31 #(520,730), 100

# Model and additional parameter(s), depending on which model we've adopted
fun = gauss2d
p_model = [5,5]

# Routine parameters
preprocess = False
cards = ['EXPSTART','DATE-OBS','RA','DEC']
verbose = 1

Let's go !

In [None]:
center_all, success_all, file_list, header_cards = vortex_center_routine(path_files, 
                                                                         center, 
                                                                         size, 
                                                                         fun,
                                                                         preprocess=preprocess, 
                                                                         path_mflat=path_mflat,
                                                                         additional_parameters=p_model, 
                                                                         verbose=verbose,
                                                                         cards=cards)

print ''
print 'Convergence reached for all minimization ? {}'.format(success_all.all())

Informations exctracted from the headers.

In [None]:
for k, filename in enumerate(file_list):
    print '{} = [{:.2f},{:.2f}]'.format(filename,header_cards['RA'][k], header_cards['DEC'][k])

In [None]:
header_cards['EXPSTART']

Evolution of the VORTEX position during the night

In [None]:
# From the observation date and time, we determine the delta time between the first and all other observations
t_start = timeExtract(header_cards['DATE-OBS'],header_cards['EXPSTART'])
delta_time = [(t-t_start[0]).seconds/3600. for t in t_start]

# Then, we illustrate the obtained VORTEX position as a function of time
import matplotlib.pyplot as plt
plt.figure(figsize=(14,7))
plt.hold('on')
plt.plot(delta_time,center_all[:,0]-center_all[0,0],'.r', markersize=14)
plt.xlabel(r'$\Delta t$ (hour)',fontsize=20)
plt.ylabel(r'$\Delta x$ (pixels)',fontsize=20)
plt.title(path_files)
#plt.ylim([-0.03,0.06])
#plt.savefig(path_files+'DUST_delta_x_'+'Gaussian'+'_'+path_files.split('/')[:-1][-1]+'.pdf')
plt.show()

plt.figure(figsize=(14,7))
plt.hold('on')
plt.plot(delta_time,center_all[:,1]-center_all[0,1],'.b', markersize=14)
plt.xlabel(r'$\Delta t$ (hour)',fontsize=20)
plt.ylabel(r'$\Delta y$ (pixels)',fontsize=20)
plt.title(path_files)
#plt.ylim([-0.86,-0.74])
#plt.savefig(path_files+'DUST_delta_y_'+'Gaussian'+'_'+path_files.split('/')[:-1][-1]+'.pdf')
plt.show()

##Registration<a id='regi'></a> 

From the previous section ([Find the VORTEX center for one image from a simplex minimization](#center)), we are now able to determine the VORTEX center position from a sky image. The next stage consists in registering a sequence of images with respect to the corresponding VORTEX center.

Routine to perfom image registration. 
+ The first cell is dedicated to the VORTEX determination (shorten version as already done before)
+ The second cell is dedicated to the **initial_position** creation. Some manual adjustment can be done there.
+ The third cell: the registration
+ The fourth cell: parallactic angles determination

###VORTEX center

In [None]:
# Path: sky images
path_files = '/Users/Olivier/Documents/ULg/VORTEX/Data/RAW/Keck/HR8799/20150609/sci/sci_Tint_0p2_coadds_100/flatted/'
#'/Users/Olivier/Documents/ULg/VORTEX/Data/RAW/Keck/HD219196/20150609/sky/flatted/'

# VORTEX center routine
center, size = (592,606), 31 # DUST
#(520,730), 100 # VORTEX center
fun = gauss2d_sym
p_model = [12]
center_from_sky, _, _ = vortex_center_routine(path_files, center, size, fun, additional_parameters=p_model, verbose=True)

# Results
print ''
print center_from_sky

###Create the **initial_position** array

We create the **initial_position** array from center_from_sky and using information written into the night log file.

In [None]:
# Path: files to register
path_files = '/Users/Olivier/Documents/ULg/VORTEX/Data/RAW/Keck/HR8799/20150609/sci/sci_Tint_0p2_coadds_100/flatted/' 
#path_files = '/Users/Olivier/Documents/ULg/VORTEX/Data/RAW/Keck/HR8799/20150609/sci/sci_Tint_0p2_coadds_100/flatted/'

file_list = listing(path_files)
#for filename in file_list:
#    print filename

# IF YOU HAVE DETERMINED THE POSITION OF THE CENTER OF THE VORTEX,
# FROM A DUST ON ALL SCI IMAGES, THE MATRIX center_all CREATED IN THE
# PREVIOUS CELL ALREADY CONTAINS ALL THE POSITION OF THE CENTER OF 
# THE VORTEX. THEREFORE, CHOOSE OPTION 1. In THE OTHER CASE, CHOOSE
# OPTION 2
#
# OPTION 1
#----------
center_all = center_from_sky

# OPTION 2
#----------
# YOU HAVE TO CONTSTRUCT THE center_all MATRIX IN SUCH A WAY THAT
# center_all.shape = (N,2) where N CORRESPONDS TO THE NUMBER OF 
# IMAGES TO REGISTER.
#
# Initial position from the VORTEX center position
#part_0 = np.tile(center_from_sky[0,:],(10,1))
#part_1 = np.tile(np.array([520.06,730.5]),(len(file_list)-10,1))
#
#part_0 = np.tile(center_from_sky[0,:],(9,1))
#part_1 = np.tile(center_from_sky[1,:],(10+10,1))#(len(file_list)-9,1))
#part_2 = np.tile(np.array([520.06,730.5]),(len(file_list)-9-10-10,1))
#
# Concatenation
#center_all = np.concatenate((part_0,part_1))

 
print 'Number of files: {},  center_all shape: {}'.format(len(file_list),center_all.shape)

###Registration

In [None]:
# Let's define the pixel coordinates at which all the VORTEX center will be shifted
target = np.array([512,512]) # For a VORTEX center determination based on a DUST, *target* should take into account
                             # the offset between the DUST and the VORTEX center. This offset can be retreived from a
                             # sky image taken the same night.

# Registration
cube_reg, headers = registration(file_list, 
                                 initial_position = center_all, 
                                 final_position = target, 
                                 header=True, 
                                 verbose=False, 
                                 display=True,
                                 save=False)

###Parallactic angles 

We extract the parallactic angles from the fits headers and save them into a 1-column fits file.

In [None]:
parallactic_angles = np.array([header['PARANG'] for header in headers])

filename = '/Users/Olivier/Documents/ULg/VORTEX/Data/Cube_PSF_PA/Keck/HR8799/20150609/pa_HR8799_20150609.fits'     
write_fits(filename,parallactic_angles)

##Crop the cube<a id='crop'></a> 

This part is dedicated to crop the cube of registered frames. The optimized size of the cube frames are automatically determined. It maximizes the area of non-zero pixel values.

In [None]:
path = '/Users/Olivier/Documents/ULg/VORTEX/Data/Cube_PSF_PA/Keck/HR8799/20150609/'
filename = 'cube_HD219196_20150609.fits' #'cube_HR8799_20150609.fits'
cube_reg_crop = cube_crop_frames_optimized(cube_reg, 
                                           target[1], target[0], 
                                           ds9_indexing=True, 
                                           verbose=True, 
                                           display=True,
                                           save=False,
                                           filename=path+filename)

##Work in progress...

Display

In [None]:
path = '/Users/Olivier/Documents/ULg/VORTEX/Data/Cube_PSF_PA/Keck/HR8799/20150609/cube_HR8799_20150609.fits'
cube = open_fits(path)

display_array_ds9(cube)

PCA RDI

In [None]:
path_cube = '/Users/Olivier/Documents/ULg/VORTEX/Data/Cube_PSF_PA/Keck/HR8799/20150609/cube_HR8799_20150609.fits'
path_ref = '/Users/Olivier/Documents/ULg/VORTEX/Data/Cube_PSF_PA/Keck/HR8799/20150609/cube_HD219196_20150609.fits'
path_pa = '/Users/Olivier/Documents/ULg/VORTEX/Data/Cube_PSF_PA/Keck/HR8799/20150609/pa_HR8799_20150609.fits'

cube, header = open_fits(path_cube, header=True, verbose=False)
cube_ref, header = open_fits(path_ref, header=True, verbose=False)
angs, header = open_fits(path_pa, header=True, verbose=False)

display_array_ds9(cube_ref)

In [None]:
vip.pca.pca?

In [None]:
import vip
out = vip.pca.pca(cube, angs, cube_ref=None,ncomp=4, svd_mode='randsvd', full_output=False)
out_RDI = vip.pca.pca(cube, angs, cube_ref=cube_ref,ncomp=4, svd_mode='randsvd', full_output=False, center='global')

In [None]:
display_array_ds9(out_RDI)

Removing bad pixels

In [None]:
path_cube = '/Users/Olivier/Documents/ULg/VORTEX/Data/Cube_PSF_PA/Keck/HR8799/20150609/cube_HR8799_20150609.fits'
cube, header = open_fits(path_cube, header=True, verbose=False)

In [None]:
frame = cube[0,:,:].copy()
display_array_ds9(frame)
import vip
ind = vip.stats.clip_array(frame,3,3,neighbor=True,num_neighbor=3)  # create bad pixel map

mapp = np.zeros_like(frame)
mapp[ind]=1

In [None]:
#vip.calib.frame_bad_pixel_correction()
frame_corrected = vip.calib.frame_bad_pixel_correction(frame,mapp,3)
display_array_ds9(frame_corrected)

In [None]:
print frame[116-1,394-1]
print frame_corrected[116-1,394-1]

In [None]:
display_array_ds9(frame,frame_corrected)