# 1. Introduction
This notebook shows how to run Stage3 pipeline version 1.12.5 (Release 20.10.2023). Previous JWST pipelines realeses can be check here: https://github.com/spacetelescope/jwst/releases 

## 1.1 JWST PIPELINE INSTALLATION<a name="pipeline_installation"></a>
In this section I explain how to install the JWST pipeline, necesary for the data reduction. 

The easiest way to install the pipeline is via pip. Below we show how to create a new conda environment, activate that environment, and then install the latest released version of the pipeline. You can name your environment anything you like. In the lines below, replace < env_name > with your chosen environment name.

<code> conda create -n <env_name> python
<code> conda activate <env_name>
    
After Creating the conda enviroment, you need to add the CDRS files to the bash_profile. 
you can simply use this two code lines in your terminal;
    
<code> touch ~/.bash_profile; open ~/.bash_profile </code>
    
And inside the TEXT file paste these two lines: 
    
<code> export CRDS_PATH=$HOME/crds_cache
<code> export CRDS_SERVER_URL=https://jwst-crds.stsci.edu </code>

Adding the PATH for the CRDS reference files in the bas_profile is recomended over the <code>!export</code> inline option inside jupyter notebook.

Now in your terminal just load the enviroment  <env_name>, execute jupyter notebook and you are done!: 

<code> conda activate < env_name >
<code> jupyter notebook </code>


In [None]:
## HERE I WILL INSTALL INLINE THE REQUIRED VERSION OF THE JWST PIPELINE ##
## I NEEDED TO CLOSE AND HALT AND RE-OPEN THE NOTEBOOK TO IMPORT THE RIGHT VERSION OF THE PIPELINE

!pip install jwst==1.15.0

#### Please check that the pipeline version is 1.12.5

In [1]:
import jwst
print("PIPELINE VERSION = ",jwst.__version__)

PIPELINE VERSION =  1.15.0


**SETUP CRDS_CONTEXT FILE FOR REPROCUDIBILITY**

Check list of context files here: https://jwst-crds.stsci.edu/ 

In [2]:
import os
os.environ["CRDS_CONTEXT"] = "jwst_1256.pmap"  #### REALEASE DATE 2024-07-26	

In [3]:
### LOAD NECESARY PACKAGES #####
import numpy as np
from glob import glob
import shutil
from pathlib import Path
import json
import os
from astropy.io import fits
import matplotlib.pyplot as plt
from astropy.table import vstack
import multiprocessing


In [4]:
####################### SET WORKING FIRECTORY, WHERE THE RAW DATA IS LOCATED ###########################
out_dir = "/Users/dumont/Documents/ReveaLLGN/RESULTS/M87/"
in_dir = "/Users/dumont/Documents/ReveaLLGN/DATA/M87/MAST_2024-07-10T1403/JWST/"

saving_path = out_dir + "DRS3/" 
list_folders = sorted(glob(in_dir+"*", recursive = True)) # List of folders of the RAW data

# Make sure the output directory exists before copying any data
if not os.path.exists(out_dir):
    os.makedirs(out_dir)
if not os.path.exists(saving_path):
    os.makedirs(saving_path)
###### COPY FILES TO WORKING DIRECTORY ################
print("Copping uncal files")
for folder in list_folders: 
    for uncal_file in glob(folder + "/*uncal.fits"):
        shutil.copy(uncal_file, out_dir)
print("Done copping uncal files")

Copping uncal files
Done copping uncal files


# RUN STAGE 1 PIPELINE
### with multiprocessing

Here we load and run STAGE1 pipeline, saving the JUMP output to flag snoball with an external code called "snoblind". The notebook uses the python scrypt **JWST_PIPELINE.py** to run specific steps of the pipeline and also allowing multiprocessing for speeding up the data reduction.

<span style="color:red"> IMPORTANT !!! </span> STAGE1 pipeline requires OPENCV installed to Run JUMP detection. Please PIP install OPENCV in your terminal before opening this jupyter notebook.:

<code> pip install -q opencv-python </code>


### FIRST RUN STAGE 1 PIPELINE ONLY TO SAVE JUMP FILES TO REMOVE SNOWBALLS
if you want to skip the snowball flagging, then simply run STAGE1 pipeline normally, i.e remove the jump.expand_large_events = False below

In [None]:
from JWST_PIPELINE import PIPELINE_DETECTOR1_BEFORE_SNOWBLIND,mk_config_det1_before_snowblind

# get the uncal files to run
uncal_files_list =  glob(out_dir + "/*_uncal.fits")
## MAKE CONFIG FILE ##
mk_config_det1_before_snowblind(out_dir,uncal_files_list[0]) ### CHOOSE RANDOM UNCAL FILE
print('Will run the pipeline on {} files'.format(len(uncal_files_list)))
# the output list should be the same length as the files to run
outptd = [out_dir for _ in range(len(uncal_files_list))]
# set the pool and run multiprocess
cores2use = 5 ### NUMBER OF CORES TO USE 
with multiprocessing.Pool(cores2use) as pool:
    pool.starmap(PIPELINE_DETECTOR1_BEFORE_SNOWBLIND, zip(uncal_files_list, outptd))
    
### REMOVE TEMPORARY UNCAL FILES OUTPUT DIRECTORY ################
for file in glob(out_dir + "*_uncal.fits"):
    os.remove(file)  
### COPY JUMP TO SAVING PATH #############
for jump_file in glob(out_dir + "*_jump.fits"):
    jump_folder = saving_path + 'JUMPS_B_snowblind/'
    if not os.path.exists(jump_folder):
        os.makedirs(jump_folder)
    shutil.copy(jump_file, jump_folder)     

## RUN SNOBALL REMOVAL WITH SNOWBLIND

> __IMPORTANT CHANGES FROM PREVIOUS PIPELINE VERSION__
Previous pipeline routine we used the Chris Willot code _DoSnowballFlag_ for removing large cosmic ray hits. 
Since then a new code has been developed _Snowblind_ by James Davis at MPIA, and based on my test it performs better. Thus, we use it here instead of _DoSnowballFlag_.

Based on the test on Leak Images from M81 and NGC4395 the best parameters are:
1. min_radius = 3
2. growth_factor = 1.5

In [None]:
!pip install snowblind

In [None]:
from JWST_PIPELINE import run_snowblind
### LOAD ALL JUMP FILES FROM THE OUTPUT DIRECTORY
jump_files_list  =  glob(out_dir + "/*_jump.fits")
# set the pool and run multiprocess
# the output list should be the same length as the files to run
outptd = [out_dir for _ in range(len(jump_files_list))]
cores2use = 5
with multiprocessing.Pool(cores2use) as pool:
    pool.starmap(run_snowblind, zip(outptd,jump_files_list))
       
#### CHANGE THE SUFFIX OF TEMPORARY SNOWBLIND_JUMPS TO JUMPS
for jump_file in glob(out_dir + "*_snowblind.fits"):
    new_name = os.path.basename(jump_file).replace("snowblind","jump")
    os.replace(jump_file, os.path.join(os.path.dirname(jump_file),new_name) )
    

## RUN THE REST OF STAGE1 PIPELINE ON JUMP.FITS FILES 

Since in the previous step we skip two steps after the JUMP detection; RampFitStep and GAINSCALE we need to Run the whole pipeline skipping previous steps until RampFittingStep

#### Running DETECTOR1 pipeline starting from RampFittingStep


In [None]:
from JWST_PIPELINE import PIPELINE_DETECTOR1_AFTER_SNOWBLIND,mk_config_det1_after_snowblind

# get the jump files to run
jump_files_list  =  glob(out_dir + "/*_jump.fits")
## MAKE CONFIG FILE ##
mk_config_det1_after_snowblind(out_dir,jump_files_list[0]) ### CHOOSE RANDOM JUMP FILE
print('Will run the pipeline on {} files'.format(len(jump_files_list)))
# the output list should be the same length as the files to run
outptd = [out_dir for _ in range(len(jump_files_list))]
# set the pool and run multiprocess
cores2use = 5 ### NUMBER OF CORES TO USE 
with multiprocessing.Pool(cores2use) as pool:
    pool.starmap(PIPELINE_DETECTOR1_AFTER_SNOWBLIND, zip(jump_files_list, outptd))
    

### REMOVE TEMPORARY RATEINTS FILES OUTPUT DIRECTORY ################
for file in glob(out_dir + "*_rateints.fits"):
    os.remove(file)  
### REMOVE JUMP FILES FROM OUTPUT DIRECTORY ################
for file in glob(out_dir + "*_jump.fits"):
    os.remove(file)  
### COPY RATE TO SAVING PATH #############
for rate_file in glob(out_dir + "*_rate.fits"):
    rate_folder = saving_path + 'RATES/'
    if not os.path.exists(rate_folder):
        os.makedirs(rate_folder)
    shutil.copy(rate_file, rate_folder)   


# RUN STAGE 2 PIPELINE

NOW THAT SNOWBALLS HAVE BEEN IDENTIFIED and THERMAL FLUCTUATION REMOVED, WE CAN RUN STAGE 2 PIPELINE ON THE SCIENCE AND IMPRINTS


In [None]:
###### COPY STAGE2 FILES DIRECTORY ################
print("Copping json files")
for folder in list_folders:
    for json_file in glob(folder.rsplit("/",1)[0] + "/*_nrs1/*json") :
        shutil.copy(json_file, out_dir)
print("Done copping json files")
###### COPY STAGE2 FILES DIRECTORY ################
print("Copping json files")
for folder in list_folders:
    for json_file in glob(folder.rsplit("/",1)[0] + "/*_nrs2/*json") :
        shutil.copy(json_file, out_dir)
print("Done copping json files")

In [5]:
from JWST_PIPELINE import PIPELINE_DETECTOR2,mk_config_det2

###### LOAD ASSOCIATION FILES
asn_files = glob(out_dir+"/*.json")
## MAKE CONFIG FILE ##
mk_config_det2(out_dir,asn_files[0]) ### CHOOSE RANDOM ASSOCIATION FILE
print('Will run the pipeline on {} files'.format(len(asn_files)))
# the output list should be the same length as the files to run
outptd = [out_dir for _ in range(len(asn_files))]
# set the pool and run multiprocess
cores2use = 5 ### NUMBER OF CORES TO USE 
with multiprocessing.Pool(cores2use) as pool:
    pool.starmap(PIPELINE_DETECTOR2, zip(asn_files, outptd))

### COPY CAL TO SAVING PATH #############
for cal_file in glob(out_dir + "*_cal.fits"):
    cal_folder = saving_path + 'CAL/'
    if not os.path.exists(cal_folder):
        os.makedirs(cal_folder)
    shutil.copy(cal_file, cal_folder)           

######################### CREATING CONFIGURATION FILE ###############################


2024-07-16 18:35:46,163 - CRDS - ERROR -  Error determining best reference for 'pars-badpixselfcalstep'  =   Unknown reference type 'pars-badpixselfcalstep'
2024-07-16 18:35:46,165 - CRDS - ERROR -  Error determining best reference for 'pars-nscleanstep'  =   Unknown reference type 'pars-nscleanstep'
2024-07-16 18:35:46,194 - stpipe - INFO - PARS-RESAMPLESPECSTEP parameters found: /Users/dumont/crds_cache/references/jwst/nirspec/jwst_nirspec_pars-resamplespecstep_0001.asdf
2024-07-16 18:35:46,229 - stpipe.Spec2Pipeline - INFO - Spec2Pipeline instance created.
2024-07-16 18:35:46,232 - stpipe.Spec2Pipeline.assign_wcs - INFO - AssignWcsStep instance created.
2024-07-16 18:35:46,234 - stpipe.Spec2Pipeline.badpix_selfcal - INFO - BadpixSelfcalStep instance created.
2024-07-16 18:35:46,235 - stpipe.Spec2Pipeline.msa_flagging - INFO - MSAFlagOpenStep instance created.
2024-07-16 18:35:46,237 - stpipe.Spec2Pipeline.nsclean - INFO - NSCleanStep instance created.
2024-07-16 18:35:46,238 - stpip

######################### CONFIGURATION FILE SAVED ###############################
Will run the pipeline on 20 files


2024-07-16 18:35:49,445 - stpipe.Spec2Pipeline - INFO - Spec2Pipeline instance created.
2024-07-16 18:35:49,446 - stpipe.Spec2Pipeline.assign_wcs - INFO - AssignWcsStep instance created.
2024-07-16 18:35:49,447 - stpipe.Spec2Pipeline - INFO - Spec2Pipeline instance created.
2024-07-16 18:35:49,448 - stpipe.Spec2Pipeline.badpix_selfcal - INFO - BadpixSelfcalStep instance created.
2024-07-16 18:35:49,448 - stpipe.Spec2Pipeline.msa_flagging - INFO - MSAFlagOpenStep instance created.
2024-07-16 18:35:49,449 - stpipe.Spec2Pipeline.assign_wcs - INFO - AssignWcsStep instance created.
2024-07-16 18:35:49,449 - stpipe.Spec2Pipeline.nsclean - INFO - NSCleanStep instance created.
2024-07-16 18:35:49,450 - stpipe.Spec2Pipeline.badpix_selfcal - INFO - BadpixSelfcalStep instance created.
2024-07-16 18:35:49,450 - stpipe.Spec2Pipeline.msa_flagging - INFO - MSAFlagOpenStep instance created.
2024-07-16 18:35:49,451 - stpipe.Spec2Pipeline.bkg_subtract - INFO - BackgroundStep instance created.
2024-07-16

# RUN STAGE3 PIPELINE

Here I run the STAGE3 pipeline with <span style="color:blue"> OUTLIER_DETECTION = ON </span>, <span style="color:blue"> cube_build.weighting='drizzle' </span> and instrument align cubes <span style="color:blue"> cube_build.coord_system= "ifualign"  </span>. To change the type of weighting function for the cube reconstruction go to the JWST_PIPELNE.py to the function <code>mk_config_det3() </code> and change <span style="color:blue"> cube_build.weighting='emsm' </span>, and for sky-align cubes (the default output of the piepline) set <span style="color:blue"> cube_build.coord_system = "skyalign"  </span>




#### Now copy the STAGE3 ASN_JASON FILES
PLEASE DELETE STAGE 2 JSON FILES AND COPY STAGE 3 JSON FILES IN THE OUT_DIR

In [6]:
#### Lets first loop over the JASON files in the working directory and delete them.
for file in glob(out_dir + "*.json"):
    os.remove(file)  
###### COPY STAGE2 FILES DIRECTORY ################
print("Copping json files")
for folder in list_folders:
    for json_file in glob(folder.rsplit("/",1)[0]+"/jw02016-*" + "/*.json" ) :
        shutil.copy(json_file, out_dir)
print("Done copping json files")

Copping json files
Done copping json files


In [5]:
# LOAD THE calwebb_spec3 pipeline
from JWST_PIPELINE import PIPELINE_DETECTOR3,mk_config_det3

###### LOAD ASSOCIATION FILES
asn_files = glob(out_dir+"/*.json") ## ALL JASON FILES IN WORKING DIRECTORY 
## MAKE CONFIG FILE ##
mk_config_det3(out_dir,asn_files[0]) ### CHOOSE RANDOM ASSOCIATION FILE
print('Will run the pipeline on {} files'.format(len(asn_files)))
# the output list should be the same length as the files to run
outptd = [out_dir for _ in range(len(asn_files))]
# set the pool and run multiprocess
cores2use = 5 ### NUMBER OF CORES TO USE 
with multiprocessing.Pool(cores2use) as pool:
    pool.starmap(PIPELINE_DETECTOR3, zip(asn_files, outptd))


### COPY CAL TO SAVING PATH #############
for cube in glob(out_dir + "*_s3d.fits"):
    drizsle_folder = saving_path + 'DRIZZLE/'
    if not os.path.exists(drizsle_folder):
        os.makedirs(drizsle_folder)
    shutil.copy(cube, drizsle_folder)   
#############################################################################################
### Remove all files from working directory   
#### Lets first loop over the JASON files in the working directory and delete them.
for file in glob(out_dir + "*.fits"):
    os.remove(file)  
for file in glob(out_dir + "*.json"):
    os.remove(file)  
for file in glob(saving_path + "*_crf.fits"):
    os.remove(file)        
    
### COPY LOG FILES TO SAVING PATH #############
for log in glob(out_dir + "/*"):
    log_folder = saving_path + 'LOG/'
    if not os.path.exists(log_folder):
        os.makedirs(log_folder)
    if os.path.isfile(log):
        shutil.copy(log, log_folder)    
        os.remove(log)  

######################### CREATING CONFIGURATION FILE ###############################


2024-07-16 22:54:08,081 - stpipe - INFO - PARS-OUTLIERDETECTIONSTEP parameters found: /Users/dumont/crds_cache/references/jwst/nirspec/jwst_nirspec_pars-outlierdetectionstep_0001.asdf
2024-07-16 22:54:08,099 - stpipe - INFO - PARS-RESAMPLESPECSTEP parameters found: /Users/dumont/crds_cache/references/jwst/nirspec/jwst_nirspec_pars-resamplespecstep_0001.asdf
2024-07-16 22:54:08,119 - CRDS - ERROR -  Error determining best reference for 'pars-spectralleakstep'  =   Unknown reference type 'pars-spectralleakstep'
2024-07-16 22:54:08,148 - stpipe.Spec3Pipeline - INFO - Spec3Pipeline instance created.
2024-07-16 22:54:08,150 - stpipe.Spec3Pipeline.assign_mtwcs - INFO - AssignMTWcsStep instance created.
2024-07-16 22:54:08,151 - stpipe.Spec3Pipeline.master_background - INFO - MasterBackgroundStep instance created.
2024-07-16 22:54:08,154 - stpipe.Spec3Pipeline.mrs_imatch - INFO - MRSIMatchStep instance created.
2024-07-16 22:54:08,157 - stpipe.Spec3Pipeline.outlier_detection - INFO - OutlierD

######################### CONFIGURATION FILE SAVED ###############################
Will run the pipeline on 2 files


2024-07-16 22:54:12,618 - stpipe.Spec3Pipeline - INFO - Spec3Pipeline instance created.
2024-07-16 22:54:12,619 - stpipe.Spec3Pipeline - INFO - Spec3Pipeline instance created.
2024-07-16 22:54:12,620 - stpipe.Spec3Pipeline.assign_mtwcs - INFO - AssignMTWcsStep instance created.
2024-07-16 22:54:12,620 - stpipe.Spec3Pipeline.assign_mtwcs - INFO - AssignMTWcsStep instance created.
2024-07-16 22:54:12,621 - stpipe.Spec3Pipeline.master_background - INFO - MasterBackgroundStep instance created.
2024-07-16 22:54:12,621 - stpipe.Spec3Pipeline.master_background - INFO - MasterBackgroundStep instance created.
2024-07-16 22:54:12,622 - stpipe.Spec3Pipeline.mrs_imatch - INFO - MRSIMatchStep instance created.
2024-07-16 22:54:12,622 - stpipe.Spec3Pipeline.mrs_imatch - INFO - MRSIMatchStep instance created.
2024-07-16 22:54:12,623 - stpipe.Spec3Pipeline.outlier_detection - INFO - OutlierDetectionStep instance created.
2024-07-16 22:54:12,624 - stpipe.Spec3Pipeline.outlier_detection - INFO - Outlier