# Imaging Mode Data Calibration

**Date**: May 29, 2020

## Table of Contents
* [Introduction](#intro)
* [Pipeline Resources and Documentation](#resources)
   * [Installation](#installation)
   * [Reference Files](#reference_files)
* [Imports](#Imports_ID)
* [Convenience Functions](#convenience_functions)
* [Download Data](#download_data)
* [Methods for calling steps/pipelines](#calling_methods)
   * [run() method](#run_method)
   * [ASDF configuration files](#asdf_config)
   * [call() method](#call_method)
   * [command line](#command_line)
* [calwebb_detector1](#detector1)
* [calwebb_image2](#image2)
* [calwebb_image3](#image3)

<div class="alert alert-block alert-warning">This is how you create a box around an important note.</div>

<a id='intro'></a>
## Introduction

 The whole set of steps ran by TSOs by this first Stage of the CalWebb Pipeline can be found [<a href="https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/calwebb_detector1.html#calwebb-detector1">here</a>]. In what follows, we'll have one section associated to each step, calibrating the data sequentially and exploring the outputs accordingly.

There are several ways to call the pipeline.....(run() method, call() method, command line). We will show all three?? show all three for one step and from that point forward focus on only one??
All step parameters have default values that the pipeline will use if the user does not provide values.

<a id='resources'></a>
## Pipeline Resources and Documentation

There are several different places to find information on installing and running the pipeline. This notebook will give a shortened description of the steps pulled from the detailed pipeline information pages, but to find more in-depth instructions use the links below.

* [GitHub repository, with installation instructions](https://github.com/spacetelescope/jwst/blob/master/README.md)
* [Pipeline documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/introduction.html)
* [Help Desk](https://stsci.service-now.com/jwst?id=sc_cat_item&sys_id=27a8af2fdbf2220033b55dd5ce9619cd&sysparm_category=e15706fc0a0a0aa7007fc21e1ab70c2f)

<a id='installation'></a>
### Installation

The easiest way to install the pipeline is via `pip`. Below we show how to create a new conda environment, activate that environment, and then install the pipeline. For more detailed instructions, see the [installation instructions](https://github.com/spacetelescope/jwst/blob/master/README.md) on GitHub. You can name your environment anything you like. In the lines below, replace <env_name> with your chosen environment name.

>`conda create -n <env_name> python`<br>
>`conda activate <env_name>`<br>
>`pip install jwst`

<a id='reference_files'></a>
### Reference Files

People at STScI should automatically have access to the Calibration Reference Data System (CRDS) cache for running the pipeline. For outside users, it is recommended to have the CRDS server download the reference files to your local system and use that local cache when running the pipeline. To do that, there are two environment variables that should be set prior to calling the pipeline. These are the CRDS_PATH and CRDS_SERVER_URL variables. In the example below, reference files will be downloaded to the "crds_cache" directory under the home directory.

>`$ export CRDS_PATH=$HOME/crds_cache`<br>
>`$ export CRDS_SERVER_URL=https://jwst-crds.stsci.edu`

The first time you invoke the pipeline, the CRDS server should download all of the context and reference files that are needed for that pipeline run, and dump them into the CRDS_PATH directory. Subsequent executions of the pipeline will first look to see if it has what it needs in CRDS_PATH and anything it doesn't have will be downloaded from the STScI cache. 

<a id=#Imports_ID></a>
## Imports

Here are the lirbaries being imported here and why:

- `numpy` for numerical calculations.
- `matplotlib.pyplot` for plots.
- `astropy.io.fits` for importing fits files.
- `jwst.pipeline.calwebb_detector1` for using the CalWebb Detector 1 stage.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import os

from astropy.io import fits
from jwst.pipeline import calwebb_detector1
from jwst.group_scale import GroupScaleStep
from jwst.dq_init import DQInitStep
from jwst.saturation import SaturationStep
from jwst.superbias import SuperBiasStep
from jwst.ipc import IPCStep                                                                                    
from jwst.refpix import RefPixStep                                                                
from jwst.linearity import LinearityStep
from jwst.dark_current import DarkCurrentStep
from jwst.jump import JumpStep
from jwst.ramp_fitting import RampFitStep
from jwst.gain_scale import GainScaleStep
from jwst import datamodels

Version of the pipeline we are running:

In [None]:
import jwst
print(jwst.__version__)

<a id='convenience_functions'></a>
## Define convenience functions and parameters

Here we define some functions that we will use repeatedly throughout the notebook.

In [None]:
# Files created in this notebook will be saved in a subdirectory
# of the current working directory called `output`
output_dir = 'output'

In [None]:
def show(data):
    """Show the input image on the screen
    """
    pass

In [None]:
def side_by_side(data1, data2):
    """Show two images side by side for easy comparison
    """
    pass

<a id='download_data'></a>
## Download Data

In [None]:
def download(url, directory='./'):
    """Download a file from a given URL
    
    Parameters
    ----------
    url : str
        URL of the file to be downloaded
        
    directory : str
        Directory into which the file should be saved
    """
    pass

In [None]:
download files from Box here

<a id='calling_methods'></a>
## Methods for calling steps/pipelines

There are three common methods by which the pipeline or pipeline steps can be called. From within python, the `run()` and `call()` methods can be used. Or, the `strun` command can be used from the command line. When using the `call()` method or `strun`, optional input parameters can be specified via [configuration files](#asdf_config). When using the `run()` method, these parameters are specified within python. See below for details on all three methods.

For the pipeline/step calls in this notebook, we will show how to use all three methods.

<a id='run_method'></a>
### Run() method

When using the `run()` method, optional input parameters are specified using attributes of the pipeline or step class, rather than configuration files. 
[example usage of run() method](https://jwst-pipeline.readthedocs.io/en/stable/jwst/stpipe/call_via_run.html)

<a id='asdf_config'></a>
### ASDF Configuration Files

When calling a pipeline or pipeline step using the call() method or the command line, 

show some examples here.

[ASDF configuration file details](https://jwst-pipeline.readthedocs.io/en/stable/jwst/stpipe/config_asdf.html#config-asdf-files)

<a id='call_method'></a>
### call() method

[example usage of call() method](https://jwst-pipeline.readthedocs.io/en/stable/jwst/stpipe/call_via_call.html)

<a id='command_line'></a>
### Command line

[example usage of command line calls](https://jwst-pipeline.readthedocs.io/en/stable/jwst/introduction.html?highlight=%22command%20line%22#running-from-the-command-line)

---
<a id='detector1'></a>
## The calwebb_detector1 pipeline

General comments about detector1 here. Inputs, Outputs, takes data from multiaccum ramps to slope images. Composed of multiple steps. Steps used/skipped are instrument-dependent. Data for all observation modes goes through calwebb_detector1.

0. [Run the entire pipeline](#detector1_at_once)
1. [The `group_scale` step](#groupscale)
2. [The `dq_init` step](#dq_init)
3. [The `saturation` step](#saturation)
4. [The `superbias` step](#superbias)
5. [The `refpix` step](#refpix)
6. [The `linearity` step](#linearity)
7. [The `darkcurrent` step](#dc)
8. [The `jump` step](#jump)
9. [The `ramp_fitting` step](#ramp_fitting)
10. [The `gain_scale` step](#gain_scale)

<a id='detector1_at_once'></a>
## Run the entire `calwebb_detecor1` pipeline

In this section we show how to run the entire calwebb_detector1 pipeline with a single call. We set parameter values for some of the individual steps, save some outputs, etc, and then call the pipeline.

[Pipeline output suffixes](https://jwst-pipeline.readthedocs.io/en/stable/jwst/introduction.html#pipeline-step-suffix-definitions)

In subsequent sections, we show how to run each step individually


In [None]:
detector1_output_file = 'XXXXXX_rate.fits'

##### Using the run() method

In [None]:
# Using the run() method
detector1 = Detector1Pipeline()
detector1.output_dir = output_dir
detector1.save_results = True

# Set some parameters for some of the steps, in
# order to show how it's done
detector1.refpix.use_side_ref_pix = True
detector1.linearity.save_results = True
detector1.jump.rejection_threshold = 6

# Run the pipeline
detector1.run(uncal_file)

##### Using the call() method

In [None]:
# Using the call() method - !!!!!need to go into cfg files here!!!!
Detector1Pipeline.call(uncal_file, output_dir=output_dir, save_results=True)

##### From the command line

In [None]:
# Calling from the command line
# strun XXXXXXXXXXXX

### Examine the outputs

The primary output of the calwebb_detector1 pipeline is a file containing a rate image for the exposure. The units of the data are ADU/sec.  

In [None]:
rate_file = uncal_file.replace('uncal.fits', 'rate.fits')

In [None]:
rate_data = fits.getdata(rate_file)

In [None]:
show(rate_data)

Also, since we set the `detector1.linearity.save_results` parameter to True in the call above, the pipeline saved the results of the linearity step. In this case, the output file will have the same name as the input uncal file, but with the suffix 'linearity' rather than 'uncal'. 

**NOTE:** This differs slightly from the case where we call the linearity step itself and save the results. In that case, the output file will have the suffix 'linearitystep' rather than 'linearty'.

In [None]:
linear_file = uncal_file.replace('uncal.fits', 'linearity.fits')

In [None]:
lin_data = fits.getdata(linear_file)

In [None]:
# Let's look at the data in the final group of the linearized data:
show(lin_data[0, -1, :. :])

## Individual Steps

In the sections below we run the steps contained within calwebb_detector1 one at a time, in order to more clearly see what each step is doing.

<a id='groupscale'></a>
## The `group_scale` step

#### Summary

This step rescales pixel values in the raw JWST science products in cases where multiple [frames](https://jwst-docs.stsci.edu/understanding-exposure-times#UnderstandingExposureTimes-uptherampHowup-the-rampreadoutswork) were averaged on-board to create the [groups](https://jwst-docs.stsci.edu/understanding-exposure-times#UnderstandingExposureTimes-uptherampHowup-the-rampreadoutswork) in the multiaccum ramp, but the number of frames per group is not a power of 2. This occurs primarily in [NIRSpec IRS^2 data that uses the NRSIRS2 readout pattern (see Table 1)](https://jwst-docs.stsci.edu/near-infrared-spectrograph/nirspec-instrumentation/nirspec-detectors/nirspec-detector-readout-modes-and-patterns) data. Data with no frame averaging, or where the number of frames is a power of 2, will not be affected by this step. See Figure 2 in the [NIRCam detector readout](https://jwst-docs.stsci.edu/near-infrared-camera/nircam-instrumentation/nircam-detector-overview/nircam-detector-readout-patterns) page for some examples of readout patterns where multiple frames are averaged to create each group.

#### Documentation

[Full description](https://jwst-pipeline.readthedocs.io/en/stable/jwst/group_scale/description.html) of the step.

#### Arguments

There are no optional arguments for this step

#### Reference files used

This step does not use any reference files.

#### Run the step

##### Run() method

In [None]:
# Using the run() method
group_scale = GroupScaleStep()
group_scale.output_dir = output_dir
group_scale.save_results = True
group_scale.run(uncal_file)

In [None]:
# When the output is saved, the group_scale step will
# attach a suffix of 'group_scale' to the input filename.
group_scale_output_file = os.path.join(output_dir, 
                                       uncal_file.replace('uncal.fits', 'groupscalestep.fits'))

##### Parameter reference file

##### Call() method

Show default asdf configuration file here...

In [None]:
# Using the call() method - need to go into cfg files here...
GroupScaleStep.call(uncal_file, output_dir=output_dir,
                    output_file=group_scale_output_file, save_results=True)


In [None]:
# Calling from the command line
# strun group_scale.asdf jw00017001001_01101_00001_nrs1_uncal.fits

#### Check the results

Since our example file does not average frames into groups, the group_scale step will not change the data at all.

In [None]:
uncal_data = fits.getdata(uncal_file)
group_scale_data = fits.getdata(group_scale_output_file)

Load the input uncal data and resulting output from the step (`output/data_groupscalestep.fits`) and compute the difference between the values from both. The expectation is that there will be no difference between the two since the input data do not create groups by averaging frames.

In [None]:
difference = uncal_data - group_scale_data

In [None]:
idx = np.where(difference.flatten() == 0.)[0]
print('{} unchanged signal values out of {} measuremenents.'.format(len(idx), difference.size))


Move this cell and the cell below to a step where the header is changed, such as assign_wcs or flux_cal

**Indeed, the step does not do anything. Both outputs are exactly equal, as expected**. What about the headers? To check this out, let's use `fits.HeaderDiff`:

In [None]:
results = fits.HeaderDiff(hdul['SCI'].header, hdul_group_scale['SCI'].header)
results.report()

### <a id='dq_init'> The `dq_init` step </a>

#### Summary

The next step in Stage 1 is the `dq_init` step. This step populates the Data Quality (DQ) mask that is associated with the data file. The DQ flags from the `MASK` bad pixel are copied into the `PIXELDQ` extension of the input file. A table showing the [mapping of bit values](https://jwst-pipeline.readthedocs.io/en/stable/jwst/references_general/references_general.html#data-quality-flags) in the `MASK` file decribes what types of bad pixels can be flagged. Any other bad pixel types will be ignored.

#### Documentation

[Full description](https://jwst-pipeline.readthedocs.io/en/stable/jwst/dq_init/description.html) of the step.

#### Arguments

There are no optional arguments for this step

#### Reference files used

This step uses the `MASK` reference file. 


#### Run the step

##### run() method

In [None]:
# Using the run() method
dq_init = DQInitStep()
dq_init.output_dir = 'output'
dq_init.save_results = True

# Note that you can call the run() method using EITHER:

# the datamodel instance from the previously-run
# group_scale step
dq_init.run(group_scale)

# OR the output file from the group_scale step
# BE CAREFUL WITH THIS. PREVIOUS PIPELINE STEPS CAN ATTACH SUFFIXES TO THE PROVIDED
# OUTPUT FILE NAME, IF THE PROVIDED NAME DOESN'T END WITH THE RECOGNIZED SUFFIX.
# THEREFORE, THE ACTUAL SAVED FILENAME MAY DIFFER FROM WHAT YOU REQUESTED.
dq_init.run(group_scale_output_file)


##### call() method

In [None]:
# Using the call() method - need to go into cfg files here...

# also show call using the datamodel or the filename

DQInitStep.call('output/data_k2-141_groupscalestep.fits', output_dir='output',save_results=True)

##### command line

In [None]:
# Call from the command line
# strun XXXXXX

The step finished without crashing, but as it is said above, there are some errors and warnings worth noting:
1. There is a CRDS ERROR "Error determining best reference for 'pars-dqinitstep'  =   Unknown reference type 'pars-dqinitstep'"
2. `CDP_WARM` and `CDP_NOISY` do not correspond to existing `DQ` mnemonics, so they are ignored (this is normal, see below).
3. There is also a WARNING with the `T_SECONDARY` Keyword. It appears it is greater than 8 characters, or contains characters not allowed by the FITS standard. Also normal (at least we see it with other steps as well).

Let's take a look at the `hdul`'s of both the uncal data and the `dqinit` products:

The pixel values in the `SCI` extension are not changed in this step. Instead, the DQ flags are copied into the `PIXELDQ` extension. The `GROUPDQ` values are not changed in this step. Let's check the `PIXELDQ` values and see what has changed.

In [None]:
dq_init_output_file = os.path.join(output_dir, uncal_file.replace('uncal.fits', 'dqinitstep.fits'))

In [None]:
group_scale_pixeldq = fits.getdata(group_scale_output_file, 'PIXELDQ')
dq_init_pixeldq = fits.getdata(dq_init_output_file, 'PIXELDQ')

In [None]:
difference_pixelDQ = dq_init_pixeldq - group_scale_pixeldq

In [None]:
idx_pixelDQ = np.where(difference_pixelDQ.flatten() == 0.)[0]
print('Total pixels in PIXELDQ: {}'.format(difference_pixelDQ.size))
print('{} pixels did not change.'.format(len(idx_pixelDQ)))
print('{} pixels did change.'.format(difference_pixelDQ.size - len(idx_pixelDQ)))

Let's check the unique DQ values in the `PIXELDQ` array. Note that unrecognized bad pixel types (such as XXXXXX) were not copied over to the `PIXELDQ` array.

In [None]:
unique_dq_vals = np.unique(dq_init_pixeldq)
print(unique_dq_vals)

Show here that there are no XXXX flags, where XXXX is for an unrecognized bad pixel type in the mask reffile

Note the pipeline product is much less populated. This is because the `CDP_WARM` and `CDP_NOISY` flags are not propagated, as these are not recognized by the pipeline explicitly. These are just "extra" flags from the reference files. 

## <a id='saturation'> The `saturation` step </a>

#### Summary

This step checks the signal values in all pixels across all groups, and adds a [`saturated` flag](https://jwst-pipeline.readthedocs.io/en/stable/jwst/references_general/references_general.html#data-quality-flags) to the `GROUPDQ` extension for pixels and groups where the signal is above the saturation limit.

#### Documentation

[Full description](https://jwst-pipeline.readthedocs.io/en/stable/jwst/saturation/description.html) of the step.

#### Arguments

There are no optional arguments for this step

#### Reference files used

This step uses the [`SATURATION`](https://jwst-pipeline.readthedocs.io/en/stable/jwst/saturation/reference_files.html) reference file. This file contains a map of the saturation threshold in ADU for each pixel on the detector.

#### Run the step

##### run() method

In [None]:
# Using the run() method
saturation = SaturationStep()
saturation.output_dir = 'output'
saturation.save_results = True

# Note that you can call the run() method using EITHER:

# the datamodel instance from the previously-run
# dq_init step
saturation.run(dq_init)

# OR the output file
saturation.run(dq_init_output_file)

##### call() method

In [None]:
SaturationStep.call('output/data_k2-141_dqinitstep.fits', output_dir=output_dir,
                    save_results=True)

##### command line

In [None]:
# strun XXXX

The step finished without crashing, but as with the `dq_init` step, there are some errors and warnings worth noting:
1. There is a CRDS ERROR "Error determining best reference for 'pars-saturationstep'  =   Unknown reference type 'pars-saturationstep'"
2. The same warning about `T_SECONDARY` Keyword shows up. 

If there are any saturated values, they should appear in the `GROUPDQ` arrays. Let's examine the `GROUPDQ` data and see if there are any detected:

In [None]:
saturation_output_file = os.path.join(output_dir,
                                      uncal_file.replace('uncal.fits', 'saturationsstep.fits'))

In [None]:
saturation_groupdq = fits.getdata(saturation_output_file, 'GROUPDQ')

In [None]:
saturated = np.where(saturation_groupdq & dqflags.pixel['SATURATED'] > 0)

In [None]:
num_sat_flags = len(saturated[0])
print(('Found {} saturated flags. This may include multiple saturated '
       'groups in a given pixel'.format(num_sat_flags)))

Let's look at the DQ flags for one of these pixels.

In [None]:
sat_int, sat_group, sat_y, sat_x = saturated
sat_index = int(len(sat_y) / 2)
print(saturation_groupdq[sat_int[sat_index], :, sat_y[sat_index], sat_x[sat_index]])

This pixel saturated in group XX, and is flagged as saturated from that group to the end of the integration. This means that in the linearity correction step later, groups XX - XX will be ignored and not corrected. They will also be ignored in the jump step, and the ramp-fitting step at the end of calwebb_detector1.

In [None]:
sat_reffile = fits.getheader(saturation_output_file)['R_SATURA']

Need to find CRDS_CACHE directory so that we can load the saturation reference file and look at the threshold for this pixel

 <a id='superbias'> </a>
## The `superbias` step

#### Summary

This step substracts the superbias reference frame from each group of the science exposure.

#### Documentation

[Full description](https://jwst-pipeline.readthedocs.io/en/stable/jwst/superbias/description.html) of the step.

#### Arguments

There are no optional arguments for this step

#### Reference files used

This step uses the [`SUPERBIAS`](https://jwst-pipeline.readthedocs.io/en/stable/jwst/superbias/reference_files.html) reference file. This file contains a map of the superbias signal in ADU for each pixel on the detector.


#### Run the step

##### run() method

##### call() method

In [None]:
SuperBiasStep.call('output/data_k2-141_saturationstep.fits', output_dir='output',
                   save_results=True)

##### Command line

Let's compare how the science products visually look like in comparison with the raw `uncal` data for the last group of the first integration. 

In [None]:
superbias_output_file = os.path.join(output_dir, uncal_file.replace('uncal.fits', 'superbiasstep.fits'))

In [None]:
side_by_side(uncal_data, superbias_data)

## <a id='refpix'> The `refpix` step </a>

This step uses the reference pixels in order to remove some extra counts added around due to readout electronics. These reference pixels are 4-pixel wide strip around the edge of the detector that don't detect light, so they serve to measure these effects. Let's apply this step to the superbias products obtained before (`refpix` assumes this step has been carried out):

In [None]:
calwebb_detector1.refpix_step.RefPixStep.call('output/data_k2-141_superbiasstep_corrected.fits', output_dir='output',save_results=True)

Let's explore the products:

In [None]:
hdul_refpix = fits.open('output/data_k2-141_superbiasstep_corrected_refpixstep.fits')
print(hdul_refpix.info())

All right, there are some changes. It is important to note here, that the reference pixels of our simulations are zeroes, so in theory this step shouldn't do nothing. Let's see what changed in the last group of the first integration, for example:

In [None]:
plt.figure(figsize=(20,10))
plt.title('Superbias corrected image (pipeline product):')
im = plt.imshow(hdul_superbias['SCI'].data[0,2,:,:] - hdul_refpix['SCI'].data[0,2,:,:])
im.set_clim(-100,100)

## <a id='linearity'> The `linearity` step </a>

This step applies the [classical linearity correction](https://jwst-pipeline.readthedocs.io/en/latest/jwst/linearity/description.html) to the data on a pixel-by-pixel, integration-by-integration, group-by-group manner. Let's apply it to our saturation-flagged (real) products:

In [None]:
calwebb_detector1.linearity_step.LinearityStep.call('output/data_k2-141_superbiasstep_corrected_refpixstep.fits', output_dir='output',save_results=True)

Once again, quick exploration of the products:

In [None]:
hdul_linearity = fits.open('output/data_k2-141_superbiasstep_corrected_linearitystep.fits')
print(hdul_linearity.info())

All looks good! To see how well this is being done, let's once again use the third group of the first integration as a reference. Let's use the NIRISS linearity reference file used by the pipeline to run the corrections ourselves:

In [None]:
# Extract coefficients for our subarray:
linearity_ref = fits.open('/grp/crds/cache/references/jwst/jwst_niriss_linearity_0011.fits')
coeffs = linearity_ref['COEFFS'].data[:,-256:,:]

# Extract third group of first integration before linearity correction:
third_group = hdul_refpix['SCI'].data[0,2,:,:]
third_group_corrected = np.zeros(third_group.shape)

# Correct for linearity pixel-to-pixel:
for i in range(third_group.shape[0]):
    for j in range(third_group.shape[1]):
        third_group_corrected[i,j] = np.polyval(coeffs[::-1,i,j],third_group[i,j])

And let's compare:

In [None]:
plt.figure(figsize=(20,10))
plt.title('Linearity correction (our) - Linearity correction (pipeline product):')
difference_linearity = third_group_corrected - hdul_linearity['SCI'].data[0,2,:,:]
im = plt.imshow(difference_linearity)
im.set_clim(0,100)

In [None]:
nonzero_values = np.where(difference_linearity.flatten() != 0)[0]
print(len(nonzero_values),'pixels not identical. All the rest give the same result.')
print('(Flattened) Positions:',nonzero_values)

These 19 non-identical pixels were expected (in fact --- we expected 20!). Remember that, above, we discovered 20 saturated pixels were identified on each group --- all but one of them, as expected, were not corrected by the pipeline (but were by our code). The missing pixel, however, is at (flattened) pixel position 162970 (see the [saturation](#saturation) section). That pixel is a bit of a special fella. Let's see what its original value was, and what its corrected value looks like: 

In [None]:
index = 162970
print('Original value:',third_group.flatten()[index])
print('Our corrected value:',third_group_corrected.flatten()[index])
print('Pipeline corrected value:',hdul_linearity['SCI'].data[0,2,:,:].flatten()[index])
print('Coefficients:',coeffs.reshape(6,len(third_group.flatten()))[:,index])

That pixel has not only a negative value, but the linearity correction coefficients are basically all zeroes (or very small values, judging from the rest of the coefficients). The only coefficient that is not zero and equal to one is the second, which is the linear coefficient in the polynomial expansion of the linearity correction ($c_0 + c_1F +c_2F^2 ....$, so $c_1$ in this case). This gives this pixel a corrected value equal to its input value, and hence why it was detected as "corrected by the pipeline" with our search above. In reality, the pixel was not touched by the pipeline (as expected, because is marked as saturated), but because our correction gave back the same input value, we didn't detected it above.

Given these results make total sense, <font color='green'>**we consider the step validated from the NIRISS/SOSS point of view.**</font>

## <a id='dc'> The `darkcurrent` step </a>

This simply substract the dark current signals. Let's run it:

In [None]:
calwebb_detector1.dark_current_step.DarkCurrentStep.call('output/data_k2-141_superbiasstep_corrected_linearitystep.fits', output_dir='output',save_results=True)

Let's look at the products:

In [None]:
hdul_dark = fits.open('output/data_k2-141_superbiasstep_corrected_darkcurrentstep.fits')
print(hdul_dark.info())

All looks good. Let's see how much changed for the third group on the first integration:

In [None]:
# Load reference file for darks:
darkcurrent = fits.open('/grp/crds/cache/references/jwst/jwst_niriss_dark_0114.fits')
print(darkcurrent['SCI'].data.shape)

# Plot:
plt.figure(figsize=(20,10))
plt.title('Linearity-corrected data (i.e., no dark-correction):')
im = plt.imshow(hdul_linearity['SCI'].data[0,2,:,:])
im.set_clim(-1000,1000)
plt.figure(figsize=(20,10))
plt.title('Dark-frame corrected data:')
im = plt.imshow(hdul_dark['SCI'].data[0,2,:,:])
im.set_clim(-1000,1000)
plt.figure(figsize=(20,10))
plt.title('Difference:')
im = plt.imshow(hdul_dark['SCI'].data[0,2,:,:] - hdul_linearity['SCI'].data[0,2,:,:])
im.set_clim(-10,10)

plt.figure(figsize=(20,10))
plt.title('Difference minus dark frame in third group:')
im = plt.imshow(hdul_dark['SCI'].data[0,2,:,:] - hdul_linearity['SCI'].data[0,2,:,:] + darkcurrent['SCI'].data[2,:,:])
im.set_clim(-10,10)

This is the expected behaviour --- we can see evident, small, vertical strips of signal being removed; these come from the 1/f components of the noise, showing up in the columns as expected for NIRISS/SOSS. The last plot shows that the pipeline is doing what is expected: removing, for the second group, the second dark frame (which is the second group as well of the reference data itself) of the set of 50 available to remove. Given this shows the expected behaviour, <font color='green'>**we consider the step validated from the NIRISS/SOSS point of view.**</font>

## <a id='jump'> The `jump` step </a>

This step detects and flags jumps in the ramps. Let's go ahead and run it on our dark-frame corrected data:

In [None]:
calwebb_detector1.jump_step.JumpStep.call('output/data_k2-141_superbiasstep_corrected_darkcurrentstep.fits', output_dir='output',save_results=True)

Let's check the products:

In [None]:
hdul_jump = fits.open('output/data_k2-141_superbiasstep_corrected_jumpstep.fits')
print(hdul_jump.info())

All looks good. Now, let's check the `GROUPDQ` of the third group of the first integration, to see if there are any differences, which we will attribute to the jump step detecting jumps:

In [None]:
diff_dark_jump = hdul_dark['GROUPDQ'].data[0,2,:,:] - hdul_jump['GROUPDQ'].data[0,2,:,:]
idx_dark_jump = np.where(diff_dark_jump!=0)
print(len(idx_dark_jump[0]),'detected jumps out of {0:} pixels (i.e., {1:.1f} percent of pixels in group 3)!'.format(2048*256,100*len(idx_dark_jump[0])/(2048*256)))

Wow, that's a lot of jumps. Let's plot some of them:

In [None]:
x,y = idx_dark_jump[0],idx_dark_jump[1]
for i in range(len(x)):
    idx = np.where(hdul_jump['GROUPDQ'].data[0,:,x[i],y[i]] == 4)[0]
    plt.errorbar([0,1,2],hdul_jump['SCI'].data[0,:,x[i],y[i]],yerr=hdul_jump['ERR'].data[0,:,x[i],y[i]],fmt='-.')
    plt.plot(idx,hdul_jump['SCI'].data[0,idx,x[i],y[i]],'o')
    if i == 15:
        break

Interesting. Most likely this is due to the errorbars not being calculated correctly to detect the jumps (due to mismatches between the reference files used to generate the data and the ones used by the pipeline, see below for details). We'll revisit this step once this is fixed.

## <a id='ramp_fitting'> The `ramp_fitting` step </a>

This step fits a ramp to the data. Let's go ahead and run it on our jump-product data:

In [None]:
calwebb_detector1.ramp_fit_step.RampFitStep.call('output/data_k2-141_superbiasstep_corrected_jumpstep.fits', output_dir='output',save_results=True)

All seems normal. An important detail here is that there are two outputs: output `*_0_rampfitstep.fits` is the classic "rate" output which weights all the ramps for each integration (i.e., it is a 2D product). Output `*_1_rampfitstep.fits` are the "rateints" products, the slopes of each integration separately. Let's load the products:

In [None]:
hdul_rate = fits.open('output/data_k2-141_superbiasstep_corrected_0_rampfitstep.fits')
print(hdul_rate.info())
hdul_rateints = fits.open('output/data_k2-141_superbiasstep_corrected_1_rampfitstep.fits')
print(hdul_rateints.info())

All looks good. Note errors have been added to the ramps (`ERR`), as well as `INT_TIMES` (which should save the timestamps of the observations), and variances. To test this step, let's do our own ramp fitting on the `output/data_k2-141_superbiasstep_corrected_jumpstep.fits` products:

In [None]:
# Plot:
plt.figure(figsize=(20,10))
plt.title('Pipeline ramp-fitting:')
im = plt.imshow(hdul_rateints['SCI'].data[0,:,:])
im.set_clim(-100,100)

Let's plot a column:

In [None]:
plt.figure(figsize=(20,5))
plt.plot(hdul_rateints['SCI'].data[0,:,1000],label='Pipeline product')
plt.legend()
plt.ylabel('Counts/second')


## <a id='gain_scale'> The `gain_scale` step </a>

We note this step only applies to NIRSpec, so it should be skipped. Let's test this:

In [None]:
calwebb_detector1.gain_scale_step.GainScaleStep.call('output/data_k2-141_superbiasstep_corrected_1_rampfitstep.fits', output_dir='output',save_results=True)

Indeed, it is properly skipped (really because there is no `GAINFACT` in the headers).