# The HST/STIS Pixel Property Database User's Guide

Created by Eddie Woods, Colton Parker, and Doug Branton

## Table of Contents:
* [Introduction](#intro)
* [The HST/STIS CCD Pixel Property Database](#db)
* [Python Query Interface](#python)

## Introduction <a class="anchor" id="intro"></a>

### STScI, HST, and STIS
The Space Telescope Science Institute (STScI) in Baltimore, MD is operated by the
Association of Universities for Research in Astronomy (AURA), and partners with NASA on a
number of astronomical missions. Most notably, STScI handles the scientific operations of the
Hubble Space Telescope (HST), and also handles the science and mission operations for the
upcoming James Webb Space Telescope (JWST).

<img src="plots/stsci-logo-dark.png" width=300 height=300 />

The Space Telescope Imaging Spectrograph (STIS) is a scientific instrument installed on HST. STIS is a versatile instrument with three unique detectors
that allow for observation along visible to far ultraviolet wavelengths. One such detector is the
STIS Charge Coupled Device (CCD). The CCD is a detector similar to a commercial camera,
where a detector plane is composed of thousands of small buckets called "pixels" which capture
incoming light and read in as an electrical signal to be read out and displayed in an accurate
representation of the observed scene. Below is an image of Jupiter's aurora captured by STIS:

<img src="plots/jupiter_aurora.gif" width=500 height=500 />

### Calibration
Any astronomical detector requires calibration to remove noise (pollution of the image due to sources besides the observed scene). In general, the HST/STIS CCD calibration is supported by a robust data pipeline that applies a number of corrections to the detector to remove noise profiles from images. However, one gap in the current infrastructure is the ability to track the health and behavior of individual pixels on the detector in time. In general, current calibration procedures either apply corrections that best affect the "average" pixel behavior, or produce files that are difficult to quickly access pixel-specific information. Pixel-specific information can be important if certain pixels are crucial to the quality of a specific observation, or in the more general creation of data quality mappings that can identify individual pixels with problematic behavior.

<img src="plots/ccd.png" width=500 height=500 />

## The HST/STIS CCD Pixel Property Database <a class="anchor" id="db"></a>

The HST/STIS CCD Pixel Property Database provides a method of storing individual pixels and their properties in a manner that allows for easy access for STIS instrument scientists and analysts. STIS CCD pixels are "annealed" (where the detector is thermally flushed) on a roughly monthly cadence, which means that pixel properties change from month to month. This informs the primary database design of storing uniquely derived pixel information for each annealing period. Anneal periods are available here: https://www.stsci.edu/~STIS/monitors/anneals/anneal_periods.html, small snippet below:

<img src="plots/anneal_periods.png" width=800 height=400 />

The database tracks information about the annealing periods, the pixels, their properties for each annealing period, the respective calibration files used to generate those properties, the detector, and the instrument itself. The pixel properties that the database tracks are an assortment of metrics derived from the "dark" calibration images taken during that anneal period, which are observations taken against a blank field (often by exposing against a closed shutter). Among these metrics, `Sci_Mean` tracks the mean value of the pixel amongst all of the dark images, also known as the "dark rate" after converted to counts/second, `Err_Mean` does the same, but tracks the value of the error array, `Stability` is a metric that tracks the variance of the pixel dark rate throughout the anneal period, `NaN_Count` tracks the number of instances where the pixel was NaN, and finally the `Readnoise` which tracks the noise added when reading the pixel charge out into a voltage for the detector. Below is an EER diagram of the Database contents for reference: 

<img src="plots/eerd.png" width=400 height=800 />

## Python Query Interface <a class="anchor" id="python"></a>

### Implementation
The primary front end application for the HST/STIS Pixel Property Database is an installable Python package that utilizes the `mysql.connector` package to handle connection and queries to the database. The decision to produce a Python-based query interface instead of something like a webtool or launchable GUI stems from the ability to integrate the query package into existing Python-based monitoring and pipeline workflows. This package allows for the ability to query for pixel properties and use those as direct components of analysis scripts or even to be embedded in reference file generation, perhaps in generating data quality flags for pixels based on their properties exceeding defined thresholds.

### Setup and Database Connection
To install, run `python setup.py install` at the top level directory. This will allow the above import statement to work. Alternatively, can run `python setup.py develop` to have it track any code changes without needing to reinstall (any changes will still require a re-import to take effect).

In [1]:
#Import needed packages
from stispixeldb import pixeldb,utils
import pandas as pd
import datetime

First step is to create a connection object utilizing the package:

In [2]:
#Create a connection object
pixdb = pixeldb.PixelDB(host="localhost",user='root',password='',database='anneals_pixels')

With a connection made, can now execute transactions against the database. Below is an example of a custom query, which allows the execution of a SQL statement, with some exceptions to avoid dramatic actions. The result of the query is stored in a Pandas DataFrame for easy manipulation after retreival.

In [4]:
#Explore Available Database Tables
result = pixdb.custom_query("SHOW TABLES") #custom querying
result

Unnamed: 0,0
0,ANNEAL_PERIOD
1,DARKS
2,DETECTOR
3,HAS_PROPERTIES_IN
4,INSTRUMENT
5,PIXEL


### Inserting Data

The package interface supports data insertion through two primary functions, with the additional custom query available as well. The first is a function that inserts the initial mapping of pixels into the database, **this only needs to be done once for the database**. It looks for a file called `pixel_map.csv`, it will need to be generated using the `create_pix_csv.py` file contained in the top level of the repository.

In [4]:
pixdb.load_pixel_mapping() # Load the initial pixel mapping of row and column indices into the PIXEL table

The next function is the main driver for updating the database with everything from a new annealing period, this includes the anneal period itself, the darks used for the anneal period, and the pixel properties. This looks for a file in the path `csv_loc` that is named `anneal_{ANNEAL_NUMBER}.csv`. These files are generated and available on central store at: {LOCATION TO BE DETERMINED}

In [5]:
#Insert an anneal into the database, populates the HAS_PROPERTIES_IN, ANNEAL_PERIOD, and DARKS tables.
pixdb.insert_anneal(200,csv_loc='example_input\\')

[]
Starting First Half Insertion
Starting Second Half Insertion


### Querying the Database

The package interface supports a number of functionally defined queries that, given a set of input parameters, queries the database. For example, `pixdb.query_pixel` accepts an input of a given pixel row and column location and a time input of either an anneal number if it is known, or a datetime string where the function will locate the correct anneal number to query for.

In [11]:
#Querying for Pixel Properties

#Based on anneal number
result = pixdb.query_pixel(pixel_row=5,pixel_col=20, anneal_num=200)
result

Unnamed: 0,AnnealNumber,RowNum,ColumnNum,Stability,Sci_Mean,Err_Mean,NaN_Count,Readnoise
0,200,5,20,27.7727,31.7564,8.7142,2,6.7804


In [12]:
#Based on input date string
result = pixdb.query_pixel(pixel_row=50,pixel_col=50, date='2017-09-26 13:04:04')
result

Unnamed: 0,AnnealNumber,RowNum,ColumnNum,Stability,Sci_Mean,Err_Mean,NaN_Count,Readnoise
0,200,50,50,1.9933,482.4206,24.3288,1,10.3887


Additonally, we can query for a set of pixels using the `pixdb.query_pixel_region`, which will return all pixels within the range of specified rows and columns.

In [3]:
#Query Region
#result = pixdb.query_pixel_row_col(1,5,2,7,date='2017-09-26 13:04:04')
result = pixdb.query_pixel_region(pixel_row_range=[1,5],pixel_col_range=[2,7],date='2017-09-26 13:04:04')
result

Unnamed: 0,AnnealNumber,RowNum,ColumnNum,Stability,Sci_Mean,Err_Mean,NaN_Count,Readnoise
0,200,1,2,2.2004,32.1551,8.9409,1,6.9447
1,200,1,3,1.1931,14.4951,7.647,1,6.6358
2,200,1,4,1.7486,15.4256,7.7171,1,6.6699
3,200,1,5,1.9485,48.3635,9.8661,1,7.031
4,200,1,6,2.4125,21.5264,8.2304,2,6.8293
5,200,1,7,0.6065,42.1326,9.4633,1,6.9035
6,200,2,2,3.4083,22.7597,8.2728,1,6.7985
7,200,2,3,4.791,15.2545,7.6959,1,6.6646
8,200,2,4,0.3008,17.3033,7.8661,1,6.6948
9,200,2,5,0.7582,53.4981,10.1331,1,7.0319


We can also query for the properties of an annealing period, again either by giving the anneal number if it's known or by providing a datetime string.

In [13]:
#Query for an anneal period
result = pixdb.query_anneal(date='2017-09-26 13:04:04')
result

Unnamed: 0,AnnealNumber,StartDate,EndDate,NumberOfDarks
0,200,2017-09-26,2017-10-25,58


Given an anneal number, or datetime string we can also query to get the list of darks that are tied to each anneal period.

In [14]:
pixdb.query_anneal_darks(anneal_num=200)

Unnamed: 0,Darks,AnnealNumber
0,odbridnaq,200
1,odbriengq,200
2,odbrifovq,200
3,odbrigozq,200
4,odbrihrrq,200
5,odbriisaq,200
6,odbrijxcq,200
7,odbrikxgq,200
8,odbrilb7q,200
9,odbrimbbq,200
