# Identifying hot pixels


## Graph distribution of pixel values?

Yeah, and maybe show distribution from a 90-second dark and a 1000-second dark.

## Some pixels are too hot

Recall from the [notebook about dark current](03.02-Real-dark-current-noise-and-other-artifacts.ipynb) that even  a cryogenically-cooled camera with low dark current has some pixels with much higher dark current. In the [discussion of "ideal" dark current](03.01-Dark-current-The-ideal-case.ipynb) we noted that the counts in a dark image should be proportional to the exposure time. 

For some hot pixels that is not the case, unfortunately, which means that those pixels cannot easily be corrected by subtracting a combined dark. Those pixels can be identified by taking darks with two different (but long) exposure times and comparing the dark current derived from each of the images. The dark current, measured in electron/sec, should be the same in both images if the dark current is really constant. 

Fortunately, the pixels whose response is not proportional to exposure time are usually also pixels with high dark current. It is straightforward to identify those pixels and create a mask to exclude them when processing images. If this weren't the case it might be necessary to take some dark frames with much longer exposure time than otherwise needed.

In [None]:
from matplotlib import pyplot as plt
import numpy as np

from astropy.visualization import hist
from astropy import units as u
from astropy.nddata import CCDData

import ccdproc as ccdp

## Example 

There are two combined dark images available for the thermoelectrically-cooled Andor Aspen CG16M discussed as "Example 2" in previous notebooks. One is an average of ten 90 second exposures taken during observations of the transiting exoplanet Kelt 16b. The other is an average of twenty 1,000 second exposures taken during commission of the camera. Typically one will not have a single dark of with an exposure time that long, let alone several of them.

We begin by reading each combined dark and calculating the dark current from the counts in the image using

$$
\text{dark current} = \text{gain} \times \text{dark counts}~/ \text{ exposure time }.
$$

The gain for this camera is 1.5 $e^-$/adu. The 1,000 second exposure also needs to be trimmed to remove the overscan region.

In [None]:
dark_90 = CCDData.read('example2-reduced/combined_dark_90.000.fit')
dark_1000 = CCDData.read('master_dark_exposure_1000.0.fit.bz2')
dark_1000 = ccdp.trim_image(dark_1000[:, :4096])

dark_90 = dark_90.multiply(1.5 * u.electron / u.adu).divide(90 * u.second)
dark_1000 = dark_1000.multiply(1.5 * u.electron / u.adu).divide(1000 * u.second)

The histogram below shows the distribution of dark current values in each image. There are some differences we should *expect* to see between the two images.

Small values of dark current r much more accurately measured in the long exposure. The exposure time in that image was chosen to be as short as possible while still measuring the nominal dark current of 0.01 $e^-$/sec from the manufacturer given that the camera's read noise is 10$e^-$. 

For the average of ten 90 second exposures, that read noise will be reduced to 10$e^-/\sqrt{10} \approx$3.2$e^-$. After dividng by exposure time this is equivalent to a "dark current" of 0.035$e^-$/sec. Roughly twice that is the smallest dark current that can be accurately measured in the 90 second dark.

For large values of dark current the shorter exposure is more accurate. Some of the pixels saturate (i.e. reach the maximum value the chip can read out, roughly 65,000) in under 90 sec and more of them saturate at some time between 90 seconds and 1,000 seconds. None of those pixels are accurately measured by the long 1,000 second exposure time.

In [None]:
plt.figure(figsize=(20, 10))

hist(dark_90.data.flatten(), bins=5000, density=False, label='90 sec dark', alpha=0.4)
hist(dark_1000.data.flatten(), bins=5000, density=False, label='1000 sec dark', alpha=0.4)
plt.xlabel('dark current, $e^-$/sec')
plt.ylabel('Number of pixels')
plt.loglog()
plt.legend();

Overall, there appear to be more hot pixels in the 90 sec exposure than in the 1,000 sec exposure. For dark current under 0.1 $e^-$/sec that is certainly affected by the read noise in the 90 sec exposure. 

To get a better idea of how consistent the dark current measurements are we construct a scatter plot with the measured dark current from each image for those pixels in whihc the dark current is larger than 1$e^-$/sec as measured in the longer exposure.

In [None]:
hots = (dark_1000.data > 1)

In [None]:
plt.figure(figsize=(10, 10))
plt.plot(dark_90.data[hots].flatten(), dark_1000.data[hots].flatten(), '.', alpha=0.2, label='Data')
plt.xlabel("dark current ($e^-$/sec), 90 sec exposure time")
plt.ylabel("dark current ($e^-$/sec), 1000 sec exposure time")
plt.plot([0, 100], [0, 100], label='Ideal relationship')
plt.grid()
plt.legend();

The upper limit on dark current that can be measured with the long exposure time can be clearly seen in the plot above; there is a ceiling at roughly 95$e^-$/sec above which the dark current in the long exposure does not go. 

It looks like the dark current as measured in each frame is equal for lower values of the dark current, so we replot to get a better look at that region.

In [None]:
plt.figure(figsize=(10, 10))
plt.plot(dark_90.data[hots].flatten(), dark_1000.data[hots].flatten(), '.', alpha=0.2, label='Data')
plt.xlabel("dark current ($e^-$/sec), 90 sec exposure time")
plt.ylabel("dark current ($e^-$/sec), 1000 sec exposure time")
plt.plot([0, 100], [0, 100], label='Ideal relationship')
plt.grid()
plt.xlim(0.5, 10)
plt.ylim(0.5, 10)
plt.legend();

In [None]:
hots.sum()

In [None]:
super_hots = (dark_1000.data > 4)

In [None]:
super_hots.sum()

In [None]:
n_ran = 10_000
randos_x = np.random.randint(0, 4095, n_ran)
randos_y = np.random.randint(0, 4095, n_ran)

In [None]:
randos.shape

In [None]:
dark_90.data[randos_x, randos_y].shape

In [None]:
plt.figure()
plt.plot(dark_90.data[randos_x, randos_y].flatten(), dark_1000.data[randos_x, randos_y].flatten(), '.', alpha=0.1)
plt.xlabel("dark current, 90 sec exposure time")
plt.ylabel("1000 sec exposure time")
plt.plot([0, 100], [0, 100])
plt.grid()

In [None]:
read_90 = 10 / np.sqrt(10) / 90
read_90

In [None]:
plt.figure()
hist(dark_90.data[randos_x, randos_y].flatten(), bins=5000, density=True)
plt.vlines([read_90, 2 * read_90], 0, 1)
plt.grid();

In [None]:
read_1000 = 10 /np.sqrt(100) / 1000

In [None]:
read_1000

In [None]:
d_fu = CCDData(dark_90, dtype='float32')

In [None]:
d_fu.dtype = np.float32

In [None]:
d_fu.data = d_fu.data.astype('float32')

In [None]:
d_fu.dtype