# Image combination

Imge combination serves a few purposes. Combining images

+ reduces noise in images.
+ can remove transient artifacts like cosmic rays and satellite tracks.
+ can remove stars in twilight flats from the combined image.

It is essential that several of each type of calibration image (bias, dark, flat) be taken. Combining them reduces the noise in the images by roughly a factor of $1/\sqrt{N}$, where $N$ is the number of images being combined. As shown in the previous notebook, using a single calibration image actually *increases* the noise in your image.

There are a few ways to combine images; if done properly, features that show up in only one of the images (like cosmic rays) are not present in the combination. Done incorrectly, those features show up in your combined images and then contaminate your calibrated science images too.

### The bottom line: combine by averaging images, but clip extreme values

The remainder of the notebook motivates that conclusion and explains how to do that combination with [ccdproc](https://ccdproc.readthedocs.io/en/latest/).

In [None]:
import numpy as np

%matplotlib inline
from matplotlib import pyplot as plt
from matplotlib import rc

rc('font', size=20)
rc('axes', grid=True)

from astropy.visualization import hist

## Combination method: average or median?

In this section we'll look at a simplified version of the problem one faces in combining images to reduce noise. It is fair to think of astronomical images (especially bias and dark images) as being a Gaussian distribution of pixel values around the bias level, and a wdith related to the read noise of the detector. In properly done flat images the noise is technically Poisson distribution, but with a large enough number of counts that the distribution is indistringuishable from a Gaussian distribution whose width is related the square root of the number of counts. While some regions of a science image are dominated by Poisson noise from sources in the image, most of the image will be dominated by Gaussian read noise from the detector or Poisson noise from the sky background.

Instead of working with a combination of images, we'll create 100 Gaussian distributions with mean of zero, and standard deviation 1 and combine those two different ways: by finding the average and by finding the median.

You should of each of these 100 distributions as representing an image, like a bias or dark. To make the analogy to real images a little more direct, a "bias" of 1000 is added to each distribution.

In [None]:
n_distributions = 100
bits = np.random.randn(n_distributions, 100000) + 1000
average = np.average(bits, axis=0)
median = np.median(bits, axis=0)

Now that we've created the distributions and combined them in two different ways, let's take a look at them. The [`hist` function from astropy.visualization](https://astropy.readthedocs.io/en/stable/visualization/histogram.html) is used below because it can figure out for you how to bin the data.

In [None]:
fig, ax = plt.subplots(1, 2, sharey=True, tight_layout=True, figsize=(20, 10))

hist(bits[0, :], bins='freedman', ax=ax[0]);
ax[0].set_title('One sample distribution')

hist(average, bins='freedman', label='average', alpha=0.5, ax=ax[1]);
hist(median, bins='freedman', label='median', alpha=0.5, ax=ax[1]);
ax[1].set_title('{} distributions combined'.format(n_distributions))
ax[1].legend()

## Combination method: rejecting outliers

In [None]:
bits[0, 0:50] = 2000

In [None]:
average = np.average(bits, axis=0)
median = np.median(bits, axis=0)
#hist(average, bins='freedman', label='Average', alpha=0.5);
#hist(median, bins='freedman', label='median', alpha=0.5);
#plt.legend()

In [None]:
hist(average, bins='freedman', alpha=0.5);
hist(median, bins='freedman', alpha=0.5);
plt.semilogy();