<center><i>Made possible by the Astropy Project and ScienceBetter Consulting through financial support from the Community Software Initiative at the Space Telescope Science Institute.</i></center>

<a id="title_ID"></a>

<a href="http://photutils.readthedocs.io/en/stable/index.html"><img src="https://photutils.readthedocs.io/en/stable/_static/photutils_banner.svg" width=300></a>

# Background Estimation with `photutils`
---

##### What is background estimation?
In order to most accurately do photometric analysis of celestial sources in image data, it is important to estimate and subtract the image background. Any astronomical image will have background noise, due to both detector effects and background emission from the night sky. This noise can be modeled as uniform, or as varying with position on the detector. 

The `photutils` package provides tools for estimating 2-dimensional background flux, which can then be subtracted from an image to ensure the most accurate photometry possible.

##### What does this tutorial include?
This tutorial covers the basics of background estimation and subtraction, including the following methods:
- Scalar Background Estimation
- 2-D Background Estimation

##### Which data are used in this tutorial?
We will be manipulating Hubble eXtreme Deep Field (XDF) data, which was collected using the Advanced Camera for Surveys (ACS) on Hubble between 2002 and 2012. The image we use here is the result of 1.8 million seconds (500 hours!) of exposure time, and includes some of the faintest and most distant galaxies that have ever been observed. 

Background subtraction is essential for accurate photometric analysis of astronomical data like the XDF.

*The methods demonstrated here are available in narrative form within the `photutils.background` [documentation](http://photutils.readthedocs.io/en/stable/background.html).*

<div class="alert alert-block alert-info">

<b>Note:</b> This notebook focuses on <i>global background estimation</i>. Local background subtraction with <i>annulus apertures</i> is demonstrated in the <a href="../03_aperture_photometry/03_aperture_photometry.ipynb">aperture photometry notebook</a>.

</div>

<div class="alert alert-block alert-warning">
    
<b>Important:</b> Before proceeding, please be sure to update your versions of <code>astropy</code>, <code>matplotlib</code>, and <code>photutils</code>, or this notebook may not work properly. Or, if you don't want to handle packages individually, you can always use (and keep updated!) the <a href="https://astroconda.readthedocs.io">AstroConda</a> distribution.
 
</div>

---

## Import necessary packages

First, let's import packages that we will use to perform arithmetic functions and visualize data:

In [None]:
import numpy as np

from astropy.io import fits
import astropy.units as u
from astropy.stats import sigma_clipped_stats, SigmaClip
from astropy.visualization import ImageNormalize, LogStretch
import matplotlib.pyplot as plt
from matplotlib.ticker import LogLocator

# Show plots in the notebook
%matplotlib inline

Let's also define some `matplotlib` parameters, such as title font size and the dpi, to make sure our plots look nice. To make it quick, we'll do this by loading a [style file shared with the other photutils tutorials](../photutils_notebook_style.mplstyle) into `pyplot`. We will use this style file for all the notebook tutorials. (See [here](https://matplotlib.org/users/customizing.html) to learn more about customizing `matplotlib`.)

In [None]:
plt.style.use('../photutils_notebook_style.mplstyle')

## Retrieve data

As described in the introduction, we will be using Hubble eXtreme Deep Field (XDF) data. Since this file is too large to store on GitHub, we will just use `astropy` to directly download the file from the STScI archive: https://archive.stsci.edu/prepds/xdf/ 

(Generally, the best package for web queries of astronomical data is [Astroquery](https://astroquery.readthedocs.io/en/latest/); however, the dataset we are using is a High Level Science Product (HLSP) and thus is not located within a catalog that could be queried with Astroquery.)

In [None]:
url = 'https://archive.stsci.edu/pub/hlsp/xdf/hlsp_xdf_hst_acswfc-60mas_hudf_f435w_v1_sci.fits'
with fits.open(url) as hdulist:
    hdulist.info()
    data = hdulist[0].data
    header = hdulist[0].header

#### Modifying data
For the purposes of this notebook example, we're going to add a linear background effect from the top to the bottom of these data. But don't worry about this (pay no attention to that man behind the curtain!).

In [None]:
mask = data == 0
n_data_pixels = len(data[~mask])
background = np.linspace(-1e-4, 5e-4, num=n_data_pixels)

modified_data = np.copy(data)
modified_data[~mask] += background[:]

#### Data representation

Throughout this notebook, we are going to store our images in Python using a `CCDData` object (see [Astropy documentation](http://docs.astropy.org/en/stable/nddata/index.html#ccddata-class-for-images)), which contains a `numpy` array in addition to metadata such as uncertainty, masks, or units. In this case, each image has units electrons (counts) per second.

In [None]:
from astropy.nddata import CCDData
unit = u.electron / u.s
xdf_image = CCDData(modified_data, unit=unit, meta=header)

Let's look at the data:

In [None]:
# Set up the figure with subplots
fig, ax1 = plt.subplots(1, 1, figsize=(8, 8))

# Set up the normalization and colormap
norm_image = ImageNormalize(vmin=1e-4, vmax=5e-2, stretch=LogStretch(), clip=False)
cmap = plt.get_cmap('viridis')
cmap.set_over(cmap.colors[-1])
cmap.set_under(cmap.colors[0])
cmap.set_bad('white') # Show masked data as white
xdf_image_clipped = np.clip(xdf_image, 1e-4, None) # clip to plot with logarithmic stretch

# Plot the data
fitsplot = ax1.imshow(np.ma.masked_where(xdf_image.mask, xdf_image_clipped), 
                      norm=norm_image, cmap=cmap)

# Define the colorbar and fix the labels
cbar = plt.colorbar(fitsplot, fraction=0.046, pad=0.04, ticks=LogLocator(subs=range(10)))
labels = ['$10^{-4}$'] + [''] * 8 + ['$10^{-3}$'] + [''] * 8 + ['$10^{-2}$']
cbar.ax.set_yticklabels(labels)

# Define labels
cbar.set_label(r'Flux Count Rate ({})'.format(xdf_image.unit.to_string('latex')), 
               rotation=270, labelpad=30)
ax1.set_xlabel('X (pixels)')
ax1.set_ylabel('Y (pixels)')

*Tip: Double-click on any inline plot to zoom in.*

## Mask data

You probably noticed that a large portion of the data is equal to zero. The data we are using is a reduced mosaic that combines many different exposures, and that has been rotated such that not all of the array holds data. 

We want to **mask** out the non-data portions of the image array, so all of those pixels that have a value of zero don't interfere with our statistics and analyses of the data.

In [None]:
# Define the mask
xdf_image.mask = xdf_image.data == 0

In [None]:
# Set up the figure with subplots
fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(12, 6), sharey=True)
plt.tight_layout()

# Plot the mask
ax1.imshow(xdf_image.mask, cmap='Greys')
ax1.set_xlabel('X (pixels)')
ax1.set_ylabel('Y (pixels)')
ax1.set_title('Mask')

# Plot the masked data
fitsplot = ax2.imshow(np.ma.masked_where(xdf_image.mask, xdf_image_clipped), 
                      norm=norm_image, cmap=cmap)

# Define the colorbar and fix the labels
cbar_ax = fig.add_axes([1, 0.09, 0.03, 0.87])
cbar = fig.colorbar(fitsplot, cbar_ax, ticks=LogLocator(subs=range(10)))
labels = ['$10^{-4}$'] + [''] * 8 + ['$10^{-3}$'] + [''] * 8 + ['$10^{-2}$']
cbar.ax.set_yticklabels(labels)
cbar.set_label(r'Flux Count Rate ({})'.format(xdf_image.unit.to_string('latex')), 
               rotation=270, labelpad=30)
ax2.set_xlabel('X (pixels)')
ax2.set_title('Masked Data')

On the left we have plotted this mask, which has a value of 1 (or True) shown in black where the data is bad, and 0 (or False) shown in white where the data is good. 

After the mask is applied to the data (on the right above) the data values "behind" the masked values are shown in white.

## Perform scalar background estimation

Now that the data are properly masked, we can calculate some basic statistical values to do a scalar estimation of the image background. 

By "scalar estimation", we mean the calculation of a single value (such as the mean or median) to represent the value of the background for our entire two-dimensional dataset. This is in contrast to a two-dimensional background, where the estimated background is represented as an array of values that can vary spatially with the dataset. We will calculate a 2D background in the upcoming section.

### Calculate scalar background value

Here we will calculate the mean, median, and mode of the dataset using sigma clipping. With sigma clipping, the data is iteratively clipped to exclude data points outside of a certain sigma (standard deviation), thus removing some of the noise from the data before determining statistical values.

In [None]:
# Calculate statistics with masking
mean, median, std = sigma_clipped_stats(xdf_image.data, sigma=3.0, maxiters=5, mask=xdf_image.mask)

# Calculate statistics without masking
stats_nomask = sigma_clipped_stats(xdf_image.data, sigma=3.0, maxiters=5)

But what difference does this sigma clipping make? And how important is masking, anyway? Let's visualize these statistics to get an idea:

In [None]:
# Set up the figure with subplots
fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(12, 4), sharey=True)

# Plot histograms of the data
flux_range = (-.5e-3, 1.5e-3)
ax1.hist(xdf_image[~xdf_image.mask], bins=100, range=flux_range)
ax2.hist(xdf_image[~xdf_image.mask], bins=100, range=flux_range)

# Plot lines for each kind of mean
ax1.axvline(mean, label='Masked and Clipped', c='C1', ls='-.', lw=3)
ax1.axvline(np.average(xdf_image[~xdf_image.mask]), label='Masked', c='C2', lw=3)
ax1.axvline(stats_nomask[0], label='Clipped', c='C3', ls=':', lw=3)
ax1.axvline(np.average(xdf_image), label='Neither', c='C5', ls='--', lw=3)

ax1.set_xlim(flux_range)
ax1.set_xlabel(r'Flux Count Rate ({})'.format(xdf_image.unit.to_string('latex')), fontsize=14)
ax1.set_ylabel('Frequency', fontsize=14)
ax1.set_title('Effect of Sigma-Clipping \n and Masking on Mean', fontsize=16)

# Plot lines for each kind of median
# Note: use np.ma.median rather than np.median for masked arrays
ax2.axvline(median, label='Masked and Clipped', c='C1', ls='-.', lw=3)
ax2.axvline(np.ma.median(xdf_image[~xdf_image.mask]), label='Masked', c='C2', lw=3)
ax2.axvline(stats_nomask[1], label='Clipped', c='C3', ls=':', lw=3)
ax2.axvline(np.ma.median(xdf_image), label='Neither', c='C5', ls='--', lw=3)

ax2.set_xlim(flux_range)
ax2.set_xlabel(r'Flux Count Rate ({})'.format(xdf_image.unit.to_string('latex')), fontsize=14)
ax2.set_title('Effect of Sigma-Clipping \n and Masking on Median', fontsize=16)

# Add legend
ax1.legend(fontsize=11, loc='lower center', bbox_to_anchor=(1.1, -0.45), ncol=2, handlelength=6)

Just from simply looking at the distribution of the data, it is pretty easy to see how sigma-clipping and masking improve the calculation of the mean and median: the masked & sigma-clipped values are closest to the center of the distribution in both cases. It's also worthwhile to note that the median does a better job even without masking or clipping!

### Subtract scalar background value

But enough looking at numbers, let's actually remove the background from the data. By using the `subtract()` method of the `CCDData` class, we can subtract the mean background while maintaining the metadata and mask of our original CCDData object:

In [None]:
# Calculate the scalar background subtraction, maintaining metadata, unit, and mask
xdf_scalar_bkgdsub = xdf_image.subtract(mean * u.electron / u.s)

In [None]:
# Set up the figure with subplots
fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(12, 6), sharey=True)
plt.tight_layout()

# Plot the original data
fitsplot = ax1.imshow(np.ma.masked_where(xdf_image.mask, xdf_image_clipped), norm=norm_image)
ax1.set_xlabel('X (pixels)')
ax1.set_ylabel('Y (pixels)')
ax1.set_title('Original Data')

# Plot the subtracted data
xdf_scalar_bkgdsub_clipped = np.clip(xdf_scalar_bkgdsub, 1e-4, None) # clip to plot with logarithmic stretch
fitsplot = ax2.imshow(np.ma.masked_where(xdf_scalar_bkgdsub.mask, xdf_scalar_bkgdsub_clipped), norm=norm_image)
ax2.set_xlabel('X (pixels)')
ax2.set_title('Scalar Background-Subtracted Data')

# Define the colorbar and fix the labels
cbar_ax = fig.add_axes([1, 0.09, 0.03, 0.87])
cbar = fig.colorbar(fitsplot, cbar_ax, ticks=LogLocator(subs=range(10)))
labels = ['$10^{-4}$'] + [''] * 8 + ['$10^{-3}$'] + [''] * 8 + ['$10^{-2}$']
cbar.ax.set_yticklabels(labels)
cbar.set_label(r'Flux Count Rate ({})'.format(xdf_image.unit.to_string('latex')), 
               rotation=270, labelpad=30)

Note that both plots above use the same normalization scheme, represented by the colorbar on the right. That is to say, if two pixels have the same color in both arrays, they have the same value.

That looks better! You can tell that the background is darker, especially in the top corner. However, the background still does not seem to be completely removed. In this case, the background varies spatially; it is two-dimensional. Thankfully, `photutils` includes functions to remove background like this.

<div class="alert alert-block alert-info">
    
<h3>Exercises:</h3><br>

Perform a median scalar background subtraction on our sigma-clipped data. Plot it and visually inspect it. How does it compare to the original data?
<br><br>
Compare the median background subtraction to the mean background subtraction. Which is better?

</div>

## Perform 2-D background estimation

The `Background2D` class allows users to model 2-dimensional backgrounds, by calculating the mean or median in small boxes, and smoothing these boxes to reconstruct a continuous 2D background. The class includes the following arguments/attributes:
* **`box_size`** &mdash; the size of the boxes used to calculate the background. This should be larger than individual sources, yet still small enough to encompass changes in the background.
* **`filter_size`** &mdash; the size of the median filter used to smooth the final 2D background.
* **`filter_threshold`** &mdash; threshold below which the smoothing median filter will not be applied.
* **`sigma_clip`** &mdash; an ` astropy.stats.SigmaClip` object that is used to specify the sigma and number of iterations used to sigma-clip the data before background calculations are performed.
* **`bkg_estimator`** &mdash; the method used to perform the background calculation in each box (mean, median, SExtractor algorithm, etc.).

For this example, we will use the `MeanBackground` estimator.

In [None]:
from photutils.background import Background2D, MeanBackground

In [None]:
sigma_clip = SigmaClip(sigma=3., maxiters=5)
bkg_estimator = MeanBackground()
bkg = Background2D(xdf_image, box_size=200, filter_size=(10, 10), mask=xdf_image.mask,
                   sigma_clip=sigma_clip, bkg_estimator=bkg_estimator)

So, what does this 2D background look like? Where were the boxes placed?

In [None]:
# Set up the figure with subplots
fig, ax1 = plt.subplots(1, 1, figsize=(8, 8))

# Plot the background
background_clipped = np.clip(bkg.background, 1e-4, None) # clip to plot with logarithmic stretch
fitsplot = ax1.imshow(np.ma.masked_where(xdf_image.mask, background_clipped), norm=norm_image)

# Plot the meshes
bkg.plot_meshes(outlines=True, color='lightgrey')

# Define the colorbar
cbar = plt.colorbar(fitsplot, fraction=0.046, pad=0.04, ticks=LogLocator(subs=range(10)))
labels = ['$10^{-4}$'] + [''] * 8 + ['$10^{-3}$'] + [''] * 8 + ['$10^{-2}$']
cbar.ax.set_yticklabels(labels)

# Define labels
cbar.set_label(r'Flux Count Rate ({})'.format(xdf_image.unit.to_string('latex')), 
               rotation=270, labelpad=30)
ax1.set_xlabel('X (pixels)')
ax1.set_ylabel('Y (pixels)')
ax1.set_title('2D Estimated Background')

You might notice that not all areas of the background array have mesh boxes over them (look for those boxes that do not have a `+`). If you compare this background array with the original data, you'll see that these un-boxed areas contain particularly bright sources, and thus are not being included in the background estimate .

And how does the data look if we use this background subtraction method (again maintaining the attributes of the CCDData object)?

In [None]:
# Calculate the 2D background subtraction, maintaining metadata, unit, and mask
xdf_2d_bkgdsub = xdf_image.subtract(bkg.background * u.electron / u.s)

In [None]:
# Set up the figure with subplots
fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(12, 6), sharey=True)
plt.tight_layout()

# Define the normalization
xdf_2d_bkgdsub_clipped = np.clip(xdf_2d_bkgdsub, 1e-4, None) # clip to plot with logarithmic stretch

# Plot the scalar-subtracted data
fitsplot = ax1.imshow(np.ma.masked_where(xdf_scalar_bkgdsub.mask, xdf_scalar_bkgdsub_clipped), norm=norm_image)
cbar.set_label(r'Flux Count Rate ({})'.format(xdf_image.unit.to_string('latex')), 
               rotation=270, labelpad=30)
ax1.set_ylabel('Y (pixels)')
ax1.set_xlabel('X (pixels)')
ax1.set_title('Scalar Background-Subtracted Data')

# Plot the 2D-subtracted data
fitsplot = ax2.imshow(np.ma.masked_where(xdf_2d_bkgdsub.mask, xdf_2d_bkgdsub_clipped), norm=norm_image)
ax2.set_xlabel('X (pixels)')
ax2.set_title('2D Background-Subtracted Data')

# Plot the colorbar
cbar_ax = fig.add_axes([1, 0.09, 0.03, 0.87])
cbar = fig.colorbar(fitsplot, cbar_ax, ticks=LogLocator(subs=range(10)))
labels = ['$10^{-4}$'] + [''] * 8 + ['$10^{-3}$'] + [''] * 8 + ['$10^{-2}$']
cbar.ax.set_yticklabels(labels)
cbar.set_label(r'Flux Count Rate ({})'.format(xdf_image.unit.to_string('latex')), 
               rotation=270, labelpad=30)

Note how much more even the 2D background-subtracted image looks; especially the difference between these two images in the bottom corner and top corner. This makes sense, as the background that `Background2D` identified was a gradient from the top corner down to the bottom!

<div class="alert alert-block alert-info">
    
<h3>Exercises:</h3><br>

Calculate the standard deviation (with sigma-clipping and masking!) for the original data, the scalar background-subtracted data, and the 2D background-subtracted data. How do the values compare? Which has the smallest standard deviation?<br><br>

Notice that the difference between each dataset's standard deviation is small - why might this be?

</div>

---
## Conclusions

The `photutils` package provides a powerful tool in the `Background2D` class, allowing users to easily estimate and subtract spatially variant background signals from their data.

**To continue with this `photutils` tutorial, go on to the [source detection notebook](../02_source_detection/02_source_detection.ipynb).**

---
## Additional Resources
For more examples and details, please visit the [photutils](http://photutils.readthedocs.io/en/stable/index.html) documentation.

---
## About this Notebook
**Authors:** Lauren Chambers (lchambers@stsci.edu), Erik Tollerud (etollerud@stsci.edu)
<br>**Updated:** May 2019

[Top of Page](#title_ID)
<img style="float: right;" src="https://raw.githubusercontent.com/spacetelescope/notebooks/master/assets/stsci_pri_combo_mark_horizonal_white_bkgd.png" alt="STScI logo" width="200px"/>