# HSC data

The cutouts are already made centered on the ra, dec of the galaxy with a 40 arcsec radius. We choose the `coadd/bg` option to avoid the aggressive background subtraction that shreds the extended wings of our galaxies.

1. Align the data with North
2. Add the PSF from the [PSF picker](https://hsc-release.mtk.nao.ac.jp/psf/pdr3/)
    * Make sure the PSF is for the same run as the data
    
...Actually, the PSF picker didn't return all galaxies, so could just do this by hand?
#### Imports

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from astropy.io import fits
import pandas as pd
from astropy.wcs import WCS
from astropy.nddata import bitfield_to_boolean_mask
from tqdm.notebook import tqdm

## Organize the data

The default name of the HSC download is `[line number]-coadd+bg-HSC-[FILTER]-[tract number]-[rerun].fits` where
* `[line number]` is the line number in the input CSV file
* `[filter]` is `I`
* `[tract number]` is a four or five digit number
* `[rerun]` is `pdr3_wide` or `pdr3_dud_rev`

Load in the galaxies list:

In [2]:
data = pd.read_csv('data/catalogs/data.csv')
# hsc_cat = data[data.hsc]
data.head(2)

Unnamed: 0,name,ra,dec,lmass50,z,dataset,cfis,hsc
0,J000318+004844,0.825888,0.812301,10.82,0.138889,spog,False,True
1,J001145-005431,2.938389,-0.908503,10.221996,0.047883,spog,False,True


For galaxies with HSC data, save their tract number and rerun

In [3]:
hsc_files = ! ls data/hsc/raw/
hsc_files = [f[:-5] for f in hsc_files]

# Extract line number, tract number, and rerun info
hsc_cat = []
for filename in hsc_files:
    vals = filename.split('-')
    row = {'idx' : int(vals[0])-2, 'tract' : int(vals[4]), 'rerun' : vals[-1], 'filename' : filename}
    hsc_cat.append(row)
hsc_cat = pd.DataFrame(hsc_cat)
hsc_cat.index = hsc_cat.idx
hsc_cat = pd.merge(data, hsc_cat, left_index=True, right_index=True).reset_index().drop(columns=['idx', 'index'])
# hsc_cat.to_csv('data/catalogs/hsc.csv', index=False)

## PSF

Generate a coordinate list to download the PSFs in bulk.

In [4]:
psf_cat = hsc_cat[['ra','dec','tract','rerun']].copy()
psf_cat['filter'] = 'i'
psf_cat['type'] = 'coadd'
psf_cat.to_csv('data/catalogs/hsc_psfs.csv', index=False, sep='\t')

Get the list of PSF files:

In [5]:
psffiles = ! ls data/hsc/psf
psffiles = [f[:-5] for f in psffiles]
psf_names = []
for filename in psffiles:
    vals = filename.split('-')
    row = {'idx' : int(vals[0])-2, 'psf_file' : filename}
    psf_names.append(row)
psf_names = pd.DataFrame(psf_names)
psf_names.index = psf_names.idx
psf_names.drop(columns=['idx'], inplace=True)

hsc_cat = pd.merge(hsc_cat, psf_names, left_index=True, right_index=True)
hsc_cat.to_csv('data/catalogs/hsc.csv', index=False)

## Attach the PSF

For each galaxy, read in the PSF file and attach it as a FITS extension

In [6]:
for idx, row in hsc_cat.iterrows():
    
    galaxy_f = fits.open(f'data/hsc/raw/{row.filename}.fits')
    psf_f = fits.open(f'data/hsc/psf/{row.psf_file}.fits')
    
    galaxy_f.append(psf_f[0])
    galaxy_f.writeto(f'data/hsc/{row["name"]}.fits', overwrite=True)
    psf_f.close()
    galaxy_f.close()
    

## Fix the file

We want all data to have a 4-extension FITS file, each HDU containing image -> err -> mask -> psf respectively. We also want each HDU to have an `EXTNAME` equal to `IMAGE`, `ERR`, `MASK`, or `PSF`.

In [7]:
for idx, row in tqdm(data[data.hsc].iterrows(), total=len(data[data.hsc])):
    
    galaxy_f = fits.open(f'data/hsc/{row["name"]}.fits')
    
    # Resize the images so they are all 80x80" squared
    size = int(80/0.168 + 0.5)
    
    # Fix order and add extanmes
    hdus = [galaxy_f[1], galaxy_f[3], galaxy_f[2], galaxy_f[4]]
    names = ['IMAGE', 'ERR', 'MASK', 'PSF']
    for hdu, name in zip(hdus, names):
        if name != 'PSF':
            img = hdu.data[:size, :size]
            hdu.data = img
        hdu.header['EXTNAME'] = name
        
    # Add the FLUXMAG0 keyword to header
    hdus[0].header['FLUXMAG0'] = galaxy_f[0].header['FLUXMAG0']
        
    # Fix the mask
    things_to_ignore = [2, 5, 6, 9, 10, 11, 13, 14, 15, 16]
    things_to_ignore = [ 1, 2, 3, 5, 6, 9, 10, 11, 13, 14, 15, 16] # Keep CR pixels, they are interpolated over. Ignore saturated pixels.
    mask = bitfield_to_boolean_mask(galaxy_f["MASK"].data, ignore_flags=[2**n for n in things_to_ignore]).astype(int)
    hdus[2].data = mask
    
    # Make the image a primary HDU
    hdus[0] = fits.PrimaryHDU(hdus[0].data, hdus[0].header)
    
    # Save file
    hdul = fits.HDUList(hdus)
    hdul.writeto(f'data/hsc/{row["name"]}.fits', overwrite=True)
    
    galaxy_f.close()    

  0%|          | 0/30 [00:00<?, ?it/s]