# Generating ZTF cutouts given RA and DEC

More information on ZTF: https://www.ztf.caltech.edu <br>
ztfquery repo: https://github.com/MickaelRigault/ztfquery

### Generating ZTF cutouts:

ZTF has a method of generating cutouts by creating a URL with the relevant information for that specific object. The general format is:
'https://irsa.ipac.caltech.edu/ibe/data/ztf/products/sci/'[year]'/'[month+day]'/'[fracday]'/ztf_'[filefracday]'_'[field]'_'[filtercode]'_c'[ccdid]'_'[imgtypecode]'_q'[qid]'_sciimg.fits'?center='[ra]','[dec]'&size='[cutout_size]'arcsec&gzip=false'
and needs to be made for each object.

To generate a ZTF cutout in this way, you need to:
- Get a full-sized ZTF image that contains your object (which can be done with ztfquery).
- Retrieve the relevant information from that image (everything in square brackets in the above URL template), and generate the URL string using that information and the RA and DEC of your object.
- Open the URL to get the cutout image data for that specific object!
- (We then save the data as ".fits" images, for convenience of use with the next notebook.)

FITS format is a very common format for astronomical images. More information can be found here: https://en.wikipedia.org/wiki/FITS

<br>

In [None]:
from ztfquery import query
import pandas as pd
import numpy as np
import astropy.io.fits as fits
from tqdm import tqdm
import os
import matplotlib.pyplot as plt
import warnings

### Read in the star and galaxy data, and combine into one pandas DataFrame

In [None]:
stars = pd.read_csv('stars.csv')
galaxies = pd.read_csv('gals.csv')
ra_dec = pd.DataFrame({'star_ra': stars['ra'].values, 'star_dec': stars['dec'].values, 'galaxy_ra': galaxies['ra'].values, 'galaxy_dec': galaxies['dec'].values})

In [None]:
# Optional: You can shuffle the DataFrame to make cutouts in a random order.
ra_dec = ra_dec.sample(frac=1)

### Set and define various parameters for generating cutouts

***Note:*** For this example, `num_cutouts` is set to create 10 cutouts of each type. If you want to duplicate producing the number of cutouts we have supplied, you need to change this number.

***Note:*** We use the full list of star/galaxy positions (`ra_dec`), and *then* define when to stop (with `num_cutouts`), instead of cutting `ra_dec` to the number of cutouts we want. There are many reasons a valid cutout URL may not be generated (there weren't any full-sized images at our specified seeing or filter, the cutout was generated the wrong size, etc). You could pass through 10 objects, but only 9 cutouts are made, so for that reason, we set how many cutouts we *want*, and go one object at a time until that is satisfied.

In [None]:
# The size of the cutout, in arcseconds
image_size = 20

# The number of cutouts (of each type) to make
num_cutouts = 10

# Create an empty pd.DataFrame to store cutout URLs
cut_outs_df = pd.DataFrame({})

# Create the directory for the cutout images to be stored
cutout_path = 'cutouts_test/'
if os.path.exists(cutout_path) == False:
    os.mkdir(cutout_path)

# We append to this later for visualizing some cutouts
plot_data = []

# This suppresses 'ugly' warnings from ztfquery 
# when there are no images within our specified parameters
warnings.simplefilter("ignore", UserWarning)

### Create a function for generating ZTF cutout URLs:
- Takes in the full-sized image, and the ra and dec of the object
- Gets all relevant information from the full image
- Returns the cutout URL for that object

In [None]:
def generate_url(image, ra, dec):
    year = image['obsdate'].values[0][0:4]
    month = image['obsdate'].values[0][5:7]
    day = image['obsdate'].values[0][8:10]
    filefracday = str(image['filefracday'].values[0])
    fracday = filefracday[8:14]
    imgtypecode = str(image['imgtypecode'].values[0])
    qid = str(image['qid'].values[0])
    
    # Get the ZTF field and pad it to 6 digits
    field = str(image['field'].values[0])
    if len(field) < 6:
        pad_field = 6 - len(field)
        field = '0'*pad_field+str(field)
    
    filtercode = image['filtercode'].values[0]
    
    # Get the CCD ID and pad it to 2 digits
    ccdid = str(image['ccdid'].values[0])
    if len(ccdid) < 2:
        pad_ccdid = 2 - len(ccdid)
        ccdid = '0'*pad_ccdid+str(ccdid)

    cut_out = 'https://irsa.ipac.caltech.edu/ibe/data/ztf/products/sci/'+year+'/'+month+day+'/'+fracday+'/ztf_'+filefracday+'_'+field+'_'+filtercode+'_c'+ccdid+'_'+imgtypecode+'_q'+qid+'_sciimg.fits?center='+str(ra)+','+str(dec)+'&size='+str(image_size)+'arcsec&gzip=false'
    
    return cut_out

### Create a function that opens each URL and saves cutout as a .fits image:
 
- Open each URL
- Check the data shape
    - If shape is wrong: correct the cutout and continue, or skip the cutout and exit
- Append first 3 images of each type to `plot_data` (for plotting later)
- Save image locally in .fits format

***Note:*** For ZTF, 1 arcsecond (1") = 1 pixel. Even though we told the cutouts to be 20", a rare few generate as more or less pixels. We want all cutouts the same size, so if they are 21x21, we correct the image, but if they are <20 or >21, we skip the cutout (too much effort and very rare).

In [None]:
def save_image(url, num, plot_data, pbar):
    image_data = fits.getdata(url, header=True) # image + header
    image = image_data[0] # just image
    header = image_data[1] # just header
    
    if (image.shape[0] < 20) or (image.shape[1] < 20) or (image.shape[0] > 21) or (image.shape[1] > 21):
        return num, pbar
    
    if image.shape[0] == 21:
        # Drop first row
        image = np.delete(image, 0, 0)
    if image.shape[1] == 21:
        # Drop first column
        image = np.delete(image, 0, 1)

    # Save as .fits
    fits.writeto(str(cutout_path)+str(obj_type)+'_'+str(num)+'.fits', image, header=header, overwrite=True)
    
    # Append first 3 objects for plotting
    if num < 3:
        plot_data.append(image)
    
    # For successful cutouts, add 1 to the counter and update progress bar
    num += 1
    pbar.update(1)
    return num, pbar

### For each object:
- Run ztfquery to get a full-sized ZTF science image that contains that object
- Generate a URL for that objects cutout with `generate_url`
- Open the URL and save cutout as .fits image using `save_image`

<u>***Note:***</u> We query the data with 4 conditions:

1. kind='sci'
    - This queries only "science" images, which we want for our cutouts.
2. Seeing < 1.85
    - "Seeing" quatifies the quality of the atmosphere. Throw out any images with seeing > 1.85. More info on "seeing" here: https://www.handprint.com/ASTRO/seeing3.html 
3. filtercode = 'zi'
    - 'zi' stands for "ZTF i-band". We select i-band images because they seem to work best for classification.
4. mcen=True
    - Each object will appear in MANY full-size science images, making querying long when we only need 1 image to generate a cutout. Passing `mcen=True` returns only 1 full-sized image that contains the objects RA and DEC. This makes it faster, but also avoids issues such as partial cutouts (when the object is too close to the edge of the image), and now the randomization (of picking 1 image to make the cutout from) is done for us!

In [None]:
for obj_type in ['star', 'galaxy']:
    
    num = 0 # This will keep track of how many cutouts are made of each type
    
    # Set a progress bar for cutout generation
    pbar = tqdm(total=num_cutouts, desc='Generating '+str(obj_type)+' cutouts', leave=True)
    
    # Loop over each object
    for idx, ra in enumerate(ra_dec[str(obj_type)+'_ra']):
        
        # Only continue if you have not satisfied 'num_cutouts'
        if num < num_cutouts:
            
            dec = ra_dec[str(obj_type)+'_dec'].values[idx]
            
            # Query ZTF images
            zquery = query.ZTFQuery()
            zquery.load_metadata(kind='sci', radec=[ra, dec], mcen=True, sql_query="seeing<1.85 and filtercode='zi'")
            image = zquery.metatable
            
            # Check that atleast 1 ZTF image was queried
            if len(image) > 0:
                # Generate cutout URL
                url = generate_url(image, ra, dec)
                
                # Save the cutout images
                num, pbar = save_image(url, num, plot_data, pbar)
                
        # Once `num_cutouts` is satisfied, exit loop, and start on next obj_type
        else:
            pbar.close()
            break

### Plot a few objects to verify cutouts were created correctly

In [None]:
fig, axes = plt.subplots(nrows=2, ncols=3, sharex='col', sharey='row', figsize=(12, 8))

axes[0,0].imshow(plot_data[0], cmap='gray')
axes[0,0].set_title('Star 1', fontsize=18)
axes[0,1].imshow(plot_data[1], cmap='gray')
axes[0,1].set_title('Star 2', fontsize=18)
axes[0,2].imshow(plot_data[2], cmap='gray')
axes[0,2].set_title('Star 3', fontsize=18)

axes[1,0].imshow(plot_data[3], cmap='gray')
axes[1,0].set_title('Galaxy 1', fontsize=18)
axes[1,1].imshow(plot_data[4], cmap='gray')
axes[1,1].set_title('Galaxy 2', fontsize=18)
axes[1,2].imshow(plot_data[5], cmap='gray')
axes[1,2].set_title('Galaxy 3', fontsize=18)

plt.show()