# Create Synthetic Galaxies

This notebook takes cleaned training data as input, creates cutout images of each galaxy in the datset, then uses these to create synthetic galaxy cutouts with a pixel-level transform. These are embedded into a plain black 450x450 image, so the galaxy itself is the only thing in it. The annotations for each galaxy are unchanged.

This script also separates the cutout images according to the category (class) of the object in them. Hence, the output directory will contain four numbered directories, one for each category of galaxy in the dataset, numbered 0-3.

Note, it only makes sense to create cutouts for training data, so this does not apply to the validation or test sets.

**Prerequisites**: The data needs to have been downloaded and cleaned already. The `clean.ipynb` notebook walks through this process.

In [1]:
import sys

sys.path.insert(1, "../..")
from data_processing.data_proc_lib.cut_and_paste import create_cutouts
from data_processing.data_proc_lib.pixel_transforms import create_pixel_distributions, transform_pixel_distributions

## 1 Create Cutout Galaxies

Set `CLEANED_PATH` to be the location of the cleaned data.

set `OUTPUT_PATH` to be the path to where the cutout images and annotations should be placed.

To create cutouts that are scaled to a 32x32 pixel size, use `create_cutouts(CLEANED_PATH, CUTOUTS_PATH, scale=True)`

In [2]:
CLEANED_PATH = "/mnt/data/rgn_ijcnn/cleaned"
CUTOUTS_PATH = "/mnt/data/rgn_ijcnn/augmented/cutouts_scaled_by_class"

In [None]:
create_cutouts(CLEANED_PATH, CUTOUTS_PATH, scale=True)

## 2 Create Pixel Distributions per Category

The following step does two things:

1. Creates a directory called `pixel_distributions` in each category-specific subdirectory.
2. Creates the pixel distributions and paste them in there. Note the pixel distribution image files have the same filename as the cutout file they are derived from, but they're in a different directory.

In [None]:
create_pixel_distributions(CUTOUTS_PATH)

## 3 Transform Pixel Distributions

This sections performs the pixel transformation on the pixel distributions and then convert back to a galaxy image, to give a pixel-transformed galaxy.

In [None]:
transform_pixel_distributions(CUTOUTS_PATH)