# Deep learning course for astro PhD students

##### Alexandre Boucaud (APC) & Marc Huertas-Company (LERMA)

1. [Introduction](#Introduction)
2. [Data](#Data)
3. [Workflow](#Workflow)
4. [Evaluation](#Detection-evaluation)
5. [Local testing/exploration](#Local-testing)

## Introduction

In astronomical images, the projection effects may cause two or more galaxies to overlap. When they are barely indistinguishable from one another, they are referred to as _blended_ and this can bias astrophysical estimators such as the morphology of galaxies or the shear (weak gravitational lensing distortion).  
As the sensitivity of imaging devices grows, a high fraction of galaxies appear _blended_ in the images, which is a known and important issue for current and upcoming galaxy surveys.  

In order not to discard such a wealth of information, it is key to develop methods to enable astronomers to alleviate such effect.
We can foresee some features that would help, in which machine learning could provide a solution:
- classify an image as containing isolated/blended objects  
  ___binary classification___
- count the number of blended sources in a blended image  
  ___regression / object detection___
- find the contours of each object  
  ___object detection/segmentation___
- ...

In this exercice, we will approach the third item, the detection of contours, but in a constrained way : the images will only contain **two galaxies** and the goal will be to find the **contours of the central galaxy**.

## Code

Install the dltools lib

In [None]:
!git clone https://github.com/aboucaud/deeplearning4astro_labs_2019
!cd deeplearning4astro_labs_2019/labs/deblending
!pip install .

## Data

Download the data

In [4]:
import os
import sys
import tarfile
from urllib.request import urlretrieve

URL = "https://www.apc.univ-paris7.fr/Downloads/comput/aboucaud"
FOLDER = "ed127"
FILES = [
    "test_blends_mini.npy",
    "test_target_img_mini.npy",
    "test_target_masks_mini.npy",
    "train_blends_mini.npy",
    "train_target_img_mini.npy",
    "train_target_masks_mini.npy",

]
BIG_FILES = [
    "masks.tar.gz",
    "single_imgs.tar.gz",
    "blends.tar.gz",
]

def main(output_dir, delete=False, full=False):
    if full:
        files = BIG_FILES
    else:
        files = FILES

    urls = [
        f"{URL}/{FOLDER}/{filename}"
        for filename in files
    ]

    if not os.path.exists(output_dir):
        print(f"Creating directory {output_dir}")
        os.mkdir(output_dir)

    for url, filename in zip(urls, files):
        output_file = os.path.join(output_dir, filename)

        if os.path.exists(output_file):
            print(f"{filename} already downloaded.")
            continue

        print(f"Downloading from {url} ...")
        urlretrieve(url, filename=output_file)
        print(f"=> File saved as {output_file}")

        if filename.endswith("tar.gz"):
            print("Extracting tarball..")
            with tarfile.open(output_file, "r:gz") as f:
                f.extractall(output_dir)
            print("Done.")

            if delete:
                os.remove(output_file)
                print(f"=> File {output_file} removed.")

test_blends_mini.npy        train_blends_mini.npy
test_target_img_mini.npy    train_target_img_mini.npy
test_target_masks_mini.npy  train_target_masks_mini.npy


In [None]:
main(output_dir='data', full=False)                
# main(output_dir='data', delete=True, full=True)

## Load data

In [1]:
import numpy as np
import matplotlib.pyplot as plt

from dltools.detector import ObjectDetector
%matplotlib inline

Using TensorFlow backend.


In [None]:
datadir = "data"
suffix = "_mini"
# suffix = ""

In [None]:
X_train = np.load(os.path.join(datadir, f"train_blends{suffix}.npy"), mmap_mode='r')
Y_train = np.load(os.path.join(datadir, f"train_target_masks{suffix}.npy"), mmap_mode='r')

X_test = np.load(os.path.join(datadir, f"test_blends{suffix}.npy"), mmap_mode='r')
Y_test = np.load(os.path.join(datadir, f"test_target_masks{suffix}.npy"), mmap_mode='r')

## Model

In [None]:
# CHANGE HERE
# ===========
filename = "alex_testfcnn.py"

from keras.models import Sequential, Model
from keras.layers import (
    Conv2D,
    Dropout,
    Input,
    MaxPooling2D,
    concatenate,
    Conv2DTranspose,
    UpSampling2D,
)
from keras.layers.noise import GaussianNoise


def model():
    input_shape = (128, 128, 1)
    output_channels = 1
    depth = 16
    n_layers = 6
    conv_size0 = (3, 3)
    conv_size = (3, 3)
    last_conv_size = (3, 3)
    activation = "relu"
    last_activation = "sigmoid"
    dropout_rate = 0
    sigma_noise = 0.01
    initialization = "he_normal"

    model = Sequential()
    model.add(
        Conv2D(
            depth,
            conv_size0,
            input_shape=input_shape,
            activation=activation,
            padding="same",
            name="conv0",
            kernel_initializer=initialization,
        )
    )
    if dropout_rate > 0:
        model.add(Dropout(dropout_rate))

    for layer_n in range(1, n_layers):
        model.add(
            Conv2D(
                depth,
                conv_size,
                activation=activation,
                padding="same",
                name="conv{}".format(layer_n),
                kernel_initializer=initialization,
            )
        )
        if dropout_rate > 0:
            model.add(Dropout(dropout_rate))

    if sigma_noise > 0:
        model.add(GaussianNoise(sigma_noise))

    model.add(
        Conv2D(
            output_channels,
            last_conv_size,
            activation=last_activation,
            padding="same",
            name="last",
            kernel_initializer=initialization,
        )
    )

    return model

## Main

In [None]:
modulename = os.path.splitext(filename)[0]

binome, submission = modulename.split('_')

os.makedirs(modulename)
history_file = os.path.join(modulename, f"{modulename}_history.png")
sample_file = os.path.join(modulename, f"{modulename}_image_sample.png")

print(f"\nTraining submission {submission} by {binome}...\n")

obj = ObjectDetector(model(), batch_size=16, epoch=10, model_check_point=False)

In [None]:
print("Training...")
history = obj.fit(X_train, Y_train)

In [6]:
print("Testing...")
score = obj.predict_score(X_test.squeeze(), Y_test)

print(f"Score: {score:.2f}")