<a href="https://colab.research.google.com/github/aritraghsh09/GaMorNet/blob/master/tutorials/gamornet_train_tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Google Colab Stuff

Although this tutorial can be run on any machine which has GaMorNet installed, it's pretty handy to run this on Google Colab as you can easily use Colab's GPUs for this tutorial.

Note that with the free version of Colab, you will only have access to a limited amount of memory available. Thus, the number of images we use here for training/testing very small just for the purpose of demonstration. In reality, GaMorNet can hands hundreds of thousands of images. 

This first section is meant to be run only when following this tutorial in Google Colab


### Make things Fast!

Before we dive in, let's make sure we're using a GPU for this tutorial.  

To do this, select "Runtime" -> "Change runtime type" -> "Hardware accelerator" -> "GPU".

The following snippet will verify that we have access to a GPU.

In [1]:
import os
# Suppressing TF warnings and info for a cleaner environ
# Set this to 0,1 for info and warnings respectively.
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 
 
# Magic telling Colab we want TF version ~=1.0
%tensorflow_version 1.x

#Checking access to GPU
import tensorflow as tf
if tf.test.gpu_device_name() != '/device:GPU:0':
  print('WARNING: GPU device not found.')
else:
  print('SUCCESS: Found GPU: {}'.format(tf.test.gpu_device_name()))

TensorFlow 1.x selected.
SUCCESS: Found GPU: /device:GPU:0


### Install GaMorNet

In [2]:
!pip install -q --upgrade gamornet

[K     |████████████████████████████████| 411.0MB 40kB/s 
[?25h  Building wheel for wget (setup.py) ... [?25l[?25hdone
  Building wheel for gast (setup.py) ... [?25l[?25hdone


In [3]:
##Checking which version of Tensorflow is being used and whether the installation worked.
import tensorflow as tf
print(tf.__version__)
from gamornet.keras_module import gamornet_train_keras, gamornet_tl_keras, gamornet_predict_keras
from gamornet.tflearn_module import gamornet_train_tflearn, gamornet_tl_tflearn, gamornet_predict_tflearn

1.15.2


Using TensorFlow backend.










# Reference

All mentions of "the paper" in this tutorial, refer to [Ghosh et. al. (2020)](https://iopscience.iop.org/article/10.3847/1538-4357/ab8a47)

# Installing Libraries Needed for this Tutorial

In [4]:
!pip install matplotlib
!pip install astropy
!pip install numpy



# Training with GaMorNet

GaMorNet can quite easily be trained from scratch using images. 

In this demonstration, we will use 90 simulated SDSS images for the purposes of training and 10 simulated SDSS images for validation. All these simulated images come from the set of simulated galaxies created for the paper. 

All these images contain disk + bulge components. As described in the paper, we have also convolved these simulations with a representative PSF and added representative noise. 

# Downloading the Data

First, let's download the images that we are going to use to train GaMorNet. We will download these into the local filesystem from Yale Astronomy's FTP service, where these are hosted.

We are going to download all the 100 images (90+10) as a single archive and then export it to a single folder called `training_imgs`. The iamges are in the FITS format and are named `output_img_xx.fits` where xx runs from 0 to 99.

We are also going to download the `sim_para.txt` file containing the ground-truth parameters for the above galaxies. Using these values, we are going to calculate the bulge-to-total light ratio of each galaxy and determine the labels to be used during the training process. 


*Tip: The `%%bash` command lets Colab know that all the commands in this shell needs to be passed the local unix virtual environment.*

*Tip: To view the files in use on Colab, click the folder icon on the left sidebar.*

In [5]:
%%bash
#get zip and txt file from server
wget ftp://ftp.astro.yale.edu/pub/aghosh/gamornet_tutorial_files/train_images/training_imgs.tar.gz
wget ftp://ftp.astro.yale.edu/pub/aghosh/gamornet_tutorial_files/train_images/sim_para.txt

#Unzip the Archive
tar -xvf training_imgs.tar.gz

./train_images/output_img_0.fits
./train_images/output_img_10.fits
./train_images/output_img_11.fits
./train_images/output_img_12.fits
./train_images/output_img_13.fits
./train_images/output_img_14.fits
./train_images/output_img_15.fits
./train_images/output_img_16.fits
./train_images/output_img_17.fits
./train_images/output_img_18.fits
./train_images/output_img_19.fits
./train_images/output_img_1.fits
./train_images/output_img_20.fits
./train_images/output_img_21.fits
./train_images/output_img_22.fits
./train_images/output_img_23.fits
./train_images/output_img_24.fits
./train_images/output_img_25.fits
./train_images/output_img_26.fits
./train_images/output_img_27.fits
./train_images/output_img_28.fits
./train_images/output_img_29.fits
./train_images/output_img_2.fits
./train_images/output_img_30.fits
./train_images/output_img_31.fits
./train_images/output_img_32.fits
./train_images/output_img_33.fits
./train_images/output_img_34.fits
./train_images/output_img_35.fits
./train_images/ou

--2020-06-12 03:57:22--  ftp://ftp.astro.yale.edu/pub/aghosh/gamornet_tutorial_files/train_images/training_imgs.tar.gz
           => ‘training_imgs.tar.gz’
Resolving ftp.astro.yale.edu (ftp.astro.yale.edu)... 128.36.139.12
Connecting to ftp.astro.yale.edu (ftp.astro.yale.edu)|128.36.139.12|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /pub/aghosh/gamornet_tutorial_files/train_images ... done.
==> SIZE training_imgs.tar.gz ... 13849775
==> PASV ... done.    ==> RETR training_imgs.tar.gz ... done.
Length: 13849775 (13M) (unauthoritative)

     0K .......... .......... .......... .......... ..........  0%  103K 2m11s
    50K .......... .......... .......... .......... ..........  0%  221K 95s
   100K .......... .......... .......... .......... ..........  1%  213K 84s
   150K .......... .......... .......... .......... ..........  1%  197M 63s
   200K .......... .......... .......... .......... ..........

# Preparing the Data

In this section, we will generate the training and validation image arrays as well as the corresponding labels to be used during the training process.


First, lets read in the `.txt` file and calculate the difference in disk and bulge magnitudes for each of the galaxies.

In [0]:
import pylab as plt

#Let's read in the sim_para.txt file 
gal_para = plt.genfromtxt("./sim_para.txt",names=True,usecols = (4,11))
 
#difference b/w the integrated magnitude of the disk and bulge components. 
#The rows in the file and thus the elements in the array below correspond to
#the numbers in the names of the image files. (i.e. the 0th element corresponds
#to output_img_0.fits)
disk_bulge_mag = gal_para["Inte_Mag"] - gal_para["Inte_Mag_2"]

Next, let's define two convenience functions, which will assist us in creating the image and label arrays.

In [0]:
# Convenience Function to get and return images as numpy arrays

def image_handler(i):
  return np.reshape(fits.getdata("./train_images/output_img_"+str(i)+".fits",
                                 memmap=False),newshape=(167,167,1)) 
  #We use the reshape command just to add the extra 3rd dimension. The image is 
  #originally 167*167. So, in essence no re-sizing is taking place in the X or Y
  #directions.


# Convenience Function to get and return the training labels of each galaxy
# in the one-hot encoding format. i.e. disk-dominated galaxies will be represented
# by the array [1,0,0], bulge-dominated by [0,0,1] and indeterminate by [0,1,0]

def label_handler(i):
  
  target_vect = [0]*3

  if (disk_bulge_mag[i] < -0.22): #  (Lb/LT) < 0.45
    target_vect[0] = 1  #disk-dominated       
  
  elif ( -0.22 <=  disk_bulge_mag[i] <= 0.22):
      target_vect[1] = 1 #indeterminate
  
  else: #  (Lb/LT) > 0.55
      target_vect[2] = 1 #bulge-dominated

  return target_vect

Now, we are going to use the first 90 images to create the training set and the last 10 to create the validation set. We are multi-threading the process below -- although this is an absolute overkill for 100 images, it's very handy while dealing with large numbers of images. 

In [0]:
from multiprocessing import Pool
import numpy as np
from astropy.io import fits

NUM_THREADS = 2

pl = Pool(NUM_THREADS)
training_imgs = np.array(pl.map(image_handler,range(0,90)))
training_labels = np.array(pl.map(label_handler,range(0,90)))

valdiation_imgs = np.array(pl.map(image_handler,range(90,100)))
validation_labels = np.array(pl.map(label_handler,range(90,100)))

# Training GaMorNet using Keras

Now, we will be using the images and the labels generated above to train GaMorNet

In [9]:
from gamornet.keras_module import gamornet_train_keras

model = gamornet_train_keras(training_imgs,training_labels,valdiation_imgs,
                             validation_labels,input_shape='SDSS', epochs=50, 
                             checkpoint_freq=25, batch_size=64, lr=0.0001, 
                             loss='categorical_crossentropy')

Instructions for updating:
If using Keras pass *_constraint arguments to layers.


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where



Train on 90 samples, validate on 10 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50

Epoch 00025: saving model to ./model_25.hdf5
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50

Epoch 00050: saving model to ./model_50.hdf5


The above command trains a model using the images we prepared for 50 epochs using a learning rate of 0.0001 and a categorical cross-entropy loss function. The `checkpoint_freq = 25` parameter also ensures that every 25 epochs, a snapshot of the model is saved. These models are named as `model_x.hdf5` where x refers to the epoch at which the model was saved. The `input_shape` parameter specifies the shape of the input images. Setting this to `SDSS` automatically sets the value to `(167,167,1)`

For an explanation of the different input parameters of `gamornet_train_keras`, pelase have a look at the [API documentation](https://gamornet.readthedocs.io/en/latest/api_docs.html).

In the output above, the `accuracy` and `loss` refer to the metrics calculated on the training set at the end of each epoch while `val_loss` and `val_accuracy` refer to the metrics calculated on the validation data. 

Thus, you have trained your first GaMorNet model!! You can have a look at the model's structure using the command below. 

In [10]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 42, 42, 96)        11712     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 21, 21, 96)        0         
_________________________________________________________________
local_response_normalization (None, 21, 21, 96)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 21, 21, 256)       614656    
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 11, 11, 256)       0         
_________________________________________________________________
local_response_normalization (None, 11, 11, 256)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 11, 11, 384)      

**Important:**
The above process also generates a `metrics.csv` file, which contains the loss and accuracy calculated on the validation as well as the training data. 

We highly recommend using the data in this file to check how the loss and accuracies vary with training. This is extremely helpful in judging whether the model was trained properly and sufficiently. 

# Training GaMorNet using TFLearn

Now, we will be using the images and the labels generated above to train GaMorNet

In [12]:
from gamornet.tflearn_module import gamornet_train_tflearn

model = gamornet_train_tflearn(training_imgs,training_labels,valdiation_imgs,
                             validation_labels,input_shape='SDSS', epochs=20, 
                             max_checkpoints=2, batch_size=64, lr=0.0001, 
                             loss='categorical_crossentropy',clear_session=True)

Training Step: 39  | total loss: [1m[32m0.66875[0m[0m | time: 0.059s
| Momentum | epoch: 020 | loss: 0.66875 - acc: 0.7354 -- iter: 64/90
Training Step: 40  | total loss: [1m[32m0.70279[0m[0m | time: 1.171s
| Momentum | epoch: 020 | loss: 0.70279 - acc: 0.7345 | val_loss: 0.30441 - val_acc: 1.0000 -- iter: 90/90
--


The above command trains a model using the images we prepared for 20 epochs using a learning rate of 0.0001 and a categorical cross-entropy loss function. The `max_checkpoints = 2` parameter ensures that the latest 2 snapshots of the epochs will always be saved during training. Three files are saved for each snapshot and the naming format of the checkpoints is `check-x.data`,`check-x.index`,`check-x.meta` where x refers to the step number at which the model was saved. The `input_shape` parameter specifies the shape of the input images. Setting this to `SDSS` automatically sets the value to `(167,167,1)`. 

In the output above, the `acc` and `loss` refer to the accuracy and loss calculated on the training set at the end of each epoch while `val_loss` and `val_acc` refer to the metrics calculated on the validation data. 

The `clear_session = True` parameter value instructs GaMorNet to clear the TensorFlow graphs created earlier. We highly recommend `clear_session` to `True` in notebooks while using the `tflearn_module` as otherwise it might fail. 

For an explanation of the different input parameters of `gamornet_train_tflearn`, pelase have a look at the [API documentation](https://gamornet.readthedocs.io/en/latest/api_docs.html).

Thus, you have trained your first GaMorNet model!! 

--- 

**Tip:** Unlike with the keras module, the tflearn module doesn't automatically save the metrics. Instead you have to redirect the Python output generated to a file in order to keep track of the metrics. 

When running some python script this can be done simply using `python script.py > out.txt`. This will save all the screen output in `out.txt`.

Thereafter the following snippet of Python Code can easily search for the relevant metrics in the screen output file. 

```python
###################################
# accParser.py
#
# Takes tflearn screen output and extracts loss, acc and val_acc every epoch for visualization
####################################
import sys

if (len(sys.argv) != 2):
        print "Exiting Program....\nUsage: python accParser.py /path/to/screen/output"


dataPath = sys.argv[1] #the first argument is the path to the screen grab of the TF Learn run

dataFile = open(dataPath, 'r')
outFile = open(dataPath[:-6] + 'out.txt', 'w')

outFile.write("epoch loss acc val_acc\n")
resultLines = dataFile.readlines()

for line in resultLines:
        if 'val_acc' in line:
                words = line.split()

                #validation step
                if words[-2:-1] != ['iter:']:
                        print "Something doesn't look right. Skipping an occurene of val_acc"
                        continue

                outFile.write(words[words.index("epoch:")+1] + " ")
                outFile.write(words[words.index("loss:")+1] + " ")
                outFile.write(words[words.index("acc:")+1] + " ")
                outFile.write(words[words.index("val_acc:")+1] + "\n")

dataFile.close()
outFile.close()

```

**Important:** We highly recommend checking how the loss and accuracies vary with training. This is extremely helpful in judging whether the model was trained properly and sufficiently. 

# Summary & Takeaways

* `gamornet_train_keras` and `gamornet_train_tflearn` are the two functions that can be used to train GaMorNet models. 

* For understanding the differences between the Keras and TFLearn modules, please refer to the [PDR Handbook](https://gamornet.readthedocs.io/en/latest/usage_guide.html). 

* The [PDR Handbook](https://gamornet.readthedocs.io/en/latest/usage_guide.html) also contains advice about which situations warrant the training of models from scratch and in which cases you can use the models which we have released. 