# Section 13

## Preliminaries
### Data Sources
This repository contains a three files named `helical.fits`, `starData.pkl`, and `galaxyData.pkl` which you will need for the excercises in this section.

You will use the `tensorflow` framework extensively in this Section. Run the following cell to enable `tensorflow`'s _eager_ execution mode. This has a small impact on training performance but it will enable you to convert tensors into `numpy` arrays and plot their contents. See the [online documentation for eager execution](https://www.tensorflow.org/guide/eager) for more information.

In [None]:
import tensorflow as tf
tf.enable_eager_execution()

## Assignment

### Part I: AST4031 and AST5031

In this part we will construct a fully connected neural network **similar** to the one that was shown in the _Live Demo_ during the lecture. We will also construct loss and accuracy curves to determine whether overfitting occurs during training.

1. Open the `helical.fits` file. The file contains a table with three columns of data to use for training and three columns to use for validation. Plot the training and validation data separately on the same axes in 3D.
2. Using the approach that is demonstrated in the `LiveDemo.ipynb` file, construct two `tensorflow.data.Dataset` objects. One containing the training data and one containing the validation data. Remember that we will treat the `Z` coordinates as _labels_ and the `X` and  `Y` coordinates as _features_.
3. Configure the dataset objects to **shuffle** the data they contain and to **repeatedly iterate** over those data as many times as required. Also configure both datasets to provide examples in **batches of 8**.
4. Construct a fully connected neural network using the `tensorflow.Keras.Sequential` class to fit our training data. For the input layer, specify **50** units using a **sigmoid** activation function. Add one hidden layer with **100** units and an activation function of your choice. For the output layer, choose an **approriate** number of units and an **appropriate** activation function.
5. Explain why you chose the activation functions and output unit counts that you did in _Question 4_. How many trainable parameters does your model have?
6. Compile and train the model. Choose an appropriate loss function and number of training epochs. Use the `adam` optimizer and compute the `accuracy` metric. Don't forget to pass the validation dataset to the `fit()` method. The `fit()` method returns a `tensorflow.keras.callbacks.History` object. Keep a reference to this returned object! You will use it to determine whether your model has overfit the training data.
7. The `tensorflow.keras.callbacks.History` object that was returned by the `fit()` method has an attribute named `history`. The `history` attribute is a Python `dict` that contains several `numpy` arrays containing the computed `loss` and `acc`(uracy) values for each training epoch for the training and validation data sets. 
    1. Plot the _loss_ for the training and validation datasets on the same axes. If the loss data are noisy, you may wish you can smooth them using the `scipy.interpolation.UnivariateSpline` class befor plotting.
    2. Now, in a separate figure, plot the _accuracy_ for the training and validation datasets on the same axes.
    3. Did your model overfit the training data? Explain your answer. What steps could you take to mitigate any overfitting that did occur?

### Part I: AST4031 and AST5031
In this part, we will train a _Convolutional Neural Network_ to distinguish between images of stars and galaxies that were observed as part of the Sloan Digital Sky Survey. 

The image data are stored as lists of pre-prepared _tensors_ in two `pickle` files named `starData.pkl` and `galaxyData.pkl`. To save time, you can use this function to load the data from the files.

In [2]:
def loadTensorsFromFile(path):
    import pickle
    import os
    
    if os.path.exists(path):
        with open(path, mode='rb') as pickleFile:
            return pickle.load(pickleFile)
    raise RuntimeError('The specified path does not exist')            

1. Load the image tensors for the stars and galaxies. Use the `print()` function to determine the **shapes** of the image tensors.
2. Use the data you loaded to construct a training and validation `tensorflow.data.Dataset` objects for each object type. 
    1. Use 1/4 the images in each category as validation data and use the remainder for training.
    2. Generate lists of labels for both object types. For the stars generate lists of zeros to act as labels. For the galaxies generate lists of ones. 
    3. As we did in _Part I_, you can supply a tuple with a matched list of features (our image tensors) and labels to the `from_tensor_slices()` function. You should not need to transpose the feature data this time. 
3. Configure all your datasets to **shuffle** their data and to **repeatedly iterate** over those data as many times as required. Configure all your datsets to deliver examples in batches of **10 images**.
4. Review the online documentation for the [`tensorflow.keras.layers.Conv2D` layer class](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D), the [`tensorflow.keras.layers.MaxPool2D` layer class](https://www.tensorflow.org/api_docs/python/tf/keras/layers/MaxPool2D) and the [`tensorflow.keras.layers.Flatten` layer](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Flatten). Use the `tensorflow.Keras.Sequential` class to construct a simple convolutional neural network.
    1. Use a `tensorflow.keras.layers.Conv2D` as the input layer for the network. You will need to specify an appropriate value for the `input_shape` argument. Use **20**, **10x10** filters (kernels) and a `relu` activation function.
    2. After the convolutional layer, add a `tensorflow.keras.layers.MaxPool2D` layer with a 5 square pixel pool size. 
    3. The convolution and max-pooling layers will form the feature extraction component of our network. Now we need to pass the extracted features to a fully connected network to perform the star/galaxy classification. To achieve this we must _flatten_ the tensor ouput of the max-pooling layer. Add a `tensorflow.keras.layers.Flatten` layer to the network. This will be the final layer in the feature extraction section of our network.
    3. Now add a fully connected layer with 128 units. You are free to choose an appropriate activation function.
    4. Add an ouput layer with an appropriate unit count and activation function. Remember that we would like the output of our network to separate stars and galaxies into **two** categories.
    5. Compile and train your model. An appropriate loss function is [the `binary_crossentropy`](https://towardsdatascience.com/understanding-binary-cross-entropy-log-loss-a-visual-explanation-a3ac6025181a). You should only need to train for about 5 epochs to achieve reasonable accuracy.