# **DSFM Exercise**: Open-source - style transfer & time series

Creator: [Data Science for Managers - EPFL Program](https://www.dsfm.ch)  
Source:  [https://github.com/dsfm-org/code-bank.git](https://github.com/dsfm-org/code-bank.git)  
License: [MIT License](https://opensource.org/licenses/MIT). See open source [license](LICENSE) in the Code Bank repository. 

-------------

## Overview

In this exercise, we leverage open-source tools to show the power of re-using existing work from the data science community. We will (1) convert the style of an image, based on a pre-trained open-source model, and (2) use cutting-edge models for time series predictions

__Main exercise 1__

This *neural style transfer* takes a *content image* and a *style reference image* (e.g. by Picasso, Kandinsky, Van Gogh). The goal is to "paint" the content image in the style of the reference image, using neural networks.

Original paper: *A Neural Algorithm of Artistic Style* by [Gatys et al. (2015)](https://arxiv.org/abs/1508.06576)

__Main exercise 2__

In the second part, we will explore how to build and manipulate a time series, train a forecasting model, and evaluate the predictions.

---------

## Part 0: Setup

In [None]:
# Imports
import os

# Style transfer
import tensorflow as tf
import tensorflow_hub as hub
import IPython.display as display
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np
import PIL.Image
import time
import functools

# Load compressed models from tensorflow_hub
os.environ['TFHUB_MODEL_LOAD_FORMAT'] = 'COMPRESSED'

# Time series
import pandas as pd
from darts import TimeSeries
from darts.models import (
    NaiveDrift,
    Prophet,
    ExponentialSmoothing,
    AutoARIMA,
    Theta
)
from darts.metrics import mape, mase

# Define plotting format
mpl.rcParams['figure.figsize'] = (12, 12)
mpl.rcParams['axes.grid'] = False

import warnings
warnings.filterwarnings("ignore")
import logging
logging.disable(logging.CRITICAL)

In [None]:
# Helper functions
def tensor_to_image(tensor):
    tensor = tensor*255
    tensor = np.array(tensor, dtype=np.uint8)
    if np.ndim(tensor)>3:
        assert tensor.shape[0] == 1
        tensor = tensor[0]
    return PIL.Image.fromarray(tensor)

def load_img(path_to_img):
    max_dim = 512
    img = tf.io.read_file(path_to_img)
    img = tf.image.decode_image(img, channels=3)
    img = tf.image.convert_image_dtype(img, tf.float32)

    shape = tf.cast(tf.shape(img)[:-1], tf.float32)
    long_dim = max(shape)
    scale = max_dim / long_dim

    new_shape = tf.cast(shape * scale, tf.int32)

    img = tf.image.resize(img, new_shape)
    img = img[tf.newaxis, :]
    return img

def imshow(image, title=None):
    if len(image.shape) > 3:
        image = tf.squeeze(image, axis=0)

    plt.imshow(image)
    if title:
        plt.title(title)

# **MAIN EXERCISE 1**

## Part 1: Upload image (optional)

Upload an image to the directory of this notebook. You can (1) upload an image from your computer or (2) copy an image from the web. We encourage the former – it's more fun.

**Q 1:** Upload image titled `myImage.jpg` and replace the existing `myImage.jpg`.

## Part 2: Choose style image

Now comes the creative part. Choose one of the available style reference images below.

**Q 1:** Define the path to the style image in the `/styles` directory. 

Hint: Create a variable called `content_path` to reference the style image you want to use.

**Q 2:** Show the style image.

Hint: Use the `load_img()` and `imshow()` helper functions.

## Part 3: Apply open-source model

We now download an open-source, pre-trained neural network to "paint" our image in the style above. The model is available on the TensorFlow Hub [here](https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2).

In [None]:
# Download the model
hub_model = hub.load('https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2')

**Q 1:** Apply the style to your image.

Hint: Use the `hub_model()` with the `content_image` and `style_image` as inputs.

**Q 2:** Plot the stylized image

# **MAIN EXERCISE 2**

## Part 4: Load and prepare data

We will use the well known [monthly airline passengers dataset](https://github.com/jbrownlee/Datasets/blob/master/monthly-airline-passengers.csv).

A `TimeSeries` simply represents a univariate or multivariate time series, with a proper time index. It is a wrapper around a `pandas.DataFrame`, and it can be built in a few different ways:
* From an entire Pandas `DataFrame` directly
* From a time index and an array of corresponding values
* From a subset of Pandas `DataFrame` columns, indicating which are the time column and the values columns. 

In [None]:
df = pd.read_csv('data/AirPassengers.csv', delimiter=",")
series = TimeSeries.from_dataframe(df, 'Month', ['#Passengers'])
mpl.rcParams['figure.figsize'] = (8, 8)
series.plot(grid=True, lw=3)

**Q 1:** Create a training and validation series and plot.

Let's split our `TimeSeries` into a training and a validation series. Note: in general, it is also a good practice to keep a test series aside and never touch it until the end of the process. Here, we just build a training and a test series for simplicity.

The training series will be a `TimeSeries` containing values until January 1958 (excluded), and the validation series a `TimeSeries` containing the rest:

## Part 5: Fit different time series models

`darts` is built to make it easy to train and validate several models in a unified way. Let's train a few more and compute their respective mean absolute percentage error (MAPE) on the validation set.

**Q 1:** Evaluate the following time series models: `NaiveDrift() ExponentialSmoothing() Prophet() AutoARIMA() Theta()`.

Hint 1: The above models are all functions readily built into DARTS.

Hint 2: Write a model evaluation helper function that takes one of the above models as an input.

Here, we did only built these models with their default parameters. We can probably do better if we fine-tune model-specific parameters to our problem. We skip this step here, but encourage you to try it out yourself and see how by how much you can improve model performance.

## Part 6: Plot the best model

Finally, we plot how well the predictions fit the actual value in the validation set. 

**Q 1:** Re-fit the best performing model from the preceding part and save the predictions in a variable, so we can plot the predictions later.

**Q 2:** Plot predicted vs. actual values in the validation dataset.