
#  Train a Neural Image Field in Keras

In this tutorial you will learn how to train a simple neural network model using Keras on Graphcloud
IPUs. If you have any problems following this guide then please ask questions on the dedicated channel
in Graphcore’s Slack Community: https://graphcorecommunity.slack.com.

## Instructions

First you need to launch an IPU machine: click the 'start machine' button above. You will be prompted to login or create an account (we recommend signing in with Github if you use it). Note that free machines can take up to ten minutes to start up.

![Start Machine Screenshot](images/start_machine_poplar.png)

Once the machine has launched you can run through the rest of the notebook.

## Training a Neural Image Field (NIF)

The model we will train is a co-ordinate network for reconstructing/compressing images by a small network of fully connected layers. We refer to this as a neural-image-field (NIF) due to the parallels with [neural radiance fields](https://arxiv.org/abs/2003.08934) (NERF) and the fact there is no consistent term in the literature for it. The model has a simple MLP structure as shown below.

![NIF Model Architecture](images/nif/nif_architecture.png)

The training inputs are pairs of vectors: the input 2D image coordinates (u, v) (normalised to [0,1)) and the corresponding red, green, blue (r,g,b) value at that position in the image. In effect, the task is to regress a function:

$$
  f([u,v]) \rightarrow [r,g,b]
$$

This function allows the network to reconstruct the entire image (by feeding a batch of inputs that represents coordinates of a regular grid) or to sample the image at any sparse set of image coordinates. If you want to learn more about approximating images with neural networks we suggest referring to the paper [Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains](https://arxiv.org/abs/2006.10739) which has a good theoretical analysis of the problem.

### Clone Graphcore's GitHub Examples Repository

Graphcore's examples repository contains many models from its [model garden](https://www.graphcore.ai/resources/model-garden). Run the cell below to clone it and install the example's pip requirements. (Note: you can also launch a terminal using the button on the left and type these commands instead of running them in cells if you prefer):

In [None]:
!git clone https://github.com/graphcore/examples.git
%cd "examples/vision/neural_image_fields/tensorflow2"
!pip install -r requirements.txt

Lets run the nif training script on the provided image just to check everything is working. The following will train for a small number of epochs on the example image provided. It should complete within a few minutes (or you can terminate it after a few epochs):

In [None]:
%set_env TF_CPP_MIN_LOG_LEVEL=3
%run train_nif.py --train-samples 1000000 --epochs 50 --input Mandrill_portrait_2_Berlin_Zoo.jpg --disable-psnr

Lets check we can reconstruct the original image from the trained model:

In [None]:
%run predict_nif.py --model saved_model --output reconstruction.jpg --original Mandrill_portrait_2_Berlin_Zoo.jpg

You can open the reconstruction [reconstruction.jpg](examples/vision/neural_image_fields/tensorflow2/reconstruction.jpg) and compare against the original image [Mandrill_portrait_2_Berlin_Zoo.jpg](examples/vision/neural_image_fields/tensorflow2/Mandrill_portrait_2_Berlin_Zoo.jpg).

### Train an HDRI Model

We now want to train the model for longer with a different type of input. The image used above was a low dynamic range image (i.e. each colour channel is stored in an 8-bit unsigned integer). The image was also compressed with lossy compression (JPG). We will now train the same network with a high-dynamic-range image (HDRI) where each colour channel is stored in 32-bit floating point and the image has been compressed losslessly (if at all). Such images are often used in rendering and later we will use this neural network as part of a simple “neural” ray-tracing program that runs on the IPU (specifically a Monte-Carlo path-tracing program).

First, we need to download a suitable image. I am going to use this image for illustration purposes: [polyhaven: studio small 09](https://polyhaven.com/a/studio_small_09). We can download the 2k version directly to our machine and begin training by running this cell, this will take about 10 minutes to train with this configuration:

In [None]:
!wget https://dl.polyhaven.org/file/ph-assets/HDRIs/exr/2k/studio_small_09_2k.exr -O studio_small_09_2k.exr
%run train_nif.py --train-samples 8000000 --epochs 100 --callback-period 20 --batch-size 1024 --layer-count 6 --layer-size 320 --input studio_small_09_2k.exr --model hdri_model

### Viewing the Result

Unlike the first training session this one was saving evaluation images as it went. You can download the reconstructed image with this link: [HDRI reconstruction](examples/vision/neural_image_fields/tensorflow2/hdri_model/tmp_eval_image.exr) (or using the file browser on the left). Because this image is in EXR format you will need to use an application that supports high dynamic range images such as [TEV](https://github.com/Tom94/tev/releases) or [pfsview](https://pfstools.sourceforge.net/}). You can also load the image and perform a simple tone-mapping for display in python:

In [None]:
import matplotlib.pyplot as plt
import cv2
import numpy as np

# Function to apply simple gamma correction, rescale,
# and clip values into range 0-255:
def gamma_correct(x, exposure, gamma):
  scale = 2.0 ** exposure
  y = np.power(x * scale, 1.0 / gamma) * 255.0
  return np.clip(y, 0.0, 255.0)

# Function to plot an opencv image:
def display_image(img):
  plt.figure(figsize=(8, 8))
  plt.style.use('dark_background')
  plt.imshow(cv2.cvtColor(ldr, cv2.COLOR_BGR2RGB), interpolation='bicubic')
  plt.show()

EXR_FLAGS = cv2.IMREAD_UNCHANGED | cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH
hdr = cv2.imread('hdri_model/tmp_eval_image.exr', EXR_FLAGS)
print(f"HDR image shape: {hdr.shape} type: {hdr.dtype} min: {np.min(hdr)} max: {np.max(hdr)}")

ldr = gamma_correct(hdr, exposure=1.0, gamma=2.4).astype(np.uint8)
display_image(ldr)

You can play with the exposure settings to see that the neural network preserves the high dynamic range of the input image. For example, note the detail in the brightest regions has not been clipped:

In [None]:
ldr = gamma_correct(hdr, exposure=-3.0, gamma=1.6).astype(np.uint8)
display_image(ldr)

### Comparing Training vs GPU

Paperspace also provides access to GPUs (inlcuding some on the free tier). You could try and run the same training script on a GPU to compare training time per epoch. To run on GPU you only need to change the following in the training command:
 - Add `--no-ipu` to the training script options.
 - Disable parallel PSNR evaluation: `--disable-psnr` (otherwise the script would try to use multiple GPUs which you may not have available).

## ACA Workshop

If you are running this notebook because you plan to attend an ACA workshop then you can download and save the model folder: `examples/vision/neural_image_fields/tensorflow2/hdri_model`. You could also choose a different image from [Polyhaven](https://polyhaven.com) and build a NIF for that instead if you like. We will load your model for inference inside a "neural renderer". The path-tracer, implemented in Poplar C++, will query the neural network to calculate realistic lighting effects with results like this:

![Image Rendered with Neural Environment Lighting](images/nif/nif_render.png)

Don't worry, a pre-trained model will be provided for those who skipped this tutorial.