# Rendering with NeRF

## Introduction

### Preface

This task improves upon last years delivery by Vegard Skui "Rendering with NeRF" by

  1. Providing the task and resources in the form of a Jupyter notebook, which makes it easier to access data, and perform the calculations. This also allows adjusting parameters to see how they affect the result

  2. More context regarding NeRFs and what we are actually doing here is provided, making the goal of the task more contextualized

  3. Fixed an error in the suggested solution

### What are NeRFs?

*This is a simplified explanation. For a more in depth explanation see the [original paper](https://arxiv.org/abs/2003.08934)*.

Essentially, NeRFs aim to "overfit" a neural net to a specific scene, allowing us to render the scene from any angle, including angles not previously seen. This is done by training a neural net to predict the color and density of a point in space, given its position and viewing direction. The neural net is trained on a set of images of the scene, and the corresponding camera positions. The neural net is trained by minimizing the error between the predicted color and density, and the actual color and density of the point in the image. This is done for all points in all images, and the error is summed up. The neural net is then updated to minimize this error. This is repeated until the error is sufficiently small.

### What are we doing here?

This tasks focuses on getting the RGB value of a pixel generated by a pretrained NeRF. What exactly we are representing in this task is not important. NeRFs use a pair of multilayer perceptrons (MLPs) to represent the scene, which are essentially just neural networks with at least three layers.

The first MLP takes in a 3D point in space, and outputs volume density $\sigma$ and a feature vector. This feature vector is then concatenated with the viewing direction, and fed into the second MLP, which outputs the RGB value $c$.

Each pixel in the image corresponds to a ray $\mathbf{r}(t) = \mathbf{o} + t\mathbf{d}$. To calculate the color of $\mathbf{r}$, we randomly sample distances $t_i$ ($0 \leq t_i \leq N$) along the ray, and pass the point $\mathbf{r}(t_i)$ and direction $\mathbf{d}$ to the MLPs to calculate the color values at the different points. The final color is then calculated using

$$c_{out} = \sum^N_{i=1} w_ic_i$$
where
$$w_i = T_i(1 - e^{\Delta _i \sigma _i})$$
$$T_i = \exp\left(-\sum_{j \lt i} \Delta _j \sigma _j \right)$$
$$\Delta _i = t_i - t_{i - 1}$$

In our case we will use a constant $\Delta _i$ for simplicities sake, and assign an even spacing between $t_i$. How $t_i$ is selected is not important for this task, but if you are curious you can read more about it in the paper above.

## Problem description

In this problem you are tasked with calculating the final RGB color value of a single pixel in im image rendered from a NeRF model. A predefined ray is provided which you will sample along, and then use volumetric rendering to combine the values into a single color value.

The ray $\mathbf{r}$ originates at $\mathbf{o} = \begin{bmatrix}
3 & -2 & 0
\end{bmatrix}^T$, and points in the direction $\mathbf{d} = \begin{bmatrix}
-1/\sqrt{5} & 2/\sqrt{5} & 0
\end{bmatrix}^T$. It is to be sampled up to 5 times with distance $\Delta = \sqrt{5}$, i.e. $t_i = i\cdot\Delta$. You may optimize your calculation by skipping occluded regions based on transmittance with a threshold of $1%$, i.e. where $T_i < 0.01$. Densities $\sigma$ and colors $c$ are provided as dictionaries in code blocks below, but also displayed as tables for easier readability. Use the formulas provided above to calculate the final values.

*Note that for simplicity's sake, all $z$ values have been left at $0$*

### Generated data from NeRF

All arrays use $y$ as the major axis, $x$ as the minor axis. $z$ is excluded, and assumed to always be $0$. We use dictionaries here since we do not provide values for all points, and this allows us to only provide values for the points we need.

#### Densities

Densities $\sigma$ predicted by the NeRF for $z = 0$.

|    |   -1 |   0 |   1 |   2 |   3 |
|---:|-----:|----:|----:|----:|----:|
|  6 |  1.5 | 2.1 | 1.8 | 0   | 0.1 |
|  4 |  1.8 | 2   | 0.9 | 0   | 0   |
|  2 |  2.3 | 0   | 0   | 0   | 0   |
|  0 |  0   | 0   | 0   | 0.1 | 0   |
| -2 |  0   | 0   | 0.1 | 0.1 | 0   |

In [None]:
densities = {
    6: {-1: 1.5, 0: 2.1, 1: 1.8, 2: 0.0, 3: 0.1},
    4: {-1: 1.8, 0: 2.0, 1: 0.9, 2: 0.0, 3: 0.0},
    2: {-1: 2.3, 0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0},
    0: {-1: 0.0, 0: 0.0, 1: 0.0, 2: 0.1, 3: 0.0},
    -2: {-1: 0.0, 0: 0.0, 1: 0.1, 2: 0.1, 3: 0.0},
}

#### Color values

Colors predicted by the NeRF for $z = 0$ and viewing direction $\mathbf{d} = \begin{bmatrix}
-1/\sqrt{5} & 2/\sqrt{5} & 0
\end{bmatrix}^T$.

**Red**
|    |   -1 |   0 |   1 |   2 |   3 |
|---:|-----:|----:|----:|----:|----:|
|  6 |  0.8 | 0.7 | 0.8 | 0.8 | 0.4 |
|  4 |  0.9 | 0.9 | 0.9 | 0.3 | 0.5 |
|  2 |  0.9 | 0.9 | 0.8 | 0.7 | 0.4 |
|  0 |  0.8 | 0.8 | 0.6 | 0.9 | 0.7 |
| -2 |  0.7 | 0.9 | 0.9 | 0.8 | 0.7 |

**Green**
|    |   -1 |   0 |   1 |   2 |   3 |
|---:|-----:|----:|----:|----:|----:|
|  6 |  0.7 | 0.8 | 0.7 | 0.3 | 0.1 |
|  4 |  0.9 | 0.9 | 0.9 | 0.4 | 0.1 |
|  2 |  0.8 | 0.8 | 0.8 | 0.8 | 0.3 |
|  0 |  0.9 | 0.7 | 0.4 | 0.5 | 0.2 |
| -2 |  0.8 | 0.3 | 0.9 | 0.8 | 0.2 |


**Blue**
|    |   -1 |   0 |   1 |   2 |   3 |
|---:|-----:|----:|----:|----:|----:|
|  6 |  0.2 | 0.1 | 0.2 | 0.1 | 0.1 |
|  4 |  0.2 | 0.1 | 0.1 | 0.3 | 0   |
|  2 |  0.3 | 0.5 | 0.4 | 0.3 | 0.1 |
|  0 |  0.1 | 0.3 | 0   | 0.3 | 0.3 |
| -2 |  0.2 | 0.1 | 0.9 | 0.8 | 0.8 |

In [None]:
red = {
    6: {-1: 0.8, 0: 0.7, 1: 0.8, 2: 0.8, 3: 0.4},
    4: {-1: 0.9, 0: 0.9, 1: 0.9, 2: 0.3, 3: 0.5},
    2: {-1: 0.9, 0: 0.9, 1: 0.8, 2: 0.7, 3: 0.4},
    0: {-1: 0.8, 0: 0.8, 1: 0.6, 2: 0.9, 3: 0.7},
    -2: {-1: 0.7, 0: 0.9, 1: 0.9, 2: 0.8, 3: 0.7},
}

green = {
    6: {-1: 0.7, 0: 0.8, 1: 0.7, 2: 0.3, 3: 0.1},
    4: {-1: 0.9, 0: 0.9, 1: 0.9, 2: 0.4, 3: 0.1},
    2: {-1: 0.8, 0: 0.8, 1: 0.8, 2: 0.8, 3: 0.3},
    0: {-1: 0.9, 0: 0.7, 1: 0.4, 2: 0.5, 3: 0.2},
    -2: {-1: 0.8, 0: 0.3, 1: 0.9, 2: 0.8, 3: 0.2},
}

blue = {
    6: {-1: 0.2, 0: 0.1, 1: 0.2, 2: 0.1, 3: 0.1},
    4: {-1: 0.2, 0: 0.1, 1: 0.1, 2: 0.3, 3: 0.0},
    2: {-1: 0.3, 0: 0.5, 1: 0.4, 2: 0.3, 3: 0.1},
    0: {-1: 0.1, 0: 0.3, 1: 0.0, 2: 0.3, 3: 0.3},
    -2: {-1: 0.2, 0: 0.1, 1: 0.9, 2: 0.8, 3: 0.8},
}

## Suggested solution

The correct values are $c_{out} = \begin{bmatrix}
0.89177933 & 0.81163112 & 0.13916069
\end{bmatrix}^T$. Calculations are shown below, using numpy.

In [None]:
import numpy as np


def get_color(point: np.array):
    x, y, z = point
    return np.array([red[y][x], green[y][x], blue[y][x]])


def get_density(point: np.array):
    x, y, z = point
    return densities[y][x]


def T(i: int, delta: float, density: np.array):
    return np.exp(-delta * np.sum(density[:i]))


def w(i: int, delta: float, density: np.array):
    t = T(i, delta, density)
    if t <= 0.01:
        return 0

    return t * (1 - np.exp(-delta * density[i]))


def calculate_color(o: np.array, d: np.array, delta: float):
    # Use rint and astype to get the nearest integer coordinates
    points = [np.rint(o + i * delta * d).astype(int) for i in range(5)]

    densities = np.array([get_density(point) for point in points])
    colors = np.array([get_color(point) for point in points])

    result = []

    for i, density, color in zip(range(5), densities, colors):
        # We can skip densities that are 0, as these will result in w_i = 0
        if density == 0:
            continue

        result.append(w(i, delta, densities) * color)

    return np.sum(np.array(result), axis=0)


o = np.array([3, -2, 0])
d = np.array([-1 / np.sqrt(5), 2 / np.sqrt(5), 0])
delta = np.sqrt(5)

print(calculate_color(o, d, delta))