## Supervised Learning with ITensor: MPS for Classification

[Supervised Learning with Quantum-Inspired Tensor Networks](https://arxiv.org/pdf/1605.05775), *NeurIPS 2016.*

In [2]:
using Random, Statistics
using ITensors, ITensorMPS

*Why machine learning?* Tensor trains provide a linear model from a very large feature space, with only $O(N dm^2)$ parameters. 

*Goal:* Classify $8\times 8$ grayscale images into two classes, can be extended to multi-class classification (e.g. MNIST).

In [6]:
# synthetic data generation

const H, W = 8, 8
function make_bar(imgtype::Symbol; rng=Random.default_rng())
    X = fill(0.0, H, W)
    if imgtype == :vertical
        j = rand(rng, 2:W-1)
        X[:, j] .= 1.0
    elseif imgtype == :horizontal
        i = rand(rng, 2:H-1)
        X[i, :] .= 1.0
    end
    return X
end

function synth_dataset(n_per=200; rng=Random.default_rng())
    Xv = [make_bar(:vertical; rng=rng) for _ in 1:n_per]
    Xh = [make_bar(:horizontal; rng=rng) for _ in 1:n_per]
    X = vcat(Xv, Xh)
    y = vcat(fill(1, n_per), fill(0, n_per)) # 1 = vertical, 0 = horizontal
    shuffle = randperm(rng, length(X))
    return X[shuffle], y[shuffle]
end

X, y = synth_dataset(200)
N = H * W # number of pixels, i.e. number of MPS sites

64

*How are tensors useful?* Let $x = (x_1, x_2, \ldots, x_N) \in \mathbb R^N$ be an input data point (image with $N$ pixels). 

We choose a *local* feature map
$$ \phi: \mathbb R \to \mathbb R^d,\quad x_j \mapsto \phi(x_j) = \left(\phi_1(x_j), \phi_2(x_j), \ldots, \phi_d(x_j)\right) $$
Then, define a *tensor-product* feature (rank-$1$, order-$N$ tensor)
$$ \Phi(x)_{s_1, s_2, \ldots, s_N} = \phi_{s_1}(x_1) \otimes \phi_{s_2}(x_2) \otimes \cdots \otimes \phi_{s_N}(x_N),\quad s_j \in \{1, 2, \ldots, d\} $$
For grayscale images, a simple $d=2$ choice is
$$ \phi(x_j) = \left( \cos\left(\frac{\pi}{2} x_j\right), \sin\left(\frac{\pi}{2} x_j\right) \right) $$