# Project on Machine Learning
## Overview
The aim of this project is to use data from Monte Carlo simulations of a familiar system from Statistical Mechanics, namely the Ising Model. We will use a simple model without any external magnetic field. The energy expectation value is expressed as
    $$E=-J\sum\limits^N_{\{kl\}}s_ks_l$$
The $s_k$ and $s_l$ indicate a spin. The spins are represented in a spin-lattice with $s_k=\pm 1$ and $N$ being the total number of spins. $J$ is a coupling constant representing the strength of the interaction between neighbouring pairs of spins. The $<kl>$ notation indicate sum over the nearest neighbours.

The data used is and the methods explored follow closesly article >> ref <<. The methods explored here is logistic regression, random forest algortihm and deep neural networks.

The interresting physical properties to be extracted is states above, below and around a critical temperature $T_c$. When the system is in a temperature lower than this the system is in a so-called ferromagnetic phase. When close to the critical point, the magnetization becomes smaller, while the net magnetization is zero when the temperature is above $T_c$.

## Theory
We will first present the theory for the methods mentioned.

### Linear Regression
Linear regression model is a model for fitting data-points to a linear functional form.

Given a data set 
    $$\{y_i, \boldsymbol{x}_i\}_{i=1}^n,\; i=1,\dots,n$$ 
of $n$ points with $\boldsymbol{X}$ being the $n\times m$ matrix representing the regressors. Assuming the relationship between the regressors and $y_i$ is linear, the model is
    $$\boldsymbol{y} = \boldsymbol{X}\boldsymbol{\beta} + \boldsymbol{\varepsilon}$$
with 
    $$
    \boldsymbol{y} =
        \begin{pmatrix}
            y_1 \\
            \vdots \\
            y_n
        \end{pmatrix},
    $$
    $$
    \boldsymbol{X} =
        \begin{pmatrix}
            1 & f_1\left(x_{11}\right) & \dots & f_1\left(x_{1m}\right) \\
            \vdots & \vdots & \ddots & \vdots \\
            1 & f_n\left(x_{n1}\right) & \dots & f_n\left(x_{nm}\right)
        \end{pmatrix}
    $$
and
    $$
    \boldsymbol{\beta} = 
        \begin{pmatrix}
            \beta_0 \\
            \vdots \\
            \beta_m
        \end{pmatrix}.
    $$
The vector $\boldsymbol{\varepsilon}$ is an estimate for the noise in the system(i.e variance in the Monte Carlo simulation) and $f_i$ is a pre-defined function. This can for instance be a polynomial function
    $$f_i(x) = x^i,$$
or a polynomial in sine
    $$f_i(x) = \sin(ix),$$
or any other suitable choice.

The method of linear regression is simply to minimize with respect to parameters $\boldsymbol{\beta}$.

#### Ridge and Lasso Regression
While the linear regression model is rigorous and simple, it does have a tendency to overfit. In order to somewhat avoid this problem so-called regularization technicues have been developed. Two of these are Ridge and Lasso Regression.

##### Ridge Regression
With Ridge regression one performs an L2 regularization by adding an additional term equal to the square of the magnitude of the coefficients. This effectively ends up with performing the original linear regression, but with an added term. The equation is as follows
    $$\boldsymbol{y} = \boldsymbol{X}\boldsymbol{\beta} + \alpha\sum_{i=1}^m\beta^2_i.$$
The factor $\alpha$ is just a scaling.

##### Lasso Regression
The Lasso regression scheme performs an L1 regularization by adding only the absolute value of the magnitude of coefficients. The equation is
    $$\boldsymbol{y} = \boldsymbol{X}\boldsymbol{\beta} + \alpha\sum_{i=1}^m{\big|}\beta_i{\big|},$$
with $\alpha$ defined as before.

#### Transform the Ising Model to a Linear Regression Problem
In order to use linear regression with the Ising model we assume the model (without any prior knowledge) the all-to-all Ising model
    $$E^{(i)} = -\sum\limits_{kl}^NJ_{kl}s^{(i)}_ks^{(i)}_l,$$
with the $J_{kl}$ being the coupling strengths we wish to learn. The index $i$ represents a sample point. This equation can be rewritten as the matrix equation
    $$E^{(i)} = -\boldsymbol{X}^{(i)} \cdot \boldsymbol{J},$$
with $\boldsymbol{X}^{(i)}$ representing the two-body interactions 
    $$\left\{s^{(i)}_k,s^{(i)}_l\right\}_{k,l=1}^N.$$

### Logistic Regression
