# Part II: Ising Model
## Task Allocation
+ Visualization (Task B~C): Junye Wang
+ Simulation and Math Proof (Task A~C): Chuyan Zhou
## Description
In certain alloys, particularly those containing Fe, Co or Ni, the electrons have a tendency to align their spins in a common direction. This phenomenon is called **ferromagnetism**
and is characterized by the existence of a finite magnetization even in
the absence of a magnetic field. It has its origin in a genuinely quantum mechanical
effect known as the exchange interaction and related to the overlap between the wave
functions of neighboring electrons.

The spin ordering diminishes as the temperature of the sample increases and, above
a certain temperature $T_c$, the magnetization vanishes entirely. This is a phase transition
and $T_c$ is called the critical temperature or the Curie temperature. The
Curie temperature is named after Pierre Curie, who received the Nobel Prize in Physics
with his wife, Marie Curie. With their win, the Curies became the first ever married
couple to win the Nobel Prize, launching the Curie family legacy of five Nobel Prizes.

The Ising model is a mathematical model of ferromagnetism in statistical physics. The
Ising model was invented by the physicist Wilhelm Lenz in 1920, who gave it as a
problem to his student Ernst Ising. The one-dimensional Ising model was solved by
Ising alone in his 1924 thesis; however it has no phase transition. The two-dimensional
Ising model is much harder and was only given an analytic description much later,
by Lars Onsager in 1944. The two-dimensional Ising model is one of the simplest
statistical models to show a phase transition.

1. Now we describe a two-dimensional square-lattice Ising model: given an undirected
graph $G = (V,E)$, where $V$ is the set of vertex (site) and $E$ is the set of
edges. If $(v,w) ∈ E$, then we say vertex $v$ is the neighbor of vertex $w$, i.e., $v ∼ w$,
and vice versa. $G$ is an $n × n$ grid graph with $|V | = n^2$. Each vertex $v ∈ V$ is
associated with a discrete variable $σ_v$ such that $σ_k ∈ \{−1,+1\}$, representing the
vertex’s spin. An example of two-dimensional Ising model with n = 3 is illustrated
as follows:

![Fig. 2](attachment:image.png)

2. Define the spin configuration as $\boldsymbol \sigma = \{\sigma_v,v\in V\}$. The corresponding energy is
   $$
   H(\boldsymbol\sigma)=-\sum_{(v,w)\in E}\sigma_v\sigma_w.
   $$
   Define the set of all possible spin configurations as $\Omega$, and define the Gibbs distribution over $\Omega$ as follows:
   $$
   \pi_{\boldsymbol\sigma}={e^{-\beta H(\boldsymbol\sigma)}\over Z}, \forall \boldsymbol\sigma\in\Omega,
   $$
   where $\beta$ is the constant representing the inverse of the temperature, and $Z$ is the normalization cosntant (also called the partition function).



## Tasks
### Task A
Show that given all other vertices' spin value, the conditional distribution of vertex $k$'s spin is
$$
P(\sigma_k=+1|\boldsymbol\sigma_{-k})={1\over 1+e^{-2\beta(\sum_{v\sim k}\sigma_v)}},
\\P(\sigma_k=-1|\boldsymbol\sigma_{-k})={1\over 1+e^{2\beta(\sum_{v\sim k}\sigma_v)}},\tag{0.1}
$$
where
$$
\boldsymbol\sigma_{-k} = (σ_1, \cdots , σ_{k−1}, σ_{k+1},\cdots, σ_{|V |}).\tag{0.2}
$$

**Answer to Task A.**
$\renewcommand{\bsigma}{\boldsymbol{\sigma}}$

(Actually, the Ising Model we consider is a simplified version in the problem description given, whose complete version for this grid graph models both edge weights (different for each) when we consider the whole graph as a Markov Network or defined on max cliques or Pairwise Markov Field, or models both edge weights and point weights when we consider the whole graph as a Markov Network defined on all cliques or Pairwise Markov Field.)

From Gibbs distribution we know, $\forall\bsigma\in\Omega$, let the spin configuration $\bsigma$ be a vector of random variables, then we have
$$
P(\boldsymbol \sigma)={\pi}_{\bsigma}={e^{-\beta H(\bsigma)}\over Z}\propto e^{-\beta H(\bsigma)}=e^{\beta \sum_{(v,w)\in E}\sigma_v\sigma_w}.\tag {1.1}
$$
From Bayes' Rule and LOTP,
$$
\begin{align*}
P(\sigma_k=+1|\boldsymbol\sigma_{-k}) &= {P(\sigma_k=+1,\boldsymbol\sigma_{-k})\over P(\boldsymbol\sigma_{-k})}
\\ & = {P(\sigma_k=+1,\boldsymbol\sigma_{-k})\over \sum_{\sigma_k\in\{+1,-1\}}P(\sigma_k,\boldsymbol\sigma_{-k})}.
\end{align*}
$$
For the denominator (normalization coefficient)
$$
\sum_{\sigma_k\in\{+1,-1\}}P(\sigma_k,\boldsymbol\sigma_{-k}) = P(\boldsymbol\sigma_{-k}),
$$
we know it's **independent** to $\sigma_k$, which means we can treat it as a function $\mathbf N(\bsigma_{-k})$, which means
$$
P(\sigma_k=+1|\boldsymbol\sigma_{-k}) = {P(\sigma_k=+1,\boldsymbol\sigma_{-k})\over \mathbf N(\bsigma_{-k})}\propto P(\sigma_k=+1,\boldsymbol\sigma_{-k}).\tag{1.2}
$$

Let $\bsigma_{+1}:=\bsigma_{-k}\cup\{\sigma_k|\sigma_k=+1\}$. We let the edge set connecting $k$ be $E_k$, and expand the terms of the joint probability corresponding to equation $(1.1)$:
$$
P(\sigma_k=+1,\boldsymbol\sigma_{-k}) = {\pi}_{\bsigma_{+1}}=e^{\beta\sum_{(k,u)\in E_k}\sigma_k\sigma_u}e^{\beta\sum_{(v,w)\in E-E_k}\sigma_v\sigma_w},
$$
where $\sigma_k=+1$, and get the proportional form of the joint probability:
$$
P(\sigma_k=+1,\boldsymbol\sigma_{-k}) = e^{\beta\sum_{(k,u)\in E_k}\sigma_w}e^{\beta\sum_{(v,w)\in E-E_k}\sigma_v\sigma_w}\propto e^{\beta\sum_{(k,u)\in E_k}\sigma_u}.\tag{1.3}
$$

From $(1.2)$ and $(1.3)$, we also treat $e^{\beta\sum_{(v,w)\in E-E_k}\sigma_v\sigma_w}$ where all $\sigma_v$ and $\sigma_w$ given as a function $\mathbf {\hat N}(\bsigma_k)$, then the equation can be written in this way:
$$
P(\sigma_k=+1|\boldsymbol\sigma_{-k}) = {P(\sigma_k=+1,\boldsymbol\sigma_{-k})\over \mathbf N(\bsigma_{-k})} = {e^{\beta\sum_{(k,u)\in E_k}\sigma_u}\over \mathbf N(\bsigma_{-k})\mathbf{\hat N}(\bsigma_{-k})}\propto e^{\beta\sum_{(k,u)\in E_k}\sigma_u}.
$$
We simply note the normalization coefficient of the conditional probability: $\mathbf{\tilde N}(\bsigma_k)= \mathbf N(\bsigma_{-k})\mathbf{\hat N}$, and simplify the notation:
$$
e^{\beta\sum_{(k,u)\in E_k}\sigma_u} = e^{\beta\sum_{v\sim k}\sigma_v},
$$
i.e. we have
$$
P(\sigma_k=+1|\boldsymbol\sigma_{-k}) = {e^{\beta\sum_{v\sim k}\sigma_v}\over \mathbf{\tilde N}(\bsigma_k)}.\tag{2.1}
$$
Similarly for $\sigma_k=-1$ (the proof of the equation below is just to change +1 into -1 for all time and get the different result), we have
$$
P(\sigma_k=-1|\boldsymbol\sigma_{-k}) =  {e^{-\beta\sum_{v\sim k}\sigma_v}\over \mathbf{\tilde N}(\bsigma_k)}.\tag{2.2}
$$

Because $\sigma_k\in\{-1,+1\}$, and
$$
P(\sigma_k=+1|\boldsymbol\sigma_{-k}) + P(\sigma_k=-1|\boldsymbol\sigma_{-k}) = 1,
$$
we can conclude
$$
\mathbf{\tilde N}(\bsigma_k) = e^{-\beta\sum_{v\sim k}\sigma_v} + e^{\beta\sum_{v\sim k}\sigma_v}.\tag {2.3}
$$

Hence, by $(2.1),(2.2),(2.3)$, we have
$$
P(\sigma_k=+1|\boldsymbol\sigma_{-k}) = {e^{\beta\sum_{v\sim k}\sigma_v}\over e^{-\beta\sum_{v\sim k}\sigma_v} + e^{\beta\sum_{v\sim k}\sigma_v}}
= {1\over 1 + e^{-2\beta\sum_{v\sim k}\sigma_v}},
\\ P(\sigma_k=-1|\boldsymbol\sigma_{-k}) = {e^{-\beta\sum_{v\sim k}\sigma_v}\over e^{-\beta\sum_{v\sim k}\sigma_v} + e^{\beta\sum_{v\sim k}\sigma_v}}
= {1\over 1 + e^{2\beta\sum_{v\sim k}\sigma_v}}.\tag{Q.E.D.}
$$
