$$
Z(\alpha, \beta) = \sum_{x \in \{0,1\}^n} \exp \left( \alpha \sum_i x_i + \beta \sum_{i \sim j} \mathbf{1}_{x_i = x_j} \right)
$$


## 1. Expression of $ Z(\alpha, \beta)$

The partition function $ Z(\alpha, \beta) $ ensures that $ p(x) $ is a valid probability distribution:

$$
Z(\alpha, \beta) = \sum_{x \in \{0,1\}^n} \exp \left( \alpha \sum_i x_i + \beta \sum_{i \sim j} \mathbf{1}_{x_i = x_j} \right)
$$

where the sum runs over all possible configurations \( x \) of the system.

### Why is $ Z(\alpha, \beta)$ difficult to compute?
- The sum has **exponential complexity** in $ n $.
- Computing it exactly requires summing over $ 2^n $ configurations, which is infeasible for large $ n $.
- Instead, we use **sampling methods** (like Monte Carlo methods) to approximate it.






## Likelihood and Log-Likelihood 

Let $ x^1, x^2, \dots, x^n $ represent the **different observed configurations** of the Ising model, where each $ x^i $ is a vector of binary values representing the spin configurations.

### Likelihood Function

The **likelihood** is the **product of the probabilities** of these observed configurations given the model parameters $ \alpha $ and $ \beta $. For each configuration $ x^i $, the probability is given by the Ising model distribution:

$$
p(x^i \mid \alpha, \beta) = \frac{1}{Z(\alpha, \beta)} \exp \left( \alpha \sum_j x^i_j + \beta \sum_{j \sim k} \mathbf{1}_{x^i_j = x^i_k} \right)
$$

The **likelihood** is the product of the probabilities for all observed configurations $ x^1, x^2, \dots, x^n $:

$$
L(\alpha, \beta \mid x^1, x^2, \dots, x^n) = \prod_{i=1}^{n} p(x^i \mid \alpha, \beta)
$$

Substituting the probability for each observation:

$$
L(\alpha, \beta \mid x^1, x^2, \dots, x^n) = \prod_{i=1}^{n} \frac{1}{Z(\alpha, \beta)} \exp \left( \alpha \sum_j x^i_j + \beta \sum_{j \sim k} \mathbf{1}_{x^i_j = x^i_k} \right)
$$

### Log-Likelihood Function

To simplify the computation, we take the **logarithm** of the likelihood to obtain the **log-likelihood**:

$$
\log L(\alpha, \beta \mid x^1, x^2, \dots, x^n) = \sum_{i=1}^{n} \left( \alpha \sum_j x^i_j + \beta \sum_{j \sim k} \mathbf{1}_{x^i_j = x^i_k} \right) - n \log Z(\alpha, \beta)
$$

Where:
- $ x^i_j $ refers to the $ j $-th element (spin) of the $ i $-th configuration vector $ x^i $.
- The sum $ \sum_j x^i_j $ is the total sum of spins in the $ i $-th configuration.
- The sum $ \sum_{j \sim k} \mathbf{1}_{x^i_j = x^i_k} $ counts the number of neighboring pairs of spins that are equal in the $ i $-th configuration.
- $ Z(\alpha, \beta) $ is the partition function, which normalizes the distribution.

---

### Why is MLE difficult?
- Computing $ Z(\alpha, \beta)$ is **intractable**.
- Gradient-based optimization is challenging because computing gradients requires evaluating $ Z(\alpha, \beta)$.

👉 **Solution**: Instead of MLE, we use **Approximate Bayesian Computation (ABC)**.