# Simulation Proofs

Simulation proofs are a fundamental technique in cryptographic security proofs. This tutorial introduces the simulation proof paradigm through semantic security and gradually builds up to more complex applications.

:::{note}
**Reference**: This material is based on "How To Simulate It â€“ A Tutorial on the Simulation Proof Technique" by Yehuda Lindell (2016).
:::

## Introduction to Simulation
:label: intro-simulation

The **simulation paradigm** is a method for proving security in cryptography. At its core, the idea is simple: if an execution can be simulated without access to certain secret information, then that execution reveals nothing about that secret information.

### What is Simulation?

In a simulation proof, we construct an algorithm called a **simulator** that generates a view that is computationally indistinguishable from the real execution, but without using the actual secret inputs. If such a simulator exists, it proves that the real execution reveals nothing beyond what can be computed without the secrets.

### The Three Tasks of a Simulator

A simulator in cryptographic proofs typically needs to accomplish three main tasks:

1. **Generate the adversary's view**: The simulator must create messages and protocol transcripts that look like a real execution to the adversary

2. **Extract the adversary's input**: When dealing with malicious adversaries, the simulator must determine what input the adversary is effectively using

3. **Make the view consistent with the output**: The simulated view must be consistent with the output that the adversary receives or can compute

:::{tip} Key Insight
The existence of a simulator that can accomplish these tasks without the honest party's private input proves that the adversary learns nothing from the protocol beyond what is implied by the output alone.
:::

## Preliminaries and Notation
:label: prelim-simulation

### Computational Indistinguishability

The foundation of simulation proofs rests on the concept of computational indistinguishability.

:::{note} Definition: Computational Indistinguishability
Two probability ensembles $X = \{X_n\}_{n\in\mathbb{N}}$ and $Y = \{Y_n\}_{n\in\mathbb{N}}$ are **computationally indistinguishable**, denoted $X \stackrel{c}{\equiv} Y$, if for every probabilistic polynomial-time distinguisher $D$ there exists a negligible function $\mu(\cdot)$ such that for all $n \in \mathbb{N}$:

$$
\left| \Pr[D(X_n) = 1] - \Pr[D(Y_n) = 1] \right| \leq \mu(n)
$$
:::

**Intuition**: Two distributions are computationally indistinguishable if no efficient algorithm can tell them apart with non-negligible probability. This is weaker than statistical indistinguishability but sufficient for cryptographic purposes.

### Probability Ensembles

A **probability ensemble** is a sequence of random variables indexed by the security parameter. Formally:

- $\{X_n\}_{n\in\mathbb{N}}$ where each $X_n$ is a random variable over $\{0,1\}^{\text{poly}(n)}$
- Each $X_n$ represents the distribution for security parameter $n$

### Non-Uniformity

In cryptographic definitions, we often consider **non-uniform** adversaries that receive auxiliary input. This models adversaries with some limited preprocessing or advice. The definition extends naturally:

$$
\left| \Pr[D(X_n, z_n) = 1] - \Pr[D(Y_n, z_n) = 1] \right| \leq \mu(n)
$$

for all polynomial-size advice strings $\{z_n\}_{n\in\mathbb{N}}$.

## The Basic Paradigm: Semantic Security
:label: semantic-security

We introduce the simulation paradigm through the example of semantic security for encryption.

### The Setting

Consider a public-key encryption scheme $(Gen, Enc, Dec)$:
- $Gen(1^n)$ generates a key pair $(pk, sk)$
- $Enc_{pk}(m)$ encrypts message $m$ under public key $pk$
- $Dec_{sk}(c)$ decrypts ciphertext $c$ using secret key $sk$

An adversary sees the public key $pk$ and a ciphertext $c = Enc_{pk}(m)$, where $m$ is drawn from some distribution $M$.

### Semantic Security Definition

:::{note} Definition 3.1: Semantic Security
An encryption scheme $(Gen, Enc, Dec)$ is **semantically secure** if for every probabilistic polynomial-time adversary $\mathcal{A}$ there exists a probabilistic polynomial-time simulator $\mathcal{S}$ such that for every message distribution $M = \{M_n\}_{n\in\mathbb{N}}$ and every polynomial-time computable function $f$:

$$
\{\mathcal{A}(pk, Enc_{pk}(m))\}_{n\in\mathbb{N}} \stackrel{c}{\equiv} \{\mathcal{S}(pk, 1^{|m|})\}_{n\in\mathbb{N}}
$$

where $(pk, sk) \leftarrow Gen(1^n)$ and $m \leftarrow M_n$.
:::

**Key observation**: The simulator $\mathcal{S}$ receives only:
- The public key $pk$ (which is public anyway)
- The message length $|m|$ (often considered public information)

but **NOT** the actual message $m$ or ciphertext $c$.

### Understanding the Definition

The definition says that whatever the adversary $\mathcal{A}$ can compute from $(pk, c)$, the simulator $\mathcal{S}$ can compute from just $(pk, 1^{|m|})$. This means:

1. The ciphertext $c$ reveals **nothing** about $m$ beyond its length
2. Any information the adversary extracts could have been computed without seeing the ciphertext
3. The encryption is "perfect" from a computational perspective

### Example: Constructing a Simulator

Given an adversary $\mathcal{A}$, how do we construct the simulator $\mathcal{S}$?

:::{dropdown} **Simple Construction**

The simulator $\mathcal{S}$ works as follows:

**Input**: Public key $pk$ and message length $1^{|m|}$

**Simulation**:
1. Choose a random message $m' \leftarrow \{0,1\}^{|m|}$ of the correct length
2. Compute $c' = Enc_{pk}(m')$
3. Run $\mathcal{A}(pk, c')$ and output whatever $\mathcal{A}$ outputs

**Why this works**:
- By the security of the encryption scheme, $Enc_{pk}(m) \stackrel{c}{\equiv} Enc_{pk}(m')$ for any messages $m, m'$ of the same length
- Therefore $\mathcal{A}(pk, Enc_{pk}(m)) \stackrel{c}{\equiv} \mathcal{A}(pk, Enc_{pk}(m')) = \mathcal{S}(pk, 1^{|m|})$
- The simulator successfully mimics the adversary's view without knowing the actual message
:::

### The Ideal vs. Real Paradigm

Semantic security can be viewed through the **ideal/real paradigm**:

- **Real World**: Adversary sees $(pk, Enc_{pk}(m))$ for the actual message $m$
- **Ideal World**: Simulator only knows $(pk, 1^{|m|})$ but must produce indistinguishable output

If the ideal and real worlds are indistinguishable, then the real execution leaks no information beyond what's available in the ideal world (i.e., just the message length).

:::{important} Key Takeaway
The simulation paradigm proves security by showing that everything an adversary can do in a real execution could also be done in an ideal world where the adversary has no access to secret information. This proof technique extends far beyond encryption to secure computation, zero-knowledge proofs, and many other cryptographic protocols.
:::

## Summary

In this section, we learned:

1. **Simulation** is a proof technique where we show security by constructing an algorithm that mimics real executions without secret information
2. Simulators typically must: generate views, extract inputs, and ensure consistency
3. **Computational indistinguishability** is the mathematical foundation that allows simulators to "approximately" mimic real executions
4. **Semantic security** demonstrates the paradigm: if a simulator can produce a ciphertext view knowing only message length, then ciphertext reveals nothing about the message
5. The **ideal/real paradigm** compares real protocol executions against ideal executions with a trusted party

This foundation prepares us for more complex applications in secure computation, zero-knowledge proofs, and other advanced protocols.
