# Historical Foundations of the Change-of-Variables Principle and the Jacobian Determinant

## 1. :contentReference[oaicite:0]{index=0} — *Théorie des fonctions analytiques* (1797)

**Field:** Classical mathematical analysis  

### Contribution
Lagrange laid some of the earliest conceptual foundations for the change-of-variables principle in multivariate calculus, particularly in the study of multiple integrals. His work implicitly recognized that transforming coordinates in a multidimensional space alters how infinitesimal regions are measured. Although the formal concept of the Jacobian determinant had not yet been introduced, Lagrange already understood that variable transformations require a compensating correction factor to account for changes in local volume.

### Significance
This work represents the conceptual seed of what would later become the Jacobian determinant. The notion that coordinate transformations require a principled adjustment of “volume” existed decades before the formal machinery was developed.

---

## 2. :contentReference[oaicite:1]{index=1} — *De determinantibus functionalibus* (1841)

**Field:** Differential analysis  

### Core Contribution
Jacobi introduced the formal definition of the Jacobian determinant as a precise mathematical object that quantifies how a differentiable transformation alters local volume. He explicitly linked this determinant to transformations of variables in multivariate calculus.

### Central Statement (Modern Interpretation)
If  
$$
\mathbf{y} = f(\mathbf{x}),
$$
then local volume elements transform according to
$$
d\mathbf{y} = \left| \det \left( \frac{\partial \mathbf{y}}{\partial \mathbf{x}} \right) \right| d\mathbf{x}.
$$

### Significance
This paper is the original and definitive source of the Jacobian determinant. Every later use of Jacobians in probability, statistics, physics, and modern generative modeling traces directly back to Jacobi’s formulation.

---

## 3. :contentReference[oaicite:2]{index=2} — *Foundations of the Theory of Probability* (1933)

**Field:** Probability theory  

### Contribution
Kolmogorov established the axiomatic foundation of probability theory, transforming probability into a rigorous mathematical discipline. Within this framework, the change-of-variables formula became essential for defining probability densities under transformations of continuous random variables.

### Conceptual Role
Kolmogorov elevated the Jacobian determinant from a geometric construct to a probabilistic necessity. For a transformation
$$
\mathbf{y} = f(\mathbf{x}),
$$
probability density functions must satisfy
$$
p_{\mathbf{y}}(\mathbf{y}) = p_{\mathbf{x}}(\mathbf{x})
\left| \det \left( \frac{\partial \mathbf{x}}{\partial \mathbf{y}} \right) \right|.
$$

### Significance
From this point onward, Jacobians were no longer optional mathematical tools; they became structurally required for consistency in continuous probability theory.

---

## 4. :contentReference[oaicite:3]{index=3} — *Theory of Probability* (1939)

**Field:** Bayesian statistics  

### Contribution
Jeffreys made systematic use of Jacobian determinants in:
- Reparameterization of random variables  
- Transformation of prior distributions  

He demonstrated that changing a model’s parameterization requires a Jacobian correction to preserve probability mass.

### Special Importance
Jeffreys reframed the Jacobian as a probabilistic correction factor rather than a purely geometric one. This interpretation aligns directly with likelihood-based modeling.

### Direct Link to Flows
The logic of normalizing flows—correcting probability densities after invertible transformations—is identical to the logic Jeffreys used in Bayesian inference.

---

## 5. :contentReference[oaicite:4]{index=4} — *Probability, Random Variables, and Stochastic Processes* (1965)

**Field:** Statistical signal processing  

### Contribution
Papoulis provided a systematic and pedagogical treatment of variable transformations in multivariate probability distributions. He explicitly described the Jacobian determinant as the factor governing the expansion or contraction of probability density under transformation.

### Special Importance
This book became a canonical reference for generations of engineers and scientists. Many early machine-learning researchers learned the probabilistic change-of-variables principle—and the role of the Jacobian—directly from Papoulis.

---

## 6. :contentReference[oaicite:5]{index=5} — Hyvärinen, Karhunen, Oja (2001)

**Field:** Independent Component Analysis (ICA) / Signal Processing  

### Contribution
ICA models observed data as transformations of latent independent sources. The likelihood function explicitly includes the Jacobian determinant of the transformation:
$$
\log p(\mathbf{x}) = \log p(\mathbf{s}) + \log \left| \det \left( \frac{\partial \mathbf{s}}{\partial \mathbf{x}} \right) \right|.
$$

### Key Insight
Statistical independence is enforced through change of variables, with the Jacobian playing a central role in likelihood evaluation.

### Direct Connection to Flows
The NICE model can be interpreted as a nonlinear, neural generalization of ICA. The conceptual transition from ICA to flow-based models is historically direct and continuous.

---

## 7. :contentReference[oaicite:6]{index=6} — Christopher M. Bishop (2006)

**Field:** Statistical machine learning  

### Contribution
Bishop devoted explicit sections to:
- Change of variables  
- Density transformation  

He presented these topics in a form accessible to machine-learning researchers, bridging classical probability theory and modern ML practice.

### Significance
This textbook served as a primary academic reference for early normalizing flow research. The mathematical framework used in flows appears almost verbatim in Bishop’s treatment.

---

## 8. :contentReference[oaicite:7]{index=7} — David J. C. MacKay (2003)

**Field:** Information theory  

### Contribution
MacKay connected the Jacobian determinant to:
- Entropy  
- Volume preservation  
- Conservation of probability mass  

### Central Principle
Probability mass must be conserved under transformations:
$$
p(\mathbf{x}) d\mathbf{x} = p(\mathbf{z}) d\mathbf{z}.
$$

This conservation law necessitates Jacobian corrections for probabilistic consistency.

---

## Historical Synthesis

Flow-based generative models did **not** invent the Jacobian determinant.  
They operationalized a mathematical tool developed over more than 180 years across:
- Calculus  
- Geometry  
- Probability  
- Statistics  
- Signal processing  
- Information theory  

### What Was *Not* New in Flow Models
- The Jacobian determinant  
- The change-of-variables formula  

### What *Was* New
- Neural parameterizations of invertible transformations  
- Architectural designs enabling tractable Jacobian computation in high dimensions  

---

## Unifying Academic Statement

Normalizing flows are not a new probabilistic theory,  
but a neural parameterization of classical change-of-variables principles.
