[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/rndsrc/stat2ml/blob/main/stat2ml.ipynb)

# From Statistics to Machine Learning

![meme](stat2ml.png)

## Why AI?

AI is everywhere in the news these days.

People debate whether we are in an AI bubble, there is even a
[Wikipedia page about it](https://en.wikipedia.org/wiki/AI_bubble),
and whether today's massive investments will lead to transformative
breakthroughs or another AI winter.

But regardless of future speculation, one thing is clear:

> **AI has already delivered real, undeniable scientific and
>   technological breakthroughs.**

Students use ChatGPT daily, but more importantly, the last decade has
produced advances that matter directly to science:
* **AlphaGo** (2016):
  A [reinforcement learning system](https://www.nature.com/articles/nature16961)
  that defeated the world champion in Goâ€”a game once believed to
  require human intuition.
  There is even a
  [documentory](https://www.youtube.com/watch?v=WXuK6gekU1Y) on it.
* **The Transformer Architecture** (2017):
  Introduced in
  "[Attention Is All You Need](https://arxiv.org/abs/1706.03762)",
  this model revolutionized sequence learning and laid the foundation
  for today's large language models including ChatGPT.
* **AlphaFold** (~2020):
  Achieved near-experimental accuracy in
  [predicting protein structures](https://www.nature.com/articles/s41586-021-03819-2),
  solving a 50-year grand challenge in biology, and won the
  [Nobel Prize in Chemistry 2024](https://www.nobelprize.org/prizes/chemistry/2024/summary/).
* **Large Language Models** (~2020-today):
  complex reasoning, coding, symbolic manipulation, and
  scientific workflows at scale.
  Notable startups include
  [OpenAI](https://openai.com/) and
  [Anthropic](https://www.anthropic.com/).
* **Diffusion Models** (~2020-today):
  Generated photorealistic images, molecular structures, and
  simulation surrogates using
  [probabilistic forward-reverse processes](https://arxiv.org/abs/2209.00796).
* **AI-assisted scientific discovery** (today):
  [AI systems](https://www.nature.com/articles/d41586-024-02842-3)
  now help design new materials, discover antibiotics,
  control fusion plasmas, and analyze particle-physics, astrophysics,
  and cosmology datasets.

There are true algorithmic innovations and real scientific value!

As physicists and scientists, it is important to ask:

> **What is AI?  
>   How does AI work?  
>   How can we use AI to accelerate scientific discovery?**

The last question can become an excellent open-ended homework
problem.
For this lab, we will focus primarily on the first two.

## What is AI/ML?

The current wave of AI can be viewed as a continuation of earlier
ideas that appeared under buzz words like *"big data"* and *"data
science"*,

To understand this evolution, it helps to take a step back and look at
the history of scientific methodology.

### The First Two Paradigms: Experiment & Theory

Before modern computing, science operated through two complementary
paradigms:

1. **Empirical/Experimental Science**

   * Start with observations.
   * Identify patterns and regularities in nature.
   * Build phenomenological descriptions.

2. **Theoretical Science**

   * Start with mathematical principles.
   * Derive predictions about how systems should behave.
   * Compare theory to experiments.

Together, experiment and theory form the foundation of the classical
scientific method.

### The Third Paradigm: Computational Science

As physics, chemistry, and engineering advanced, systems became too
complex for purely pencil-and-paper analysis: turbulence, weather,
plasma physics, galaxy formation, quantum many-body systems, general
relativity, etc.

Numerical algorithms became essential:

> Theoretical science + computing power = computational science.

This gave rise to the **third paradigm: computational science**, i.e.,
using algorithms and simulations to test and extend theory, and make
predictions.

### The Fourth Paradigm: Data Science

In recent decades:

* Experiments produce massive data streams (radio telescopes, climate
  satellites).
* Sensors became cheap and popular.
* Digital transactions and interactions created enormous datasets.

When **the data** becomes too large for traditional statistical
analysis, we need new tools to find structure, correlations, and
predictions.

This drove the "big data" era and the rise of **data science**:

> Empirical science + computing power = data science.

This is the **fourth paradigm of science**.

This notebook is a **lab-style introduction** that takes you on a
smooth path from:

* basic **statistics**;
* statistic **moments**;
* simple **curve fitting**;
* gradient-based **optimization**;
* automatic differentiation with **JAX**, and
* a first **deep learning** model on MNIST.

To see the connections between the different steps, we try to change
only a few things at a time.