# Statistical Illusions: When Data Misleads Rational Thinking

In quantitative analysis, numbers often appear authoritative. Means, correlations, trends, and statistical significance are routinely interpreted as objective signals of truth. However, many widely used statistical summaries can produce systematically misleading conclusions when structural assumptions are ignored.

This project investigates **statistical illusions** — situations where correct calculations lead to incorrect interpretations.

Rather than focusing on data collection or prediction, this study emphasizes the following question:

> Under what structural conditions do standard statistical tools distort reality?

We approach this through controlled simulation using synthetic data, where the data-generating process (DGP) is explicitly defined. By constructing scenarios with known ground truth, we can precisely identify when and why statistical summaries diverge from underlying structure.

---

## Objectives

1. Demonstrate how aggregation can reverse relationships (Simpson’s Paradox).
2. Show how distributional skewness separates mean from typical outcomes.
3. Examine survivorship bias and selection effects.
4. Analyze spurious correlation arising from hidden variables.
5. Clarify the limits of expectation-based reasoning.

---

## Methodological Principles

- No external datasets are required in early phases.
- All datasets are generated via explicitly defined probabilistic models.
- Each illusion is studied through:
  - Data-generating mechanism
  - Visual inspection
  - Statistical summary
  - Structural explanation

This project is not about predictive modeling, but about understanding the **epistemic limits of statistical inference**.

The central thesis is:

> Statistical calculations are internally correct,
> but their interpretation depends critically on structural assumptions.

By the end of this project, we aim to build a structural intuition for when data analysis clarifies reality — and when it creates illusion.