# FIFA World Cup 2014 â€“ Exploratory Analysis

This notebook documents the exploratory steps and basic analysis
performed before and after the Monte Carlo simulation.

The goal is to understand the input data (FIFA rankings),
the derived team strengths, and the resulting title probabilities.

The final simulation itself is implemented in Python scripts (`src/`).


## Data Overview

The simulation is based on FIFA World Rankings.
The last ranking before the start of the 2014 World Cup (2014-06-05)
is used as input.

From this ranking, normalized team strength values are derived.


In [None]:
import pandas as pd

teams = pd.read_csv("../data/processed/teams_strengths_pre_wc_2014.csv")
teams.head()


## Team Strengths

The team strength values are normalized between 0 and 1.
Higher values indicate stronger teams according to FIFA ranking points.

These values are later used as parameters in the match simulation.


In [None]:
teams.sort_values("strength", ascending=False).head(10)


## Simulation Results

After running the Monte Carlo simulation, title probabilities
for each team are obtained.

Only the 32 teams participating in the 2014 World Cup
are included in the final results.


In [None]:
results = pd.read_csv("../reports/wm2014_title_probabilities.csv")
results.head(10)


## Interpretation

Even strong teams have relatively low title probabilities.
This is due to the knockout structure of the tournament and
the inherent randomness of football matches.

The Monte Carlo simulation captures this uncertainty explicitly.
