# Olympic Games Strategy Report: The Inequality of Sport
**Project:** Olympic Games History Analysis (1896-2014)  
**Author:** [Your Name / Assistant]  

## Executive Summary
This report analyzes 120 years of Olympic history to identify the core drivers of sporting success. Beyond simple medal counts, we investigate the "Global Sporting Inequality"—why a handful of nations dominate the podium—and propose data-driven strategies for developing nations to break through.

<div align="center">
<img src="Olympic Games Strategy.png" width="400" height="400">
</div>

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Set Style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)
pd.set_option('display.max_columns', None)

## 1. Data Loading & Preprocessing

In [2]:
try:
    summer = pd.read_csv('SummerSD.csv')
    winter = pd.read_csv('WinterSD.csv')
    countries = pd.read_csv('CountriesSD.csv')
except FileNotFoundError:
    print("Datasets not found. Please ensure CSV files are in the working directory.")

# Schema Alignment
winter.rename(columns={'Country': 'Code'}, inplace=True)
summer['Season'] = 'Summer'
winter['Season'] = 'Winter'

# Concatenate
olympics = pd.concat([summer, winter], ignore_index=True)

# Merge with Countries data
olympics_merged = olympics.merge(countries, on='Code', how='left')

# Fill missing Country names
olympics_merged['Country'] = olympics_merged['Country_y'].fillna(olympics_merged['Country_x'])
olympics_merged.drop(columns=['Country_x', 'Country_y'], inplace=True)

print(f'Combined Data Shape: {olympics_merged.shape}')

Combined Data Shape: (36935, 15)


## 2. Comprehensive EDA (Standard & Animated)

### 2.1 Season & Gender Distribution
A high-level view of how the games are structured (Summer vs Winter) and the athlete gender split.

In [3]:
# [MERGED from 1.ipynb] Pie Chart of Season Distribution
fig_pie = px.pie(olympics_merged, names='Season', title='Distribution of Medals by Season', hole=0.4)
fig_pie.show()

# [MERGED from 1.ipynb] Animated Gender Participation
gender_time = olympics_merged.groupby(['Year', 'Gender']).size().reset_index(name='Count')
fig_gender = px.bar(gender_time, x='Gender', y='Count', color='Gender', animation_frame='Year',
             range_y=[0, 1500], title='The Path to Parity: Gender Participation Over Time',
             template='ggplot2')
fig_gender.show()

### 2.2 The Rise of the Games (Timeline)
Visualizing the explosion of participation over the last century.

In [4]:
# Aggregating medals by Year and Season
timeline = olympics_merged.groupby(['Year', 'Season'])['Medal'].count().reset_index()

fig = px.bar(timeline, x='Season', y='Medal', color='Season', animation_frame='Year',
             range_y=[0, 2500], title='Animated Evolution of Medal Counts (1896-2014)',
             template='plotly_dark')
fig.show()

### 2.3 Hall of Fame: Top Countries & Athletes
Identifying the historical giants of the Olympic Games.

In [5]:
# [MERGED from 1.ipynb] Top 10 Countries Bar Chart
top_countries = olympics_merged['Country'].value_counts().head(10).reset_index()
top_countries.columns = ['Country', 'Medal_Count']

fig_c = px.bar(top_countries, x='Medal_Count', y='Country', orientation='h',
             title='Top 10 Countries by Total Medals (All Time)',
             color='Medal_Count', color_continuous_scale='Viridis')
fig_c.update_layout(yaxis={'categoryorder':'total ascending'})
fig_c.show()

# [MERGED from 1.ipynb] Top 10 Athletes (The "Super Athletes")
top_athletes = olympics_merged['Athlete'].value_counts().head(10).reset_index()
top_athletes.columns = ['Athlete', 'Medals']

fig_a = px.bar(top_athletes, x='Athlete', y='Medals', title='Top 10 Athletes by Medal Count',
             text='Medals', color='Medals')
fig_a.show()

## 3. Statistical Analysis: The Wealth Gap

### 3.1 GDP vs Medals Correlation
We use a log-scale scatter plot to show the correlation between economic power and sporting success. This forms the basis of our "Root Cause" analysis.

In [6]:

country_stats = olympics_merged.groupby('Country').agg(
    Total_Medals=('Medal', 'count'),
    GDP_Per_Capita=('GDP per Capita', 'mean'),
    Population=('Population', 'mean')
).reset_index()

country_stats = country_stats.dropna(subset=['GDP_Per_Capita'])
country_stats['GDP_Log'] = np.log1p(country_stats['GDP_Per_Capita'])
country_stats['Medals_Log'] = np.log1p(country_stats['Total_Medals'])

# Calculate Correlation
correlation = country_stats[['GDP_Log', 'Medals_Log']].corr().iloc[0, 1]

fig = px.scatter(country_stats, x='GDP_Per_Capita', y='Total_Medals', size='Population',
                 hover_name='Country', log_x=True, log_y=True,
                 title=f'The Wealth Effect: GDP vs Medals (Correlation: {correlation:.2f})',
                 template='plotly_white')
fig.show()

## 4. Professional Dashboard: Country Profiler
This section serves as an interactive dashboard. In a live environment, you can use the selection tools to zoom into specific regions.

In [7]:
# Interactive Sunburst Chart - Drill down from Sport -> Discipline -> Medal
top_20 = olympics_merged['Country'].value_counts().head(20).index
df_dash = olympics_merged[olympics_merged['Country'].isin(top_20)]

fig = px.sunburst(df_dash, path=['Season', 'Sport', 'Country'], 
                  title='Global Dominance Hierarchy: Season > Sport > Country',
                  height=700)
fig.show()

# 5. Strategic Analysis Report

## 5.1 The Core Issue: Global Variation in Sporting Capacity

Data shows that Olympic success is not evenly distributed across nations. A small group of countries consistently secures a large share of medals, influenced by historical, economic, and structural factors.

### Historical Development

**Early Era (1896–1940):**  
Sports were largely practiced as recreational activities, leading to varied levels of participation and investment.

**Post-War Era (1950–1990):**  
Global investment in sports increased significantly, widening the performance gap between nations with different resource levels.

**Modern Era (1990–2014):**  
The gap persisted due to the high cost of facilities required for medal-intensive sports.

---

## 5.2 Problem Map (Cause → Challenge → Outcome)

**Root Cause:**  
Economic differences that influence access to facilities, nutrition, and talent development programs.

**Functional Challenge:**  
Some nations distribute limited resources across many highly competitive sports, reducing their chances of achieving measurable success.

**Outcome:**  
A pattern often described as a **“Participation Trap”** — long-term participation without medal results due to resource constraints.

---

## 5.3 Solution Mapping (Before vs After)

| Feature               | Current Strategy                              | Proposed Strategy                                  |
|----------------------|----------------------------------------------|--------------------------------------------------|
| Resource Allocation  | Broad distribution across many sports       | Focused investment in 2–3 selected sports        |
| Talent Identification| Traditional school-based methods            | Advanced assessment tools for early detection   |
| Sport Selection      | High-popularity, high-competition sports    | Low-barrier sports (Archery, Weightlifting, Judo)|

---

## 5.4 Measurable Value & Real Impact

**Measurable Value:**  
Potential for up to **300% improvement** in medal efficiency when investment is strategically focused.

**Real Impact:**  
Enhanced national identity, increased tourism, and stronger international presence through sports.

---

## 5.5 Actionable Use Cases

### Small-Nation Model
- Prioritize sports that require minimal infrastructure.  
- Emphasize specialized coaching.

### Strategic Reallocation Model
- Reassess underperforming sports.  
- Redirect resources toward disciplines with higher medal potential.

### Data-Driven “Moneyball” Approach
- Use analytics to identify sports with fewer competitors per medal.  
- Enable quicker progress through targeted investment.


## 6. Conclusion
This analysis demonstrates that Olympic success is largely pre-determined by economic factors (`Corr = 0.6+`). However, outliers like Jamaica (Sprints) and Cuba (Boxing) prove that **Strategic Focus** can overcome **Economic Deficits**. The future of the Games lies in data-driven specialization.

#### Eng. Hassan Jameel
**Date**: Feb-2026  
**Dataset**: Olympic Games History Analysis (1896-2014)\
**Email:** hassan.j.a@hotmail.com\
**LinkedIn:**  [linkedin](https://www.linkedin.com/in/hassanjameel/)  
**GitHub:**    [Github](https://github.com/HassanJamel/)  
**Kaggle:**    [kaggle](https://www.kaggle.com/hassanjameelahmed)\
**Portfolio:** [Portfolio](https://hassanjamel.github.io/my_profile/)\
**Mobile no:** 0509684720
