# Space Race Missions Analysis
This notebook explores space mission data since 1957, covering data exploration, cleaning, and various visualizations including choropleths and sunbursts.

### Install Packages
Ensure required packages are installed.

In [None]:
!pip install iso3166 plotly --quiet


### Imports and Settings

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import plotly.express as px
from iso3166 import countries

%matplotlib inline


## Load Dataset

In [None]:
df = pd.read_csv('mission_launches.csv')
df['launch_date'] = pd.to_datetime(df['launch_date'], errors='coerce')
df.head()


## Preliminary Data Exploration

In [None]:
# Basic info
df.info()


In [None]:
# Summary statistics
df.describe(include='all')


In [None]:
# Missing values per column
df.isnull().sum()


## Data Cleaning

In [None]:
# Drop duplicates
df = df.drop_duplicates()

# Clean cost_usd
df['cost_usd'] = df['cost_usd'].replace({'\$':'','[,]':''}, regex=True)
df['cost_usd'] = pd.to_numeric(df['cost_usd'], errors='coerce')


In [None]:
# Extract year and month
df['year'] = df['launch_date'].dt.year
df['month'] = df['launch_date'].dt.month


## Descriptive Statistics

In [None]:
# Missions per country
country_counts = df['country'].value_counts().reset_index()
country_counts.columns = ['country', 'missions']
country_counts.head()


## Choropleth Map of Launches by Country

In [None]:
# Map country codes
country_counts['iso_alpha'] = country_counts['country'].apply(lambda x: countries.get(x).alpha3 if x in countries else None)
fig = px.choropleth(country_counts, locations='iso_alpha', color='missions',
                    hover_name='country', title='Launches by Country')
fig.show()


## Missions per Year

In [None]:
missions_year = df.groupby('year').size().reset_index(name='count')
plt.figure(figsize=(10,5))
plt.plot(missions_year['year'], missions_year['count'])
plt.title('Number of Missions per Year')
plt.xlabel('Year'); plt.ylabel('Count'); plt.grid(True)


## Average Mission Cost Over Time

In [None]:
avg_cost = df.groupby('year')['cost_usd'].mean().reset_index()
plt.figure(figsize=(10,5))
plt.plot(avg_cost['year'], avg_cost['cost_usd'])
plt.title('Average Mission Cost Over Time')
plt.xlabel('Year'); plt.ylabel('Avg Cost (USD)'); plt.grid(True)


## Monthly Launch Distribution

In [None]:
monthly = df['month'].value_counts().sort_index()
plt.figure(figsize=(10,5))
plt.bar(monthly.index, monthly.values)
plt.title('Launches by Month')
plt.xlabel('Month'); plt.ylabel('Count')


## Mission Safety Over Time

In [None]:
df['success_flag'] = df['status'] == 'Success'
safety = df.groupby('year')['success_flag'].mean().reset_index()
plt.figure(figsize=(10,5))
plt.plot(safety['year'], safety['success_flag'])
plt.title('Mission Success Rate Over Time')
plt.xlabel('Year'); plt.ylabel('Success Rate'); plt.grid(True)


## Sunburst Chart of Launching Agencies Over Years

In [None]:
sun = df.groupby(['year','agency']).size().reset_index(name='count')
fig = px.sunburst(sun, path=['year','agency'], values='count', title='Agencies Launch Distribution')
fig.show()


## Conclusion
This analysis provided insights into launch trends, costs, seasonality, and safety since the start of the Space Race. Further exploration could include rocket types, payloads, and budget comparisons between countries.