🚀 International Cooperation and Space Investment: Beyond Citizens’ Daily Lives\
Team: Crazy Thursday\
Team Members:Yanqiu Hu, Yuan Yao, Peiye Li, Qihao Bu


**1. Motivation**

Human space exploration has always carried dual meanings: the dream of science and discovery, but also the reality of national prestige and power politics. As resources are limited, a natural question arises: should governments prioritize people’s living standards, or prestige-driven space competition?

Our analysis began with two perspectives:

1. International cooperation – by constructing astronaut and mission networks, we found that the United States, Russia, and Japan appear most frequently as central hubs of cooperation. These countries not only dominate in launch frequency, but also rely heavily on partnerships with others, especially in the International Space Station era. This highlights the strategic role of collaboration in sustaining modern spaceflight.

2. Citizens’ living quality (CPI) – we then asked whether governments adjust space activity based on economic well-being at home. If inflation soars, do launches decline? If living costs rise, does investment shift away from rockets?

**2. Research Questions**



*   RQ1: Is there a relationship between citizens’ financial well-being (CPI, cost of living) and annual space launches?
*   RQ2: Does international cooperation accelerate space activity, and can cooperation rather than competition drive the future of space exploration?



**3. Data & Methods**

**Astronaut and Mission Data**

* Sources: Wikipedia human spaceflight lists, CSIS Astronaut Database, Social Science Dataset

* Constructed a cooperation network: if astronauts from multiple countries joined one mission, their countries were linked.

* Aggregated by decade to observe structural changes.

**CPI (2010=100)**

* Source: https://archive.ourworldindata.org/20250903-083611/grapher/consumer-price-index.html?tab=table#explore-the-data (processed for United States, Russia, Japan).


**Analysis**
* Visualized country cooperation network → highlighted U.S., Russia, Japan as central hubs.

* Merged CPI and launch counts by country–year.

* Calculated Pearson & Spearman correlations to test RQ1.

In [None]:
import pandas as pd
import re
import seaborn as sns

# read csv
df = pd.read_csv('Social_Science.csv')

print(df)



In [None]:
df['Mission.Year_Name'] = df['Mission.Year'].astype(str) + '_' + df['Mission.Name']
df['Mission.Name'] = df['Mission.Name'].str.title()
df['Profile.Gender'] = df['Profile.Gender'].str.lower()
df['Profile.Nationality'] = df['Profile.Nationality'].str.title()


df_sorted = df.sort_values(by='Mission.Year_Name')


df_sorted.to_csv("Social_Science1.csv", index=False)

In [None]:
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
from collections import defaultdict, Counter
import numpy as np

# read clean data
df = pd.read_csv("Social_Science1.csv")

# 1. identify international cooperations
# Find every countries involved in each mission
mission_countries = df.groupby('Mission.Year_Name')['Profile.Nationality'].unique()

# keep missions that are only involved by multiple countries
international_missions = mission_countries[mission_countries.apply(len) > 1]

print(f"There are {len(international_missions)} international cooperations")

# 2. calculate frequency of cooperations among countries
cooperation_count = Counter()

for mission, countries in international_missions.items():
    # pairing countries in each mission
    countries = sorted(countries)
    for i in range(len(countries)):
        for j in range(i+1, len(countries)):
            pair = (countries[i], countries[j])
            cooperation_count[pair] += 1

# 3. build cooperation network
G = nx.Graph()


for (country1, country2), count in cooperation_count.items():
    G.add_node(country1)
    G.add_node(country2)
    G.add_edge(country1, country2, weight=count)

# 4. visulize network
plt.figure(figsize=(14, 10))


node_sizes = [G.degree(node) * 300 for node in G.nodes()]


edge_widths = [G[u][v]['weight'] * 0.5 for u, v in G.edges()]


pos = nx.spring_layout(G, k=1, iterations=50)
nx.draw_networkx_nodes(G, pos, node_size=node_sizes, node_color='lightblue', alpha=0.9)
nx.draw_networkx_edges(G, pos, width=edge_widths, alpha=0.5, edge_color='gray')
nx.draw_networkx_labels(G, pos, font_size=10, font_weight='bold')


edge_labels = {(u, v): d['weight'] for u, v, d in G.edges(data=True)}
nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, font_size=8)

plt.title('Internationa Space Mission Cooperations', fontsize=16)
plt.axis('off')
plt.tight_layout()
plt.savefig('space_cooperation_network.png', dpi=300, bbox_inches='tight')
plt.show()

# 5. build frequency table
cooperation_df = pd.DataFrame([
    {"Country1": pair[0], "Country2": pair[1], "Number of Cooprations": count}
    for pair, count in cooperation_count.items()
]).sort_values("Number of Cooprations", ascending=False)

print("\Ranking of Cooprations Times (20)：")
print(cooperation_df.head(20).to_string(index=False))

# 6. Number of Cooprations for Each Country
country_coop_count = defaultdict(int)
for (country1, country2), count in cooperation_count.items():
    country_coop_count[country1] += count
    country_coop_count[country2] += count

country_coop_df = pd.DataFrame([
    {"Country": country, "Number of International Cooprations": count}
    for country, count in country_coop_count.items()
]).sort_values("Number of International Cooprations", ascending=False)

print("\nRanking of International Cooprations Times：")
print(country_coop_df.to_string(index=False))

# 7. save to csv
cooperation_df.to_csv('country_cooperation_frequency.csv', index=False)
country_coop_df.to_csv('country_total_cooperation.csv', index=False)

# 8. heatmap

all_countries = sorted(list(set([c for pair in cooperation_count.keys() for c in pair])))


coop_matrix = pd.DataFrame(0, index=all_countries, columns=all_countries)
for (c1, c2), count in cooperation_count.items():
    coop_matrix.loc[c1, c2] = count
    coop_matrix.loc[c2, c1] = count


plt.figure(figsize=(12, 10))
plt.imshow(coop_matrix, cmap='YlOrRd', interpolation='nearest')
plt.xticks(range(len(all_countries)), all_countries, rotation=90)
plt.yticks(range(len(all_countries)), all_countries)
plt.colorbar(label='Number of Cooprations')
plt.title('Cooperation Heatmap')
plt.tight_layout()
plt.savefig('cooperation_heatmap.png', dpi=300, bbox_inches='tight')
plt.show()




In [None]:
df = pd.read_csv("Social_Science1.csv")

country_counts = df['Profile.Nationality'].value_counts()
top_3_countries = country_counts.head(3).index.tolist()

print("Top 3 countries by total launches:")
for i, country in enumerate(top_3_countries, 1):
    count = country_counts[country]
    print(f"{i}. {country}: {count} launches")


all_years = range(int(df['Mission.Year'].min()), int(df['Mission.Year'].max()) + 1)


combined_table = pd.DataFrame(index=all_years)
combined_table.index.name = 'Year'


for country in top_3_countries:
    country_data = df[df['Profile.Nationality'] == country]
    yearly_counts = country_data['Mission.Year'].value_counts().sort_index()


    combined_table[country] = 0
    for year, count in yearly_counts.items():
        combined_table.loc[year, country] = count


combined_table['Total'] = combined_table.sum(axis=1)


combined_table.to_csv('Top3_Countries_Annual_Launches.csv', encoding='utf-8')

print("\nTop 3 countries annual launches table (first 10 years):")
print(combined_table.head(10))


plt.figure(figsize=(14, 8))

colors = ['#1f77b4', '#ff7f0e', '#2ca02c']  # 为三个国家设置不同颜色

for i, country in enumerate(top_3_countries):
    plt.plot(combined_table.index, combined_table[country],
             label=country,
             color=colors[i],
             linewidth=2.5,
             marker='o',
             markersize=4)


plt.axvline(x=2008, color='red', linestyle='--', alpha=0.7, linewidth=2)
plt.text(2008, plt.ylim()[1]*0.95, '2008 Financial Crisis', rotation=90,
        verticalalignment='top', fontsize=10, color='red', fontweight='bold')
plt.text(1987, plt.ylim()[1]*0.95, 'Space Shuttle Challenger disaster', rotation=90,
        verticalalignment='top', fontsize=10, color='red', fontweight='bold')

plt.xlabel('Year', fontsize=12)
plt.ylabel('Number of Launches', fontsize=12)
plt.title('Annual Space Launches by Top 3 Countries', fontsize=14, fontweight='bold')
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)
plt.xticks(rotation=45)

plt.tight_layout()
plt.savefig('Top3_Countries_Annual_Launches.png', dpi=300, bbox_inches='tight')
plt.show()

In [None]:
cpi = pd.read_csv("consumer-price-index.csv")


target_entities = ["United States", "Russia", "USSR", "Russian Federation","Japan"]
cpi_us_russia_japan = cpi[cpi["Entity"].isin(target_entities)].copy()


cpi_us_russia_japan = cpi_us_russia_japan.sort_values(["Entity", "Year"]).reset_index(drop=True)

cpi_us_russia_japan.to_csv("cpi_us_russia_japan.csv", index=False)

In [None]:
df = pd.read_csv("cpi_us_russia_japan.csv")


df = df[(df['Year'] >= 1960) & (df['Year'] <= 2019)]

#pivot wide the table
cpi_wide = df.pivot(index='Year', columns='Entity', values='Consumer price index (2010 = 100)')
cpi_wide.reset_index(inplace=True)

# raname the column
cpi_wide.columns.name = None

# sort by year
cpi_wide = cpi_wide.sort_values('Year').reset_index(drop=True)

launch_df = pd.read_csv('Top3_Countries_Annual_Launches.csv')

merged_df = pd.merge(launch_df, cpi_wide, on='Year', how='inner')

# subset of American
us_data = merged_df[['Year', 'U.S.', 'United States']].dropna()
us_data.columns = ['Year', 'Launches', 'CPI']

# subset of Rassia
russia_data = merged_df[['Year', 'U.S.S.R/Russia', 'Russia']].dropna()
russia_data.columns = ['Year', 'Launches', 'CPI']

# Subset for Japan
japan_data = merged_df[['Year', 'Japan_x','Japan_y']].dropna()
japan_data.columns = ['Year', 'Launches', 'CPI']


# compute the corrilation
us_corr = us_data['CPI'].corr(us_data['Launches'])
russia_corr = russia_data['CPI'].corr(russia_data['Launches'])


sns.set(style="whitegrid")
plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号


fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(15, 4))


ax1.scatter(us_data['CPI'], us_data['Launches'], alpha=0.7, color='blue')
ax1.set_xlabel('Consumer Price Index (2010 = 100)')
ax1.set_ylabel('Number of Astronaut Launches')
ax1.set_title(f'U.S.: CPI vs. Astronaut Launches\nCorrelation: {us_corr:.3f}')
ax1.grid(True, linestyle='--', alpha=0.7)


for i, row in us_data.iterrows():
    if row['Year'] in [1969, 1985, 1992, 2009] or row['Launches'] > 40:
        ax1.annotate(str(int(row['Year'])), (row['CPI'], row['Launches']),
                    xytext=(5, 5), textcoords='offset points', fontsize=8)


ax2.scatter(russia_data['CPI'], russia_data['Launches'], alpha=0.7, color='red')
ax2.set_xlabel('Consumer Price Index (2010 = 100)')
ax2.set_ylabel('Number of Astronaut Launches')
ax2.set_title(f'Russia: CPI vs. Astronaut Launches\nCorrelation: {russia_corr:.3f}')
ax2.grid(True, linestyle='--', alpha=0.7)


for i, row in russia_data.iterrows():
    if row['Year'] in [1992, 2000, 2008, 2015] or row['Launches'] > 8:
        ax2.annotate(str(int(row['Year'])), (row['CPI'], row['Launches']),
                    xytext=(5, 5), textcoords='offset points', fontsize=8)

# Scatter plot of Japan
ax3.scatter(japan_data['CPI'], japan_data['Launches'], alpha=0.7, color='red')
ax3.set_xlabel('Consumer Price Index (2010 = 100)')
ax3.set_ylabel('Number of Astronaut Launches')
ax3.set_title(f'Japan: CPI vs. Astronaut Launches\nCorrelation: {russia_corr:.3f}')
ax3.grid(True, linestyle='--', alpha=0.7)


In [None]:
from pathlib import Path
import matplotlib.pyplot as plt
from scipy.stats import pearsonr, spearmanr

In [None]:

# ---------- 1) Load data ----------
CPI_FILE = "cpi_us_russia_japan.csv"
LAUNCHES_FILE = "Top3_Countries_Annual_Launches.csv"

cpi = pd.read_csv(CPI_FILE).rename(
    columns={"Consumer price index (2010 = 100)": "CPI"}
)
launch = pd.read_csv(LAUNCHES_FILE)

# ---------- 2) Make launches long & harmonize country names ----------
launch_long = launch.melt(
    id_vars="Year",
    value_vars=["U.S.", "U.S.S.R/Russia", "Japan"],
    var_name="Entity",
    value_name="Launches"
)
launch_long["Entity"] = launch_long["Entity"].replace({
    "U.S.": "United States",
    "U.S.S.R/Russia": "Russia",
    "Japan": "Japan",
})

# Keep only countries present in CPI
valid = set(cpi["Entity"].unique())
launch_long = launch_long[launch_long["Entity"].isin(valid)].copy()

# Ensure numeric
for col in ["Year", "CPI"]:
    cpi[col] = pd.to_numeric(cpi[col], errors="coerce")
for col in ["Year", "Launches"]:
    launch_long[col] = pd.to_numeric(launch_long[col], errors="coerce")

# ---------- 3) Merge ----------
df = pd.merge(cpi, launch_long, on=["Entity", "Year"], how="inner").dropna(subset=["CPI", "Launches"])
df = df.sort_values(["Entity", "Year"]).reset_index(drop=True)

# ---------- 4) Plot per country with correlations ----------
outdir = Path("fig_cpi_vs_launches_corr")
outdir.mkdir(parents=True, exist_ok=True)

for country in df["Entity"].unique():
    sub = df[df["Entity"] == country].copy()

    # correlations (use overlapping years only)
    x = sub["CPI"].astype(float)
    y = sub["Launches"].astype(float)
    if len(sub) >= 3 and x.nunique() > 1 and y.nunique() > 1:
        r_p, r_pval = pearsonr(x, y)
        r_s, s_pval = spearmanr(x, y)
        corr_text = f"Pearson r = {r_p:.2f} (p={r_pval:.3g})\nSpearman ρ = {r_s:.2f} (p={s_pval:.3g})"
    else:
        corr_text = "Not enough variation for correlation"

    fig, ax1 = plt.subplots(figsize=(10, 5))
    ax1.set_title(f"{country}: CPI vs Annual Launches")
    ax1.plot(sub["Year"], sub["CPI"], label="CPI")
    ax1.set_xlabel("Year")
    ax1.set_ylabel("CPI (2010 = 100)")
    ax1.grid(True, linestyle="--", linewidth=0.5, alpha=0.5)

    ax2 = ax1.twinx()
    ax2.bar(sub["Year"], sub["Launches"], alpha=0.5, label="Launches")
    ax2.set_ylabel("Annual Launches")

    # Legend (combine both axes)
    lines1, labels1 = ax1.get_legend_handles_labels()
    lines2, labels2 = ax2.get_legend_handles_labels()
    ax1.legend(lines1 + lines2, labels1 + labels2, loc="upper left", frameon=False)

    # Correlation annotation
    ax1.text(
        0.01, 0.99, corr_text,
        transform=ax1.transAxes,
        ha="left", va="top",
        bbox=dict(boxstyle="round", alpha=0.15)
    )

    fig.tight_layout()
    plt.show()  # ensures display in notebook

    # also save a PNG
    fig.savefig(outdir / f"{country.replace(' ', '_')}_CPI_vs_Launches_corr.png", dpi=200)
    plt.close(fig)

print(f"Saved figures to: {outdir.resolve()}")









In [None]:

CPI_FILE = "cpi_us_russia_japan.csv"
LAUNCHES_FILE = "Top3_Countries_Annual_Launches.csv"

cpi = pd.read_csv(CPI_FILE).rename(columns={"Consumer price index (2010 = 100)": "CPI"})
launch = pd.read_csv(LAUNCHES_FILE)


cpi["Entity"] = (
    cpi["Entity"]
    .replace({"United States of America": "United States", "Russian Federation": "Russia"})
)


launch_long = launch.melt(
    id_vars="Year",
    value_vars=["U.S.", "U.S.S.R/Russia", "Japan"],
    var_name="Series",
    value_name="Launches"
)

launch_long["Year"] = pd.to_numeric(launch_long["Year"], errors="coerce")

def map_entity(row):
    if row["Series"] == "U.S.":
        return "United States"
    if row["Series"] == "Japan":
        return "Japan"

    return "USSR" if row["Year"] < 1991 else "Russia"

launch_long["Entity"] = launch_long.apply(map_entity, axis=1)
launch_long = launch_long.drop(columns=["Series"])



cpi["Year"] = pd.to_numeric(cpi["Year"], errors="coerce")
cpi["CPI"] = pd.to_numeric(cpi["CPI"], errors="coerce")
launch_long["Launches"] = pd.to_numeric(launch_long["Launches"], errors="coerce")


df = pd.merge(cpi, launch_long, on=["Entity", "Year"], how="inner")\
       .dropna(subset=["CPI", "Launches"])\
       .sort_values(["Entity", "Year"])\
       .reset_index(drop=True)

outdir = Path("fig_cpi_vs_launches_fixed")
outdir.mkdir(parents=True, exist_ok=True)

for country in df["Entity"].unique():
    sub = df[df["Entity"] == country].copy()


    if len(sub) >= 3 and sub["CPI"].nunique() > 1 and sub["Launches"].nunique() > 1:
        r_p, p_p = pearsonr(sub["CPI"].astype(float), sub["Launches"].astype(float))
        r_s, p_s = spearmanr(sub["CPI"].astype(float), sub["Launches"].astype(float))
        note = f"Pearson r={r_p:.2f} (p={p_p:.3g})\nSpearman ρ={r_s:.2f} (p={p_s:.3g})"
    else:
        note = "Not enough overlap/variation"

    fig, ax1 = plt.subplots(figsize=(10, 5))
    ax1.set_title(f"{country}: CPI vs Annual Launches (only years with CPI)")
    ax1.scatter(sub["Year"], sub["CPI"], label="CPI")
    ax1.set_xlabel("Year")
    ax1.set_ylabel("CPI (2010 = 100)")
    ax1.grid(True, linestyle="--", linewidth=0.6, alpha=0.5)

    ax2 = ax1.twinx()
    ax2.bar(sub["Year"], sub["Launches"], alpha=0.45, label="Launches")
    ax2.set_ylabel("Annual Launches")


    l1, lb1 = ax1.get_legend_handles_labels()
    l2, lb2 = ax2.get_legend_handles_labels()
    ax1.legend(l1 + l2, lb1 + lb2, loc="upper left", frameon=False)


    ax1.text(0.01, 0.99, note, transform=ax1.transAxes, ha="left", va="top",
             bbox=dict(boxstyle="round", alpha=0.15))

    fig.tight_layout()
    plt.show()
    fig.savefig(outdir / f"{country.replace(' ', '_')}_CPI_vs_Launches.png", dpi=200)
    plt.close(fig)

print(f"Saved figures to: {outdir.resolve()}")

**4. Results**\
**International Cooperation**

1. Cooperation networks reveal that U.S., Russia, and Japan are the top three most connected and most active launch countries.

2. Their centrality reflects the shift from competition (Cold War) to cooperation (ISS era).

3. Smaller countries (Canada, European states, etc.) often appear only through cooperation, reinforcing its importance for access to space.

**CPI vs Launches (RQ1)**

1. United States: CPI rises steadily, launches fluctuate with political/technological eras. Weak correlation (Pearson r=0.27).

2. Russia: Despite post-Soviet inflation crises, launch activity continued. Correlation insignificant (r≈−0.18).

3. Japan: Shows a moderate positive correlation (r≈0.45), but still space investment does not decline when living costs rise.

👉 Across cases, living quality and inflation do not predict space investment. Governments launch regardless of domestic CPI trends.

**5. Discussion**

These findings suggest two critical points:

1. Space programs are politically and strategically motivated – not directly responsive to citizens’ financial struggles. Launches persisted even in times of economic hardship (e.g., Russia in the 1990s, U.S. d

2. Cooperation, not competition, sustains space progress – our cooperation network shows that the most successful spacefaring nations are also those most engaged in international partnerships. Without U.S.–Russia–Japan collaboration, the ISS and many missions would not exist.

**6. Conclusion**

1. RQ1 Answer: People’s daily living quality (measured by CPI) does not influence how much governments invest in space launches. Space is treated as a domain of prestige, power, and long-term strategy.

2. RQ2 Answer: International cooperation strongly shapes the development of space programs. Countries that collaborate more—such as the U.S., Russia, and Japan—also sustain higher levels of space activity.

👉 Our Point: Since space investment is not adjusted for domestic economic well-being, the best way to ensure benefits for humanity is through cooperation rather than competition. By pooling resources, nations can reduce costs, expand access, and create shared scientific outcomes that eventually improve life on Earth.