## SIPRI data exploration & preprocessing

In [1]:
import pandas as pd
from pathlib import Path

In [4]:
PROJECT_ROOT = Path.cwd().parents[1]
csv_path = PROJECT_ROOT / "data" / "raw" / "military" / "sirpi-trade-register.csv"

sipri_df = pd.read_csv(csv_path, skiprows=11)  # skip metadata block

In [7]:
sipri_df.columns.tolist()

['supplier', 'recipient', 'year', 'tiv']

In [9]:
# Key columns for your network:
# - Supplier
# - Recipient  
# - Delivery year (the actual year arms were delivered)
# - TIV delivery values (the value to aggregate)

# Preprocessing steps:
# 1. Filter to 1990-2024 delivery years
# 2. Aggregate to supplier-recipient-year level (sum TIV)
# 3. Save dyadic flows

sipri_df = sipri_df[['supplier', 'recipient', 'year', 'tiv']]
sipri_df.columns = ['supplier', 'recipient', 'year', 'tiv']

# Filter year range
sipri_df = sipri_df[(sipri_df['year'] >= 1990) & (sipri_df['year'] <= 2024)]

# Aggregate to dyadic flows per year
arms_flows = (sipri_df
    .groupby(['supplier', 'recipient', 'year'])['tiv']
    .sum()
    .reset_index())

# Save
PROJECT_ROOT = Path.cwd().parents[1]  # if running from notebooks/target-exploration
out_path = PROJECT_ROOT / "data" / "processed" / "target" / "sirpi-trade-register.csv"
out_path.parent.mkdir(parents=True, exist_ok=True)  # safe even if already exists
    
arms_flows.to_csv(out_path, index=False)

print(arms_flows.shape)
arms_flows.head()

(13169, 4)


Unnamed: 0,supplier,recipient,year,tiv
0,Albania,Burkina Faso,2011,1.2
1,Algeria,Western Sahara,2016,0.3
2,Angola,Cote d'Ivoire,2002,1.72
3,Argentina,Bolivia,2006,2.22
4,Argentina,Colombia,1990,6.0


It's SIPRI's standardized measure of arms transfer volume. It's not the actual price paid â€” instead it's based on the production cost of the weapon relative to a baseline (a core set of weapons), which lets you compare transfers across countries and time consistently.


So a TIV of 10 for a fighter jet transfer means the same "military capability value" whether it's sold to India or Brazil, regardless of the actual contract price (which varies based on negotiations, discounts, aid packages, etc.).


**higher TIV = more arms dependency**