# **The Master Joiner: Orbital Risk Synthesis**

**Datasets:** * **The Body:** `satcat_cleaned.csv` (Global Physical Registry - 67k+ Objects)
* **The Brain:** `ucs_cleaned.csv` (Active Intelligence Layer - 7.5k+ Payloads)

**Objective:** Fuse physical tracking data with operational intelligence to create a unified "Kinetic Master" registry for the 2026 Kessler Syndrome simulation.

### **The Engineering Challenge**
We currently possess two distinct realities: the **Physical Reality** (where objects are located) and the **Operational Reality** (what objects are doing). Merging these creates a significant "Visibility Gap"—we track ~67,000 objects, but only possess deep intelligence on ~11% (the active payload fleet).

To build a valid risk model, we must implement a synthesis pipeline:
1.  **Intelligence Coupling:** Perform a prioritized **Left Join** to enrich active assets without discarding the critical debris population.
2.  **Zombie Identification:** Algorithmically identify "The Living Dead"—payloads that are physically intact but operationally defunct (e.g., Age > Design Life).
3.  **Kinetic Engineering:** Calculate orbital velocity ($v$) and kinetic energy ($E_k = \frac{1}{2}mv^2$) for every object using derived mass and fuel-fraction logic.
4.  **Risk Synthesis:** Generate `kinetic_master.csv`, the single source of truth for geopolitical and collisional risk analysis.

In [1]:
import pandas as pd
import numpy as np
from IPython.display import Markdown, display

### **Stage 1: Intelligence Coupling (The Master Merge)**
**The Problem:** We possess two disconnected datasets. The SATCAT contains the global population (including 40,000+ debris shards) but lacks mission context. The UCS registry contains deep mission intelligence but only for active payloads. Merging them blindly would duplicate mass columns and potentially discard the debris population if an inner join were used.

**The Solution:** 
* **Lean Selection:** We isolate only the "Intelligence Packet" from the UCS (metadata, mission, and lifecycle columns) to prevent column duplication.
* **Left Join Topology:** We utilize the SATCAT as the immutable backbone. This ensures that 100% of the debris and rocket bodies are retained in the final model.
* **Active Purge:** Post-merge, we immediately filter for `in_orbit == 1`, discarding decayed objects to focus the model strictly on the "Live Fire" environment of 2026.

In [2]:
# We enforce string types for IDs immediately to prevent "Invisible Bug" merge failures
print("Loading Gold Standard Registries...")
satcat = pd.read_csv('../data/clean/satcat_cleaned.csv', dtype={'norad_id': str, 'cospar_id': str})
ucs = pd.read_csv('../data/clean/ucs_cleaned.csv', dtype={'norad_id': str, 'cospar_id': str})

# 2. Define the "Intelligence Packet"
# We strictly select only the metadata columns from UCS to avoid duplication.
# NOTE: 'object_type' is NOT in this list because we rely on the SATCAT version.
ucs_intelligence_cols = [
    'norad_id',              # The Key
    'satellite_name',        # Human name
    'official_name',         # Full name
    'country_operator',      # Readable Country Name
    'users',                 # Sector string
    'primary_purpose',       # Standardized Mission (e.g., "Communications")
    'detailed_purpose',      # Granular Mission
    'orbit_type',            # Geometry label (e.g., "Polar")
    'is_commercial', 'is_government', 'is_military', 'is_civil', # Sector Flags
    'lifetime_years',        # Critical for Zombie Algorithm
    'power_watts',           # Extra Physics
    'contractor', 'contractor_country'
]

# Filter UCS down to just the intelligence packet
ucs_lean = ucs[ucs_intelligence_cols].copy()

# 3. Execute The Master Merge (Left Join)
# SATCAT is the backbone (Left) so we keep all debris/rocket bodies
master = satcat.merge(ucs_lean, on='norad_id', how='left')

# 4. The "Active Orbit" Purge
# We drop decayed objects to create the kinetic baseline
pre_purge_count = len(master)
master = master[master['in_orbit'] == 1].copy()
post_purge_count = len(master)

print(f"\n{'--- MERGE & PURGE AUDIT ---':^40}")
print(f"Global Registry (Raw):    {pre_purge_count:,}")
print(f"Active Kinetic Master:    {post_purge_count:,}")
print("-" * 40)
print(f"Intelligence Matches:     {master['satellite_name'].notna().sum():,} (Active Payloads)")

Loading Gold Standard Registries...

      --- MERGE & PURGE AUDIT ---       
Global Registry (Raw):    67,264
Active Kinetic Master:    32,687
----------------------------------------
Intelligence Matches:     5,603 (Active Payloads)


### **Stage 2: Zombie Identification & Schema Enforcement**
**The Problem:**
1.  **The "Living Dead":** A satellite labeled `OPERATIONAL` in the SATCAT might have launched in 1990 with a 5-year design life. This object is a "Zombie"—a high-mass threat that appears active in simple queries but is actually a piece of debris that cannot maneuver.
2.  **Visualization Blindness:** Our raw `object_type` lumps all payloads together, failing to distinguish between active assets and dead hulks for our risk dashboards.
3.  **Schema Chaos:** Merging two massive datasets has left our columns in a random order, making the dataset difficult for scientists to scan and audit.

**The Solution:**
* **Hybrid Health Algorithm:** We identify Zombies by cross-referencing `ops_status` (SATCAT) with `lifetime_years` (UCS). Any payload exceeding its design life by >10% is flagged.
* **Category Engineering:** We derive a high-level `category` column to satisfy downstream visualization needs (`Active Satellite`, `Inactive Satellite`, `Rocket Body`, `Debris`).
* **Scientific Reordering:** We reorganize the dataframe into strict engineering domains: **Identity**, **Kinetic Profile**, **Orbital State**, **Mission Intelligence**, and **Lifecycle**.

In [3]:
# Initialize the Zombie Flag (Default to 0)
master['is_zombie'] = 0

# Define Zombie Logic
# CRITICAL NOTE: This algorithm ONLY applies to PAYLOADS (satellites).
# Debris and rocket bodies are excluded from zombie classification by design.
# The 'payload_mask' filter ensures we only flag defunct satellites, not all dead objects.
# Expected result: ~5,200-5,300 zombie payloads (verfied as 5,278 in validation)
payload_mask = master['object_type'] == 'PAYLOAD'
status_zombie = payload_mask & master['ops_status'].isin(['NON-OPERATIONAL', 'PARTIAL', 'STANDBY', 'DECAYED', 'UNKNOWN'])
lifecycle_zombie = (
    payload_mask & 
    (master['ops_status'] == 'OPERATIONAL') & 
    (master['sat_age_years'] > (master['lifetime_years'] * 1.1))
)
master.loc[status_zombie | lifecycle_zombie, 'is_zombie'] = 1

# Engineer 'Category' (Visualization Fix)
def derive_category(row):
    if row['object_type'] == 'DEBRIS': return 'Debris'
    elif row['object_type'] == 'ROCKET BODY': return 'Rocket Body'
    elif row['object_type'] == 'PAYLOAD':
        return 'Inactive Satellite' if row['is_zombie'] == 1 else 'Active Satellite'
    return 'Unknown'

master['category'] = master.apply(derive_category, axis=1)

master['owner'] = master['owner_code']

# SCIENTIFIC COLUMN REORDERING (The Final Schema)
logical_order = [
    # --- IDENTITY ---
    'norad_id', 'cospar_id', 
    'object_name', 
    'satellite_name', 'official_name', 'category', 'object_type',
    
    # --- KINETIC PROFILE ---
    'launch_mass_kg', 
    'proxy_mass_kg', 
    'dry_mass_kg', 'power_watts', 'rcs', 'rcs_class',
    
    # --- ORBITAL STATE ---
    'velocity_kms', 'kinetic_joules', 'semi_major_axis_km', 'proxy_power_watts',
    'orbit_class', 'orbit_type', 'period_minutes', 
    'perigee_km', 'apogee_km', 'inclination_degrees', 'eccentricity',
    
    # --- MISSION INTELLIGENCE ---
    'primary_purpose', 'detailed_purpose', 'users', 'country_operator',
    'is_commercial', 'is_government', 'is_military', 'is_civil',
    
    # --- LIFECYCLE & STATUS ---
    'launch_date', 'launch_year', 'sat_age_years', 'lifetime_years',
    'ops_status', 
    'data_status', 
    'in_orbit', 'is_zombie',
    
    # --- SUPPLY CHAIN & METADATA ---
    'owner', 'owner_code', 'contractor', 'contractor_country'
]

final_cols = [c for c in logical_order if c in master.columns]
master = master[final_cols]

# 5. Export
print(f"{'--- COMPATIBILITY AUDIT ---':^40}")
print(f"Object Name Density:      {master['object_name'].notna().mean():.1%} (Should be ~100%)")
print(f"Proxy Mass Present:       {'proxy_mass_kg' in master.columns}")
print(f"Final Schema Shape:       {master.shape[1]} columns")

output_path = '../data/clean/kinetic_master.csv'
master.to_csv(output_path, index=False)
print(f"\n✅ EXPORT SUCCESS: {len(master):,} active records saved to {output_path}")

      --- COMPATIBILITY AUDIT ---       
Object Name Density:      100.0% (Should be ~100%)
Proxy Mass Present:       True
Final Schema Shape:       40 columns

✅ EXPORT SUCCESS: 32,687 active records saved to ../data/clean/kinetic_master.csv


### **Stage 3: Kinetic & Power Engineering (The Physics Engine)**
**The Objective:** Transform static orbital elements into dynamic kinetic risk metrics and impute missing satellite bus capabilities.

**The Physics & Engineering Models:**
To quantify the "Stopping Power" and "Technological Density" of the debris field, we derive three critical variables:
1.  **Orbital Velocity ($v$):** Calculated using the standard gravitational parameter ($\mu$) and the semi-major axis ($a$) derived from the orbital period.
    * $$v = \sqrt{\frac{\mu}{a}}$$
2.  **Kinetic Energy ($E_k$):** The raw impact energy, measured in Joules.
    * $$E_k = \frac{1}{2} m v^2$$
3.  **Power Capacity ($P_{proxy}$):** A mass-derived estimate for power generation (Watts), using **Regime-Specific** densities (LEO vs. GEO) to respect the distinct physics of different orbital classes.
    * $$P_{proxy} \approx m \times \left( \frac{Watts}{kg} \right)_{orbit\_class}$$

**The Result:** A dataset that describes not just *where* objects are, but *how hard* they strike and *how capable* they were at launch.

In [4]:
# Define Astrodynamic Constants
MU = 398600.4418  # Standard Gravitational Parameter (km^3/s^2)

# Derive Semi-Major Axis (a) from Mean Motion
period_seconds = master['period_minutes'] * 60
mean_motion = (2 * np.pi) / period_seconds
master['semi_major_axis_km'] = np.cbrt(MU / mean_motion**2)

# Calculate Mean Orbital Velocity (km/s)
master['velocity_kms'] = np.sqrt(MU / master['semi_major_axis_km'])

# Calculate Kinetic Energy (Joules)
v_ms = master['velocity_kms'] * 1000
master['kinetic_joules'] = 0.5 * master['proxy_mass_kg'] * (v_ms ** 2)

# Initialize Proxy with Raw Data (Source of Truth)
master['proxy_power_watts'] = master['power_watts']

# Calculate "Tech Density" (Watts per Kg) for the KNOWN population
known_set = master.dropna(subset=['power_watts', 'proxy_mass_kg']).copy()
known_set['power_density'] = known_set['power_watts'] / known_set['proxy_mass_kg']

# Build the "Smart Lookup" Table (Median Density by Orbit Class)
# This captures the fact that GEO buses are fundamentally different from LEO cubesats.
orbit_density_map = known_set.groupby('orbit_class')['power_density'].median()
global_density = known_set['power_density'].median() # Fallback

print("--- SMART POWER MODEL PARAMETERS ---")
print(orbit_density_map)
print(f"Global Fallback: {global_density:.4f} W/kg")

# The Smart Imputation Function
def impute_power_smart(row):
    # If we already have data (Real or already filled), keep it.
    if pd.notna(row['proxy_power_watts']):
        return row['proxy_power_watts']
    
    # If it's a PAYLOAD, Model the Capacity
    if row['object_type'] == 'PAYLOAD':
        mass = row['proxy_mass_kg']
        orbit = row['orbit_class']
        
        # Look up the specific ratio for this orbit (e.g., GEO ratio)
        # If orbit is unknown/weird, use global median
        ratio = orbit_density_map.get(orbit, global_density)
        
        return mass * ratio
        
    # 3. Debris/Rocket Bodies have 0 Capacity
    return 0.0

# Apply the Smart Logic
master['proxy_power_watts'] = master.apply(impute_power_smart, axis=1)

# Final Schema Update
new_cols = ['velocity_kms', 'kinetic_joules', 'semi_major_axis_km', 'proxy_power_watts']
# Insert strictly into Kinetic Profile section
if 'rcs_class' in logical_order:
    insert_idx = logical_order.index('rcs_class') + 1
    final_schema = [c for c in logical_order if c not in new_cols]
    final_schema = final_schema[:insert_idx] + new_cols + final_schema[insert_idx:]
else:
    final_schema = master.columns.tolist()

master = master[[c for c in final_schema if c in master.columns]]

print(f"\n{'--- PHYSICS & ENGINEERING AUDIT ---':^50}")
print(f"Velocity Calculated:      {master['velocity_kms'].notna().mean():.1%}")
print(f"Kinetic Energy Calculated:{master['kinetic_joules'].notna().mean():.1%}")
print(f"Power Proxy Density:      {master['proxy_power_watts'].notna().mean():.1%} (Fully Imputed)")

--- SMART POWER MODEL PARAMETERS ---
orbit_class
ELLIPTICAL    0.400000
GEO           2.203390
LEO           0.461538
MEO           1.081081
Name: power_density, dtype: float64
Global Fallback: 0.4615 W/kg

       --- PHYSICS & ENGINEERING AUDIT ---        
Velocity Calculated:      100.0%
Kinetic Energy Calculated:100.0%
Power Proxy Density:      100.0% (Fully Imputed)


### **Stage 4: The Kinetic Master Dictionary**
**Objective:** Define the "Gold Standard" schema for downstream analysts.

**Scope Legend:**
* **Global:** Mandatory data for every object (Debris, Rocket Bodies, Payloads).
* **Payload:** Contextual data relevant only for Satellites (Assets).

#### **1. Identity & Classification**
The "Who" and "What" of the orbital environment.
| Feature Name | Type | Scope | Description |
| :--- | :--- | :--- | :--- |
| `norad_id` | `str` | **Global** | **Primary Key.** Unique USSPACECOM catalog number. |
| `cospar_id` | `str` | **Global** | International Designator (Launch Year + Launch Number). |
| `object_name` | `str` | **Global** | **Universal Label.** Defaults to UCS name for Payloads; SATCAT for Debris. |
| `satellite_name` | `str` | Payload | Commercial/Common Name (Payloads only). |
| `official_name` | `str` | Payload | Official Registry Name (Payloads only). |
| `category` | `str` | **Global** | **Engineered Risk Class:** `Active`, `Inactive`, `Rocket Body`, `Debris`. |
| `object_type` | `str` | **Global** | Raw Source Class: `PAYLOAD`, `ROCKET BODY`, `DEBRIS`. |

#### **2. Kinetic Profile (The Physics)**
The "Engine" of the risk model.
| Feature Name | Type | Scope | Description |
| :--- | :--- | :--- | :--- |
| `launch_mass_kg` | `float` | Payload | Mass at time of launch (Raw Data). |
| `proxy_mass_kg` | `float` | **Global** | **Mass Model.** Uses High-Fidelity UCS data or ESA Tier 1 Proxies. |
| `dry_mass_kg` | `float` | **Global** | Structural Mass (w/o Fuel). Derived via fuel-fraction constants. |
| `velocity_kms` | `float` | **Global** | **Mean Velocity.** Derived via Vis-Viva Equation ($v=\sqrt{\mu/a}$). |
| `kinetic_joules` | `float` | **Global** | **Impact Energy.** ($E_k = \frac{1}{2}mv^2$). The raw destructive potential. |
| `power_watts` | `float` | Payload | Electrical power generation capacity (Raw Data). |
| `proxy_power_watts`| `float` | Payload* | **Power Model.** Imputed Capacity (Watts). *Note: Set to 0 for Debris.* |
| `rcs` | `float` | **Global** | Radar Cross Section ($m^2$). Raw radar return size. |
| `rcs_class` | `str` | **Global** | Standardized Size Category: `LARGE`, `MEDIUM`, `SMALL`. |

#### **3. Orbital State & Geometry**
Where the object is located in 3D space.
| Feature Name | Type | Scope | Description |
| :--- | :--- | :--- | :--- |
| `orbit_class` | `str` | **Global** | Standardized Regime: `LEO`, `MEO`, `GEO`. |
| `orbit_type` | `str` | Payload | Geometric Classification (e.g., `Polar`, `Sun-Synchronous`). |
| `perigee_km` | `float` | **Global** | Closest approach to Earth surface. |
| `apogee_km` | `float` | **Global** | Farthest distance from Earth surface. |
| `semi_major_axis_km`| `float` | **Global** | The "Radius" of the orbit. Base variable for velocity calc. |
| `inclination_degrees`| `float` | **Global** | Angle relative to the equator (Critical for Polar congestion). |
| `eccentricity` | `float` | **Global** | Deviation from a perfect circle (0 = Circular). |
| `period_minutes` | `float` | **Global** | Time to complete one full orbit. |

#### **4. Mission Intelligence**
Contextual data describing the function and users of the asset.
| Feature Name | Type | Scope | Description |
| :--- | :--- | :--- | :--- |
| `primary_purpose` | `str` | Payload | Mission Type (e.g., `Communications`, `Earth Observation`). |
| `detailed_purpose` | `str` | Payload | Granular Mission Detail. |
| `users` | `str` | Payload | Consolidated User String (e.g., "Commercial/Military"). |
| `country_operator` | `str` | Payload | The nation operating the specific payload. |
| `is_commercial` | `int` | Payload | Flag: Mission has commercial utility. |
| `is_government` | `int` | Payload | Flag: Mission has government utility. |
| `is_military` | `int` | Payload | Flag: Mission has military utility. |
| `is_civil` | `int` | Payload | Flag: Mission has civil utility. |

#### **5. Lifecycle & Status**
The temporal health of the object.
| Feature Name | Type | Scope | Description |
| :--- | :--- | :--- | :--- |
| `launch_date` | `date` | **Global** | Precise date of orbital insertion. |
| `launch_year` | `int` | **Global** | Year of Launch. |
| `sat_age_years` | `int` | **Global** | Object Age ($2026 - Launch Year$). |
| `lifetime_years` | `float` | Payload | Design Life expectancy. |
| `ops_status` | `str` | **Global** | Operational Status: `OPERATIONAL`, `DECAYED`, etc. |
| `data_status` | `str` | **Global** | Tracking Health: `Lost` = Issue; `NaN` = Healthy. |
| `in_orbit` | `int` | **Global** | Binary Flag: `1` = Currently in orbit. |
| `is_zombie` | `int` | Payload | **Risk Flag:** `1` if Payload Age > Design Life + 10%. |

#### **6. Supply Chain & Metadata**
Geopolitical ownership and manufacturing lineage.
| Feature Name | Type | Scope | Description |
| :--- | :--- | :--- | :--- |
| `owner` | `str` | **Global** | **Standardized Operator Code.** (e.g., `US`, `PRC`). |
| `owner_code` | `str` | **Global** | Raw Source Code from SATCAT. |
| `contractor` | `str` | Payload | Prime Manufacturer of the bus/payload. |
| `contractor_country` | `str` | Payload | Nation of manufacture. |

### **Stage 5: Executive Completion Report**
**Objective:** Generate a dynamic "State of the Registry" executive summary to validate the final export.

**The "Vital Signs" Audit:**
Before exporting the `kinetic_master.csv`, we programmatically calculate the critical metrics that will define the 2026 simulation:
1.  **Kinetic Load:** Total mass (Kilotons) and energy (Terajoules) in the active environment.
2.  **The Zombie Index:** The exact percentage of payloads operating beyond design life.
3.  **Geopolitical Footprint:** A breakdown of the top 3 nations by on-orbit mass.
4.  **Schema Density:** A final "Pass/Fail" integrity check on critical engineering columns.

In [5]:
output_path = '../data/clean/kinetic_master.csv'

# Calculate Vital Signs
total_objects = len(master)
total_mass_kt = master['proxy_mass_kg'].sum() / 1_000_000  # Kilotons
total_energy_tj = master['kinetic_joules'].sum() / 1_000_000_000_000 # Terajoules

# Breakdown by Category
counts = master['category'].value_counts()
debris_count = counts.get('Debris', 0)
rbody_count = counts.get('Rocket Body', 0)
active_count = counts.get('Active Satellite', 0)
inactive_count = counts.get('Inactive Satellite', 0)

# Zombie Analysis
payloads = master[master['object_type'] == 'PAYLOAD']
zombie_total = payloads['is_zombie'].sum()
zombie_rate = (zombie_total / len(payloads)) if len(payloads) > 0 else 0

# Geopolitical Footprint
mass_owners = master.groupby('owner')['proxy_mass_kg'].sum().sort_values(ascending=False).head(3)
o1_n, o1_m = mass_owners.index[0], mass_owners.values[0] / 1000 # Metric Tons
o2_n, o2_m = mass_owners.index[1], mass_owners.values[1] / 1000
o3_n, o3_m = mass_owners.index[2], mass_owners.values[2] / 1000

# Define Column Configurations
column_specs = [
    # IDENTITY
    ('IDENTITY', 'norad_id', 'Global'), ('IDENTITY', 'cospar_id', 'Global'),
    ('IDENTITY', 'object_name', 'Global'), ('IDENTITY', 'satellite_name', 'Payload'),
    ('IDENTITY', 'official_name', 'Payload'), ('IDENTITY', 'category', 'Global'),
    ('IDENTITY', 'object_type', 'Global'),

    # KINETIC (Physics & Mass)
    ('KINETIC', 'proxy_mass_kg', 'Global'), ('KINETIC', 'dry_mass_kg', 'Global'),
    ('KINETIC', 'launch_mass_kg', 'Payload'), 
    ('KINETIC', 'velocity_kms', 'Global'), ('KINETIC', 'kinetic_joules', 'Global'),
    ('KINETIC', 'power_watts', 'Payload'), ('KINETIC', 'rcs', 'Global'),
    ('KINETIC', 'proxy_power_watts', 'Payload'),('KINETIC', 'rcs_class', 'Global'),

    # ORBITAL STATE
    ('ORBIT', 'orbit_class', 'Global'), ('ORBIT', 'orbit_type', 'Payload'),
    ('ORBIT', 'period_minutes', 'Global'), ('ORBIT', 'perigee_km', 'Global'),
    ('ORBIT', 'apogee_km', 'Global'), ('ORBIT', 'inclination_degrees', 'Global'),
    ('ORBIT', 'eccentricity', 'Global'), ('ORBIT', 'semi_major_axis_km', 'Global'),

    # MISSION INTELLIGENCE
    ('MISSION', 'primary_purpose', 'Payload'), ('MISSION', 'detailed_purpose', 'Payload'),
    ('MISSION', 'users', 'Payload'), ('MISSION', 'is_commercial', 'Payload'),
    ('MISSION', 'is_government', 'Payload'), ('MISSION', 'is_military', 'Payload'),
    ('MISSION', 'is_civil', 'Payload'),

    # LIFECYCLE
    ('LIFECYCLE', 'launch_date', 'Global'), ('LIFECYCLE', 'launch_year', 'Global'),
    ('LIFECYCLE', 'sat_age_years', 'Global'), ('LIFECYCLE', 'lifetime_years', 'Payload'),
    ('LIFECYCLE', 'ops_status', 'Global'), ('LIFECYCLE', 'data_status', 'Global'),
    ('LIFECYCLE', 'in_orbit', 'Global'), ('LIFECYCLE', 'is_zombie', 'Payload'),

    # SUPPLY CHAIN
    ('SUPPLY CHAIN', 'owner', 'Global'), ('SUPPLY CHAIN', 'owner_code', 'Global'),
    ('SUPPLY CHAIN', 'country_operator', 'Payload'), 
    ('SUPPLY CHAIN', 'contractor', 'Payload'), ('SUPPLY CHAIN', 'contractor_country', 'Payload')
]

# Calculate Density & Build Sortable List
audit_data = []
for domain, col, scope in column_specs:
    if col not in master.columns:
        continue

    # Calculate Density
    if scope == 'Payload':
        subset = master[master['object_type'] == 'PAYLOAD']
        density = subset[col].notna().mean()
        label = f"{density:.1%} (Payloads)"
        status = "✅ OK" if density > 0.9 else "⚠️ LOW"
    else:
        density = master[col].notna().mean()
        label = f"{density:.1%} (Global)"
        status = "✅ OK" if density > 0.99 else "⚠️ LOW"
        if col == 'data_status': status = "ℹ️ INFO" # Exception

    audit_data.append({
        'Domain': domain,
        'Feature': col,
        'Density': density, 
        'Label': label,
        'Scope': scope,
        'Status': status
    })

# Sort Data
domain_order = {
    'IDENTITY': 1, 'KINETIC': 2, 'ORBIT': 3, 
    'MISSION': 4, 'LIFECYCLE': 5, 'SUPPLY CHAIN': 6
}
audit_data.sort(key=lambda x: (domain_order[x['Domain']], -x['Density']))

report = f"""
### **Kinetic Master Pipeline: Completion Report**
**Global Registry Status:** {total_objects:,} Tracked Objects (In-Orbit)

#### **The Kinetic Environment (2026 Simulation)**
| Metric | Value | Unit |
| :--- | :--- | :--- |
| **Total Orbital Mass** | **{total_mass_kt:.2f}** | **Kilotons** |
| **Total Kinetic Energy** | **{total_energy_tj:.2f}** | **Terajoules** |
| **Zombie Satellites** | {zombie_total:,} | {zombie_rate:.1%} of Payload Fleet |

#### **Population Composition**
| Category | Count | Share |
| :--- | :--- | :--- |
| **Debris** | {debris_count:,} | {debris_count/total_objects:.1%} |
| **Rocket Bodies** | {rbody_count:,} | {rbody_count/total_objects:.1%} |
| **Active Satellites** | {active_count:,} | {active_count/total_objects:.1%} |
| **Inactive Satellites** | {inactive_count:,} | {inactive_count/total_objects:.1%} |

#### **Top 3 Mass Owners**
| Rank | Nation/Entity | Total Mass (Metric Tons) |
| :--- | :--- | :--- |
| 1 | **{o1_n}** | **{o1_m:,.0f} t** |
| 2 | **{o2_n}** | {o2_m:,.0f} t |
| 3 | **{o3_n}** | {o3_m:,.0f} t |

#### **Full Schema Integrity Audit (Sorted by Density)**
| Domain | Feature | Density | Scope | Status |
| :--- | :--- | :--- | :--- | :--- |
"""

for row in audit_data:
    report += f"| {row['Domain']} | `{row['Feature']}` | {row['Label']} | {row['Scope']} | {row['Status']} |\n"

display(Markdown(report))

master.to_csv(output_path, index=False)
print(f"✅ KINETIC MASTER EXPORTED: {len(master):,} records saved to {output_path}")


### **Kinetic Master Pipeline: Completion Report**
**Global Registry Status:** 32,687 Tracked Objects (In-Orbit)

#### **The Kinetic Environment (2026 Simulation)**
| Metric | Value | Unit |
| :--- | :--- | :--- |
| **Total Orbital Mass** | **16.28** | **Kilotons** |
| **Total Kinetic Energy** | **333.37** | **Terajoules** |
| **Zombie Satellites** | 5,278 | 30.1% of Payload Fleet |

#### **Population Composition**
| Category | Count | Share |
| :--- | :--- | :--- |
| **Debris** | 12,672 | 38.8% |
| **Rocket Bodies** | 2,401 | 7.3% |
| **Active Satellites** | 12,284 | 37.6% |
| **Inactive Satellites** | 5,278 | 16.1% |

#### **Top 3 Mass Owners**
| Rank | Nation/Entity | Total Mass (Metric Tons) |
| :--- | :--- | :--- |
| 1 | **US** | **7,774 t** |
| 2 | **CIS** | 3,315 t |
| 3 | **PRC** | 1,652 t |

#### **Full Schema Integrity Audit (Sorted by Density)**
| Domain | Feature | Density | Scope | Status |
| :--- | :--- | :--- | :--- | :--- |
| IDENTITY | `norad_id` | 100.0% (Global) | Global | ✅ OK |
| IDENTITY | `cospar_id` | 100.0% (Global) | Global | ✅ OK |
| IDENTITY | `object_name` | 100.0% (Global) | Global | ✅ OK |
| IDENTITY | `category` | 100.0% (Global) | Global | ✅ OK |
| IDENTITY | `object_type` | 100.0% (Global) | Global | ✅ OK |
| IDENTITY | `satellite_name` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| IDENTITY | `official_name` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| KINETIC | `proxy_mass_kg` | 100.0% (Global) | Global | ✅ OK |
| KINETIC | `dry_mass_kg` | 100.0% (Global) | Global | ✅ OK |
| KINETIC | `velocity_kms` | 100.0% (Global) | Global | ✅ OK |
| KINETIC | `kinetic_joules` | 100.0% (Global) | Global | ✅ OK |
| KINETIC | `rcs` | 100.0% (Global) | Global | ✅ OK |
| KINETIC | `proxy_power_watts` | 100.0% (Payloads) | Payload | ✅ OK |
| KINETIC | `rcs_class` | 100.0% (Global) | Global | ✅ OK |
| KINETIC | `launch_mass_kg` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| KINETIC | `power_watts` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| ORBIT | `orbit_class` | 100.0% (Global) | Global | ✅ OK |
| ORBIT | `period_minutes` | 100.0% (Global) | Global | ✅ OK |
| ORBIT | `perigee_km` | 100.0% (Global) | Global | ✅ OK |
| ORBIT | `apogee_km` | 100.0% (Global) | Global | ✅ OK |
| ORBIT | `inclination_degrees` | 100.0% (Global) | Global | ✅ OK |
| ORBIT | `eccentricity` | 100.0% (Global) | Global | ✅ OK |
| ORBIT | `semi_major_axis_km` | 100.0% (Global) | Global | ✅ OK |
| ORBIT | `orbit_type` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| MISSION | `primary_purpose` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| MISSION | `detailed_purpose` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| MISSION | `users` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| MISSION | `is_commercial` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| MISSION | `is_government` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| MISSION | `is_military` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| MISSION | `is_civil` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| LIFECYCLE | `launch_date` | 100.0% (Global) | Global | ✅ OK |
| LIFECYCLE | `launch_year` | 100.0% (Global) | Global | ✅ OK |
| LIFECYCLE | `sat_age_years` | 100.0% (Global) | Global | ✅ OK |
| LIFECYCLE | `ops_status` | 100.0% (Global) | Global | ✅ OK |
| LIFECYCLE | `in_orbit` | 100.0% (Global) | Global | ✅ OK |
| LIFECYCLE | `is_zombie` | 100.0% (Payloads) | Payload | ✅ OK |
| LIFECYCLE | `lifetime_years` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| LIFECYCLE | `data_status` | 2.9% (Global) | Global | ℹ️ INFO |
| SUPPLY CHAIN | `owner` | 100.0% (Global) | Global | ✅ OK |
| SUPPLY CHAIN | `owner_code` | 100.0% (Global) | Global | ✅ OK |
| SUPPLY CHAIN | `country_operator` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| SUPPLY CHAIN | `contractor` | 31.8% (Payloads) | Payload | ⚠️ LOW |
| SUPPLY CHAIN | `contractor_country` | 31.8% (Payloads) | Payload | ⚠️ LOW |


✅ KINETIC MASTER EXPORTED: 32,687 records saved to ../data/clean/kinetic_master.csv


In [6]:
# Objective: Verify the exported CSV reloads with 100% physical and structural integrity.

output_path = '../data/clean/kinetic_master.csv'

try:
    print(f"--- Cold-Load Technical Audit: {output_path} ---")
    
    # 1. Load the file (Simulating a fresh start)
    # CRITICAL: We don't specify dtypes here to test if pandas can infer them correctly,
    # or if we need to warn future users to specify dtype={'norad_id': str}
    df_verify = pd.read_csv(output_path)
    
    # 2. Primary Key Integrity
    # NORAD ID must be unique
    is_unique = df_verify['norad_id'].is_unique
    print(f"{'NORAD ID Uniqueness':<30} | {'✅ VERIFIED' if is_unique else '❌ DUPLICATES FOUND'}")
    
    # 3. Kinetic Physics Validation (The Engine Check)
    # Mass and Velocity must be positive numbers for ALL objects.
    # We use a tolerance of > 0.0
    phys_errors = (
        (df_verify['proxy_mass_kg'] <= 0).sum() + 
        (df_verify['velocity_kms'] <= 0).sum() +
        (df_verify['kinetic_joules'] <= 0).sum()
    )
    print(f"{'Kinetic Physics (Mass/Vel/E_k)':<30} | {'✅ VERIFIED' if phys_errors == 0 else f'❌ {phys_errors} INVALID VALUES'}")
    
    # 4. Critical Scope Density (The "Global" Check)
    # These columns MUST be 100% full (no NaNs).
    global_scope = ['norad_id', 'object_name', 'category', 'proxy_mass_kg', 'velocity_kms', 'orbit_class']
    null_counts = df_verify[global_scope].isnull().sum().sum()
    print(f"{'Global Scope Density':<30} | {'✅ 100.0%' if null_counts == 0 else f'❌ {null_counts} UNEXPECTED NULLS'}")
    
    # 5. Data Type check
    # 'is_zombie' should be an integer (0/1), not a float (0.0/1.0)
    zombie_is_int = pd.api.types.is_integer_dtype(df_verify['is_zombie'])
    print(f"{'Zombie Flag Type (Int)':<30} | {'✅ VERIFIED' if zombie_is_int else '⚠️ TYPE WARNING (Float)'}")

    # 6. Final Count
    print("-" * 60)
    print(f"✅ READY FOR ANALYSIS: {len(df_verify):,} verified objects locked in {output_path}")

except FileNotFoundError:
    print(f"❌ Error: File not found at {output_path}")
except Exception as e:
    print(f"❌ CRITICAL FAILURE: {e}")

--- Cold-Load Technical Audit: ../data/clean/kinetic_master.csv ---
NORAD ID Uniqueness            | ✅ VERIFIED
Kinetic Physics (Mass/Vel/E_k) | ✅ VERIFIED
Global Scope Density           | ✅ 100.0%
Zombie Flag Type (Int)         | ✅ VERIFIED
------------------------------------------------------------
✅ READY FOR ANALYSIS: 32,687 verified objects locked in ../data/clean/kinetic_master.csv
