# Space Mission Log Analysis

## Challenge Objective
Find the security code of the **longest successful Mars mission** in the database.

## Problem Requirements
- Analyze `space_missions.log` file
- Fields separated by '|' characters
- Filter for:
  - Destination: Mars
  - Status: Completed
- Find mission with maximum Duration (days)
- Extract Security Code (format: ABC-123-XYZ)

## Data Format
- Date | Mission ID | Destination | Status | Crew Size | Duration (days) | Success Rate | Security Code
- File contains commented lines (starting with #)
- Field separators may have inconsistent spacing


In [1]:
# Import necessary libraries
import pandas as pd
import re
from pathlib import Path


## Step 1: Read and Parse the Log File

We need to:
1. Skip commented lines (starting with #)
2. Handle inconsistent spacing around '|' separators
3. Parse the fields correctly


In [2]:
# Read the log file and parse it
log_file = 'space_missions.log'

# List to store parsed mission data
missions = []

# Helper function to safely convert to int
def safe_int(value):
    try:
        return int(value.strip())
    except (ValueError, AttributeError):
        return None

# Helper function to safely convert to float
def safe_float(value):
    try:
        return float(value.strip())
    except (ValueError, AttributeError):
        return None

# Read file line by line
with open(log_file, 'r', encoding='utf-8') as f:
    for line_num, line in enumerate(f, 1):
        # Skip commented lines and empty lines
        stripped_line = line.strip()
        if not stripped_line or stripped_line.startswith('#'):
            continue
        
        # Skip system/config lines that don't contain mission data
        if stripped_line.startswith('SYSTEM:') or stripped_line.startswith('CONFIG:') or stripped_line.startswith('CHECKSUM:'):
            continue
        
        # Split by '|' and strip whitespace from each field
        fields = [field.strip() for field in stripped_line.split('|')]
        
        # Check if we have the expected number of fields (8)
        if len(fields) == 8:
            try:
                # Create a dictionary for this mission
                mission = {
                    'Date': fields[0],
                    'Mission_ID': fields[1],
                    'Destination': fields[2],
                    'Status': fields[3],
                    'Crew_Size': safe_int(fields[4]),
                    'Duration': safe_int(fields[5]),
                    'Success_Rate': safe_float(fields[6]),
                    'Security_Code': fields[7]
                }
                missions.append(mission)
            except (ValueError, IndexError) as e:
                # Skip lines that can't be parsed
                continue

print(f"Total missions parsed: {len(missions)}")
print(f"\nFirst few missions:")
for i, mission in enumerate(missions[:3]):
    print(f"\nMission {i+1}:")
    for key, value in mission.items():
        print(f"  {key}: {value}")


Total missions parsed: 100000

First few missions:

Mission 1:
  Date: 2048-09-07
  Mission_ID: WXI-0590
  Destination: Io
  Status: In Progress
  Crew_Size: 6
  Duration: 380
  Success_Rate: 69.57
  Security_Code: ALP-780-GBT

Mission 2:
  Date: 2041-09-13
  Mission_ID: EYO-5723
  Destination: Moon
  Status: Partial Success
  Crew_Size: 6
  Duration: 48
  Success_Rate: 48.53
  Security_Code: HRV-950-OIS

Mission 3:
  Date: 2048-07-27
  Mission_ID: HQI-6628
  Destination: Mercury
  Status: Failed
  Crew_Size: 5
  Duration: 81
  Success_Rate: 18.46
  Security_Code: ZKP-703-FKM


## Step 2: Convert to DataFrame for Easier Analysis


In [3]:
# Convert to pandas DataFrame
df = pd.DataFrame(missions)

print(f"DataFrame shape: {df.shape}")
print(f"\nColumn names: {list(df.columns)}")
print(f"\nData types:")
print(df.dtypes)
print(f"\nFirst few rows:")
df.head(10)


DataFrame shape: (100000, 8)

Column names: ['Date', 'Mission_ID', 'Destination', 'Status', 'Crew_Size', 'Duration', 'Success_Rate', 'Security_Code']

Data types:
Date              object
Mission_ID        object
Destination       object
Status            object
Crew_Size          int64
Duration           int64
Success_Rate     float64
Security_Code     object
dtype: object

First few rows:


Unnamed: 0,Date,Mission_ID,Destination,Status,Crew_Size,Duration,Success_Rate,Security_Code
0,2048-09-07,WXI-0590,Io,In Progress,6,380,69.57,ALP-780-GBT
1,2041-09-13,EYO-5723,Moon,Partial Success,6,48,48.53,HRV-950-OIS
2,2048-07-27,HQI-6628,Mercury,Failed,5,81,18.46,ZKP-703-FKM
3,2054-09-23,RZY-2558,Jupiter,In Progress,8,265,22.96,WDS-964-RZI
4,2064-08-02,GDD-2143,Callisto,Completed,4,312,23.65,EFD-081-AQZ
5,2049-04-18,AWZ-3469,Mercury,Failed,3,1075,59.53,JFR-200-AVJ
6,2070-09-09,EVD-6446,Callisto,Failed,8,346,82.35,ZTE-415-IJL
7,2042-11-22,LLG-5460,Mars,Completed,2,476,14.32,AAJ-640-DZG
8,2034-05-03,JID-3218,Pluto,In Progress,8,408,91.85,WTB-100-RIY
9,2036-07-10,EVR-2754,Pluto,Failed,1,124,71.63,LOS-177-HTZ


## Step 3: Filter for Mars Missions with Completed Status


In [4]:
# Filter for Mars missions with Completed status
mars_completed = df[
    (df['Destination'].str.strip().str.lower() == 'mars') & 
    (df['Status'].str.strip().str.lower() == 'completed')
].copy()

print(f"Total Mars missions with Completed status: {len(mars_completed)}")
print(f"\nSample of filtered missions:")
mars_completed.head(10)


Total Mars missions with Completed status: 975

Sample of filtered missions:


Unnamed: 0,Date,Mission_ID,Destination,Status,Crew_Size,Duration,Success_Rate,Security_Code
7,2042-11-22,LLG-5460,Mars,Completed,2,476,14.32,AAJ-640-DZG
62,2064-05-04,XDT-2062,Mars,Completed,4,273,45.28,OOQ-852-YGM
214,2043-10-10,TUL-0204,Mars,Completed,0,334,56.42,LPJ-569-CDN
257,2039-01-07,RBJ-3598,Mars,Completed,4,439,69.34,ZCX-076-TTL
395,2048-07-28,DAE-6980,Mars,Completed,0,19,96.95,CFP-608-VPG
413,2032-06-21,KKZ-1882,Mars,Completed,4,457,27.25,WWK-637-IIA
529,2042-12-29,IGU-4709,Mars,Completed,7,382,83.94,TKY-005-NKL
550,2068-02-18,BWR-1331,Mars,Completed,8,285,73.75,AVT-921-ATR
794,2066-06-04,NYJ-3398,Mars,Completed,24,824,82.97,PQA-505-ZIK
949,2031-04-14,KQR-4973,Mars,Completed,7,368,18.61,GPX-474-CZB


## Step 4: Find the Mission with Maximum Duration


In [5]:
# Check for any missing Duration values
print(f"Missions with missing Duration: {mars_completed['Duration'].isna().sum()}")

# Remove any rows with missing Duration (if any)
mars_completed_clean = mars_completed.dropna(subset=['Duration'])

print(f"\nMissions after removing missing Duration: {len(mars_completed_clean)}")

# Find the mission with maximum duration
max_duration = mars_completed_clean['Duration'].max()
longest_mission = mars_completed_clean[mars_completed_clean['Duration'] == max_duration]

print(f"\nMaximum Duration: {max_duration} days")
print(f"\nNumber of missions with this duration: {len(longest_mission)}")
print(f"\nLongest Mars Mission Details:")
longest_mission


Missions with missing Duration: 0

Missions after removing missing Duration: 975

Maximum Duration: 1629 days

Number of missions with this duration: 1

Longest Mars Mission Details:


Unnamed: 0,Date,Mission_ID,Destination,Status,Crew_Size,Duration,Success_Rate,Security_Code
5199,2065-06-05,WGU-0200,Mars,Completed,4,1629,98.82,XRT-421-ZQP


## Step 5: Extract the Security Code


In [6]:
# Extract the security code
if len(longest_mission) > 0:
    security_code = longest_mission.iloc[0]['Security_Code'].strip()
    mission_id = longest_mission.iloc[0]['Mission_ID']
    date = longest_mission.iloc[0]['Date']
    duration = longest_mission.iloc[0]['Duration']
    
    print("=" * 60)
    print("RESULT: Longest Successful Mars Mission")
    print("=" * 60)
    print(f"Date: {date}")
    print(f"Mission ID: {mission_id}")
    print(f"Duration: {duration} days")
    print(f"Security Code: {security_code}")
    print("=" * 60)
    print(f"\nðŸŽ¯ ANSWER: {security_code}")
else:
    print("No matching mission found!")


RESULT: Longest Successful Mars Mission
Date: 2065-06-05
Mission ID: WGU-0200
Duration: 1629 days
Security Code: XRT-421-ZQP

ðŸŽ¯ ANSWER: XRT-421-ZQP


## Additional Analysis: Summary Statistics


In [7]:
# Additional statistics for Mars completed missions
if len(mars_completed_clean) > 0:
    print("Summary Statistics for Completed Mars Missions:")
    print("-" * 60)
    print(f"Total missions: {len(mars_completed_clean)}")
    print(f"Average duration: {mars_completed_clean['Duration'].mean():.2f} days")
    print(f"Minimum duration: {mars_completed_clean['Duration'].min()} days")
    print(f"Maximum duration: {mars_completed_clean['Duration'].max()} days")
    print(f"Median duration: {mars_completed_clean['Duration'].median():.2f} days")
    
    print("\nTop 5 Longest Completed Mars Missions:")
    print("-" * 60)
    top_5 = mars_completed_clean.nlargest(5, 'Duration')[['Date', 'Mission_ID', 'Duration', 'Security_Code']]
    print(top_5.to_string(index=False))


Summary Statistics for Completed Mars Missions:
------------------------------------------------------------
Total missions: 975
Average duration: 272.66 days
Minimum duration: 10 days
Maximum duration: 1629 days
Median duration: 258.00 days

Top 5 Longest Completed Mars Missions:
------------------------------------------------------------
      Date Mission_ID  Duration Security_Code
2065-06-05   WGU-0200      1629   XRT-421-ZQP
2047-07-31   AJV-3533      1482   WCN-103-DVD
2041-01-14   LBS-1848      1479   ZCA-027-KCP
2045-12-13   LTZ-4413      1422   DHA-730-NYP
2043-01-15   PGQ-7628      1417   NQT-363-IFR
