# Interactive Visual Analytics with Folium

## SpaceX Launch Sites Analysis

This notebook performs interactive visual analytics on SpaceX launch data using Folium, a powerful Python library for creating interactive maps. 

### Objectives:
- Visualize SpaceX launch sites on an interactive map
- Analyze the geographic distribution of launch sites
- Identify patterns in successful vs. failed launches
- Calculate distances between launch sites and key proximities (coastline, cities, etc.)
- Understand the geographic factors that may influence launch success

## 1. Import Required Libraries

Import essential libraries for data manipulation and map visualization.

In [31]:
# Import Folium for interactive map visualization
import folium
# Import Pandas for data manipulation and analysis
import pandas as pd

In [32]:
# Import Folium plugins for enhanced map features
# MarkerCluster: Groups nearby markers for better visualization
from folium.plugins import MarkerCluster
# MousePosition: Displays cursor coordinates on the map
from folium.plugins import MousePosition
# DivIcon: Creates custom HTML-based icons for markers
from folium.features import DivIcon

## 2. Load and Prepare SpaceX Launch Data

Load the cleaned SpaceX launch data and prepare it for analysis by:
- Removing rows with missing class values
- Converting data types for consistency
- Extracting year information from dates

In [33]:
# Load SpaceX launch data from cleaned CSV file
spacex_df = pd.read_csv("spacex_launch_data_clean.csv")

# Data cleaning: Remove rows where Class (success/failure) is not available
spacex_df = spacex_df[spacex_df['Class'].notna()]

# Convert data types for consistency and proper analysis
spacex_df['Class'] = spacex_df['Class'].astype(int)  # 0 = Failed, 1 = Successful
spacex_df['LaunchSite'] = spacex_df['LaunchSite'].astype(str)
spacex_df['Latitude'] = spacex_df['Latitude'].astype(float)
spacex_df['Longitude'] = spacex_df['Longitude'].astype(float)
spacex_df['BoosterVersion'] = spacex_df['BoosterVersion'].astype(str)
spacex_df['PayloadMass'] = spacex_df['PayloadMass'].astype(float)
spacex_df['Orbit'] = spacex_df['Orbit'].astype(str)

# Extract year from Date column for temporal analysis
spacex_df['Date'] = pd.to_datetime(spacex_df['Date'])
spacex_df['Year'] = spacex_df['Date'].dt.year

# Display basic information about the dataset
print(f"Total launches: {len(spacex_df)}")
print(f"Launch sites: {spacex_df['LaunchSite'].nunique()}")
print(f"Date range: {spacex_df['Year'].min()} - {spacex_df['Year'].max()}")

Total launches: 90
Launch sites: 3
Date range: 2010 - 2020


## 3. Create Launch Sites Summary

Extract unique launch site information by selecting relevant columns and grouping by launch site.

In [34]:
# Select relevant columns for geographic analysis
spacex_df = spacex_df[['LaunchSite', 'Latitude', 'Longitude', 'Class']]

# Create a summary dataframe with unique launch sites and their coordinates
# Group by LaunchSite and take the first occurrence to get unique site coordinates
launch_sites_df = spacex_df.groupby(['LaunchSite'], as_index=False).first()
launch_sites_df = launch_sites_df[['LaunchSite', 'Latitude', 'Longitude']]

# Display the launch sites
print(f"Number of unique launch sites: {len(launch_sites_df)}")
launch_sites_df

Number of unique launch sites: 3


Unnamed: 0,LaunchSite,Latitude,Longitude
0,CCSFS SLC 40,28.561857,-80.577366
1,KSC LC 39A,28.608058,-80.603956
2,VAFB SLC 4E,34.632093,-120.610829


## 4. Basic Map Creation

### Task 1: Create a basic Folium map centered at NASA Johnson Space Center

NASA Johnson Space Center is located in Houston, Texas and serves as the Mission Control Center for SpaceX launches.

In [35]:
# Define NASA Johnson Space Center coordinates [Latitude, Longitude]
# Location: Houston, Texas
nasa_coordinate = [29.559684888503615, -95.0830971930759]

# Create a Folium map centered at NASA JSC with appropriate zoom level
site_map = folium.Map(location=nasa_coordinate, zoom_start=10)

site_map

### Task 2: Add a marker and circle to NASA Johnson Space Center

Add visual elements to highlight the NASA JSC location with both a circle and a labeled marker.

In [36]:
# Create a circle around NASA Johnson Space Center
# - radius: 1000 meters (1 km)
# - color: Orange (#d35400) to make it stand out
# - fill: True to fill the circle area
circle = folium.Circle(
    nasa_coordinate, 
    radius=1000, 
    color='#d35400', 
    fill=True
).add_child(folium.Popup('NASA Johnson Space Center'))

# Create a custom text label marker for NASA JSC
# Using DivIcon to create an HTML-based label instead of a standard icon
marker = folium.map.Marker(
    nasa_coordinate,
    icon=DivIcon(
        icon_size=(20, 20),
        icon_anchor=(0, 0),
        html='<div style="font-size: 12pt; color:#d35400;"><b>%s</b></div>' % 'NASA JSC',
    )
)

# Add both circle and marker to the map
site_map.add_child(circle)
site_map.add_child(marker)

site_map

## 5. Visualize All SpaceX Launch Sites

### Task 3: Mark all launch sites on the map

Create a comprehensive view of all SpaceX launch sites with circles and labels.

In [37]:
# Initialize a new map with a wider zoom to show all launch sites
# Using NASA JSC as the center point with zoom_start=5 for broader view
site_map = folium.Map(location=nasa_coordinate, zoom_start=5)

# Iterate through each launch site and add visual markers
for index, row in launch_sites_df.iterrows():
    # Create a circle for each launch site
    circle = folium.Circle(
        [row['Latitude'], row['Longitude']], 
        radius=1000,  # 1 km radius
        color='#d35400',  # Orange color for visibility
        fill=True
    ).add_child(folium.Popup(row['LaunchSite']))
    
    # Create a text label marker for the launch site name
    marker = folium.map.Marker(
        [row['Latitude'], row['Longitude']],
        icon=DivIcon(
            icon_size=(20, 20),
            icon_anchor=(0, 0),
            html='<div style="font-size: 12pt; color:#d35400;"><b>%s</b></div>' % row['LaunchSite'],
        )
    )
    
    # Add circle and marker to the map
    site_map.add_child(circle)
    site_map.add_child(marker)

site_map

## 6. Visualize Launch Success/Failure

### Task 4: Mark success and failed launches with different colors

Use MarkerCluster to group launches and color-code them:
- **Green markers**: Successful launches (Class = 1)
- **Red markers**: Failed launches (Class = 0)

In [38]:
# Create a MarkerCluster to group nearby launch markers
# This improves visualization when multiple launches occur at the same site
marker_cluster = MarkerCluster().add_to(site_map)

# Add a marker for each launch, color-coded by success/failure
for index, row in spacex_df.iterrows():
    # Determine marker color based on launch outcome
    # Class = 1: Successful landing (green)
    # Class = 0: Failed landing (red)
    if row['Class'] == 1:
        color = 'green'
    else:
        color = 'red'
    
    # Create and add marker to the cluster
    folium.Marker(
        location=[row['Latitude'], row['Longitude']],
        icon=folium.Icon(color=color),
    ).add_to(marker_cluster)

site_map

### Data Exploration

Let's examine the recent launch data to understand the Class distribution.

In [39]:
# Display the last 10 launches to check the data
spacex_df.tail(10)

Unnamed: 0,LaunchSite,Latitude,Longitude,Class
80,CCSFS SLC 40,28.561857,-80.577366,1
81,CCSFS SLC 40,28.561857,-80.577366,1
82,CCSFS SLC 40,28.561857,-80.577366,1
83,CCSFS SLC 40,28.561857,-80.577366,1
84,CCSFS SLC 40,28.561857,-80.577366,1
85,KSC LC 39A,28.608058,-80.603956,1
86,KSC LC 39A,28.608058,-80.603956,1
87,KSC LC 39A,28.608058,-80.603956,1
88,CCSFS SLC 40,28.561857,-80.577366,1
89,CCSFS SLC 40,28.561857,-80.577366,1


### Alternative Approach: Using Color Column

Create a dedicated color column for cleaner code when adding markers.

In [40]:
# Create a new MarkerCluster for fresh visualization
marker_cluster = MarkerCluster()

# Create a new column 'marker_color' in spacex_df to store marker colors
# This makes the code cleaner by pre-computing colors
# Green = Successful landing, Red = Failed landing
spacex_df['marker_color'] = spacex_df['Class'].apply(lambda x: 'green' if x == 1 else 'red') 

# Add markers using the pre-computed color column
for index, row in spacex_df.iterrows():
    folium.Marker(
        location=[row['Latitude'], row['Longitude']],
        icon=folium.Icon(color=row['marker_color']),
    ).add_to(marker_cluster)

# Add the marker cluster to the map
site_map.add_child(marker_cluster)

site_map

### Custom Marker Implementation

This section demonstrates a template for creating custom markers with specific styling.

In [41]:
# Create a new fresh map for custom marker demonstration
site_map = folium.Map(location=nasa_coordinate, zoom_start=5)

# Add launch site circles and labels
for index, row in launch_sites_df.iterrows():
    circle = folium.Circle(
        [row['Latitude'], row['Longitude']], 
        radius=1000, 
        color='#d35400', 
        fill=True
    ).add_child(folium.Popup(row['LaunchSite']))
    
    marker = folium.map.Marker(
        [row['Latitude'], row['Longitude']],
        icon=DivIcon(
            icon_size=(20, 20),
            icon_anchor=(0, 0),
            html='<div style="font-size: 12pt; color:#d35400;"><b>%s</b></div>' % row['LaunchSite'],
        )
    )
    site_map.add_child(circle)
    site_map.add_child(marker)

# Add marker cluster for launches
marker_cluster = MarkerCluster()

# Create markers for each launch with custom styling
for index, record in spacex_df.iterrows():
    # Create marker with color based on success/failure
    marker = folium.Marker(
        location=[record['Latitude'], record['Longitude']],
        icon=folium.Icon(color='white', icon_color=record['marker_color'])
    )
    marker_cluster.add_child(marker)

# Add the completed marker cluster to the map
site_map.add_child(marker_cluster)

site_map

## 7. Proximity Analysis

### Task 5: Calculate Distances to Geographic Features

Analyze the proximity of launch sites to important geographic features like coastlines, which can impact launch operations and safety.

In [42]:
# Create coastline reference data for proximity calculations
# These coordinates represent approximate coastline points near the launch sites

# CCSFS SLC 40 is in Cape Canaveral, Florida (Atlantic coast)
# VAFB SLC 4E is in Vandenberg, California (Pacific coast)
coastline_df = pd.DataFrame({
    'Name': ['Cape Canaveral Coast', 'Vandenberg Coast', 'Florida East Coast', 'California Central Coast'],
    'Latitude': [28.56342, 34.6324, 28.6, 34.7],
    'Longitude': [-80.567, -120.6, -80.5, -120.5]
})

print("Coastline reference points:")
coastline_df

Coastline reference points:


Unnamed: 0,Name,Latitude,Longitude
0,Cape Canaveral Coast,28.56342,-80.567
1,Vandenberg Coast,34.6324,-120.6
2,Florida East Coast,28.6,-80.5
3,California Central Coast,34.7,-120.5


### Distance Calculation Function

Implement the Haversine formula to calculate the great-circle distance between two points on Earth's surface.

In [43]:
# Import mathematical functions for distance calculation
from math import sin, cos, sqrt, atan2, radians 

def calculate_distance(lat1, lon1, lat2, lon2):
    """
    Calculate the great-circle distance between two points on Earth using the Haversine formula.
    
    Parameters:
    -----------
    lat1, lon1 : float
        Latitude and longitude of point 1 in degrees
    lat2, lon2 : float
        Latitude and longitude of point 2 in degrees
    
    Returns:
    --------
    float
        Distance in kilometers
    """
    # Approximate radius of Earth in kilometers
    R = 6373.0

    # Convert coordinates from degrees to radians
    lat1 = radians(lat1)
    lon1 = radians(lon1)
    lat2 = radians(lat2)
    lon2 = radians(lon2)

    # Calculate differences in coordinates
    dlon = lon2 - lon1
    dlat = lat2 - lat1

    # Haversine formula
    a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))

    # Calculate distance
    distance = R * c

    return distance

# Calculate and store the distance to the closest coastline for each launch site
launch_sites_df['DistanceToCoastline'] = 0.0    

for index, launch_site in launch_sites_df.iterrows():
    # Initialize variables to track the closest coastline point
    closest_coastline = None
    min_distance = float('inf')
    
    # Find the closest coastline point by checking all coastline coordinates
    for _, coastline in coastline_df.iterrows():
        distance = calculate_distance(
            launch_site['Latitude'], 
            launch_site['Longitude'], 
            coastline['Latitude'], 
            coastline['Longitude']
        )
        
        if distance < min_distance:
            min_distance = distance
            closest_coastline = coastline
    
    # Store the minimum distance to the closest coastline
    launch_sites_df.at[index, 'DistanceToCoastline'] = min_distance

# Display results
print("Launch Sites with Distance to Coastline:")
launch_sites_df

Launch Sites with Distance to Coastline:


Unnamed: 0,LaunchSite,Latitude,Longitude,DistanceToCoastline
0,CCSFS SLC 40,28.561857,-80.577366,1.027494
1,KSC LC 39A,28.608058,-80.603956,6.138497
2,VAFB SLC 4E,34.632093,-120.610829,0.991677


### Visualize Closest Coastline Points

Mark the closest coastline point for each launch site and display the distance.

In [44]:
# For each launch site, find and mark the closest coastline point
for index, launch_site in launch_sites_df.iterrows():
    # Find the closest coastline point
    closest_coastline = None
    min_distance = float('inf')
    
    for _, coastline in coastline_df.iterrows():
        distance = calculate_distance(
            launch_site['Latitude'], 
            launch_site['Longitude'], 
            coastline['Latitude'], 
            coastline['Longitude']
        )
        
        if distance < min_distance:
            min_distance = distance
            closest_coastline = coastline
    
    # Create a marker at the closest coastline point showing the distance
    marker = folium.Marker(
        location=[closest_coastline['Latitude'], closest_coastline['Longitude']],
        icon=DivIcon(
            icon_size=(200, 36),
            icon_anchor=(0, 0),
            html='<div style="font-size: 10pt; color:#d35400;"><b>Distance: %.2f km</b></div>' % min_distance,
        )
    )
    site_map.add_child(marker)

site_map

### Draw Lines to Coastline

Draw lines connecting each launch site to its closest coastline point for better visualization.

In [45]:
# Draw lines connecting launch sites to their closest coastline points
for index, launch_site in launch_sites_df.iterrows():
    # Find the closest coastline point
    closest_coastline = None
    min_distance = float('inf')
    
    for _, coastline in coastline_df.iterrows():
        distance = calculate_distance(
            launch_site['Latitude'], 
            launch_site['Longitude'], 
            coastline['Latitude'], 
            coastline['Longitude']
        )
        
        if distance < min_distance:
            min_distance = distance
            closest_coastline = coastline
    
    # Create a line (PolyLine) connecting the launch site to the coastline
    coordinates = [
        [launch_site['Latitude'], launch_site['Longitude']], 
        [closest_coastline['Latitude'], closest_coastline['Longitude']]
    ]
    lines = folium.PolyLine(locations=coordinates, weight=2, color='blue', opacity=0.6)
    site_map.add_child(lines)

site_map

## 8. Interactive Tools

### Task 6: Add MousePosition Plugin

Add a MousePosition tool to help identify coordinates of any point on the map. This is useful for finding coordinates of cities, highways, railways, etc.

In [46]:
# Add a MousePosition plugin to display coordinates
# This allows you to hover over any point on the map and see its coordinates
# Useful for identifying locations of cities, railways, highways, etc.
mouse_position = MousePosition(
    position='topright',         # Position on the map
    separator=' Long: ',         # Separator between lat and long
    empty_string='NaN',          # Display when no position
    lng_first=False,             # Show latitude first
    num_digits=20,               # Number of decimal places
    prefix='Lat:',               # Prefix for coordinates
)

site_map.add_child(mouse_position)

site_map

### Task 7: Mark and Measure Distance to Cities

Identify and mark the nearest cities to launch sites. This helps understand the relationship between launch sites and populated areas.

In [47]:
# Define nearby cities for each launch site
# You can use the MousePosition tool to find accurate coordinates
# Cape Canaveral: Nearby city is Cape Canaveral/Cocoa Beach
# Vandenberg: Nearby city is Lompoc, CA

cities_df = pd.DataFrame({
    'Name': ['Cape Canaveral', 'Lompoc'],
    'Latitude': [28.485833, 34.6391],
    'Longitude': [-80.544444, -120.4579]
})

# For each launch site, mark the nearest city and draw a line
for index, launch_site in launch_sites_df.iterrows():
    # Determine closest city (simplified: use first city for first site, second for others)
    if index == 0:
        closest_city = cities_df.iloc[0]
    else:
        closest_city = cities_df.iloc[1] if len(cities_df) > 1 else cities_df.iloc[0]
    
    # Calculate distance to the city
    city_distance = calculate_distance(
        launch_site['Latitude'], 
        launch_site['Longitude'],
        closest_city['Latitude'],
        closest_city['Longitude']
    )
    
    # Create a marker at the city location
    marker = folium.Marker(
        location=[closest_city['Latitude'], closest_city['Longitude']],
        icon=DivIcon(
            icon_size=(150, 36),
            icon_anchor=(0, 0),
            html='<div style="font-size: 10pt; color:#e74c3c;"><b>%s (%.2f km)</b></div>' % (closest_city['Name'], city_distance),
        )
    )
    site_map.add_child(marker)
    
    # Draw a line between launch site and city
    coordinates = [
        [launch_site['Latitude'], launch_site['Longitude']], 
        [closest_city['Latitude'], closest_city['Longitude']]
    ]
    lines = folium.PolyLine(locations=coordinates, weight=2, color='red', opacity=0.5)
    site_map.add_child(lines)

site_map

## 9. Analysis and Conclusions

### Key Questions to Investigate:

Based on the visualizations above, analyze the following aspects:

- **Proximity to Railways**: Are launch sites in close proximity to railways for transporting rocket components?
- **Proximity to Highways**: Are launch sites accessible via major highways for logistics and personnel?
- **Proximity to Coastline**: Are launch sites strategically located near coastlines for safety (failed launches fall into water)?
- **Distance from Cities**: Do launch sites maintain a safe distance from populated areas?

### Expected Findings:

1. **Coastline Proximity**: Launch sites are typically very close to coastlines (within a few km) to ensure that failed launches or spent stages fall into the ocean rather than populated areas.

2. **City Distance**: Launch sites maintain a reasonable distance from major cities (typically 10-50 km) for safety while remaining accessible for personnel.

3. **Transportation Access**: Sites need good highway access for transporting large rocket components and supporting infrastructure.

### Success Rate Analysis:

From the color-coded markers:
- **Green markers** indicate successful first-stage landings (Class = 1)
- **Red markers** indicate failed landings (Class = 0)

You can use the clustering feature to zoom in and analyze success rates at specific launch sites.

## 10. Summary Statistics

Calculate and display summary statistics about the launch sites and their geographic characteristics.

In [48]:
# Calculate summary statistics for the launches
print("="*60)
print("SPACEX LAUNCH ANALYSIS SUMMARY")
print("="*60)

# Overall statistics
total_launches = len(spacex_df)
successful_launches = len(spacex_df[spacex_df['Class'] == 1])
failed_launches = len(spacex_df[spacex_df['Class'] == 0])
success_rate = (successful_launches / total_launches) * 100

print(f"\nOverall Statistics:")
print(f"  Total Launches: {total_launches}")
print(f"  Successful Landings: {successful_launches}")
print(f"  Failed Landings: {failed_launches}")
print(f"  Success Rate: {success_rate:.2f}%")

# Statistics by launch site
print(f"\nLaunch Site Statistics:")
print("-"*60)
for site in spacex_df['LaunchSite'].unique():
    site_data = spacex_df[spacex_df['LaunchSite'] == site]
    site_total = len(site_data)
    site_success = len(site_data[site_data['Class'] == 1])
    site_rate = (site_success / site_total) * 100 if site_total > 0 else 0
    
    print(f"\n{site}:")
    print(f"  Total Launches: {site_total}")
    print(f"  Successful: {site_success}")
    print(f"  Success Rate: {site_rate:.2f}%")

# Geographic information
print(f"\nGeographic Information:")
print("-"*60)
for index, site in launch_sites_df.iterrows():
    print(f"\n{site['LaunchSite']}:")
    print(f"  Latitude: {site['Latitude']:.4f}")
    print(f"  Longitude: {site['Longitude']:.4f}")
    if 'DistanceToCoastline' in site:
        print(f"  Distance to Coastline: {site['DistanceToCoastline']:.2f} km")

print("\n" + "="*60)

SPACEX LAUNCH ANALYSIS SUMMARY

Overall Statistics:
  Total Launches: 90
  Successful Landings: 60
  Failed Landings: 30
  Success Rate: 66.67%

Launch Site Statistics:
------------------------------------------------------------

CCSFS SLC 40:
  Total Launches: 55
  Successful: 33
  Success Rate: 60.00%

VAFB SLC 4E:
  Total Launches: 13
  Successful: 10
  Success Rate: 76.92%

KSC LC 39A:
  Total Launches: 22
  Successful: 17
  Success Rate: 77.27%

Geographic Information:
------------------------------------------------------------

CCSFS SLC 40:
  Latitude: 28.5619
  Longitude: -80.5774
  Distance to Coastline: 1.03 km

KSC LC 39A:
  Latitude: 28.6081
  Longitude: -80.6040
  Distance to Coastline: 6.14 km

VAFB SLC 4E:
  Latitude: 34.6321
  Longitude: -120.6108
  Distance to Coastline: 0.99 km

