# Celestial Conspiracies: UFOs, SpaceX, and Military Bases 🚀👽🏰

In this notebook we will:
- **Load** and examine our processed datasets:
  - `ufo_processed.csv`
  - `pacedevs_launches_processed.json`
  - `military_bases.csv`
- **Flag** UFO sightings:
  - **is_near_spacex_launch:** UFO sighting occurred within 24 hours after a launch and within 50 km of the launch location.
  - **is_near_military:** UFO sighting occurred within 50 km of a military base.

## Data Loading 📥

In this cell, we load:

- **Launches Data:** Loaded from a JSON file.
- **UFO Data:** Loaded from a CSV file.
- **U.S. Military Bases Data:** Loaded from a CSV file.

In [32]:
import os
import pandas as pd
import json

# ============================================================
# SETUP: Define output directory relative to this script
# ============================================================
# Get the absolute path of the directory where this script is located
# In a notebook, __file__ is not defined so we use os.getcwd() as a fallback.
try:
    BASE_DIR = os.path.dirname(os.path.abspath(__file__))
except NameError:
    BASE_DIR = os.getcwd()


# Define the folder where raw data is stored (assumed to be "../data/raw")
RAW_DIR = os.path.join(BASE_DIR, "..", "data", "raw")

# Define the folder where raw data is stored (assumed to be "../data/raw")
PROCESSED_DIR = os.path.join(BASE_DIR, "..", "data", "processed")

# Load our datasets
ufo_path = os.path.join(PROCESSED_DIR, "ufo_processed.csv")
launches_path = os.path.join(PROCESSED_DIR, "spacedevs_launches_processed.json")
military_path = os.path.join(RAW_DIR, "military_bases.csv")

# Load UFO data
ufo_df = pd.read_csv(ufo_path)
# Convert the UFO datetime string into a datetime object for easier comparison
ufo_df['datetime'] = pd.to_datetime(ufo_df['datetime'], format="%m/%d/%Y %H:%M")

# Load SpaceX launches data
with open(launches_path, "r") as f:
    launches = json.load(f)
launches_df = pd.DataFrame(launches)
# Convert the launch "net" string into datetime object.
launches_df['net'] = pd.to_datetime(launches_df['net'], format="%m/%d/%Y %H:%M")

# Load military bases data
# Note: The military bases CSV is delimited by semicolons.
military_df = pd.read_csv(military_path, delimiter=";")

print("✅ All datasets loaded!")

✅ All datasets loaded!


## Filter Data By Time and Location ⌚📌