# Exoplanet Database Integration and Visualization

This notebook demonstrates how to fetch, integrate, and visualize data from multiple exoplanet databases.

## Databases Integrated:
1. **NASA Exoplanet Archive** - Comprehensive database maintained by NASA
2. **EU Exoplanet Catalogue** - European database at exoplanet.eu
3. **Open Exoplanet Catalogue** - Community-maintained open database
4. **Exoplanet Orbit Database** - Specialized orbital parameters
5. **TEPCat** - Transiting exoplanet parameters

## Setup and Imports

In [None]:
# Import required modules
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

# Import our custom modules
from exoplanet_data_sources import ExoplanetDataCollector
from exoplanet_visualizations import ExoplanetVisualizer

# Setup plotting
%matplotlib inline
sns.set_style('darkgrid')
plt.rcParams['figure.figsize'] = (12, 8)

print("Setup complete!")

## Step 1: Collect Data from Multiple Sources

In [None]:
# Initialize the data collector
collector = ExoplanetDataCollector()

# Fetch from NASA Exoplanet Archive
nasa_data = collector.fetch_nasa_exoplanet_archive()
print(f"\nNASA data shape: {nasa_data.shape if nasa_data is not None else 'Failed'}")

In [None]:
# Fetch from EU Exoplanet Catalogue
eu_data = collector.fetch_eu_exoplanet_catalogue()
print(f"\nEU data shape: {eu_data.shape if eu_data is not None else 'Failed'}")

## Step 2: Create Unified Dataset

In [None]:
# Merge data into unified schema
combined_data = collector.create_unified_schema()
print(f"\nCombined dataset shape: {combined_data.shape}")
print(f"\nColumns: {list(combined_data.columns)}")

In [None]:
# Preview the data
combined_data.head(10)

## Step 3: Enrich Data with Derived Values

In [None]:
# Add derived columns
enriched_data = collector.enrich_data()
print(f"\nEnriched dataset shape: {enriched_data.shape}")
print(f"\nNew columns added: {[col for col in enriched_data.columns if col not in combined_data.columns]}")

## Step 4: Explore the Data

In [None]:
# Get summary statistics
collector.get_statistics()

In [None]:
# Save the combined data
collector.save_data('exoplanet_combined_data.csv')

## Step 5: Create Visualizations

In [None]:
# Initialize visualizer
viz = ExoplanetVisualizer(enriched_data)

### 5.1 3D Galaxy View

In [None]:
# Create interactive 3D view
fig_3d = viz.plot_3d_galaxy_view(save_html=True)
fig_3d.show()

### 5.2 Mass-Radius Diagram

In [None]:
# Mass vs Radius plot
fig_mr = viz.plot_mass_radius_diagram(save_html=True)
fig_mr.show()

### 5.3 Discovery Timeline

In [None]:
# Timeline of discoveries
fig_timeline = viz.plot_discovery_timeline(save_html=True)
fig_timeline.show()

### 5.4 Detection Methods

In [None]:
# Detection method analysis
fig_methods = viz.plot_detection_methods()
plt.show()

### 5.5 Habitable Zone Analysis

In [None]:
# Habitable zone visualization
fig_hz = viz.plot_habitable_zone_analysis(save_html=True)
fig_hz.show()

### 5.6 Stellar Properties

In [None]:
# Host star analysis
fig_stellar = viz.plot_stellar_properties()
plt.show()

### 5.7 Comprehensive Dashboard

In [None]:
# Create interactive dashboard
fig_dashboard = viz.create_dashboard(save_html=True)
fig_dashboard.show()

## Step 6: Custom Analysis

In [None]:
# Find potentially habitable planets
habitable = enriched_data[
    (enriched_data['in_habitable_zone'] == True) &
    (enriched_data['planet_type'].isin(['Rocky (Earth-like)', 'Super-Earth']))
].sort_values('stellar_distance_pc')

print(f"\nFound {len(habitable)} potentially habitable planets!\n")
print("Closest potentially habitable planets:")
print(habitable[['planet_name', 'host_star', 'stellar_distance_pc', 
                 'planet_radius_earth', 'equilibrium_temp_k']].head(10))

In [None]:
# Compare detection methods efficiency
method_stats = enriched_data.groupby('discovery_method').agg({
    'planet_name': 'count',
    'planet_mass_earth': 'mean',
    'planet_radius_earth': 'mean',
    'stellar_distance_pc': 'mean'
}).round(2)

method_stats.columns = ['Count', 'Avg Mass (Earth)', 'Avg Radius (Earth)', 'Avg Distance (pc)']
print("\nDetection method statistics:")
print(method_stats.sort_values('Count', ascending=False))

In [None]:
# Analyze discovery trends
yearly_discoveries = enriched_data.groupby('discovery_year').agg({
    'planet_name': 'count',
    'discovery_method': lambda x: x.mode()[0] if len(x.mode()) > 0 else 'Unknown'
})
yearly_discoveries.columns = ['Discoveries', 'Primary Method']

print("\nRecent discovery trends:")
print(yearly_discoveries.tail(10))

## Step 7: Export Results

In [None]:
# Export habitable planets to separate file
if len(habitable) > 0:
    habitable.to_csv('potentially_habitable_exoplanets.csv', index=False)
    print("Saved potentially habitable planets to potentially_habitable_exoplanets.csv")

# Export summary statistics
summary = pd.DataFrame({
    'Metric': [
        'Total Planets',
        'Data Sources',
        'Discovery Methods',
        'Potentially Habitable',
        'Closest Planet (pc)',
        'Farthest Planet (pc)',
        'Date Range'
    ],
    'Value': [
        len(enriched_data),
        enriched_data['data_source'].nunique(),
        enriched_data['discovery_method'].nunique(),
        len(habitable),
        f"{enriched_data['stellar_distance_pc'].min():.2f}",
        f"{enriched_data['stellar_distance_pc'].max():.2f}",
        f"{enriched_data['discovery_year'].min():.0f} - {enriched_data['discovery_year'].max():.0f}"
    ]
})

summary.to_csv('exoplanet_summary.csv', index=False)
print("\nSaved summary to exoplanet_summary.csv")
print(summary)

## Conclusion

This notebook has demonstrated:
1. ✅ Integration of multiple exoplanet databases
2. ✅ Creation of a unified data schema
3. ✅ Data enrichment with derived values
4. ✅ Comprehensive visualizations including:
   - 3D spatial distribution
   - Mass-radius relationships
   - Discovery timeline
   - Detection method analysis
   - Habitable zone identification
   - Stellar properties
5. ✅ Custom analysis capabilities

All visualizations are saved as interactive HTML files that can be opened in any web browser!