# Lunar Phase Calendar App
__[Github Repository](https://github.com/BobbyKostko/Astron1221-project2)__
## Project Motivation
- The moon's size and brightness in the sky often interfere with astronomical observation
- We wanted to create a tool designed for amateur astronomers who might not have the coding, data wrangling, or astronomy skills needed to accurately track the moon for good viewing conditions
- This puts a heavy emphasis on user-friendly implementation and accessibility
- We will generate data on lunar phase, rise/set times, and other important lunar events such as eclipses, packaging this data for easy use by others

## Coding Methods
- Pull data on earth/moon locations and rise/set times from the DE 421 ephemeris which is loaded through skyfield 
- Calculate illumination percentages and lunar events with this data, arranging our calculations in our own PANDAS table
- Store our data on GitHub and push it to a Streamlit app to generate lunar reports on nightly phase and other noteable lunar events
- Streamlit implementation means nobody needs to meet the installation dependencies of our code or know how to run it. They dont even need to run it at all since the data is nestled in our repository.

## Installations and Requirements
### As listed in our repository's requirements.txt file, to run the data generation code, you need to install:
- Pandas
- Skyfield
- Numpy
  
### Additionally, our web app has separate requirements (also in requirements.txt):
- Streamlit
- Plotly (for data visualizations)
- Pillow (for calendar generation)

## Laying the Foundations
We start with making all of our imports, loading the proper ephemeris, and establishing a few objects and planetary constants used throughout the data generation process.

In [None]:
from datetime import datetime, timezone, timedelta
from skyfield.api import load, Topos
from skyfield import almanac
import numpy as np
import pandas as pd
from zoneinfo import ZoneInfo
import os

# Load timescale and ephemeris
ts = load.timescale()
eph = load('de421.bsp')

earth = eph['earth']
moon = eph['moon']
sun = eph['sun']

# Eclipse constants (angular sizes in degrees)
EARTH_ANGULAR_RADIUS_AT_MOON = 1.9   # Earth's angular radius as seen from Moon
SUN_ANGULAR_RADIUS_AT_MOON = 0.27   # Sun's angular radius as seen from Moon (~1AU)

## Calculating Lunar Phase
The first pieces of data we want to get are the moon's phase and illumination percentage at a given time. This is relatively easy using skyfield's almanac feature which already contains this information, but we need to make sure the value it outputs is in the format we want.
### Retrieving phase and illumination percentage from our ephemeris

In [None]:
def get_lunar_phase(date):
    """
    Get the lunar phase for a given UTC datetime.
    
    Args:
        date: UTC-aware datetime object
        
    Returns:
        phase_name: String name of the phase (e.g., "Full Moon")
        illumination: Percentage of moon illuminated (0-100%)
    """
    t = ts.utc(date)
    
    # Calculate elongation (angle between sun and moon as seen from Earth)
    # using Skyfield's built-in moon_phase function
    phase_angle = almanac.moon_phase(eph, t)
    elongation = phase_angle.degrees
    
    # Calculate illumination percentage
    # elongation: 0° (New) -> 180° (Full) -> 360° (New)
    illumination = (1 - abs(elongation - 180) / 180) * 100
    illumination = round(illumination, 1)

Skyfield's almanac gives us the angle between the sun and moon as seen from earth, which corresponds to the percentage of the moon's illuminated side we can see.
### Classifying phase names for our retireved angles

In [None]:
# Determine phase name
    if elongation < 22.5 or elongation >= 337.5:
        phase_name = "New Moon"
    elif elongation < 67.5:
        phase_name = "Waxing Crescent"
    elif elongation < 112.5:
        phase_name = "First Quarter"
    elif elongation < 157.5:
        phase_name = "Waxing Gibbous"
    elif elongation < 202.5:
        phase_name = "Full Moon"
    elif elongation < 247.5:
        phase_name = "Waning Gibbous"
    elif elongation < 292.5:
        phase_name = "Last Quarter"
    else:
        phase_name = "Waning Crescent"
    
    return phase_name, illumination

Now we connect our phase angles into something more easily understandable, phase names. It is worth noting here that a "full moon" by our definition only needs 90% of the moon's visible surface illuminated rather than 100%. We chose to do this for two reasons:
- For an amateur astronomer looking either at the moon or trying to guage how much the moon affects sky visibility on a given night, 90% is good enough to be almost indistinguishable from 100%.
- Our code will only calaulate lunar phase once each day for efficiency. With illumination percentages slightly changing over a 24-hour period, we will almost never calculate 100% illumination.

## Finding Rise/Set Times
The next pieces of data we want to retrieve are the moon's rise and set times each day. This is again fairly simple with Skyfield almanac since it already has a function for finding rise and set times of any object when viewed from any earth coordinates.
### Note: This is locally centered data
We are choosing our coordinates to be for Columbus specifically. Ultimately we will convert this data from UTC to EST in our final app.

In [None]:
def get_moon_rise_set(date, latitude=39.9612, longitude=-82.9988, elevation_m=275.0):
    """
    Get moon rise and set times for a given calendar day using Skyfield almanac.
    
    Args:
        date: UTC-aware datetime object
        latitude: Observer's latitude in degrees (default: 39.9612, Columbus, OH)
        longitude: Observer's longitude in degrees (default: -82.9988, Columbus, OH)
        elevation_m: Observer's elevation in meters (default: 275.0, Columbus, OH)
    
    Returns:
        rise_time: datetime (UTC) of first moonrise in the day, or None
        set_time: datetime (UTC) of first moonset in the day, or None
    """
    # Observer location
    site_topos = Topos(latitude_degrees=latitude,
                       longitude_degrees=longitude,
                       elevation_m=elevation_m)

    # Start/end of the UTC day
    start_of_day = date.replace(hour=0, minute=0, second=0, microsecond=0)
    end_of_day = start_of_day + timedelta(days=1)

    t0 = ts.utc(start_of_day)
    t1 = ts.utc(end_of_day)

    # Build above-horizon function and find discrete transitions
    above_horizon_fn = almanac.risings_and_settings(eph, moon, site_topos)
    times, events = almanac.find_discrete(t0, t1, above_horizon_fn)

    # Extract first rise and set times within the day
    rise_time = None
    set_time = None
    for t, ev in zip(times, events):
        if ev == 1 and rise_time is None:  # rising edge
            rise_time = t.utc_datetime()
        elif ev == 0 and set_time is None:  # setting edge
            set_time = t.utc_datetime()

        if rise_time is not None and set_time is not None:
            break

    return rise_time, set_time

## Checking for Eclipses
The next event we want to check for is a lunar eclipse. This requires some slight geometry to determine if the moon appears within the earth's shadow at a given time and if so, by how much.

In [None]:
def check_lunar_eclipse(date):
    """
    Check for lunar eclipse at given time.
    Returns: (eclipse_type, shadow_depth, elongation) where:
        eclipse_type: "Total", "Partial", "Penumbral", or None
        shadow_depth: 0-100 indicating shadow coverage
        elongation: angle from opposition
    """
    if isinstance(date, str):
        date = datetime.strptime(date, '%Y-%m-%d')
    if date.tzinfo is None:
        date = date.replace(tzinfo=timezone.utc)
    
    t = ts.utc(date)
    sun_vec = earth.at(t).observe(sun).apparent()
    moon_vec = earth.at(t).observe(moon).apparent()
    
    # Elongation check: must be near opposition (full moon)
    elongation = sun_vec.separation_from(moon_vec).degrees
    if abs(elongation - 180) > 5:  # Not close enough to opposition
        return None, 0, abs(elongation - 180)

We start by getting our sun and moon positions as seen from earth to find their elongation angle again.A lunar eclipse can only happen when the moon is roughly behind the earth (since that's where earth's shadow appears) so we only run the eclipse calculation for elongation angles close to 180 to save some computing time.

In [None]:
# Shadow cone angles (in degrees) - based on angular sizes
    # Penumbra: Earth + Sun angular radii
    # Umbra: Earth - Sun angular radii  
    penumbra_radius = EARTH_ANGULAR_RADIUS_AT_MOON + SUN_ANGULAR_RADIUS_AT_MOON
    umbra_radius = EARTH_ANGULAR_RADIUS_AT_MOON - SUN_ANGULAR_RADIUS_AT_MOON
    
    # Offset from perfect opposition
    offset = abs(elongation - 180)
    
    # Classify eclipse
    if offset < umbra_radius * 0.5:
        return "Total", int(100 * (1 - offset / umbra_radius)), offset
    elif offset < umbra_radius:
        return "Partial", int(100 * (1 - offset / umbra_radius)), offset
    elif offset < penumbra_radius:
        return "Penumbral", int(50 * (1 - offset / penumbra_radius)), offset
    return None, 0, offset

Now we need to get our radii for earth's shadow using the angular radius constants defined at the start of our code. Since the center of earth's shadow appears at a perfect 180%, we only need to see if the elongation angle of the moon appears within the range of our shadow's radius from 180%. This will tell us if the moon's center appears within the umbral or penumbral part of earth's shadow at any given time.

## Samplling Each Night for an Eclipse
A lunar eclipse only lasts a few hours, so if we only check for a lunar eclipse once a night, we might miss every single one. Instead, we need to conduct regular checks throughout every night while the moon is up.

In [None]:
def sample_night_for_eclipse(date_utc, rise_time, set_time):
    """
    Sample throughout a single night (hourly) to find maximum eclipse.
    Returns: (eclipse_type, shadow_depth, max_eclipse_time_utc)
    """
    if not rise_time or not set_time:
        return None, 0, None
    
    best_eclipse_type = None
    best_depth = 0
    best_time = None
    
    # Get the sampling window from rise/set times
    night_start = rise_time
    night_end = set_time
    # If set time is before rise time on the calendar, moon sets next day
    if night_end < night_start:
        night_end = night_end + timedelta(days=1)
    
    # Sample every hour throughout the visible period
    current_time = night_start
    max_samples = 48  # Safety: max 2 days of hourly samples
    sample_count = 0

We will conduct hourly checks for every rise/set window for a lunar eclipse. We are looking for a simple, easily digestible lunar report, so we are only keeping track of the eclipse's peak as the night goes on. We keep running variables on the peak eclipse depth, the kind of eclipse, and the time in which that peak occurs.

In [None]:
while current_time <= night_end and sample_count < max_samples:
        eclipse_type, depth, _ = check_lunar_eclipse(current_time)
        
        # Track best eclipse found
        if eclipse_type and depth > best_depth:
            best_eclipse_type = eclipse_type
            best_depth = depth
            best_time = current_time
        
        current_time += timedelta(hours=1)
        sample_count += 1
        
        # Safety check: don't sample beyond 48 hours
        if (current_time - night_start).total_seconds() > 86400 * 2:
            break
    
    return best_eclipse_type, best_depth, best_time

## Creating a PANDAS Table with our Data
With all of our skyfield functions for data gathering laid out, we can start putting them together to generate our actual data file. The goal here is to generate all the data our app will need just one time. To make this possible, we need to set our range of dates to be quite large. 

In [None]:
def main():
    print("=" * 60)
    print("Moon Phase Tracker - Data Generator (1900-2035)")
    print("=" * 60)
    
    # Set date range: January 1, 1900 to December 31, 2035 at 11PM Eastern time
    eastern = ZoneInfo('America/New_York')
    start_date = datetime(1900, 1, 1, 23, 0, 0, tzinfo=eastern)
    end_date = datetime(2035, 12, 31, 23, 0, 0, tzinfo=eastern)
    # Convert to UTC
    start_date_utc = start_date.astimezone(timezone.utc)
    end_date_utc = end_date.astimezone(timezone.utc)
    
    # Calculate number of days
    total_days = (end_date_utc - start_date_utc).days + 1

    print(f"\nStarting from: {start_date.strftime('%Y-%m-%d %H:%M:%S')} Eastern Time")
    print(f"                     ({start_date_utc.strftime('%Y-%m-%d %H:%M:%S')} UTC)")
    print(f"\nEnding at: {end_date.strftime('%Y-%m-%d %H:%M:%S')} Eastern Time")
    print(f"                 ({end_date_utc.strftime('%Y-%m-%d %H:%M:%S')} UTC)")
    print(f"\nGenerating data for {total_days} days (1900-2035)...")

The range of dates we will use spans from the years 1900-2035. This might seem like too many dates across an uneven span of time with much more past data than future data, but this is because of the two main features we want our app to have:
### Calendar report for future lunar data
- A thirty day report showing rise/set times, moon phase, and other noteable events
- Our target demographic (backyard astronomers) won't want to make super distant predictions
- Lunar data for the next ten years in the future is more than enough

### Individual reports for lunar data
- A slightly simpler report (no time information included) for specific days
- Designed for people who want to know what the moon was doing when a certain event happened
- Mainly this has to do with people checking their own birthdays
- Data going back to 1900 makes sure every living person can use this feature

### Now we define our data columns for the PANDAS table

In [None]:
# Initialize lists to store data
    dates = []
    phases = []
    illuminations = []
    rise_times = []
    set_times = []
    eclipse_types = []
    eclipse_depths = []
    eclipse_times = []
    supermoon_flags = []
    # Dictionary to store eclipses by their calendar date (Eastern time)
    # Key: date string "YYYY-MM-DD", Value: (eclipse_type, depth, time_str)
    eclipse_dict = {}

Each of the above lists will become its own column in our PANDAS table. The columns and their purposes are as follows
- __Dates__ for each day ranging between 01/01/1900 and 12/31/2035
- __Lunar phase__ measured at 11:00 PM EST (roughly the middle of peak stargazing hours)
- __Illmination%__ also measured at 11:00 PM
- __Rise/set times__ of the moon from a viewer in Columbus
- __Eclipse type__ at the peak measured in our sampling (total, partial, or penumbral)
- __Eclipse depth__ (percentage of the moon covered in shadow at the peak time)
- __Eclipse time__ when it reaches peak depth
- __Is there a supermoon__ (a simple enough function using the distance between earth and the moon to just be defined in our main function)


We also include a dictionary of eclipse dates since our sampling method uses the entire window where the moon is above the horizon, meaning it could span multiple days. This means our sampling logic could previously locate an eclipse's peak on one day and store the eclipse info in a different day's row. Making a separate dictionary of dates allows us to retroactively assign an eclipse's information to the correct day.
### Calling our data functions

In [None]:
 # Generate data for all days from 1900 to 2035
    for day in range(total_days):
        # Calculate the date for lunar phase (11PM Eastern)
        date_utc = start_date_utc + timedelta(days=day)
        date_local = date_utc.astimezone(eastern)
        # Get lunar phase at 11PM Eastern
        phase, illumination = get_lunar_phase(date_utc)
        # --- SUPERMOON CHECK ---
        # Compute geocentric distance at 11PM Eastern (UTC)
        t = ts.utc(date_utc)
        moon_apparent = earth.at(t).observe(moon).apparent()
        distance_km = moon_apparent.distance().km  # Skyfield distance is in AU by default; .km gets km
        # Supermoon definition: Full Moon and ≤360,000 km
        is_supermoon = phase == "Full Moon" and distance_km <= 360000
        supermoon_flags.append(is_supermoon)
        # --- END SUPERMOON CHECK ---
        # Get moon rise/set times for the calendar day
        rise_time, set_time = get_moon_rise_set(date_utc)
        # Check for lunar eclipse - sample hourly throughout the night if near full moon
        eclipse_type, eclipse_depth, eclipse_time_utc = None, 0, None
        if illumination > 85:  # Only check during near full moons
            eclipse_type, eclipse_depth, eclipse_time_utc = sample_night_for_eclipse(
                date_utc, rise_time, set_time
            )
            # Store eclipse info keyed by its actual calendar date (in Eastern time)
            if eclipse_time_utc:
                eclipse_local = eclipse_time_utc.astimezone(eastern)
                eclipse_date = eclipse_local.strftime('%Y-%m-%d')
                eclipse_time_str = eclipse_local.strftime('%Y-%m-%d %H:%M ET')
                # Store in dictionary by actual date (keep the one with highest depth if multiple)
                if eclipse_date not in eclipse_dict or eclipse_depth > eclipse_dict[eclipse_date][1]:
                    eclipse_dict[eclipse_date] = (eclipse_type, eclipse_depth, eclipse_time_str)
        # Format times for display
        rise_str = "No rise" if rise_time is None else rise_time.strftime('%H:%M:%S UTC')
        set_str = "No set" if set_time is None else set_time.strftime('%H:%M:%S UTC')
        current_date = date_local.strftime('%Y-%m-%d')
        # Store data
        dates.append(current_date)
        phases.append(phase)
        illuminations.append(illumination)
        rise_times.append(rise_str)
        set_times.append(set_str)
    # After all days, map eclipse info for each calendar date
    for i, current_date in enumerate(dates):
        if current_date in eclipse_dict:
            eclipse_type, eclipse_depth, eclipse_time_str = eclipse_dict[current_date]
            eclipse_types.append(eclipse_type)
            eclipse_depths.append(eclipse_depth)
            eclipse_times.append(eclipse_time_str)
        else:
            eclipse_types.append("None")
            eclipse_depths.append(0)
            eclipse_times.append("None")
        # Progress indicator
        if (i + 1) % 500 == 0:
            print(f"  Generated data for {i + 1}/{total_days} days...")

This updates our lists with the correct data. Each list has almost 50,000 individual entries, one for each day in the span of 136 years. Generating all of this data took just over 40 minutes, which is why we really want nobody to have to do this themselves.

### Turning our lists into a PANDAS table and (not actually) running the code

In [None]:
# Create pandas DataFrame
    df = pd.DataFrame({
        'Date': dates,
        'Phase': phases,
        'Illumination_%': illuminations,
        'Moon_Rise': rise_times,
        'Moon_Set': set_times,
        'Eclipse_Type': eclipse_types,
        'Eclipse_Depth_%': eclipse_depths,
        'Eclipse_Time': eclipse_times,
        'Supermoon': supermoon_flags
    })
    print("\nData generation complete!")
    print("\nFirst 10 rows:")
    print(df.head(10))
    print(f"\nTotal rows: {len(df)}")
    print("=" * 60)
    # Save to CSV (will overwrite if file exists)
    csv_filename = 'lunar_data_1900_2035.csv'
    if os.path.exists(csv_filename):
        os.remove(csv_filename)
        print(f"\nRemoved existing file: {csv_filename}")
    df.to_csv(csv_filename, index=False)
    print(f"Data saved to: {csv_filename}")
    print("=" * 60)

if __name__ == "__main__":
    main()

Our table comes out looking like this:
![CSV_Data_sample](Sample_Data.PNG)

## Streamlit implementation
With our data securely nestled in one big CSV file, Streamlit can quickly and easily pull from this file to do whatever we want. By writing our streamlit code in the same GitHub repository as the rest of our project (including this data file) and deploying our app from GitHub, Streamlit can access any of the files within that repository. What this means is __nobody else actually needs to run our code.__ This is hugely important for the people we are designing this app for. 

__[App link](https://thecoolestlunarcalendarmakerofalltime.streamlit.app/)__
### What's included in our app
- Allows calendar input for any day within our generated data range
- Shows visualizations for moon phase, illumination percentage, and rise/set times in the 30 days following the input date
- Generates a calendar image for the 30-day window with
    - Lunar phase, illumination percentage, rise/set times, eclipse type (if any), and whether there is a supermoon
    - Day-of-week awareness and actual month/day shown
    - Fun color coding for easily digestible information
    - An option to save the calendar to your computer for later use
- Gives general report for a specific date with
    - Lunar phase
    - Illumination percentage
    - Other important lunar events (if any)

## Important Caveats to our Data
Naturally, our data has its limitations. We chose to go for a broad timespan for our data give our app a much broader use, but that means we had to sacrifice some precision in our data. Running our calculations for eclipses and lunar phases only a few times (or even once) per night means we almost never actually capture the true peak eclipse depth time, or are able to tell you the specific illumination percentage at any time of day. We could fix these imprecise calculations by only changing a few lines of code, but our final code appears this way for one main reason. __I don't want my laptop to explode!__

For the amount of time it took to generate our current csv file, running these same checks several more times each night creates a serious issue of computing power. It wouldn't be impossible to leave a laptop running overnight to generate more accurate or time sensitive data, but with the time constraints of this project and the number of iterations the CSV file has gone through, This level of data quality is simply all we are willing to do.

This level of data precision is also good enough for most of what we are trying to deliver. It contains very precise rise/set times, and while it may not have the exact start time and max depth of an eclipse, it will catch any major lunar eclipses and a time when they are happening without fail. For backyard stargazers who aren't trying to take precise measurements or make astronomical predictions, having this general info is often good enough.

## Possible Additions
- __Better eclipse data__ gotten by taking more samples throughout the night
- __Charting the moon's location__ in the sky. Letting us create a visualization of the moon's path through the sky.
- __Tracking obscured celestial bodies__ using the moon's position in the sky and its brightness to measure stargazing conditions. This would allow us to list important objects in the sky such as planets or bright stars that might be blocked by the moon or its light at a certain point in the night.

## AI Limitations
Our project relied on using Cursor AI to help write code. However, there were certain points where this AI struggled.
- __Using Skyfield__ was difficult at times.
    - Cursor specifically struggled to understand Skyfield's almanac feature. It did not use the almanac on its own, and even struggled to do so when directly told to.
    - It also didn't understand the output of certain functions like .separation_from() which will only return an angle between 0-180. The lunar phase angle ranges from 0-360 to account for the difference between waxing and waning. This is easily fixable by using Skyfield's almanac. Too bad it didn't know how to do that.
- __Understanding basic astronomy concepts__ such as how the moon must be above the horizon at least once a day. This led to the creation of several useless variables and lines of code that had to be manually deleted.

## Conclusion
Our project successfully retrieves and manipulates ephemeris data to get information about the moon on a given day. We were able to package this data and deliver it  in a user-friendly way to anyone via web app. Most of this project's difficulty was found not in hard astronomical calculations but rather efficiently finding and storing data in PANDAS format to be manipulated and displayed in a more readable format than a BSP or CSV file.