<a href="https://colab.research.google.com/github/l-87hjl/3i-atlas-public-data/blob/main/Copy_of_Horizons_Batch_Fetcher_Fixed.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## ‚ö†Ô∏è Pipeline Context

This notebook implements **Pipeline B**: API direct-fetch against the **current JPL solution**.

**Key limitation**: API queries always use the current orbital solution. You cannot query historical solutions like Sol44.

---

**Rotated Error Ellipse Methodology**: This notebook fetches SMAA, SMIA, THETA parameters. Do NOT treat RA/DEC uncertainties as independent. See `documentation/SIGMA_UNCERTAINTY_INTERPRETATION_JPL_HORIZONS.txt`.

# üåå Horizons API Batch Fetcher for Comet 3I/ATLAS (FIXED)

This notebook automatically fetches ephemeris data for observations of comet 3I/ATLAS from JPL Horizons.

**FIXED:** Now handles variable data formats including optional flags and different observatory code formats.

## üìã Instructions

1. **Upload your CSV file**: Click the folder icon on the left ‚Üí Upload ‚Üí Select `observations-timestamp-observatory-only.csv`
2. **Run all cells**: Click `Runtime` ‚Üí `Run all` (or press Ctrl+F9)
3. **Download results**: The final cell will let you download your results CSV

## üìä Data Extracted

For each observation, we extract:
- UTC time
- RA (ICRF)
- DEC (ICRF)
- dRA*cosD (angular rate)
- d(DEC)/dt (angular rate)
- RA_3sigma (uncertainty)
- DEC_3sigma (uncertainty)
- SMAA_3sig (error ellipse semi-major axis)
- SMIA_3sig (error ellipse semi-minor axis)
- Theta (error ellipse orientation)

---

## Step 1: Install Required Libraries

Google Colab already has `requests` installed, but we'll make sure.

In [None]:
# Install/verify requests library (usually already available in Colab)
!pip install -q requests
print("‚úì Libraries ready!")

‚úì Libraries ready!


## Step 2: Import Libraries and Define Functions

In [None]:
import csv
import re
import time
import requests
from typing import Dict, List, Optional
from urllib.parse import urlencode
from google.colab import files
import io

print("‚úì Imports successful!")

‚úì Imports successful!


In [None]:
def build_horizons_url(timestamp: str, observatory: str) -> str:
    """
    Build Horizons API URL for given timestamp and observatory
    """
    params = {
        'format': 'text',
        'COMMAND': "'DES=1004083;'",
        'MAKE_EPHEM': "'YES'",
        'EPHEM_TYPE': "'OBSERVER'",
        'CENTER': f"'{observatory}'",
        'TLIST': f"'{timestamp}'",
        'QUANTITIES': "'1,3,36,37'",
        'EXTRA_PREC': "'YES'",
        'TIME_DIGITS': "'SECONDS'",
        'CSV_FORMAT': "'NO'"
    }

    base_url = "https://ssd.jpl.nasa.gov/api/horizons.api"
    return f"{base_url}?{urlencode(params)}"


def parse_horizons_response(response_text: str, timestamp: str, observatory: str) -> Optional[Dict[str, str]]:
    """
    Parse Horizons API response to extract required fields.
    FIXED: Handles optional 'A' flag and variable data formats.
    """
    # Find the data line between $$SOE and $$EOE markers
    soe_pattern = r'\$\$SOE\s+(.*?)\s+\$\$EOE'
    match = re.search(soe_pattern, response_text, re.DOTALL)

    if not match:
        print(f"    ‚ö†Ô∏è No data found for {timestamp} at {observatory}")
        return None

    data_line = match.group(1).strip()
    parts = data_line.split()

    if len(parts) < 15:
        print(f"    ‚ö†Ô∏è Insufficient data ({len(parts)} parts) for {timestamp} at {observatory}")
        print(f"    Debug: {data_line[:100]}...")
        return None

    try:
        # Check if there's an 'A' flag or other single-letter flag at position 2
        # The 'A' flag appears to indicate accuracy/quality information
        offset = 0
        if len(parts[2]) == 1 and parts[2].isalpha():
            # Found a flag, shift indices by 1
            offset = 1
            print(f"    üìç Detected flag '{parts[2]}' - adjusting parse indices")

        # Extract UTC time (first 2 parts: date and time)
        utc_time = f"{parts[0]} {parts[1]}"

        # Extract RA (parts offset+2 to offset+4: HH MM SS.ffffff)
        ra_icrf = f"{parts[2+offset]} {parts[3+offset]} {parts[4+offset]}"

        # Extract DEC (parts offset+5 to offset+7: +/-DD MM SS.fffff)
        dec_icrf = f"{parts[5+offset]} {parts[6+offset]} {parts[7+offset]}"

        # Extract motion and uncertainty values
        result = {
            'timestamp': timestamp,
            'observatory': observatory,
            'utc_time': utc_time,
            'ra_icrf': ra_icrf,
            'dec_icrf': dec_icrf,
            'dra_cosd': parts[8+offset],
            'ddec_dt': parts[9+offset],
            'ra_3sigma': parts[10+offset],
            'dec_3sigma': parts[11+offset],
            'smaa_3sig': parts[12+offset],
            'smia_3sig': parts[13+offset],
        }

        # Theta may not always be present (especially with flags)
        if len(parts) > 14+offset:
            result['theta'] = parts[14+offset]
        else:
            result['theta'] = 'N/A'
            print(f"    ‚ö†Ô∏è Theta value not found (using N/A)")

        return result

    except (IndexError, ValueError) as e:
        print(f"    ‚ùå Error parsing data for {timestamp} at {observatory}: {e}")
        print(f"    Debug: parts={len(parts)}, first 20 parts: {parts[:20]}")
        return None


def fetch_horizons_data(url: str, max_retries: int = 3) -> Optional[str]:
    """
    Fetch data from Horizons API with retry logic
    """
    for attempt in range(max_retries):
        try:
            response = requests.get(url, timeout=30)
            if response.status_code == 200:
                return response.text
            else:
                print(f"      Attempt {attempt + 1}/{max_retries}: HTTP {response.status_code}")
        except Exception as e:
            print(f"      Attempt {attempt + 1}/{max_retries} failed: {e}")
            if attempt < max_retries - 1:
                time.sleep(2)  # Wait before retry

    return None


print("‚úì Functions defined!")

‚úì Functions defined!


## Step 3: Upload Your CSV File

Run this cell and select your `observations-timestamp-observatory-only.csv` file.

In [None]:
# Upload the CSV file
print("üìÅ Please select your CSV file...")
uploaded = files.upload()

# Get the filename
csv_filename = list(uploaded.keys())[0]
print(f"\n‚úì Uploaded: {csv_filename}")

üìÅ Please select your CSV file...


Saving observations_timestamp_observatory_index_clean.csv to observations_timestamp_observatory_index_clean.csv

‚úì Uploaded: observations_timestamp_observatory_index_clean.csv


## Step 4: Load and Preview Observations

In [None]:
# Read observations from CSV
observations = []
with open(csv_filename, 'r') as f:
    reader = csv.DictReader(f)
    observations = list(reader)

print(f"üìä Loaded {len(observations)} observations\n")
print("First 5 observations:")
for i, obs in enumerate(observations[:5], 1):
    print(f"  {i}. {obs['timestamp']} at observatory {obs['observatory']}")

if len(observations) > 5:
    print(f"  ... and {len(observations) - 5} more")

üìä Loaded 114 observations

First 5 observations:
  1. 2025-12-19 00:10:29 at observatory B67
  2. 2025-12-19 00:12:51 at observatory B67
  3. 2025-12-19 00:15:12 at observatory B67
  4. 2025-12-19 00:37:56 at observatory D69
  5. 2025-12-19 01:43:00 at observatory D69
  ... and 109 more


## Step 5: Fetch and Parse All Data

This will take a few minutes. Progress will be shown below.

In [None]:
results = []
errors = []
total = len(observations)

print("üöÄ Starting data collection...\n")
print("=" * 80)

for i, obs in enumerate(observations, 1):
    timestamp = obs['timestamp']
    observatory = obs['observatory']

    print(f"\n[{i}/{total}] {timestamp} at {observatory}")

    # Build URL
    url = build_horizons_url(timestamp, observatory)

    # Fetch data
    print("    Fetching...", end=' ')
    response = fetch_horizons_data(url)

    if response is None:
        print("‚ùå FAILED - Could not fetch data")
        errors.append({
            'timestamp': timestamp,
            'observatory': observatory,
            'error': 'Failed to fetch after retries'
        })
        continue

    # Parse data
    parsed = parse_horizons_response(response, timestamp, observatory)

    if parsed:
        results.append(parsed)
        print(f"‚úÖ SUCCESS")
        print(f"    RA: {parsed['ra_icrf']}, DEC: {parsed['dec_icrf']}")
    else:
        errors.append({
            'timestamp': timestamp,
            'observatory': observatory,
            'error': 'Failed to parse response'
        })

    # Be nice to the API - small delay between requests
    if i < total:
        time.sleep(0.5)

print("\n" + "=" * 80)
print(f"\n‚ú® Processing Complete!")
print(f"   Successful: {len(results)}/{total}")
print(f"   Failed: {len(errors)}/{total}")

üöÄ Starting data collection...


[1/114] 2025-12-19 00:10:29 at B67
    Fetching... ‚úÖ SUCCESS
    RA: 10 47 13.069808, DEC: +07 00 12.85815

[2/114] 2025-12-19 00:12:51 at B67
    Fetching... ‚úÖ SUCCESS
    RA: 10 47 12.609622, DEC: +07 00 15.19324

[3/114] 2025-12-19 00:15:12 at B67
    Fetching... ‚úÖ SUCCESS
    RA: 10 47 12.152648, DEC: +07 00 17.51188

[4/114] 2025-12-19 00:37:56 at D69
    Fetching... ‚úÖ SUCCESS
    RA: 10 47 07.718487, DEC: +07 00 39.76476

[5/114] 2025-12-19 01:43:00 at D69
    Fetching... ‚úÖ SUCCESS
    RA: 10 46 55.050561, DEC: +07 01 43.94405

[6/114] 2025-12-19 01:54:35 at 213
    Fetching... ‚úÖ SUCCESS
    RA: 10 46 52.838453, DEC: +07 01 55.86883

[7/114] 2025-12-19 02:09:33 at B74
    Fetching... ‚úÖ SUCCESS
    RA: 10 46 49.924774, DEC: +07 02 10.61298

[8/114] 2025-12-19 02:12:34 at B74
    Fetching... ‚úÖ SUCCESS
    RA: 10 46 49.336683, DEC: +07 02 13.58878

[9/114] 2025-12-19 02:30:46 at B74
    Fetching... ‚úÖ SUCCESS
    RA: 10 46 45.78784

## Step 6: Preview Results

In [None]:
if results:
    print("üìã First 5 results:\n")
    for i, result in enumerate(results[:5], 1):
        print(f"{i}. {result['timestamp']} at {result['observatory']}")
        print(f"   RA: {result['ra_icrf']}")
        print(f"   DEC: {result['dec_icrf']}")
        print(f"   dRA*cosD: {result['dra_cosd']}, d(DEC)/dt: {result['ddec_dt']}")
        print()
else:
    print("‚ö†Ô∏è No results to display")

üìã First 5 results:

1. 2025-12-19 00:10:29 at B67
   RA: 10 47 13.069808
   DEC: +07 00 12.85815
   dRA*cosD: -173.640, d(DEC)/dt: 59.34382

2. 2025-12-19 00:12:51 at B67
   RA: 10 47 12.609622
   DEC: +07 00 15.19324
   dRA*cosD: -173.650, d(DEC)/dt: 59.34347

3. 2025-12-19 00:15:12 at B67
   RA: 10 47 12.152648
   DEC: +07 00 17.51188
   dRA*cosD: -173.661, d(DEC)/dt: 59.34311

4. 2025-12-19 00:37:56 at D69
   RA: 10 47 07.718487
   DEC: +07 00 39.76476
   dRA*cosD: -173.738, d(DEC)/dt: 59.33344

5. 2025-12-19 01:43:00 at D69
   RA: 10 46 55.050561
   DEC: +07 01 43.94405
   dRA*cosD: -173.973, d(DEC)/dt: 59.31812



## Step 7: Save Results to CSV

In [None]:
if results:
    output_filename = 'horizons_results_atlas_3i_fixed.csv'

    fieldnames = [
        'timestamp', 'observatory', 'utc_time',
        'ra_icrf', 'dec_icrf',
        'dra_cosd', 'ddec_dt',
        'ra_3sigma', 'dec_3sigma',
        'smaa_3sig', 'smia_3sig', 'theta'
    ]

    # Write to CSV
    with open(output_filename, 'w', newline='') as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(results)

    print(f"‚úì Results saved to {output_filename}")
    print(f"\nüì• Downloading...")
    files.download(output_filename)
    print("‚úì Download complete!")
else:
    print("‚ö†Ô∏è No results to save")

‚úì Results saved to horizons_results_atlas_3i_fixed.csv

üì• Downloading...


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

‚úì Download complete!


## Step 8: Save Error Log (if any errors occurred)

In [None]:
if errors:
    error_filename = 'horizons_error_log.txt'

    with open(error_filename, 'w') as f:
        f.write("=" * 70 + "\n")
        f.write("HORIZONS BATCH FETCHER ERROR LOG\n")
        f.write("=" * 70 + "\n\n")
        f.write(f"Total Observations: {total}\n")
        f.write(f"Successful: {len(results)}\n")
        f.write(f"Failed: {len(errors)}\n\n")
        f.write("=" * 70 + "\n")
        f.write("DETAILED ERROR LIST\n")
        f.write("=" * 70 + "\n\n")

        for i, error in enumerate(errors, 1):
            f.write(f"Error #{i}\n")
            f.write(f"  Timestamp: {error['timestamp']}\n")
            f.write(f"  Observatory: {error['observatory']}\n")
            f.write(f"  Error: {error['error']}\n")
            f.write("-" * 70 + "\n\n")

    print(f"‚ö†Ô∏è {len(errors)} errors occurred")
    print(f"‚úì Error log saved to {error_filename}")
    print(f"\nüì• Downloading error log...")
    files.download(error_filename)
    print("‚úì Download complete!")
else:
    print("‚úì No errors - all observations processed successfully!")

‚úì No errors - all observations processed successfully!


## üéâ All Done!

Your results have been downloaded:
- `horizons_results_atlas_3i_fixed.csv` - Your compiled ephemeris data
- `horizons_error_log.txt` - Error log (if any failures occurred)

You can now use this data for your orbital refinement analysis!