# Binary Star Fitting: Curve Fit

This tutorial demonstrates the challenges of fitting incomplete astronomical data using binary star radial velocity measurements as an example. We'll explore:

1. **Theoretical Background**: Understanding binary star systems and radial velocity variations
2. **Curve Fitting**: Using `scipy.optimize.curve_fit` 
3. **Incomplete Data Challenge**: How missing observations can lead to different solutions


## 1. Theoretical Background: Binary Star Systems

Binary star systems consist of two stars orbiting their common center of mass. When we observe one star (the primary), its radial velocity (line-of-sight velocity) varies periodically due to the gravitational influence of its companion.

### Key Parameters:
- **Period (P)**: Orbital period in days
- **Eccentricity (e)**: Shape of the orbit (between 0 and 1, 0 = circular, 1 = parabolic)
- **Argument of periastron (ω)**: Orientation of the orbit
- **Time of periastron passage (T₀)**: When the star is closest to the companion
- **Velocity amplitude (K)**: Maximum radial velocity variation
- **Systemic velocity (γ)**: Average velocity of the system

### Radial Velocity Equation:
The radial velocity of the primary star is given by:

$$v_r(t) = \gamma + K[\cos(\omega + f(t)) + e\cos(\omega)]$$

where $f(t)$ is the true anomaly, which depends on the orbital parameters and time.

This model has six free parameters!


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
# import emcee
# import corner
from astropy import units as u
from astropy.constants import G
import warnings
warnings.filterwarnings('ignore')

# Set up plotting
plt.style.use('default')
plt.rcParams['figure.figsize'] = (10, 6)
plt.rcParams['font.size'] = 12

# Set random seed for reproducibility
np.random.seed(42)


## 2. Radial Velocity Model Implementation

Let's implement the radial velocity model for a binary star system. You are not requried but encouraged to figure out what the functions are doing. The function radial_velocity_model() returns the radial velocities at the times provided given the six model parameters. Principally, this is the same as the linear model case except with four more free model parameters.


In [None]:
def solve_kepler_equation(E, e):
    """Solve Kepler's equation: M = E - e*sin(E)"""
    M = E - e * np.sin(E)
    return M

def get_true_anomaly(E, e):
    """Calculate true anomaly from eccentric anomaly"""
    f = 2 * np.arctan(np.sqrt((1 + e) / (1 - e)) * np.tan(E / 2))
    return f

def radial_velocity_model(t, P, e, omega, T0, K, gamma):
    """
    Calculate radial velocity for a binary star system
    
    Parameters:
    -----------
    t : array_like
        Time array
    P : float
        Orbital period in days
    e : float
        Eccentricity (0-1)
    omega : float
        Argument of periastron in radians
    T0 : float
        Time of periastron passage
    K : float
        Velocity amplitude in km/s
    gamma : float
        Systemic velocity in km/s
    
    Returns:
    --------
    v_r : array_like
        Radial velocity in km/s
    """
    # Mean anomaly
    M = 2 * np.pi * (t - T0) / P
    
    # Solve for eccentric anomaly using Newton-Raphson
    E = M.copy()  # Initial guess
    for _ in range(10):  # Max iterations
        E_new = E - (E - e * np.sin(E) - M) / (1 - e * np.cos(E))
        if np.allclose(E, E_new, rtol=1e-10):
            break
        E = E_new
    
    # True anomaly
    f = get_true_anomaly(E, e)
    
    # Radial velocity
    v_r = gamma + K * (np.cos(omega + f) + e * np.cos(omega))
    
    return v_r

# Test the model with some example parameters
t_test = np.linspace(0, 500, 50)
# Feel free to play around with these parameters to see how they affect the radial velocity
P_test, e_test, omega_test, T0_test, K_test, gamma_test = 150.0, 0.3, np.pi/3, 30.0, 20.0, 50.0

v_r_test = radial_velocity_model(t_test, P_test, e_test, omega_test, T0_test, K_test, gamma_test)

In [None]:
# Plot the results
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(t_test, v_r_test, c='b', ls='None', marker='o', linewidth=2)
plt.xlabel('Time (days)')
plt.ylabel('Radial Velocity (km/s)')
plt.title('Example Binary Star Radial Velocity Curve')
plt.grid(True, alpha=0.3)

plt.subplot(1, 2, 2)
plt.plot(t_test % P_test, v_r_test, c='b', ls='None', marker='o', linewidth=2)
plt.xlabel('Phase')
plt.ylabel('Radial Velocity (km/s)')
plt.title('Folded Light Curve')
plt.grid(True, alpha=0.3)
# plt.tight_layout()
plt.show()

print(f"Model parameters:")
print(f"Period: {P_test} days")
print(f"Eccentricity: {e_test}")
print(f"Omega: {omega_test:.3f} rad ({np.degrees(omega_test):.1f}°)")
print(f"T0: {T0_test} days")
print(f"K: {K_test} km/s")
print(f"Gamma: {gamma_test} km/s")


## 3. Import and fit a set of observed radial velocities

Now let's try to fit some data where we do not know the true model parameters. Using what you learned with the linear model, perform a fit with curve_fit.


In [None]:
# Import the mock data
binary_data = pd.read_csv('binary_measurements_complete.csv')
# The data contains three columns: time, vlos, and error
binary_data

In [None]:
# Take a look at the data to understand what we are working with
plt.figure(figsize=(14, 5))

plt.errorbar(binary_data['time'], binary_data['vlos'], yerr=binary_data['error'], fmt='ko', markersize=3, 
             alpha=0.4)
plt.xlabel('Time (days)')
plt.ylabel('Radial Velocity (km/s)')
plt.grid(True, alpha=0.3)

# plt.tight_layout()
plt.show()

Looking at the data, you may notice that some model parameters (e.g. K, the velocity amplitude and gamma, the systemic velocity) can already be estimated just from eyeballing. But we can do better with curve_fit!

In [None]:
# Pick some initial guess (close to true parameters)
p0_1 = [
    200,      # Period: eyeball it
    0.5,       # Eccentricity: pick between 0-1
    np.pi,   # Omega: pick between 0-2pi
    20,      # T0: random guess
    20,       # K: eyeball it
    50    # Gamma: eyeball it
]

# Parameter bounds for curve_fit; pick something sensible
bounds = (
    [10, 0, 0, 0, 5, -100],      # Lower bounds
    [500, 0.99, 2*np.pi, 50, 50, 100]  # Upper bounds
)

print("Initial guess:")
param_names = ['P', 'e', 'omega', 'T0', 'K', 'gamma']
for i, name in enumerate(param_names):
    if name == 'omega':
        print(f"{name}: {p0_1[i]:.3f} rad ({np.degrees(p0_1[i]):.1f}°)")
    else:
        print(f"{name}: {p0_1[i]:.3f}")

# Perform the fit
popt_1, pcov_1 = curve_fit(
    radial_velocity_model, binary_data['time'], binary_data['vlos'], 
    p0=p0_1, sigma=binary_data['error'], 
    bounds=bounds, maxfev=10000
)

# Calculate uncertainties
perr_1 = np.sqrt(np.diag(pcov_1))

print("\nFit results:")
print("Parameter | Best Fit | Uncertainty")
print("-" * 60)

for i, name in enumerate(param_names):
    fit_val = popt_1[i]
    err_val = perr_1[i]
    
    if name == 'omega':
        print(f"{name:8s} | {fit_val:8.3f} | {err_val:8.3f}")
    else:
        print(f"{name:8s} | {fit_val:8.3f} | {err_val:8.3f}")


Let's examine in the figure how well this best-fit model from curve_fit describe the data.

In [None]:
# Plot the results
plt.figure(figsize=(14, 5))

t_plot = np.linspace(0, 500, 1000)
v_r_fit_plot = radial_velocity_model(t_plot, *popt_1)
plt.errorbar(binary_data['time'], binary_data['vlos'], yerr=binary_data['error'], fmt='ko', markersize=3, 
             alpha=0.4,label='Data')
plt.plot(t_plot, v_r_fit_plot, 'r--', linewidth=2, label='Best fit')
plt.xlabel('Time (days)')
plt.ylabel('Radial Velocity (km/s)')
plt.grid(True, alpha=0.3)
plt.legend()

# plt.tight_layout()
plt.show()

Looks great!

## 4. Challenges in Incomplete Data Scenario

Unfortunately real astronomical observations rarely comes in such uniformly sampled time steps. Situations such as the following will leave gaps in the data:
- Weather prevents observations
- Telescope time is limited
- Seasonal constraints exist
- Equipment failures occur

Let's now consider the same dataset, but with some of the observations missing.

In [None]:
# Import the mock data with missing observation; the data structure is the same
binary_data_inc = pd.read_csv('binary_measurements_observed.csv')

In [None]:
# Take a look at the data to understand what we are working with
plt.figure(figsize=(14, 5))

plt.errorbar(binary_data['time'], binary_data['vlos'], yerr=binary_data['error'], fmt='ko', markersize=3, 
             alpha=0.4)
plt.errorbar(binary_data_inc['time'], binary_data_inc['vlos'], yerr=binary_data_inc['error'], fmt='bo', markersize=5, 
             alpha=0.4)
plt.xlabel('Time (days)')
plt.ylabel('Radial Velocity (km/s)')
plt.grid(True, alpha=0.3)

# plt.tight_layout()
plt.show()

Repeat the excercise above; what do you find? Try changing your initial guess and see how that affect the results.

In [None]:
# Pick some initial guess (close to true parameters)
p0_2 = [
    300,      # Period: eyeball it
    0.5,       # Eccentricity: pick between 0-1
    np.pi,   # Omega: pick between 0-2pi
    20,      # T0: random guess
    20,       # K: eyeball it
    50    # Gamma: eyeball it
]

# Parameter bounds for curve_fit; pick something sensible
bounds = (
    [10, 0, 0, 0, 5, -100],      # Lower bounds
    [500, 0.99, 2*np.pi, 50, 50, 100]  # Upper bounds
)

print("Initial guess:")
param_names = ['P', 'e', 'omega', 'T0', 'K', 'gamma']
for i, name in enumerate(param_names):
    if name == 'omega':
        print(f"{name}: {p0_2[i]:.3f} rad ({np.degrees(p0_2[i]):.1f}°)")
    else:
        print(f"{name}: {p0_2[i]:.3f}")

# Perform the fit
popt_2, pcov_2 = curve_fit(
    radial_velocity_model, binary_data_inc['time'], binary_data_inc['vlos'], 
    p0=p0_2, sigma=binary_data_inc['error'], 
    bounds=bounds, maxfev=10000
)

# Calculate uncertainties
perr_2 = np.sqrt(np.diag(pcov_2))

print("\nFit results:")
print("Parameter | Best Fit | Uncertainty")
print("-" * 60)

for i, name in enumerate(param_names):
    fit_val = popt_2[i]
    err_val = perr_2[i]
    
    if name == 'omega':
        print(f"{name:8s} | {fit_val:8.3f} | {err_val:8.3f}")
    else:
        print(f"{name:8s} | {fit_val:8.3f} | {err_val:8.3f}")


In [None]:
# Plot the results
plt.figure(figsize=(14, 5))

t_plot = np.linspace(0, 500, 1000)
v_r_fit_plot = radial_velocity_model(t_plot, *popt_2)
plt.errorbar(binary_data_inc['time'], binary_data_inc['vlos'], yerr=binary_data_inc['error'], fmt='ko', markersize=3, 
             alpha=0.4,label='Data')
plt.plot(t_plot, v_r_fit_plot, 'r--', linewidth=2, label='Best fit')
plt.xlabel('Time (days)')
plt.ylabel('Radial Velocity (km/s)')
plt.grid(True, alpha=0.3)
plt.legend()

# plt.tight_layout()
plt.show()

#### The curve_fit function can give completely different results with different initial guesses! Especially when the data is incomplete and you do not have a good idea of the true parameters. How do astronomers deal with that?