# Nonlinear Fitting: Handling Errors in Both x and y

This notebook shows you how to fit data when you have uncertainties in both your x and y measurements.

## The Problem

Typical fitting routines like `scipy.optimize.curve_fit` allow you to specify errors only on the y-axis (dependent) variables. However, in many experiments, you have measurement uncertainties in both x and y.

One routine that allows you to handle errors on both x-axis and y-axis variables is **Orthogonal Distance Regression (ODR)** from `scipy.odr`.

This notebook compares both approaches so you can see the difference.

## Setup

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy import odr

## Step 1: Enter Your Data

Replace these arrays with your own experimental data:

In [None]:
# Your measurements
x_data = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0])
y_data = np.array([2.1, 6.8, 12.3, 17.5, 23.2, 28.6, 33.9, 39.8])

# Uncertainties (errors) on your measurements
# Note: x errors are significant compared to y errors
x_err = np.array([0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5])
y_err = np.array([0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3])

## Step 2: Define Your Fit Function

Replace this with whatever function you expect to fit your data. Here we use a straight line as an example:

**Common alternatives:**
- Exponential: `def fit_func(x, a, b): return a * np.exp(b * x)`
- Power law: `def fit_func(x, a, b): return a * x**b`
- Quadratic: `def fit_func(x, a, b, c): return a + b*x + c*x**2`

In [None]:
# Define your fit function: y = a + b*x
def fit_func(x, a, b):
    return a + b * x

# Initial guesses for the parameters [a, b]
initial_guess = [1.0, 2.0]

## Method 1: Standard Fitting with `curve_fit` (y errors only)

This is the standard approach that only considers uncertainties in y:

In [None]:
popt, pcov = curve_fit(fit_func, x_data, y_data, 
                       p0=initial_guess, 
                       sigma=y_err, 
                       absolute_sigma=True)

# Extract parameter uncertainties
perr = np.sqrt(np.diag(pcov))

print("CURVE_FIT RESULTS (y errors only)")
print("=" * 50)
print("Parameter 0 (a):", popt[0], "±", perr[0])
print("Parameter 1 (b):", popt[1], "±", perr[1])

## Method 2: ODR Fitting (x and y errors)

ODR accounts for uncertainties in both x and y:

In [None]:
# ODR requires the function with parameters first, then x
def odr_func(params, x):
    a = params[0]
    b = params[1]
    return a + b * x

# Set up the ODR fit
model = odr.Model(odr_func)
data = odr.RealData(x_data, y_data, sx=x_err, sy=y_err)
odr_obj = odr.ODR(data, model, beta0=initial_guess)
output = odr_obj.run()

print("\nODR RESULTS (x and y errors)")
print("=" * 50)
print("Parameter 0 (a):", output.beta[0], "±", output.sd_beta[0])
print("Parameter 1 (b):", output.beta[1], "±", output.sd_beta[1])

## Visualise the Fits

Compare how well each method fits your data:

In [None]:
# Generate smooth curve for plotting
x_smooth = np.linspace(x_data.min(), x_data.max(), 100)
y_fit_curve = fit_func(x_smooth, popt[0], popt[1])
y_fit_odr = fit_func(x_smooth, output.beta[0], output.beta[1])

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

# Plot 1: curve_fit result
ax1.errorbar(x_data, y_data, xerr=x_err, yerr=y_err, 
             fmt='o', label='Data', capsize=3)
ax1.plot(x_smooth, y_fit_curve, 'r-', label='curve_fit', linewidth=2)
ax1.set_xlabel('x')
ax1.set_ylabel('y')
ax1.set_title('Standard Fit (y errors only)')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Plot 2: ODR result
ax2.errorbar(x_data, y_data, xerr=x_err, yerr=y_err, 
             fmt='o', label='Data', capsize=3)
ax2.plot(x_smooth, y_fit_odr, 'b-', label='ODR fit', linewidth=2)
ax2.set_xlabel('x')
ax2.set_ylabel('y')
ax2.set_title('ODR Fit (x and y errors)')
ax2.legend()
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## Check the Residuals

Residuals show how far each data point is from the fitted curve. Good fits have residuals randomly scattered around zero:

In [None]:
residuals_curve = y_data - fit_func(x_data, popt[0], popt[1])
residuals_odr = y_data - fit_func(x_data, output.beta[0], output.beta[1])

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

ax1.errorbar(x_data, residuals_curve, yerr=y_err, fmt='o', capsize=3)
ax1.axhline(0, color='r', linestyle='--', alpha=0.5)
ax1.set_xlabel('x')
ax1.set_ylabel('Residuals')
ax1.set_title('curve_fit Residuals')
ax1.grid(True, alpha=0.3)

ax2.errorbar(x_data, residuals_odr, yerr=y_err, fmt='o', capsize=3)
ax2.axhline(0, color='b', linestyle='--', alpha=0.5)
ax2.set_xlabel('x')
ax2.set_ylabel('Residuals')
ax2.set_title('ODR Residuals')
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## When Does It Matter?

The difference between `curve_fit` and ODR becomes significant when:
- Your x errors are comparable to or larger than your y errors
- Your fitted line/curve has a steep slope
- You need accurate parameter uncertainties for further analysis

For data with negligible x errors, both methods give similar results.