 # Notebook 03 — Branch Screening & DTW Alignment



 Screens all branches for the best linear Mg/Sr–temperature relationship generates the synthetic master series, sweeps Sakoe–Chiba window sizes, and performs the final DTW alignment.



 Workflow:

 1. Full pipeline setup (auth + load)

 2. Screen branches → pick winner

 3. Generate synthetic Mg/Sr master

 4. Window sweep to compare alignment quality

 5. Final alignment with chosen window

 6. Inspect DTW diagnostics

 ## Setup

In [1]:
from google.colab import drive
drive.mount('/content/drive')

!pip install dtaidistance

import sys

# Edit REPO_PATH if you cloned the repo to a different location inside MyDrive
REPO_PATH = '/content/rhodopipeline'
if REPO_PATH not in sys.path:
    sys.path.insert(0, REPO_PATH)

import rhodopipeline
from rhodopipeline import RhodolithPipeline, CONFIG

pipeline = RhodolithPipeline(CONFIG)
pipeline.authenticate()
pipeline.load_temperature_data()
pipeline.load_curve6_curve7()

print('\nPipeline ready.')


Mounted at /content/drive
Collecting dtaidistance
  Downloading dtaidistance-2.4.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (14 kB)
Downloading dtaidistance-2.4.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (4.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.4/4.4 MB[0m [31m39.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: dtaidistance
Successfully installed dtaidistance-2.4.0
STEP 1: AUTHENTICATION
Mounted at /content/drive
✓ Authenticated.

STEP 2: LOAD TEMPERATURE
✓ Temp loaded: 130 days.
  Range: 26.07–32.72 °C

LOAD CURVE6 & CURVE7 (Rhodo25_data)
✓ Curve6 loaded (78 increments)
✓ Curve7 loaded (81 increments)


Pipeline ready.


 ## Screen all branches

In [2]:
pipeline.screen_best_linear_branch()

print('\nBest branch metadata:')
for k, v in pipeline.best_branch_meta.items():
    print(f'  {k}: {v}')


STEP 3: SCREENING (Linear + Curve6/7 for AFE5-1)
  afe5-1: Linear R² = 0.4106
      curve6 time R² = 0.4899
      curve7 time R² = 0.4771
  afe5-2: Linear R² = 0.5011
  afe5-3: Linear R² = 0.4610
  afe5-4: Linear R² = 0.6092
  afe5-5: Linear R² = 0.0707
  afe5-6: Linear R² = 0.0764
  afe5-7: Linear R² = 0.3628

✓ WINNER: afe5-4 using linear time axis
  Mg/Sr = 2.0577 * Temp + 13.4355
  Best R² = 0.6092


Best branch metadata:
  slope: 2.0576670584480476
  intercept: 13.435467839999816
  name: afe5-4
  model_type: linear


 ## Generate synthetic Mg/Sr master

In [3]:
pipeline.generate_synthetic_master()

print('\nSynthetic master (first 5 rows):')
print(pipeline.synthetic_master.head())


STEP 4: GENERATE SYNTHETIC Mg/Sr MASTER
✓ Synthetic Master Created (130 days)
  Mg/Sr range: 67.08–80.75 mmol/mol


Synthetic master (first 5 rows):
                 Date  MgSr_Target
Index                             
2024-03-16 2024-03-16    68.300073
2024-03-17 2024-03-17    68.384609
2024-03-18 2024-03-18    67.486352
2024-03-19 2024-03-19    67.586577
2024-03-20 2024-03-20    67.503156


 ## Window sweep



 Compare calibration quality (R² and RMSE) across Sakoe–Chiba window sizes.

 Use the table below to choose `WINDOW_DAYS` in the next cell.

In [4]:
import pandas as pd

sweep_results = []

for w in [10, 15, 20, 30]:
    print(f'\n{"─"*50}')
    print(f'Window = {w} days')
    pipeline.perform_dtw_alignment(window_days=w)
    pipeline.build_composite_and_calibrate()

    if 'Mg/Sr' in pipeline.final_equations:
        s = pipeline.final_equations['Mg/Sr']['stats']
        sweep_results.append({
            'window_days': w,
            'R2':          round(s['r2'],   3),
            'RMSE_degC':   round(s['rmse'], 3),
            'n':           s['n'],
        })

sweep_df = pd.DataFrame(sweep_results)
print('\n=== WINDOW SWEEP RESULTS (Mg/Sr) ===')
print(sweep_df.to_string(index=False))



──────────────────────────────────────────────────
Window = 10 days
STEP 5: UNIFIED DTW ALIGNMENT (window=10 days)
  ✓ Aligned afe5-1: 151 steps, stretch=1.16
  ✓ Aligned afe5-2: 143 steps, stretch=1.10
  ✓ Aligned afe5-3: 158 steps, stretch=1.22
  ✓ Aligned afe5-4: 135 steps, stretch=1.04
  ✓ Aligned afe5-5: 131 steps, stretch=1.01
  ✓ Aligned afe5-6: 134 steps, stretch=1.03
  ✓ Aligned afe5-7: 137 steps, stretch=1.05

DTW stretch factors:
branch  stretch_factor
afe5-1        1.161538
afe5-2        1.100000
afe5-3        1.215385
afe5-4        1.038462
afe5-5        1.007692
afe5-6        1.030769
afe5-7        1.053846

Mean stretch factor: 1.09

STEP 6: COMPOSITE & FINAL CALIBRATION
  Mg/Sr Final Model: Temp = 0.4743*Mg/Sr + -5.8158
    R²=0.892, RMSE=0.690, n=130

  Mg/Ca Final Model: Temp = 0.1494*Mg/Ca + -11.4754
    R²=0.463, RMSE=1.537, n=130

  Sr/Ca Final Model: Temp = -12.2111*Sr/Ca + 74.5765
    R²=0.622, RMSE=1.289, n=130


────────────────────────────────────────────────

 ## Final alignment with chosen window

In [5]:
# Change WINDOW_DAYS if the sweep table suggests a better value
WINDOW_DAYS = 30

print(f'\nRunning final alignment with window_days = {WINDOW_DAYS}')
pipeline.perform_dtw_alignment(window_days=WINDOW_DAYS)
pipeline.build_composite_and_calibrate()

print('\nCalibration equations:')
for proxy, eq in pipeline.final_equations.items():
    s = eq['stats']
    print(f'  {proxy}: Temp = {eq["slope"]:.4f} * proxy + {eq["intercept"]:.4f}  '
          f'(R²={s["r2"]:.3f}, RMSE={s["rmse"]:.3f} °C, n={s["n"]})')



Running final alignment with window_days = 30
STEP 5: UNIFIED DTW ALIGNMENT (window=30 days)
  ✓ Aligned afe5-1: 156 steps, stretch=1.20
  ✓ Aligned afe5-2: 147 steps, stretch=1.13
  ✓ Aligned afe5-3: 158 steps, stretch=1.22
  ✓ Aligned afe5-4: 135 steps, stretch=1.04
  ✓ Aligned afe5-5: 131 steps, stretch=1.01
  ✓ Aligned afe5-6: 134 steps, stretch=1.03
  ✓ Aligned afe5-7: 137 steps, stretch=1.05

DTW stretch factors:
branch  stretch_factor
afe5-1        1.200000
afe5-2        1.130769
afe5-3        1.215385
afe5-4        1.038462
afe5-5        1.007692
afe5-6        1.030769
afe5-7        1.053846

Mean stretch factor: 1.10

STEP 6: COMPOSITE & FINAL CALIBRATION
  Mg/Sr Final Model: Temp = 0.5013*Mg/Sr + -7.7808
    R²=0.909, RMSE=0.633, n=130

  Mg/Ca Final Model: Temp = 0.1618*Mg/Ca + -14.8693
    R²=0.496, RMSE=1.489, n=130

  Sr/Ca Final Model: Temp = -12.6565*Sr/Ca + 76.2725
    R²=0.614, RMSE=1.303, n=130


Calibration equations:
  Mg/Sr: Temp = 0.5013 * proxy + -7.7808  (R²=0

 ## DTW diagnostics table

In [6]:
diag_df = pd.DataFrame(pipeline.dtw_diagnostics)
print('\nDTW Diagnostics:')
print(diag_df.to_string(index=False, float_format='%.3f'))
print(f'\nMean stretch factor: {diag_df["stretch_factor"].mean():.3f}')



DTW Diagnostics:
branch  n_query  n_master  path_len  stretch_factor
afe5-1       43       130       156           1.200
afe5-2       34       130       147           1.131
afe5-3       55       130       158           1.215
afe5-4       43       130       135           1.038
afe5-5       16       130       131           1.008
afe5-6       31       130       134           1.031
afe5-7       39       130       137           1.054

Mean stretch factor: 1.097
