# Demo: Streamlined Clock Offset Analysis

**Note: This is a refactored version by Claude. See `demo_check_clock.ipynb` for the original user file.**

This notebook provides a streamlined version of clock offset analysis for oceanographic instruments.
It uses the new `oceanarray.clock_offset` module for cleaner, more maintainable code.

## Purpose

This notebook helps determine whether instrument timestamps are incorrect by:
1. Analyzing deployment timing based on temperature profiles
2. Performing lag correlation analysis between instruments
3. Calculating recommended clock offset corrections

**Note:** This notebook does not modify data files. It only analyzes and suggests clock_offset values for the YAML configuration.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from oceanarray import clock_offset

## Configuration

In [None]:
# Configuration
mooring_name = 'dsE_1_2018'
base_dir = '/Users/eddifying/Dropbox/data/ifmro_mixsed/ds_data_eleanor/'
output_path = base_dir + 'moor/proc/'

# Choose file type: '_raw' for original data, '_use' for processed data
file_suffix = '_raw'
# file_suffix = '_use'

print(f"Analyzing mooring: {mooring_name}")
print(f"Using files with suffix: {file_suffix}")

## Load and Process Data

In [None]:
# Load instrument data
datasets, moor_yaml_data = clock_offset.load_mooring_instruments(
    mooring_name, base_dir, output_path, file_suffix
)

print(f"Loaded {len(datasets)} instruments")

In [None]:
# Create common time grid and interpolate
time_grid = clock_offset.create_common_time_grid(datasets)
datasets_interp = clock_offset.interpolate_datasets_to_grid(datasets, time_grid)

# Combine into single multi-level dataset
combined_ds = clock_offset.combine_interpolated_datasets(datasets_interp)

print(f"Combined dataset shape: {combined_ds.dims}")
print(f"Time grid length: {len(time_grid)}")
print(f"Time range: {time_grid[0]} to {time_grid[-1]}")

## Deployment Timing Analysis

In [None]:
# Analyze deployment timing using temperature profiles
combined_ds = clock_offset.analyze_deployment_timing(combined_ds)

print("Deployment timing analysis completed")
print(f"Dataset now includes: {list(combined_ds.data_vars)}")

## Visualize Temperature Profiles

In [None]:
# Plot temperature time series with deployment bounds
time = combined_ds["time"].values
temp = combined_ds["temperature"].values
split_vals = combined_ds["split_value"].values
instruments = combined_ds["instrument"].values
start_times = combined_ds["start_time"].values
end_times = combined_ds["end_time"].values

for i in range(combined_ds.dims["N_LEVELS"]):
    fig, ax = plt.subplots(figsize=(12, 4))

    ax.plot(time, temp[:, i], label=f"{instruments[i]}", alpha=0.7)
    ax.axhline(split_vals[i], color="red", linestyle="--",
               label=f"Split={split_vals[i]:.2f}")

    # Plot deployment bounds if available
    if np.isfinite(start_times[i].astype("datetime64[ns]").astype("int64")):
        ax.axvline(start_times[i], color="green", linestyle="--", lw=2,
                   label="Deployment Start")
    if np.isfinite(end_times[i].astype("datetime64[ns]").astype("int64")):
        ax.axvline(end_times[i], color="blue", linestyle="--", lw=2,
                   label="Deployment End")

    ax.set_title(f"Instrument {i}: {instruments[i]} at {combined_ds['nominal_depth'][i].values:.0f}m")
    ax.set_xlabel("Time")
    ax.set_ylabel("Temperature (°C)")
    ax.legend()
    ax.grid(True, alpha=0.3)

    plt.tight_layout()
    plt.show()

## Calculate Timing Offsets

In [None]:
# Calculate timing offsets based on deployment bounds
offset_results = clock_offset.calculate_timing_offsets(combined_ds)

# Print summary table
clock_offset.print_timing_offset_summary(combined_ds, offset_results)

## Detailed Deployment Boundary Visualization

Examine the exact transition points with individual measurements around predicted deployment boundaries.

In [None]:
# Plot detailed deployment boundaries showing individual measurements
# This shows 10 samples before/after predicted boundaries with red circles and blue connecting lines
clock_offset.plot_deployment_boundaries(datasets, combined_ds, n_samples=10)

## Lag Correlation Analysis

In [None]:
# Get suggestion for best reference instrument (but you can override manually)
ref_suggestion = clock_offset.suggest_reference_instrument(combined_ds, offset_results)

# Use the suggested reference (or manually set ref_index to any value you prefer)
ref_index = ref_suggestion['suggested_index']  # You can change this manually
sub_sample = 5  # Subsampling factor for speed

print(f"Using reference instrument: Index {ref_index}")
print(f"Note: You can manually set ref_index to any instrument index (0-{combined_ds.sizes['N_LEVELS']-1})")
print()

correlation_results = clock_offset.perform_lag_correlation_analysis(
    combined_ds, ref_index=ref_index, sub_sample=sub_sample
)

# Print correlation summary
clock_offset.print_correlation_summary(combined_ds, correlation_results)

In [None]:
# Plot correlation results
time_interval = correlation_results['time_interval']
sub_sample = correlation_results['sub_sample']
depths = combined_ds['nominal_depth'].values

plt.figure(figsize=(12, 6))

max_lag_sub = len(correlation_results['correlations'][0]) // 2
lags_sub = np.arange(-max_lag_sub, max_lag_sub + 1)

for i, corrs in enumerate(correlation_results['correlations']):
    dt_sub = sub_sample * time_interval
    plt.plot(lags_sub * dt_sub, corrs,
             label=f'Level {i+1} ({depths[i]:.0f}m)', alpha=0.7)

plt.xlabel('Lag (seconds)')
plt.ylabel('Correlation')
plt.title(f'Lag Correlation Analysis (Reference: Level {ref_index+1})')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## Summary: All Temperature Time Series

In [None]:
# Plot all temperature time series together
plt.figure(figsize=(14, 8))

for i in range(combined_ds.sizes['N_LEVELS']):
    depth = combined_ds['nominal_depth'][i].values
    instrument = combined_ds['instrument'][i].values
    serial = combined_ds['serial_number'][i].values

    plt.plot(combined_ds['time'], combined_ds['temperature'][:, i],
             label=f'{instrument} #{serial} ({depth:.0f}m)', alpha=0.8)

plt.xlabel('Time')
plt.ylabel('Temperature (°C)')
plt.title(f'Temperature Time Series - {mooring_name}')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## Recommendations

Based on the analysis above:

1. **Deployment Timing Analysis**: Shows offset estimates based on when instruments first/last detect "deep" water temperatures
2. **Lag Correlation Analysis**: Shows offset estimates based on cross-correlation of temperature time series

Use the **negative** of the calculated offset as the `clock_offset` value in the YAML file.

After updating the YAML file, run stage2 processing to apply the corrections, then re-run this analysis with `file_suffix = '_use'` to verify the corrections.