# Transit Finding with `monofind`

This notebook demonstrates how to detect transit-like events in light curves using the `monofind` algorithm.

The aim is to search for transits of circumbinary planets in eclipsing binary light curves. Firstly, the code removes unwanted trends from the light curve (see utils/detrending.py for details). Then, based on the Median Absolute Deviation of the flux, the code determines a threshold across the light curve, flagging Threshold Crossing Events (TCEs) where 3 or more consecutive points lie below this threshold.

## Setup

First, import the necessary modules and configure logging.

In [None]:
from mono_cbp import TransitFinder
from mono_cbp.utils import setup_logging
import os
import pandas as pd
%matplotlib inline

setup_logging(log_file=None)

## Configuration

Define the `transit_finding` configuration parameters.

In [2]:
config = {
    'transit_finding': {
        'edge_cutoff': 0.0,                # How much data to cut from edges (days)
        'mad_threshold': 3,                # Multiplier for MAD detection threshold
        'detrending_method': 'cb',         # 'cb' = cosine + biweight, 'cp' = cosine + pspline
        'generate_vetting_plots': True,    # Create diagnostic plots
        'generate_event_snippets': True,   # Generate event data for model comparison
        'save_event_snippets': True,       # Save event snippets to disk (False = in-memory only)
        'generate_skye_plots': True,       # Create Skye metric histograms
        'cadence_minutes': 30,             # Cadence of data (minutes)
        'filters': {
            'min_snr': 5.0,                # Minimum signal-to-noise ratio
            'max_duration_days': 1.0,      # Maximum transit duration (days)
            'det_detection_threshold': 18  # Detrending-dependence threshold
        },
        'cosine': {
            'win_len_max': 12.0,           # Maximum window length (days)
            'win_len_min': 1.0,            # Minimum window length (days)
            'fap_threshold': 0.01,         # False alarm probability threshold
            'poly_order': 2                # Polynomial order to remove from flux before periodogram
        },
        'biweight': {
            'win_len_max': 3.0,            # Maximum biweight window length (days)
            'win_len_min': 1.0             # Minimum biweight window length (days)
        },
        'pspline': {
            'max_splines': 25              # Maximum number of splines
        }
    }
}

## Initialise `TransitFinder`

Create a `TransitFinder` object using the sub-sample of the TESS Eclipsing Binary Catalogue (TEBC) used in Davies *et al* (submitted).

To load `catalogue` and `sector_times`, you can provide `TransitFinder` with:

- The filepath to the catalogue CSV file
- A DataFrame of the catalogue, either with `catalogue = pd.read_csv(...)` or `catalogue = load_catalogue(...)` (using `from mono_cbp.utils.data import load_catalogue`)

In [3]:
catalogue = pd.read_csv('../catalogues/TEBC_morph_05_P_7_ADJUSTED.csv')
sector_times = pd.read_csv('../catalogues/sector_times.csv')

finder = TransitFinder(
    catalogue=catalogue,
    sector_times=sector_times,
    config=config,
    TEBC=True
)

14:32:56 | INFO     | mono_cbp.utils.data | Processed TEBC catalogue: eclipse parameters selected from 2g columns


## Process a Single Light Curve

You can choose a single light curve file to search for TCEs using `finder.process_file()`:

In [4]:
# Example: Process a single light curve file (TOI-1338 Sector 6, contains a transit of TOI-1338b)
# Replace with an actual file path from your masked light curves
sample_file = '../data/TIC_260128333_06.npz'

print(f"Processing {sample_file}...\n")

# Create output directory for plots
plot_dir = 'results/vetting_plots'
os.makedirs(plot_dir, exist_ok=True)

events = finder.process_file(sample_file, plot_output_dir=plot_dir)

print(f"\nFound {len(events)} TCEs.")

Processing ../data/TIC_260128333_06.npz...



14:33:00 | INFO     | mono_cbp.utils.detrending | TIC_260128333_06 cos + biweight
14:33:00 | INFO     | mono_cbp.utils.detrending | Cosine window length: 2.1 days
14:33:01 | INFO     | mono_cbp.utils.plotting | Saved event plot to results/vetting_plots/TIC_260128333_06_1.png
14:33:01 | INFO     | mono_cbp.utils.plotting | Saved event plot to results/vetting_plots/TIC_260128333_06_2.png



Found 2 TCEs.


## Display Event Details

To investigate the results of our search, we can examine the properties of detected events using the keys of the event dictionaries:

In [5]:
if len(events) > 0:
    print("Detected Events:")
    print("=" * 70)
    for i, event in enumerate(events, 1):
        print(f"Event {i}:")
        print(f"  Time (BTJD): {event['time']:.4f}")
        print(f"  Phase: {event['phase']:.4f}" if event['phase'] is not None else "  Phase: N/A")
        print(f"  Depth: {event['depth']*100:.3f}%")
        print(f"  Duration: {event['duration']:.4f} days ({event['duration']*24:.2f} hours)")
        print(f"  SNR: {event['snr']:.2f}")
        print()
else:
    print("No events detected in this light curve.")

Detected Events:
Event 1:
  Time (BTJD): 1471.5741
  Phase: 0.2243
  Depth: 0.107%
  Duration: 0.2292 days (5.50 hours)
  SNR: 3.67

Event 2:
  Time (BTJD): 1483.9075
  Phase: 0.0685
  Depth: 0.311%
  Duration: 0.2708 days (6.50 hours)
  SNR: 14.62



So we can see that the first event has quite a low signal-to-noise ratio (SNR), and so is likely to be a systematic (more on that later).

## Process Multiple Light Curves

If you want to process a whole sample of light curves, you can use the `process_directory()` method:

Note that event snippets are only saved when using `process_directory()`, not when using `process_file()`

In [6]:
data_dir = '../data'
output_dir = 'results'
plot_dir = 'results/vetting_plots'

# Create output directories
os.makedirs(output_dir, exist_ok=True)
os.makedirs(plot_dir, exist_ok=True)

# This may take a while depending on the number of files
results_df = finder.process_directory(
    data_dir=data_dir,
    output_file='detected_events.txt',
    output_dir=output_dir,
    plot_output_dir=plot_dir
)

print("\nProcessing complete!")

14:33:20 | INFO     | mono_cbp.transit_finding | Processing files in ../data
14:33:20 | INFO     | mono_cbp.transit_finding | Found 4 files to process
14:33:20 | INFO     | mono_cbp.transit_finding | Progress: 0/4
14:33:20 | INFO     | mono_cbp.utils.detrending | TIC_7695666_05 cos + biweight
14:33:20 | INFO     | mono_cbp.utils.detrending | Cosine window length: 1.5 days
14:33:21 | INFO     | mono_cbp.utils.plotting | Saved no-events plot to results/vetting_plots/TIC_7695666_05.png
14:33:21 | INFO     | mono_cbp.utils.detrending | TIC_260128333_06 cos + biweight
14:33:21 | INFO     | mono_cbp.utils.detrending | Cosine window length: 2.1 days
14:33:22 | INFO     | mono_cbp.utils.plotting | Saved event plot to results/vetting_plots/TIC_260128333_06_1.png
14:33:22 | INFO     | mono_cbp.utils.plotting | Saved event plot to results/vetting_plots/TIC_260128333_06_2.png
14:33:23 | INFO     | mono_cbp.utils.detrending | TIC_260128333_07 cos + biweight
14:33:23 | INFO     | mono_cbp.utils.detr


Processing complete!


The results of this search are automatically saved to a text file ('detected_events.txt').

## Summary Statistics

Display summary statistics for the transit search:

In [None]:
print("Transit Finding Statistics:")
print("=" * 60)
print(f"Total files processed: {finder.stats['total_files']}")
print(f"Total events detected: {finder.stats['total_events']}")
print(f"Cosine detrending successes: {finder.stats['cosine_successes']}")

## Examine `results_df`

View the detected events in tabular format:

In [None]:
print(f"Total events in results: {len(results_df)}")
print("\nFirst 10 events:")
results_df.head(10)

## Filter Events

Now that we have our list of TCEs from our search, how do we filter out false positives?

One way you can do this is by using the `filter_events()` method. You can filter the events by SNR, duration, Skye flags, and detrending-dependence.

In [None]:
# Filter results for high-quality TCEs only
filtered_results = finder.filter_events(
    results_df,
    min_snr=config['transit_finding']['filters']['min_snr'],
    max_duration_days=config['transit_finding']['filters']['max_duration_days'],
    det_dependence_flag=0,
    skye_flag=0
)

filtered_results

Now that we have a filtered list of candidates, we can pass these on to the model comparison vetting (see 04_model_comparison.ipynb) to classify these events.