# Behavioral Data Processing Pipeline

**Author:** Mir Qi  
**Last Updated:** November 2024  

## üìã Pipeline Overview

This notebook walks through the complete preprocessing pipeline for multi-camera behavioral recordings:

1. **Scan & Log** - Index all recording sessions
2. **Load & Filter**  - Load metadata and select sessions
3. **Calib Generat5e** - Generate calibration (raw data only)
4. **Camera Sync**  - Align multi-camera timestamps
5. **COM Prediction**  - Coarse animal localization
6. **COM Validation** - Quality control visualization
7. **sDANNCE Prediction**  - Full 3D pose estimation
8. **sDANNCE Validation** - Pose quality metrics

---

## ‚öôÔ∏è Prerequisites

**Required:**
- Access to Duke cluster/or local GPU
- Conda environment: `sdannce`
- Calibration files in recording directories
- Multi-camera video recordings with synchronized timestamps

**Python Environment:**
```bash
conda activate bbop
```

**Data Structure:**
```
base_folder/
‚îú‚îÄ‚îÄ YYYY_MM_DD/
‚îÇ   ‚îú‚îÄ‚îÄ session_name/
‚îÇ   ‚îÇ   ‚îú‚îÄ‚îÄ videos/
‚îÇ   ‚îÇ   ‚îÇ   ‚îú‚îÄ‚îÄ Camera1/
‚îÇ   ‚îÇ   ‚îÇ   ‚îÇ   ‚îú‚îÄ‚îÄ frametimes.mat
‚îÇ   ‚îÇ   ‚îÇ   ‚îÇ   ‚îî‚îÄ‚îÄ 0.mp4
‚îÇ   ‚îÇ   ‚îÇ   ‚îú‚îÄ‚îÄ Camera2/
‚îÇ   ‚îÇ   ‚îÇ   ‚îî‚îÄ‚îÄ ...
‚îÇ   ‚îÇ   ‚îú‚îÄ‚îÄ *labe3d_dannce.mat/  # calibration folders
‚îÇ   ‚îÇ   ‚îî‚îÄ‚îÄ folder_log.parquet  # generated by pipeline
```

---
## 1. Scan Folders & Generate Logs

### What this does:
- Scans the base folder for all recording sessions
- Generates/updates `folder_log.parquet` files with processing status
- Creates a database of session metadata for filtering



 (you may need to manually cerate a paret folder in the base path, which will be revised in the future)


In [28]:
import sys
import os
sys.path.append(os.path.abspath('../'))

from status_fields.status_fields_config_oct3v1_brws_250525 import STATUS_FIELDS_CONFIG
from utlis.scan_engine_utlis.scan_eng_big_utlis import log_folder_to_parquet_sep

# Configuration
base_folder = "/data/big_rim/rsync_dcc_sum/bbop_demo/25Nov"
failed_paths_file = None  # Optional: path to txt file with failed sessions
force_rescan_rec_files = []  # Optional: [('date', 'session_name')] to force rescan
rescan_threshold_days = 0.000000001  # Rescan if modified within this many days

log_folder_to_parquet_sep(base_folder, failed_paths_file, STATUS_FIELDS_CONFIG,
                            force_rescan_rec_files=force_rescan_rec_files,
                            rescan_threshold_days=rescan_threshold_days)

print("\n‚úì Scanning complete!")

Log for ZIcI1_0mW saved at /data/big_rim/rsync_dcc_sum/bbop_demo/25Nov/2025_10_31/ZIcI1_0mW/folder_log.parquet

‚úì Scanning complete!


---
## 2. Load Data

### What this does:
- Reads all `folder_log.parquet` files from scanned sessions
- Combines them into a single PyArrow table for efficient filtering

### Data structure:
Each row represents one recording session with columns:
- `rec_file`: Session directory name
- `date_folder`: Date folder (YYYY_MM_DD)
- `rec_path`: Full path to session
- `mir_generate_param`: Frist step of parameters generated? (0/1)
- `sync`: Camera synchronization done? (0/1)
- `com`: COM prediction done? (0/1)
- `com_vis`: COM validation done? (0/1)
- `dannce`: DANNCE prediction done? (0/1)
- `dannce_vis`: DANNCE validation done? (0/1)
- `social`: Social interaction session? (0/1)
- `scan_time`: Last scan timestamp
- `calib_files`: Available calibration folders


In [29]:
sys.path.append(os.path.abspath('../'))
from utlis.scan_engine_utlis.scan_engine_utlis import read_all_parquet_files

# Load all session metadata
all_df = read_all_parquet_files(base_folder)

print(f"‚úì Loaded {len(all_df)} sessions")

‚úì Loaded 1 sessions


### View the data:

In [30]:
# Convert to pandas for viewing (PyArrow table underneath for filtering)
print(all_df.to_pandas())

  mir_generate_param sync mini_6cam_map dropf_handle com com_vis social  \
0                  0    0             0            0   0       0      0   

  miniscope test after_oxytocin before_oxytocin dannce dannce_vis  \
0         0    0              0               1      0          0   

  mini_rec_sync mini_rec_sync_com   rec_file                   scan_time  \
0             0                 0  ZIcI1_0mW  2025-11-04T22:40:07.663238   

                                            rec_path date_folder  \
0  /data/big_rim/rsync_dcc_sum/bbop_demo/25Nov/20...  2025_10_31   

      calib_files  
0  [calib_before]  


In [31]:
# View all available columns
all_df.to_pandas().columns.tolist()

['mir_generate_param',
 'sync',
 'mini_6cam_map',
 'dropf_handle',
 'com',
 'com_vis',
 'social',
 'miniscope',
 'test',
 'after_oxytocin',
 'before_oxytocin',
 'dannce',
 'dannce_vis',
 'mini_rec_sync',
 'mini_rec_sync_com',
 'rec_file',
 'scan_time',
 'rec_path',
 'date_folder',
 'calib_files']

---
## 3. Filter Sessions

### What this does:
- Filters sessions based on processing status flags
- Selects only sessions that need the next processing step
- Uses PyArrow compute for efficient filtering on large datasets

### Important notes:
- **Values are strings**: Use `'0'` and `'1'`, not integers `0` and `1`
- **Multiple conditions**: combined to select.
- **Common filters** listed at end of notebook for reference

### Example workflow:
1. Filter for `mir_generate_param=0` ‚Üí run mir_generate_param 
2. Filter for `sync=0` ‚Üí run sync 
2. Filter for `com=0` ‚Üí run COM prediction
3. Filter for `com=1, com_vis=0` ‚Üí run COM validation
4. Continue through pipeline...


In [32]:
import pyarrow.compute as pc
from functools import reduce

table = all_df

# Example: Find social sessions that need COM validation
conditions = [
    pc.equal(table['mir_generate_param'], '0'),
    pc.equal(table['social'], '0'),
    pc.equal(table['sync'], '0'),
    pc.equal(table['com'], '0'),
    pc.equal(table['com_vis'], '0'),
]

filter_mask = reduce(pc.and_, conditions)
filtered_table = table.filter(filter_mask)

print(f"‚úì Filtered: {len(filtered_table)} sessions match criteria\n")


‚úì Filtered: 1 sessions match criteria



### View filtered session paths:

**Why this is useful:**
- Click paths to navigate to session directories
- Verify correct sessions are selected before processing
- Quick visual check of what will be processed

In [20]:
rec_paths = filtered_table["rec_path"].to_pylist()

print("Sessions to process:\n")
for i, path in enumerate(rec_paths, 1):
    path_str = path[0] if isinstance(path, list) else path
    print(f"{i:2d}. {path_str}")

print(f"\nTotal: {len(rec_paths)} sessions")

Sessions to process:

 1. /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/ZIcI1_1mW
 2. /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/ZIcI1_00mW
 3. /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/ZIcI1_0mW
 4. /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/ZIcI1_10mW
 5. /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/ZIcI1_5mW

Total: 5 sessions


---
## 4. Generate Calibration File

### When to run:
**Only for raw data** - Filter: `mir_generate_param=0`

### What this does:
- Generates mirror calibration parameters for multi-camera setup
- Uses specified calibration folder (checkerboard-based calibration)
- Creates configuration files needed for downstream pose estimation
- Updates status flag to `mir_generate_param=1` upon completion

### Calibration options:
Common calibration folder names:
- `calib_before_newintrinsics` - Standard calibration
- `calib_after` - Post-adjustment calibration
- `calib_extrinsics_fixed` - Fixed extrinsics calibration

### Troubleshooting:
- **"Calibration folder not found"**: Check that specified calibration exists in session directory, you can use your customized subdirectories, the defualt is `/calib_before`

In [None]:
from utlis.exe_engine_utlis.comb_all_exe import sequential_process_and_update_mirgenparam

# Specify which calibration to use, default is "calib_before"
calib_folder_name = "calib_before"
# calib_folder_name = "calib_before_newintrinsics"

print(f"Using calibration: {calib_folder_name}\n")
sequential_process_and_update_mirgenparam(filtered_table, base_folder, calib_folder_name)

print("\n‚úì Mir parameter generation complete!")

Using calibration: calib_before

Found 6 calibration files.
Frame count: 18000
Processed /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/calib_before/hires_cam1_params.mat
Processed /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/calib_before/hires_cam2_params.mat
Processed /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/calib_before/hires_cam3_params.mat
Processed /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/calib_before/hires_cam4_params.mat
Processed /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/calib_before/hires_cam5_params.mat
Processed /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/calib_before/hires_cam6_params.mat
Data saved to /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/ZIcI1_1mW/2025_10_31_ZIcI1_1mW_calib_before_label3d_dannce.mat
mir_generate_param ran successfully.
Updated Parquet file at /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/ZIcI1_1mW/folder_log.parquet with new status.
Found 6 calibration files.
Frame count: 18000
Processed /data/big_rim/rsync_dcc_sum/25Nov/2025_10_31/calib_

---
## 5. Camera Synchronization

### When to run:
Filter: `sync=0`

### What this does:
- Detects LED brightness transitions in video streams
- Creates synchronized frame indices for multi-camera analysis
- Generates validation plots showing brightness profiles


### Synchronization protocol:
The recording protocol includes **3 brightness drops** (LED switches)


### Critical: Manual supervision required!

**Why supervision matters:**
- Initial frames may have fluctuating brightness, can cause false synchronization
- Must visually verify sync looks correct

**What to check in output plots:**
- All cameras show clear brightness drops
- First drop is clean (not during fluctuation period)

### Expected output:

**Generated files:**
- `videos/6cam_sync.png` - Brightness profile plot (CHECK THIS, would also display)
- `df*label3d_dannce.mat` - Synchronized frame indices updated to calibration file

In [None]:
from utlis.exe_engine_utlis.comb_all_exe import sequential_process_and_update_sync

print("Starting camera synchronization...\n")
print("‚ö†Ô∏è  IMPORTANT: Check the generated sync plots for each session!\n")

sequential_process_and_update_sync(filtered_table, base_folder, max_frames=800)

print("\n‚úì Synchronization complete!")
print("\nüìä For later validations: Review 6cam_sync.png plots in each session's videos/ folder")

---
## 6. COM Prediction & Validation

COM (Center of Mass) prediction provides coarse animal localization needed for full pose estimation.
Note, the validation script is adapted from https://github.com/Sooophy/dannce/tree/stroke_analysis/trace_protocol. For better validations and improvements, can adapt scripts from Anshuman Sabath.

### Pipeline split:
- **Single animal**: Standard COM prediction
- **Social**: Modified pipeline for multi-animal tracking

### 6A. Single Animal COM Prediction

### When to run:
Filter: `social=0, com=0`

### What this does:
- Predicts 3D center-of-mass position for each frame
- Uses sDANNCE network with `--predict_com` flag


### Expected output:
```
Submitting job for /path/to/session1...
  Job ID: 12345678
Submitting job for /path/to/session2...
  Job ID: 12345679
...
‚úì All jobs submitted
```

**Generated files:**
- `COM/predict00/com3d.mat` - 3D COM predictions
- Slurm output logs in session directory
- Monitor jobs: `squeue -u $USER`


In [None]:
from utlis.exe_engine_utlis.comb_all_exe import dispatch_slurm_jobs

print("Submitting COM prediction jobs to HPC cluster...\n")

dispatch_slurm_jobs(
    base_path=base_folder,
    table=filtered_table,
    slurm_launch_file="/hpc/group/tdunn/lq53/tianqing_pytorch_dannce/dannce_/slurm_launch_predict.py",
    predict_flag="--predict_com",
    conda_env="sdannce",
    partition="scavenger-gpu",
    dry_run=False,
    max_workers=6,
)

print("\n‚úì Jobs submitted!")
print("\nüìä Monitor progress: squeue -u $USER")

### 6B. Single Animal COM Validation

### When to run:
Filter: `social=0, com=1, com_vis=0`

### What this does:
- Generates trajectory plots and validation visualizations
- Plots COM positions across frames
- Identifies potential tracking errors (jumps, occlusions) (optional)
- Creates frame-by-frame overlay videos (optional)
- Updates status flag to `com_vis=1` upon completion

### What to check:
- **Trajectory smoothness**: Should show natural movement patterns, not squres
- **No large jumps**: Sudden position changes indicate tracking errors

### If COM looks bad:
1. Add session path to exclusion list (e.g., `bad_com.txt`)
2. This prevents bad COM from propagating to DANNCE prediction
3. Optionally retrain/finetune COM network if many failures (the function's additional functions: `com_folder_name='COM/predict00', perform_jump_indices=True, perform_video_generation=False, perform_generate_com_video=False`)

In [None]:
from utlis.vis_valid_utlis.com_trag_updated import plot_com_all

for_com_vis = filtered_table

records = [
    {
        'date_folder': date_folder.as_py(),
        'rec_file': rec_file.as_py()
    }
    for date_folder, rec_file in zip(for_com_vis['date_folder'], for_com_vis['rec_file'])
]

print(f"Validating {len(records)} COM predictions...\n")

for i, record in enumerate(records, 1):
    base_path = f"{base_folder}/{record['date_folder']}/{record['rec_file']}"
    print(f"[{i}/{len(records)}] {base_path}")
    
    try:
        plot_com_all(base_path)
        print("  ‚úì Complete")
    except Exception as e:
        print(f"  ‚úó Error: {e}")
        continue

print("\n‚úì COM validation complete!")

### 6C. Social COM Prediction [NOTE: the social pipeline may be updated later based on Tianqing Li's pipeline]

### When to run:
Filter: `social=1, com=0`

### What this does:
- Predicts COM for multiple animals in social interaction sessions
- Same computational setup as single animal

### Differences from single animal:
- Uses `slurm_launch_predict_social.py` instead


In [None]:
from utlis.exe_engine_utlis.comb_all_exe import dispatch_slurm_jobs

print("Submitting social COM prediction jobs...\n")

dispatch_slurm_jobs(
    base_path=base_folder,
    table=filtered_table,
    slurm_launch_file="/hpc/group/tdunn/lq53/tianqing_pytorch_dannce/dannce_/slurm_launch_predict_social.py",
    predict_flag="--predict_com",
    conda_env="sdannce",
    partition="scavenger-gpu",
    dry_run=False,
    max_workers=6,
)

print("\n‚úì Social COM jobs submitted!")

### 6D. Social COM Validation

### When to run:
Filter: `social=1, com=1, com_vis=0`

### What this does:
- Validates multi-animal COM tracking
- Generates trajectory plots for each animal
- Creates interaction distance plots
- Optionally generates overlay videos

### What to check:
- **Identity maintenance**: Each animal's trajectory should be consistent
- **No identity switches**: Animals shouldn't swap IDs mid-session

In [None]:
from utlis.vis_valid_utlis.scom_traga_utlis import plot_com_all_social

for_com_vis = filtered_table

records = [
    {
        'date_folder': date_folder.as_py(),
        'rec_file': rec_file.as_py()
    }
    for date_folder, rec_file in zip(for_com_vis['date_folder'], for_com_vis['rec_file'])
]

print(f"Validating {len(records)} social COM predictions...\n")

for i, record in enumerate(records, 1):
    base_path = f"{base_folder}/{record['date_folder']}/{record['rec_file']}"
    print(f"[{i}/{len(records)}] {base_path}")
    
    try:
        plot_com_all_social(base_path, perform_generate_com_video=True)
        print("  ‚úì Complete")
    except Exception as e:
        print(f"  ‚úó Error: {e}")
        continue

print("\n‚úì Social COM validation complete!")

---
## 7. sDANNCE Prediction & Validation

sDANNCE performs full 3D skeletal pose estimation.

### Prerequisites:
- ‚úì COM prediction completed and validated
- ‚úì Bad COM sessions added to exclusion list (optional)

### 7A. Single Animal DANNCE Prediction

### When to run:
Filter: `social=0, dannce=0`

### What this does:
- Predicts 3D coordinates for all anatomical keypoints
- Optionally skips sessions with bad COM (from exclusion file)


### Expected output:
```
Executing command: conda run -n sdannce python .../slurm_launch_predict.py ...
Executing command: conda run -n sdannce python .../slurm_launch_predict.py ...
Skipping: /path/to/bad_session is in the skip list
...
```

**Generated files:**
- `DANNCE/predict00/save_data_AVG0.mat` - 3D pose predictions



### Exclusion list (optional):
Create a text file with paths to skip (one per line):
```
2025_10_03/0single5_group2
2025_10_10/0single4_group3
```

In [None]:
from concurrent.futures import ThreadPoolExecutor
import os

for_dannce = filtered_table
slurm_launch_file = "/hpc/group/tdunn/lq53/251017_new_dannce_files/slurm_launch_predict.py"

def check_expdir(expdir):
    """Check if experiment directory exists."""
    if not os.path.exists(expdir):
        print(f"Skipping: Experiment directory {expdir} does not exist")
        return None
    return expdir

def run_command(base_path, date_folder, rec_file, partition='scavenger-gpu', dry_run=True):
    """Submit DANNCE prediction job via Slurm."""
    expdir_path = os.path.join(base_path, date_folder, rec_file)
    
    if check_expdir(expdir_path) is None:
        return
    
    command = f"conda run -n sdannce python {slurm_launch_file} --expdir {expdir_path} --predict_dannce --partition {partition}"
    
    if dry_run:
        print(f"[DRY-RUN] Command: {command}")
    else:
        print(f"Executing command: {command}")
        os.system(command)

# Optional: Load exclusion list
txt_file = "/hpc/group/tdunn/Bryan_Rigs/BigOpenField/lumi_novel_object_recog/bad_com.txt"
rel_paths_to_skip = set()

if os.path.exists(txt_file):
    print(f"Loading exclusion list from: {txt_file}\n")
    with open(txt_file, 'r') as f:
        for line in f:
            rel_path = line.strip()
            if rel_path:
                rel_paths_to_skip.add(rel_path)
    print(f"Will skip {len(rel_paths_to_skip)} sessions\n")

# Prepare records
records = [
    {
        'date_folder': date_folder.as_py(),
        'rec_file': rec_file.as_py()
    }
    for date_folder, rec_file in zip(for_dannce['date_folder'], for_dannce['rec_file'])
]

max_concurrent_jobs = 4  # Parallel jobs
dry_run = False  # Set to True for testing

print(f"Submitting {len(records)} DANNCE prediction jobs...\n")

with ThreadPoolExecutor(max_workers=max_concurrent_jobs) as executor:
    futures = []
    
    for record in records:
        rel_path = os.path.join(record['date_folder'], record['rec_file'])
        expdir_path = os.path.join(base_folder, rel_path)
        
        if expdir_path in rel_paths_to_skip:
            print(f"Skipping: {rel_path} is in the skip list")
            continue
        
        futures.append(
            executor.submit(run_command, base_folder, record['date_folder'], record['rec_file'], 'scavenger-gpu', dry_run)
        )

print("\n‚úì DANNCE jobs submitted!")
print("\nüìä Monitor: squeue -u $USER")
print("‚è±Ô∏è  Expected: ~30-60 min per session")

### 7B. Single Animal DANNCE Validation

### When to run:
Filter: `social=0, dannce=1, dannce_vis=0`

### What this does:
- Generates 3D pose validation visualizations
- Checks anatomical constraints (bone lengths, angles)
- Creates skeleton overlay videos

In [None]:
from useful_files.sophie_check_dannce_mir_modif import dannce_valid
from concurrent.futures import ProcessPoolExecutor, as_completed

for_dannce_vis = filtered_table

records = [
    {
        'date_folder': date_folder.as_py(),
        'rec_file': rec_file.as_py()
    }
    for date_folder, rec_file in zip(for_dannce_vis['date_folder'], for_dannce_vis['rec_file'])
]

print(f"Validating {len(records)} DANNCE predictions...\n")

def process_record(record):
    """Process a single DANNCE validation."""
    base_path = f"{base_folder}/{record['date_folder']}/{record['rec_file']}"
    print(base_path)
    try:
        dannce_valid(base_path)
        return f"‚úì {base_path}"
    except Exception as e:
        return f"‚úó {base_path}: {e}"

with ProcessPoolExecutor() as executor:
    futures = [executor.submit(process_record, record) for record in records]
    for future in as_completed(futures):
        result = future.result()
        print(result)

print("\n‚úì DANNCE validation complete!")
print("\nüìä Review validation plots in DANNCE/predict00/vis/ folders")

---
## 8. Social DANNCE (Coming Soon)

### Current status:
Social DANNCE prediction and validation workflows will be updated soon. 

### Available visualization:
For social sessions with 3D poses, you can use:

```python
from utlis.vis_valid_utlis.social_dannce_vis import visualize_frames

visualize_frames(incident_all_six_cam, config=C, out_name="all_incidents")
```

**Requirements:**
- Config object (`C`) with session parameters
- Incident frames identified from behavioral analysis
- Multi-camera video frames extracted

### Coming updates:
- Streamlined social DANNCE prediction pipeline
- Automated identity assignment validation
- Social interaction metrics (distances, orientations)
- Multi-animal skeleton overlays

**Timeline:** Will be updated once workflow is finalized and tested.

---
## üìö Reference: Common Filter Combinations

Copy-paste these filter blocks as needed, but note that if you start from raw there is no need to get more filter beuacse you have to rescan and reload to get updated status:

### Raw Data Processing
```python
# Step 1: Generate mirror parameters
conditions = [pc.equal(table['mir_generate_param'], '0')]

# Step 2: Sync cameras
conditions = [pc.equal(table['sync'], '0')]
```

### Single Animal Pipeline
```python
# COM prediction
conditions = [
    pc.equal(table['sync'], '1'),
    pc.equal(table['social'], '0'),
    pc.equal(table['com'], '0'),
]

# COM validation
conditions = [
    pc.equal(table['social'], '0'),
    pc.equal(table['com'], '1'),
    pc.equal(table['com_vis'], '0'),
]

# DANNCE prediction
conditions = [
    pc.equal(table['social'], '0'),
    pc.equal(table['dannce'], '0'),
    pc.equal(table['com'], '1'),
    pc.equal(table['com_vis'], '1'),
]

# DANNCE validation
conditions = [
    pc.equal(table['social'], '0'),
    pc.equal(table['dannce'], '1'),
    pc.equal(table['dannce_vis'], '0'),
    pc.equal(table['com'], '1'),
    pc.equal(table['com_vis'], '1'),
]
```

### Social Animal Pipeline
```python
# COM prediction
conditions = [
    pc.equal(table['sync'], '1'),
    pc.equal(table['social'], '1'),
    pc.equal(table['com'], '0'),
]

# COM validation
conditions = [
    pc.equal(table['social'], '1'),
    pc.equal(table['com'], '1'),
    pc.equal(table['com_vis'], '0'),
]
```

### Utility Filters
```python
# Specific date
conditions = [pc.equal(table['date_folder'], '2025_11_03')]

# Date range (requires string matching)
conditions = [pc.match_substring(table['date_folder'], '2025_10')]

# Multiple conditions
conditions = [
    pc.equal(table['social'], '1'),
    pc.equal(table['com'], '1'),
    pc.match_substring(table['date_folder'], '2025_10'),
]
```

---
## üîß Troubleshooting Guide

### Common Issues

**PyArrow filtering errors:**
- Always use strings for status values: `'0'`, `'1'` not `0`, `1`
- Check column names match exactly (case-sensitive)
- Use `all_df.to_pandas().columns.tolist()` to verify columns

**Poor prediction quality:**
- Review calibration - try different calib folder
- Check synchronization plots for issues
- Consider retraining networks on your data

### Getting Help

- Review generated plots/videos for clues
- Compare with successfully processed sessions
- Check Slurm output files for error messages
- Post an issue on github

---
## üìä Pipeline Summary

| Step | Filter | Output Files |
|------|--------|-------------|
| **Scan** | N/A  `folder_log.parquet` |
| **Mirror Gen** | `mir_generate_param=0` | calib file |
| **Sync** | `sync=0` |  `6cam_sync.png`, udpated calib file |
| **COM Pred** | `com=0` |  `com3d.mat` |
| **COM Val** | `com=1, com_vis=0` | Trajectory plots |
| **DANNCE Pred** | `dannce=0` |  `save_data_AVG0.mat` |
| **DANNCE Val** | `dannce=1, dannce_vis=0` |  Validation plots |

---

## ‚úÖ Next Steps

After completing this preprocessing pipeline:

1. **Data Quality Check**: Review all validation plots systematically
2. **Behavioral Analysis**: Extract kinematic features from 3D poses (you can try neuroposelib)
3. **Statistical Analysis**: Analyze behavioral metrics across conditions
4. **Neural Alignment**: Correlate behavior with calcium imaging data (see other tutorials with general loader)

---

**Questions?** Check function docstrings or review generated outputs for debugging clues.

**Happy pre-processing! üéâ**