```


               +-------------------+
               |   Input Video or  |
               |   Image Sequence  |
               +--------+----------+
                        |
                        v
             +----------+-----------+
             |   Resize & Normalize |
             |      (to 512x512)    |
             +----------+-----------+
                        |
                        v
                +-------+--------+
                |   TSP-SAM      |<---+
                | Probabilistic  |    |
                |   Segmentation |    |
                +-------+--------+    |
                        |             |
                        v             |
               +--------+---------+   |
               | Adaptive Threshold|  |
               | + Morph + Filter  |  |
               +--------+----------+  |
                        |             |
               +--------v----------+  |
               |   TSP Mask (binary)| |
               +--------+----------+  |
                        |             |
                        |             |
         +--------------+-------------+--------------+
         |                            |              |
         v                            v              v
 +--------------+        +--------------------+   +--------------------+
 |  SAM Box     |        |  OpenPose Keypoints|   | Temporal Memory    |
 | Extraction   |        | + SAM2 Segmentation|   | (Optional Smoothing)|
 +------+-------+        +----------+---------+   +---------+----------+
        |                           |                       |
        v                           v                       |
 +-------------+         +-------------------+              |
 | SAM Mask    |         | Pose-Guided Mask  |              |
 +------+------|         +---------+---------+              |
        |                          |                        |
        +-----------+  +----------+                         |
                    v  v                                    |
           +--------+--+---------+                          |
           |  Fusion Logic (YAML) |  <----------------------+
           |  (union, tsp+pose...)|
           +--------+-------------+
                    |
                    v
     +--------------+---------------+
     | Post-process Fused Mask      |
     | (MorphOpen → Close → Filter) |
     +--------------+---------------+
                    |
                    v
         +----------+----------+
         | Save Binary Mask    |
         | Save Overlay Image  |
         | Save Composite View |
         +----------+----------+
                    |
                    v
        +-----------+------------+
        |  Write Debug Statistics |
        |  (mask areas, thresholds|
        |   region count, etc.)   |
        +------------------------+


```


# MaskAnyone–Temporal Pipeline with Code Mapping


---

## 1. **Input & Resize**

```
+-------------------+
|   Input Video or  |
|   Image Sequence  |
+--------+----------+
         |
         v
+--------+---------+
| Resize & Normalize |
+-------------------+
```

**Code:**

```python
if dataset_mode == "davis":
    frame = cv2.imread(str(frame_data))
else:
    ret, frame = cap.read()
frame_resized = resize_frame(frame, infer_cfg)
```

---

## 2. **TSP-SAM Segmentation**

```
+------------------+
|   TSP-SAM Model  |
+------------------+
```

**Code:**

```python
tsp_mask, stats = model_infer_real(
    model, frame_resized, infer_cfg,
    debug_save_dir=output_path / "tsp_thresh", frame_idx=idx
)
```

---

## 3. **Adaptive Thresholding + Post-processing**

```
+---------------------------+
| Adaptive Thresholding +  |
| Morphological Filtering  |
+---------------------------+
```

**Code (inside `model_infer_real()`):**

```python
adaptive_thresh = get_adaptive_threshold(prob, percentile=percentile)
raw_mask = (prob > adaptive_thresh).astype(np.uint8)
...
mask_cleaned = cv2.morphologyEx(mask_open, cv2.MORPH_CLOSE, kernel)
...
final_mask = mask_cleaned * 255 if num_pixels >= min_area else ...
```

---

## 4. **SAM Mask Extraction via Bounding Box**

```
+--------------+
|  SAM Mask    |
| (via BBox)   |
+--------------+
```

**Code:**

```python
if use_sam:
    bbox = extract_bbox_from_mask(tsp_mask)
    if bbox:
        sam_raw = sam_wrapper.segment_with_box(rgb, str(bbox))
        sam_mask = cv2.resize(sam_raw, tsp_mask.shape[::-1], interpolation=cv2.INTER_NEAREST)
```

---

## 5. **Pose Mask via OpenPose + SAM2**

```
+---------------------+
| Pose Mask (SAM2 +   |
|   OpenPose Keypts)  |
+---------------------+
```

**Code:**

```python
if use_pose:
    keypoints = extract_pose_keypoints(rgb)
    ...
    masks = sam2_client.predict_points(Image.fromarray(rgb), scaled, [1]*len(scaled))
    if masks and masks[0] is not None:
        pose_mask = cv2.resize(masks[0], tsp_mask.shape[::-1], interpolation=cv2.INTER_NEAREST)
```

---

## 6. **Fusion Logic**

```
+------------------------+
|  Fusion Strategy YAML  |
+------------------------+
```

**Code:**

```python
if fusion_method == "union":
    fused_mask = cv2.bitwise_or(fused_mask, sam_mask)
    if use_pose:
        fused_mask = cv2.bitwise_or(fused_mask, pose_mask)
elif fusion_method == "sam_only":
    fused_mask = sam_mask
elif fusion_method == "tsp+pose":
    fused_mask = cv2.bitwise_or(tsp_mask, pose_mask)
...
```

---

## 7. **Temporal Smoothing (Optional)**

```
+--------------------------+
| Rolling Mask Memory Avg |
+--------------------------+
```

**Code:**

```python
if temporal_smoothing:
    mask_memory.append(fused_mask.astype(np.float32))
    smoothed_mask = np.mean(mask_memory, axis=0)
    fused_mask = (smoothed_mask > 127).astype(np.uint8) * 255
```

---

## 8. **Post-process Fused Mask**

```
+-----------------------------+
| MorphOpen/Close + CC Filter|
+-----------------------------+
```

**Code:**

```python
fused_mask = post_process_fused_mask(fused_mask, min_area=min_area)
```

`post_process_fused_mask()`:

```python
cleaned = cv2.morphologyEx(fused_mask, cv2.MORPH_OPEN, kernel)
closed = cv2.morphologyEx(cleaned, cv2.MORPH_CLOSE, kernel)
# connected components + area filtering
```

---

## 9. **Output Visualizations**

```
+-----------------------------+
| Save Mask + Overlay + CSV  |
+-----------------------------+
```

**Code:**

```python
save_mask_and_frame(frame, mask_resized, str(output_path), idx,
    save_overlay=output_cfg.get("save_overlay", True),
    overlay_alpha=output_cfg.get("overlay_alpha", 0.5),
    save_frames=output_cfg.get("save_frames", False),
    save_composite=True)
```

---

## 10. **Debug CSV Logging**

```
+-------------------+
|  Write debug_stats|
+-------------------+
```

**Code:**

```python
writer.writerow([
    idx, stats['mean'], stats['max'], stats['min'], stats['adaptive_thresh'],
    tsp_area, sam_area, pose_area, fused_area, max_region
])
```
