# QP2 Computational Lab Week 2 Template

In Part 1, we loaded our ultrasound images and did some pre-processing of an image to make the follicles more distinct, and therefore easier to recognize by AI models. We quantified the improvements to signal-to-noise ratio provided by a median filter and CLAHE method.

This week, we will work to create 'masks' of the follicles on pre-processed images, which can be used for training AI systems to recognize our features of interest. 

We will explore three methods: 
* The Gold Standard: Manual annotation of images
* A Silver (?) Standard: Using CellPose, a generalist algorithm for cellular segmentation, released in 2020
* The 'Traditional' Method: Using OpenCV (cv2), an open-source computer vision library, available since 2011

First, let's pre-process all of the files in your Dominant Follicle folder. (No need to do the ROI or CNR checks anymore; we already validated this method in Part 1.). Here is some sample code for that. If you changed any of the processing parameters for median blur or CLAHE, make the same modifications here. 

In [1]:
import cv2
import os
import glob
from tqdm import tqdm # Useful for showing a progress bar

def bulk_process_ultrasound(input_dir, output_dir):
    # 1. Create output directory if it doesn't exist
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
        print(f"Created directory: {output_dir}")

    # 2. Get list of all .jpg files
    image_paths = glob.glob(os.path.join(input_dir, "*.jpg"))
    print(f"Found {len(image_paths)} images to process.")

    # 3. Setup CLAHE
    clahe = cv2.createCLAHE(clipLimit=3.0, tileGridSize=(8,8))

    # 4. Loop through images
    for path in tqdm(image_paths):
        # Read image
        img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
        if img is None: continue

        # --- WEEK 1 PIPELINE ---
        # Step A: Median Filter (Speckle removal)
        denoised = cv2.medianBlur(img, 5)
        
        # Step B: CLAHE (Contrast stretch)
        processed = clahe.apply(denoised)

        # 5. Save the result
        file_name = os.path.basename(path)
        save_path = os.path.join(output_dir, file_name)
        cv2.imwrite(save_path, processed)

    print(f"\nDone! Processed images are in: {output_dir}")


In [6]:
input_directory = r'C:\Users\zacha\qp2Clab1\archive\Ovarian_US\dominant_follicle'
bulk_process_ultrasound(input_directory, "output_images")

Found 1296 images to process.


100%|██████████| 1296/1296 [00:02<00:00, 445.64it/s]


Done! Processed images are in: output_images





Once this is done, you will set up LabelMe to generate high-quality "Gold Standard" datasets for medical segmentation. LabelMe is a Python-based annotation tool that allows us to draw precise polygons around follicles.

Since LabelMe is a Python package, the safest way to install it is within a Virtual Environment to avoid conflicts with your other AI libraries.



# Manual Annotation

## Create a Dedicated Environment

Open your terminal (or Anaconda Prompt) and run:


conda create --name labelme python=3.9

Then activate it:

conda activate labelme

## Install LabelMe

Once the environment is active, install the package via pip:

pip install labelme


## Launch the App

Simply type the name of the tool in your terminal:

labelme

A window should pop up. You can now go to File > Open Dir and select your folder of processed ultrasound .jpg files. Be sure to use the processed JPEGs, not the raw ones. LabelMe will create .json files in the same folder.

## How to Annotate for Folliculometry
To get the best results for your future U-Net, follow these "Clinical Annotation" rules:

Use Polygons: Click "Create Polygons" (or press Ctrl+N). Do not use rectangles; follicles are organic, fluid-filled sacs and require precise boundary tracing.

The "Inner Wall" Rule: In ultrasound, the follicle wall has some thickness. Aim for the interface where the dark fluid meets the gray tissue.

Naming Conventions: When prompted for a label, use a consistent name like follicle. If you are tracking multiple follicles, you can name them follicle_1, follicle_2, etc.

Save Often: LabelMe creates one .json file for every .jpg image. These files contain the coordinate points you will later convert into .png masks.

Your task: Annotate the dominant follicle in 5 images, using LabelMe.  When that is done, return here.

Now that you have 5 .json files from LabelMe, let's do a Mask Audit. This makes sure our manual work is ready for AI. 

First, convert your json files to binary masks to create .png files. Here is sample code to do that one at a time.

In [7]:
import json
import numpy as np
import cv2

def json_to_binary_mask(json_path, image_shape):
    with open(json_path, 'r') as f:
        data = json.load(f)
    
    # Create blank canvas
    mask = np.zeros(image_shape, dtype=np.uint8)
    
    for shape in data['shapes']:
        # Extract polygon points
        points = np.array(shape['points'], dtype=np.int32)
        # Fill polygon with 255 (Follicle)
        cv2.fillPoly(mask, [points], 255)
        
    return mask

# Save as PNG (NOT JPG!).  Example:
# cv2.imwrite('mask_01.png', mask)

In [19]:
# if you're not sure what your image_shape is, check it here: 
import cv2

# Load one of your processed images
temp_img = cv2.imread('output_images/dominant_follicle_0001.jpg', cv2.IMREAD_GRAYSCALE)

# This will give you a tuple like (480, 640)
my_shape = temp_img.shape 

print(f"The image shape is: {my_shape}")

The image shape is: (256, 256)


In [21]:
for i in range(1, 6):
    json_path = f"output_images/dominant_follicle_{i:04d}.json"
    mask = json_to_binary_mask(json_path, my_shape)
    cv2.imwrite(f"mask_{i:04d}.png", mask)

In [2]:
import cv2
import matplotlib.pyplot as plt

def check_mask_alignment(img, mask):
    # Create a red-tinted overlay
    overlay = img.copy()
    overlay = cv2.cvtColor(overlay, cv2.COLOR_GRAY2RGB)
    overlay[mask > 0] = [0, 0, 255]
    
    # Blend the original and the overlay
    alpha = 0.3
    combined = cv2.addWeighted(overlay, alpha, cv2.cvtColor(img, cv2.COLOR_GRAY2RGB), 1 - alpha, 0)
    
    return combined

# Just process the first image without displaying
img_path = f"output_images/dominant_follicle_0001.jpg"
mask_path = f"mask_0001.png"

img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)

if img is not None and mask is not None:
    result = check_mask_alignment(img, mask)
    print("Alignment check completed successfully")
else:
    print("Could not load image or mask")

Alignment check completed successfully


Now do an overlay check: Use the code below to overlay your mask onto the original image. If the red "tint" doesn't perfectly match the follicle boundary, you need to refine your trace in LabelMe.

In [3]:
import cv2

for i in range(1, 6):
    img_path = f"output_images/dominant_follicle_{i:04d}.jpg"
    mask_path = f"mask_{i:04d}.png"
    
    img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
    mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
    
    if img is not None and mask is not None:
        # Create a red-tinted overlay
        overlay = img.copy()
        overlay = cv2.cvtColor(overlay, cv2.COLOR_GRAY2RGB)
        overlay[mask > 0] = [0, 0, 255]
        
        # Blend the original and the overlay
        alpha = 0.3
        combined = cv2.addWeighted(overlay, alpha, cv2.cvtColor(img, cv2.COLOR_GRAY2RGB), 1 - alpha, 0)
        
        # Save instead of display
        cv2.imwrite(f"alignment_check_{i:04d}.png", combined)
        print(f"Saved alignment_check_{i:04d}.png")
    else:
        print(f"Could not load image or mask for follicle_{i:04d}")

Saved alignment_check_0001.png
Saved alignment_check_0002.png
Saved alignment_check_0003.png
Saved alignment_check_0004.png
Saved alignment_check_0005.png


# CellPose Annotation

Unless you would like to manually annotate all of the files in the folder, let's try some ways to automate the rest of the dataset.  Cellpose was first made publicly available as a preprint on bioRxiv on February 3, 2020. The peer-reviewed paper was subsequently published in the journal Nature Methods on February 3, 2020, and appeared in the January 2021 issue. 

It was developed by Carsen Stringer, Tim Wang, Michalis Michaelos, and Marius Pachitariu of the Janelia Research Campus as a "generalist algorithm for cellular segmentation" designed to work on a wide variety of biological images. It's pretty impressive!

NOTE: I do not expect Cellpose to do very well - don't be alarmed if it does a poor job.  But you might uncover a great way to use it!

Take a few minutes to go to cellpose.org and play around with the sample images there.  This will give you a lot better idea of the capabilities of the system and what we will be trying to do with our calls to Cellpose from our script.

In [4]:
# First install Cellpose
pip install cellpose

SyntaxError: invalid syntax (3829861195.py, line 2)

Cellpose takes a lot of processing power. Before trying to annotate the entire folder, create a working folder of a subset; use just the files you manually annotated in LabelMe. Use the working folder for the rest of this section.

In [9]:
# Apply Cellpose to annotate the small subset of your processed images, corresponding to those you manually annotated 
import os
import cv2
import numpy as np
from cellpose import models, io
from tqdm import tqdm

def run_cellpose_v4_final(input_dir, output_dir):
    # 1. Create directory
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    # 2. Setup Device (v4 standard)
    device, gpu = models.assign_device(use_torch=True, gpu=True)
    
    # 3. Initialize ONLY the CellposeModel (Note the single .models)
    # Using model_type='cyto2' here forces it to skip the SAM search
    model = models.CellposeModel(gpu=gpu, pretrained_model='cyto2', device=device)

    # 4. Filter for images
    files = [f for f in os.listdir(input_dir) if f.endswith('.jpg')]
    print(f"Processing {len(files)} images on {device}...")

    for f in tqdm(files):
        img_path = os.path.join(input_dir, f)
        img = io.imread(img_path)

        # 5. The V4-Clean Eval Call
        # channels=[0,0] is used for single-channel grayscale
        # invert=True is CRITICAL: it turns dark follicles into 'cells' for the AI
        masks, flows, styles = model.eval(
            img, 
            diameter=100,             # None = auto-estimate follicle size
            channels=[0,0],         
           
           resample=True,
           #invert=True,            # Must be done for ultrasound
            #net_avg=True,           # Improves accuracy by running multiple passes
            #cellprob_threshold=-2.0, # Higher sensitivity for faint follicles
            #flow_threshold=0.4      # Standard flow; increase if masks are 'leaking'
        )

        # 6. Convert to binary mask (0 and 255)
        binary_mask = (masks > 0).astype(np.uint8) * 255

        # 7. Save as PNG
        mask_name = f.replace('.jpg', '_mask.png')
        cv2.imwrite(os.path.join(output_dir, mask_name), binary_mask)




In [10]:
run_cellpose_v4_final("cellposeTest", "cellpose_output")

pretrained model C:\Users\zacha\.cellpose\models\cpsam not found, using default model


Processing 5 images on cpu...


  0%|          | 0/5 [00:00<?, ?it/s]channels deprecated in v4.0.1+. If data contain more than 3 channels, only the first 3 channels will be used
 20%|██        | 1/5 [00:15<01:03, 15.85s/it]channels deprecated in v4.0.1+. If data contain more than 3 channels, only the first 3 channels will be used
 40%|████      | 2/5 [00:31<00:47, 15.91s/it]channels deprecated in v4.0.1+. If data contain more than 3 channels, only the first 3 channels will be used
 60%|██████    | 3/5 [00:47<00:32, 16.04s/it]channels deprecated in v4.0.1+. If data contain more than 3 channels, only the first 3 channels will be used
 80%|████████  | 4/5 [01:03<00:15, 15.74s/it]channels deprecated in v4.0.1+. If data contain more than 3 channels, only the first 3 channels will be used
100%|██████████| 5/5 [01:18<00:00, 15.79s/it]


Compare a Cellpose-created mask to the manual mask of the same follicle. 

How did it do? If you did not get a good match, try adjusting the diameter variable and run again. Also try adjusting the cellprob_threshold and the flow_threshold. Keep track of the variables you adjust.

Try (at least) three different parameter settings. Save the image corresponding to one of your manual masks for each parameter setting.


## Evaluating the effectiveness of the Cellpose model

To see how well Cellpose did, we need to calculate the Dice Coefficient (also known as the F1-Score). This metric is the industry standard for validating medical segmentations. It measures how many pixels the AI (Cellpose) and the Human (LabelMe) agreed upon, relative to the total number of "follicle" pixels.

Dice Coefficient Evaluation Script

This script will loop through your "Gold Standard" (manual) folder and find the corresponding Cellpose masks to calculate the overlap score.

In medical imaging, a Dice score > 0.85 is generally considered excellent. If your Cellpose masks score significantly lower, it’s a sign that you need to adjust parameters in the model.eval function or do more "Human-in-the-Loop" corrections, meaning that you manually adjust poorly annotated images in LabelMe and return them to the folder.

Dice= $ \frac{2|A\cup B|}{|A|+|B|}$


How to Interpret Your Results


|Dice Score|Interpretation	|Action Required|
| --- | --- | --- |
|0.90 - 1.0	|Excellent	|The AI is ready to be used as ground truth for Part 3.|
|0.75 - 0.89|	Good	|Standard for ultrasound; minor "Human-in-the-Loop" cleaning needed.|
|Below 0.70	|Poor|	Your diameter parameter in Cellpose is likely wrong, or your CLAHE contrast is too low, or this method just isn't suitable.|

In [13]:
import cv2
import numpy as np
import os
import glob

def calculate_dice(manual_path, ai_path):
    # Load masks as binary (0 or 255)
    m_mask = cv2.imread(manual_path, cv2.IMREAD_GRAYSCALE)
    a_mask = cv2.imread(ai_path, cv2.IMREAD_GRAYSCALE)

    if m_mask is None or a_mask is None:
        return None

    # Threshold to ensure they are strictly boolean (True/False)
    m_bool = m_mask > 0
    a_bool = a_mask > 0

    # Calculate Intersection and Sum
    intersection = np.logical_and(m_bool, a_bool).sum()
    total_pixels = m_bool.sum() + a_bool.sum()

    if total_pixels == 0:
        return 1.0 # Both are empty, technically a perfect match

    dice = (2. * intersection) / total_pixels
    return dice

def run_evaluation(manual_dir, ai_dir):
    manual_files = glob.glob(os.path.join(manual_dir, "mask_*.png"))
    scores = []

    print(f"{'Image Name':<30} | {'Dice Score':<10}")
    print("-" * 45)

    for m_path in manual_files:
        # Extract the number from mask_0001.png -> 0001
        base_name = os.path.basename(m_path).replace('mask_', '').replace('.png', '')
        # Match to dominant_follicle_0001_mask.png
        a_path = os.path.join(ai_dir, f"dominant_follicle_{base_name}_mask.png")

        if os.path.exists(a_path):
            score = calculate_dice(m_path, a_path)
            scores.append(score)
            print(f"follicle_{base_name:<24} | {score:.4f}")
        else:
            print(f"Warning: AI mask not found for dominant_follicle_{base_name}_mask.png")

    if scores:
        print("-" * 45)
        print(f"Mean Dice Score for Dataset: {np.mean(scores):.4f}")

run_evaluation('manualMasks', 'cellpose_output')

Image Name                     | Dice Score
---------------------------------------------
follicle_0001                     | 0.0007
follicle_0002                     | 0.0000
follicle_0003                     | 0.0000
follicle_0004                     | 0.0010
follicle_0005                     | 0.0113
---------------------------------------------
Mean Dice Score for Dataset: 0.0026


Manual masks:
  mask_0001.png
  mask_0002.png
  mask_0003.png
  mask_0004.png
  mask_0005.png

AI masks:
  dominant_follicle_0001_mask.png
  dominant_follicle_0002_mask.png
  dominant_follicle_0003_mask.png
  dominant_follicle_0004_mask.png
  dominant_follicle_0005_mask.png


# Traditional Computer Vision

Cellpose was built primarily for fluorescence microscopy (sharp, glowing edges). Ultrasound is the opposite: it's "speckle-heavy," noisy, and the boundaries are often fuzzy textures rather than clean lines.

Let's try Traditional Computer Vision (Watershed Segmentation) or Active Contours. 

## Watershed Segmentation

Watershed is the classic "topographic" approach. It treats your image like a map: dark follicles are "valleys" and bright stroma are "peaks." We "flood" the valleys with water (labels) until they hit the peaks (walls). This uses OpenCV and doesn't require any pretrained models. It relies entirely on the contrast you created in Part 1.

Test this on your subset folder.

In [37]:
import cv2
import numpy as np
import os
from tqdm import tqdm

def refined_watershed(input_dir, output_dir):
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    files = [f for f in os.listdir(input_dir) if f.endswith('.jpg')]

    for f in tqdm(files):
        # 1. Load and Blur
        img = cv2.imread(os.path.join(input_dir, f), cv2.IMREAD_GRAYSCALE)
        # A slight Gaussian blur helps merge speckle noise before thresholding
        blurred = cv2.GaussianBlur(img, (5, 5), 0)
        
        # 2. Adaptive Thresholding
        # Instead of one fixed number, this looks at 11x11 pixel neighborhoods
        # Follicles are dark, so we use THRESH_BINARY_INV to make them white
        thresh = cv2.adaptiveThreshold(blurred, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, 
                                      cv2.THRESH_BINARY_INV, 17, 7)

        # 3. Morphological Cleaning
        kernel = np.ones((3,3), np.uint8)
        # 'Opening' removes white noise (speckles) from the background
        opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=2)
        # 'Closing' fills small holes inside the follicles
        closing = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, kernel, iterations=2)

        # 4. Distance Transform (The "Seed" Creator)
        # This calculates the distance from every white pixel to the nearest black pixel
        dist_transform = cv2.distanceTransform(closing, cv2.DIST_L2, 5)
        
        # 5. Refining the Foreground (Sure FG)
        # Increase the 0.2 factor to 0.5 if you get too many small detections
        # Decrease it to 0.1 if follicles are being ignored
        ret, sure_fg = cv2.threshold(dist_transform, .4 * dist_transform.max(), 255, 0)
        sure_fg = np.uint8(sure_fg)

        # 6. Save as Mask
        mask_name = f.replace('.jpg', '_mask.png')
        cv2.imwrite(os.path.join(output_dir, mask_name), sure_fg)


refined_watershed('cellposeTest', 'watershed_masks')

  0%|          | 0/5 [00:00<?, ?it/s]

100%|██████████| 5/5 [00:00<00:00, 267.97it/s]


## Key Parameter Tuning

If the output isn't perfect, these are the "knobs" you should turn (and document in your discussion):

The Neighborhood Size (11, 2): In adaptiveThreshold, the 11 is the block size. If your follicles are very large, increase this to 21 or 31. If it's catching too much "texture" as follicles, increase the constant 2 to 5.

The Distance Multiplier (0.2): This is the most sensitive part.

If your follicles look like tiny dots: The multiplier is too high (e.g., 0.5). Lower it to 0.1.

If multiple follicles are merged into one blob: The multiplier is too low. Raise it to 0.3 or 0.4.

Without the distance transform, two follicles touching each other would be seen as one giant "mega-follicle." The distance transform finds the "peak" (the center) of each follicle, allowing the algorithm to realize there are two distinct centers.

Try (and document) three parameter adjustments.

In [None]:
(find og code and compare to the paramaters you have above)

Calculate a dice score for your best Watershed image.

In [38]:
run_evaluation('manualMasks', 'watershed_masks')

Image Name                     | Dice Score
---------------------------------------------
follicle_0001                     | 0.0181
follicle_0002                     | 0.0502
follicle_0003                     | 0.1289
follicle_0004                     | 0.0299
follicle_0005                     | 0.0150
---------------------------------------------
Mean Dice Score for Dataset: 0.0484
