# üå∏ Pollen Viability Deep Learning Pipeline (YOLOv8)

**Project:** Automated counting of viable (stained) vs. non-viable (pale) pollen grains.  
**Author:** Jakub ≈†tenc  
**Model:** YOLOv8 (Medium/Large)  
**Compute Environment:** CESNET MetaCentrum (HPC)

This pipeline performs three main functions:
1.  **Data Management:** Pulls versioned datasets from **CESNET S3 Object Storage** to the high-speed Scratch SSD.
2.  **Training:** Retrains the model on the cluster GPUs using specific biological augmentations.
3.  **Inference:** Detects pollen in new microscope images using dynamic resolution switching.

---
### üõ†Ô∏è Step 1: Environment Setup
Run the cells below to configure the headless environment and download the dataset.
* **Installs:** `ultralytics`, `boto3`, `opencv-python-headless` (Server-side CV)
* **Storage:** Downloads data from `s3://pollen-handrkov` to `$SCRATCHDIR`

In [1]:
# Install the necessary libraries
# The '!' tells Jupyter to run this as a terminal command
# We pin OpenCV to 4.8.x and Numpy to 1.24.x so they play nice together
#!pip install "numpy==1.24.4" "opencv-python-headless==4.8.0.74" boto3 ultralytics


import os
import boto3
from getpass import getpass
from ultralytics import YOLO
import numpy

# 1. VERIFY THE ENVIRONMENT
print(f"‚úÖ Numpy Version: {numpy.__version__}") 
# (Should be 1.24.4)

# 2. CONFIGURATION
# I corrected the bucket name for you here:
BUCKET_NAME = 'bucket'   
REMOTE_PATH = 'Ostatni/Pollen_viability/'      
LOCAL_PATH = './Pollen_viability/'          

# 3. AUTHENTICATION
print("Enter CESNET S3 Access Key:")
access_key = getpass()
print("Enter CESNET S3 Secret Key:")
secret_key = getpass()

# 4. CONNECT
s3 = boto3.resource('s3',
                    endpoint_url='https://s3.cl4.du.cesnet.cz',
                    aws_access_key_id=access_key,
                    aws_secret_access_key=secret_key)

bucket = s3.Bucket(BUCKET_NAME)

# 5. DOWNLOAD
print(f"‚¨áÔ∏è Downloading '{REMOTE_PATH}' from S3 to '{LOCAL_PATH}'...")

try:
    # Check if bucket exists
    s3.meta.client.head_bucket(Bucket=BUCKET_NAME)
    
    count = 0
    for obj in bucket.objects.filter(Prefix=REMOTE_PATH):
        target = os.path.join(LOCAL_PATH, os.path.relpath(obj.key, REMOTE_PATH))
        if not os.path.exists(os.path.dirname(target)):
            os.makedirs(os.path.dirname(target))
        if obj.key.endswith('/'): continue 
        bucket.download_file(obj.key, target)
        count += 1
        if count % 10 == 0: print(f"Downloaded {count} files...", end='\r')

    print(f"\n‚úÖ Success! Downloaded {count} files to {LOCAL_PATH}")
    print("üöÄ You are ready to train YOLO!")

except Exception as e:
    print(f"\n‚ùå Error: {e}")
    print("Tip: If the error says '404 Not Found', check if 'Pollen_viability/' exists in your S3 bucket.")

hello
