# PPE Analysis using Ollama

## Introduction
Since we are exploring 2 models Llava:13B and Llamma3.2-Vision for PPE Analysis. We will follow steps mentioned bellow.

_Note: If Ollama is not installed in your system, follow the steps given in given in [README.md](README.md) file._

### Step 1: Run the Ollama Model
Open a new terminal/command prompt and run the following command based on the model to be used:<br>
    ```
    ollama run llava:13b
    ```
    <br>OR<br> 
    ```
    ollama run llava:7b
    ```
    <br>OR<br> 
    ```
    ollama run llamma3.2-vision
    ```
    <br>Once the model is running, keep the terminal/command prompt open for further steps.

### Step 2: Run this Notebook 
Start running this File. Make sure you replace the name of the model in the `model_name` variable.
`model_name` can have two values - `llava:13b` or `llamma3.2-vision`.

## The notebook is structured as follows:  

- **Data Preparation**: We start by loading our custom PPE dataset, which contains images from the medical, construction, and mining domains. The notebook demonstrates how to preprocess the images and prepare them for input into the VLM models. 
- **Model Loading and Initialization**: Next, we use OllaMa to load the LLaVa-13B and LLaMa3.2-Vision models. We'll explore the capabilities of these VLMs and discuss their suitability for the PPE analysis task. 
- **VLM-based PPE Detection**: The core of the notebook focuses on using the loaded VLM models to perform PPE detection on the input images. We'll showcase the code that leverages the models' computer vision and natural language processing abilities to identify the presence and types of PPE in the images. 
- **Results Evaluation and Analysis**: After the detection phase, we'll evaluate the performance of the VLM models on our PPE dataset. We'll analyze the accuracy, precision, recall, and other relevant metrics to understand the strengths and limitations of this approach. 
- **Visualizations and Insights**: To better interpret the model outputs, the notebook includes visualizations and techniques for highlighting the key PPE-related elements in the analyzed images. This helps provide deeper insights into the VLMs' decision-making process.  

By working through this notebook, you'll gain a solid understanding of how to use the OllaMa library and the LLaVa-13B and LLaMa3.2-Vision VLMs for PPE analysis. You'll also learn about the performance characteristics, advantages, and potential areas for improvement in this approach. 

  

In [1]:
# Imports
import os
from pathlib import Path
import ollama
from PIL import Image
import base64
from io import BytesIO
import time
from IPython.display import HTML, display

In [None]:
def encode_image(image_path):
    """Convert image to base64"""
    with Image.open(image_path) as img:
        # Convert to RGB if needed
        if img.mode != 'RGB':
            img = img.convert('RGB')
        buffered = BytesIO()
        img.save(buffered, format="JPEG", quality=95)
        return base64.b64encode(buffered.getvalue()).decode()

def analyze_image(image_path, model_name):
    """Analyze single image with VLM"""
    try:
        # Encode image
        image_base64 = encode_image(image_path)
        
        # Create prompt following Ollama's format
        messages = [{
            "role": "user",
            "content": "Please describe in particular, the protective equipment being worn, if present.",
            "images": [image_base64]
        }]
        
        # Get VLM response
        response = ollama.chat(
            model=model_name,
            messages=messages
        )
        
        return response['message']['content']
        
    except Exception as e:
        print(f"Error analyzing {image_path}: {e}")
        return f"Error: {str(e)}"

def process_folder(folder_path, model_name, limit=None): 
    """Process all images in folder and return results or exit at the number specified"""
    results = []
    
    # Support multiple image extensions
    image_extensions = {'.jpg', '.jpeg', '.png', '.bmp'}

    # Get all image files
    image_files = [
        f for f in Path(folder_path).rglob('*')
        if f.suffix.lower() in image_extensions
    ]
    
    # Add this line to limit the number of images
    # if limit:
    #     image_files = image_files[:limit]
    
    print(f"Found {len(image_files)} images to process")
    
    # Process each image
    for i, image_path in enumerate(image_files, 1):
        print(f"Processing image {i}/{len(image_files)}: {image_path.name}")
        
        analysis = analyze_image(image_path, model_name)
        
        results.append({
            'image_path': str(image_path),
            'analysis': analysis
        })
        
        # Small delay to avoid overwhelming the API
        time.sleep(0.5)
    
    return results


In [None]:
# Visualizing and Saving the result as .html file.
def generate_html(results):
    """Generate HTML display of results"""
    html = """
    <style>
        .result-container {
            display: flex;
            margin-bottom: 20px;
            border: 1px solid #ddd;
            padding: 10px;
            border-radius: 5px;
        }
        .image-container {
            flex: 0 0 400px;
            margin-right: 20px;
        }
        .image-container img {
            max-width: 100%;
            height: auto;
        }
        .analysis-container {
            flex: 1;
            padding: 10px;
        }
        .analysis-text {
            white-space: pre-wrap;
            font-family: Arial, sans-serif;
        }
    </style>
    """
    
    for result in results:
        html += f"""
        <div class="result-container">
            <div class="image-container">
                <img src="file://{result['image_path']}" alt="Image">
                <p><small>{Path(result['image_path']).name}</small></p>
            </div>
            <div class="analysis-container">
                <div class="analysis-text">{result['analysis']}</div>
            </div>
        </div>
        """
    
    return html

def save_html(html_content, model_name, output_path):
    """Save HTML to file"""
    with open(output_path, 'w', encoding='utf-8') as f:
        f.write(f"""
        <html>
        <head>
            <title>{model_name} Image Analysis Results</title>
        </head>
        <body>
            <h1>{model_name} Image Analysis Results</h1>
        """ + html_content + """
        </body>
        </html>
        """)

In [3]:
# Set the folder path containing images
image_folder = r'C:\Users\shrey\OneDrive - Process Point Technologies\vlm-ppe-analysis-toolkit\data\ppe-custom-data'

# Define Model to be used
model_name = "llava:7b"  # "llama3.2-vision"  # or 

# Process images
results = process_folder(image_folder, model_name) # Process all images
# results = process_folder(image_folder, limit=10) # Process only so many images

# Generate and display HTML in notebook
html_content = generate_html(results)
display(HTML(html_content))

# Save HTML file
save_html(html_content, model_name, r"C:\Users\shrey\OneDrive - Process Point Technologies\vlm-ppe-analysis-toolkit\results\llava7b_analysis_results.html")

Found 185 images to process
Processing image 1/185: 0o18v965b30d1.jpeg
Processing image 2/185: 13_CONSTRUCTION_SITE_WORKERS_FMT_09112021.jpg
Processing image 3/185: 15454402121_60953d8737_b.jpg
Processing image 4/185: 2+people+wearing+respirator+masks+-unsplashsml.jpg
Processing image 5/185: 200519_qilei_song_ppe_covid_027-JPG--tojpeg_1589892792905_x2.jpg
Processing image 6/185: 2021-01-06-PAPR-Kit-AMMACHI-Labs-Amrita-Hospital.jpg
Processing image 7/185: 20210423_euromarc_NEW_WEB_CTA_S_TECMEN_11.jpg
Processing image 8/185: 27015938380_f1cae4b9d4_b.jpg
Processing image 9/185: 29984032326_0d8702fa5a_b.jpg
Processing image 10/185: 3381804-man-wearing-respirator.jpg
Processing image 11/185: 360_F_904532789_CO7JBxJNlkTk3LJULiNXrw7D36nB7EnM.jpg
Processing image 12/185: 420631685.jpg
Processing image 13/185: 47185000342_89b6e44653_b.jpg
Processing image 14/185: 47185000382_bd00c6716c_b.jpg
Processing image 15/185: 4957521446_a672f926a4_b.jpg
Processing image 16/185: 5_CONSTRUCTION_WORKERS_FMT

In [21]:
# import torch

# if torch.cuda.is_available():
#     print("GPU is active")
# else:
#     print("GPU is not active")