# Multimodal Medical Emergency Detection Agent 

## 1. Introduction to AI Agents

An **AI Agent** is a system that perceives its environment through sensors, processes the data, and takes actions autonomously to achieve specific goals. In this project, the AI agent aims to detect medical emergencies by analyzing multiple types of data:

- **Text**: Patient statements or medical notes.
- **Images**: Facial expressions indicating pain or distress.
- **Audio**: Speech content that may suggest an emergency.
- **Video**: Movements indicating falls or accidents.
- **Physiological Data**: Vital signs like heart rate and blood pressure.

**Key Concepts:**

- **Autonomous Agent**: Operates independently without continuous human guidance.
- **Multimodal Agent**: Processes and integrates multiple types of data.
- **Intelligent Agent**: Makes decisions based on AI algorithms and models.



## 2. Overview of the Code

The code is structured to perform the following tasks:

1. **Initialize AI Models**: Sets up models for text, speech, image, and video analysis.
2. **Data Acquisition**: Simulates or accepts input data from various modalities.
3. **Data Processing**: Processes each data type using appropriate AI models.
4. **Data Fusion**: Combines insights from all modalities to make an informed decision.
5. **Decision-Making**: Determines if a medical emergency is occurring.
6. **User Interface**: Provides a Streamlit-based GUI for user interaction.
7. **Alert Mechanism**: Triggers visual and auditory alerts if an emergency is detected.
8. **Interactive Q&A**: Allows users to ask questions, with answers generated by the LLaMA model.

## Importing Libraries

Import all necessary libraries required for the application, including standard libraries, machine learning models, and Streamlit for the web interface.


In [None]:
# Import standard libraries
import random          # For generating random synthetic data and simulating processes
import time            # For time-related functions, if needed in the future
import cv2             # OpenCV library for video processing
import numpy as np      # For numerical operations, especially with arrays
import torch           # PyTorch library for deep learning models
import librosa         # For audio processing and feature extraction
import warnings        # To manage warning messages

# Suppress all warnings to keep the output clean
warnings.filterwarnings("ignore")

# Import necessary libraries for Ollama integration
import requests        # To make HTTP requests to the Ollama API
import json            # To handle JSON data
import os              # For interacting with the operating system (e.g., file handling)
import base64          # For encoding binary data to base64 (useful for embedding media)

# Import models for image and speech processing
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor  # Pre-trained models for speech-to-text
from PIL import Image                                     # For image processing

# Import Streamlit for creating the web-based GUI
import streamlit as st


## Custom CSS for UI Enhancement

Define custom CSS styles to enhance the appearance of the Streamlit web interface, including font styles, button colors, and emergency alert animations.


In [None]:
def add_custom_css():
    """
    Adds custom CSS styles to the Streamlit app to enhance the UI appearance.
    """
    st.markdown(
        """
        <style>
        /* Set the default font for the body */
        body {
            font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
        }
        /* Set the background color for the main content area */
        .main {
            background-color: #f5f5f5;
        }
        /* Style for Streamlit buttons */
        .stButton > button {
            background-color: #4CAF50; /* Green background */
            color: white;              /* White text */
        }
        /* Style for the emergency alert banner */
        .emergency-alert {
            background-color: red;     /* Red background to indicate urgency */
            color: white;              /* White text for contrast */
            font-size: 24px;           /* Larger font size */
            text-align: center;        /* Centered text */
            padding: 20px;             /* Padding around the content */
            border-radius: 10px;       /* Rounded corners */
            animation: blink 1s infinite; /* Blinking animation to attract attention */
        }
        /* Keyframes for the blinking animation */
        @keyframes blink {
            0% {opacity: 1;}
            50% {opacity: 0.5;}
            100% {opacity: 1;}
        }
        </style>
        """,
        unsafe_allow_html=True  # Allow raw HTML for custom styling
    )


### Encoding Audio Files

Utility function to read an audio file and encode it to base64 format. This is useful for embedding audio directly into HTML for playback.


In [None]:
def get_audio_base64(file_path):
    """
    Reads an audio file from the given file path and encodes it to base64.
    This is useful for embedding audio directly into HTML.

    Parameters:
    - file_path (str): The path to the audio file to be encoded.

    Returns:
    - str or None: The base64-encoded string of the audio file, or None if an error occurs.
    """
    try:
        with open(file_path, 'rb') as f:
            data = f.read()  # Read the binary data of the audio file
        data_base64 = base64.b64encode(data).decode('utf-8')  # Encode to base64 and decode to string
        return data_base64
    except Exception as e:
        print(f"Error encoding audio file: {e}")  # Log the error
        return None  # Return None if encoding fails


### Initializing LLaMA via Ollama

Function to initialize the LLaMA model using the Ollama API. This setup allows sending prompts to the model and receiving generated responses.


In [None]:
def initialize_llama_via_ollama():
    """
    Initializes the LLaMA model via the Ollama API.
    This function sets up a closure that can be used to send prompts to the LLaMA model
    and receive generated responses.

    Returns:
    - function: A function that takes a prompt string and returns the model's response.
    """
    base_url = "http://localhost:11434"  # Base URL for the Ollama API, typically running locally

    def llama_model(prompt):
        """
        Sends a prompt to the LLaMA model via Ollama and retrieves the generated response.

        Parameters:
        - prompt (str): The input text prompt to send to the model.

        Returns:
        - str: The generated response from the LLaMA model.
        """
        try:
            headers = {"Content-Type": "application/json"}  # Set the content type for the request
            data = {
                "model": "llama3.2",  # Specify the LLaMA 3.2 RAG model
                "prompt": prompt      # The prompt to send to the model
            }
            # Send a POST request to the Ollama API's generate endpoint with streaming enabled
            response = requests.post(
                f"{base_url}/api/generate",
                headers=headers,
                data=json.dumps(data),
                stream=True  # Enable streaming to receive the response incrementally
            )
            if response.status_code == 200:
                output = ""  # Initialize an empty string to accumulate the response
                for line in response.iter_lines():
                    if line:
                        decoded_line = line.decode('utf-8')  # Decode the byte stream to string
                        try:
                            json_data = json.loads(decoded_line)  # Parse the JSON data
                            output += json_data.get('response', '')  # Append the response part
                        except json.JSONDecodeError:
                            continue  # If JSON is invalid, skip to the next line
                return output.strip()  # Return the accumulated response without leading/trailing whitespace
            else:
                # Log errors if the response status is not OK
                print(f"Error communicating with Ollama: {response.status_code}")
                print(f"Response: {response.text}")
                return ""  # Return an empty string on error
        except requests.exceptions.RequestException as e:
            # Handle exceptions related to the HTTP request
            print(f"Error communicating with Ollama: {e}")
            return ""
        except Exception as e:
            # Handle any other unexpected exceptions
            print(f"Exception in llama_model: {e}")
            return ""

    return llama_model  # Return the closure function


### Generating Synthetic Data

Function to generate synthetic physiological data that mimics real-world data from devices like an Apple Watch. This includes metrics such as heart rate, oxygen saturation, blood pressure, steps, calories burned, and sleep hours.


In [None]:
def generate_synthetic_data():
    """
    Generates synthetic physiological data to mimic data that might be collected from a device like an Apple Watch.
    This data includes heart rate, oxygen saturation, blood pressure, steps, calories burned, and sleep hours.

    Returns:
    - dict: A dictionary containing the synthetic physiological data.
    """
    data = {
        'heart_rate': random.randint(50, 150),                # Heart rate in beats per minute (bpm)
        'oxygen_saturation': random.uniform(85, 100),         # Oxygen saturation in percentage (%)
        'blood_pressure_systolic': random.randint(90, 160),    # Systolic blood pressure in mmHg
        'blood_pressure_diastolic': random.randint(60, 100),   # Diastolic blood pressure in mmHg
        'steps': random.randint(0, 10000),                     # Number of steps taken
        'calories_burned': random.uniform(0, 500),             # Calories burned in kilocalories (kcal)
        'sleep_hours': random.uniform(0, 12),                  # Sleep duration in hours
    }
    return data  # Return the generated data


### Processing Video

Function to process a video file and detect features relevant to medical emergencies, such as falls. Currently, this function simulates video processing by randomly determining if a fall is detected.


In [None]:
def process_video(video_path):
    """
    Processes a video file to extract features relevant to medical emergency detection, such as fall detection.
    Currently, this function simulates video processing by randomly determining if a fall is detected.

    Parameters:
    - video_path (str): The file path to the video to be processed.

    Returns:
    - bool: True if a fall is detected, False otherwise.
    """
    try:
        # Check if the video file exists
        if not os.path.exists(video_path):
            print(f"Error: Video file {video_path} does not exist.")
            return False  # Return False if the video file is missing

        # For simplicity, simulate video processing with random fall detection
        fall_detected = random.choice([True, False])
        return fall_detected  # Return the simulated result
    except Exception as e:
        # Handle any exceptions during video processing
        print(f"Error in video processing: {e}")
        return False  # Default to False on error


### Processing Image

Function to analyze an image and detect facial expressions, simulating emotion detection.


In [None]:
def process_image(image_path):
    """
    Analyzes an image to detect facial expressions, simulating emotion detection.

    Parameters:
    - image_path (str): The file path to the image to be processed.

    Returns:
    - str or None: The detected dominant emotion or None if an error occurs.
    """
    try:
        # Check if the image file exists
        if not os.path.exists(image_path):
            print(f"Error: Image file {image_path} does not exist.")
            return None  # Return None if the image file is missing

        # Load the image to ensure it's readable
        img = Image.open(image_path)

        # Simulate emotion detection by randomly selecting an emotion
        emotions = ['happy', 'sad', 'angry', 'surprised', 'neutral']
        dominant_emotion = random.choice(emotions)
        return dominant_emotion  # Return the simulated emotion
    except Exception as e:
        # Handle any exceptions during image processing
        print(f"Error in image processing: {e}")
        return None  # Return None on error



### Processing Speech

Function to convert speech in an audio file to text using a pre-trained Wav2Vec2 model.
