# AI Medical Assistant - Example Notebook

This notebook demonstrates the basic setup and usage of the AI Medical Assistant using RAG.

## Setup Instructions

If you're running this in Google Colab, run the installation cells below. For local development, make sure you've installed the requirements first.

## Installation (Google Colab)

In [None]:
# Only run this cell if you're in Google Colab
# Check if running in Colab
try:
    import google.colab
    IN_COLAB = True
except:
    IN_COLAB = False

if IN_COLAB:
    # Clone the repository if not already cloned
    !git clone https://github.com/BasithMrasak/AI-Medical-Assistant-Using-RAG.git
    %cd AI-Medical-Assistant-Using-RAG
    
    # Install requirements
    !pip install -q -r requirements.txt
    
    print("✓ Setup complete!")
else:
    print("Running locally - make sure you've installed requirements.txt")

## Import Required Libraries

In [None]:
import torch
import numpy as np
import pandas as pd
from transformers import AutoTokenizer, AutoModel
from sentence_transformers import SentenceTransformer
import faiss

print("Libraries imported successfully!")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

## Configuration

In [None]:
# Configuration parameters
config = {
    'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2',  # Fast and efficient
    'embedding_dim': 384,
    'top_k': 5,  # Number of documents to retrieve
}

print("Configuration:")
for key, value in config.items():
    print(f"  {key}: {value}")

## Sample Medical Documents

In a production system, these would be loaded from a medical database or document collection.

In [None]:
# Sample medical knowledge base
sample_documents = [
    "Diabetes mellitus is a metabolic disorder characterized by high blood sugar levels. Type 1 diabetes is caused by autoimmune destruction of pancreatic beta cells, while Type 2 diabetes is primarily due to insulin resistance.",
    "Hypertension, or high blood pressure, is a condition where the force of blood against artery walls is consistently too high. Normal blood pressure is below 120/80 mmHg. Hypertension is diagnosed when readings are consistently 130/80 mmHg or higher.",
    "Asthma is a chronic respiratory condition characterized by inflammation and narrowing of the airways. Common symptoms include wheezing, shortness of breath, chest tightness, and coughing, especially at night or early morning.",
    "Coronary artery disease (CAD) occurs when the coronary arteries become narrowed or blocked due to plaque buildup. This can lead to chest pain (angina), heart attacks, and heart failure if left untreated.",
    "Alzheimer's disease is a progressive neurodegenerative disorder that causes memory loss, cognitive decline, and behavioral changes. It is the most common cause of dementia in older adults.",
]

print(f"Loaded {len(sample_documents)} sample medical documents")

## Initialize Embedding Model

In [None]:
# Load the sentence transformer model for creating embeddings
print("Loading embedding model...")
embedding_model = SentenceTransformer(config['embedding_model'])
print("✓ Embedding model loaded successfully!")

## Create Document Embeddings

In [None]:
# Generate embeddings for all documents
print("Generating document embeddings...")
document_embeddings = embedding_model.encode(sample_documents)
print(f"✓ Created embeddings with shape: {document_embeddings.shape}")

## Build FAISS Index

In [None]:
# Create FAISS index for fast similarity search
print("Building FAISS index...")
dimension = document_embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)  # L2 distance
index.add(document_embeddings.astype('float32'))
print(f"✓ FAISS index built with {index.ntotal} vectors")

## Query Function

This function demonstrates the retrieval part of RAG.

In [None]:
def retrieve_relevant_documents(query, top_k=3):
    """
    Retrieve the most relevant documents for a given query.
    
    Args:
        query: The user's question or query
        top_k: Number of documents to retrieve
        
    Returns:
        List of tuples (document, distance)
    """
    # Encode the query
    query_embedding = embedding_model.encode([query])
    
    # Search the index
    distances, indices = index.search(query_embedding.astype('float32'), top_k)
    
    # Return the documents with their distances
    results = []
    for idx, dist in zip(indices[0], distances[0]):
        results.append({
            'document': sample_documents[idx],
            'distance': float(dist),
            'similarity': 1 / (1 + dist)  # Convert distance to similarity score
        })
    
    return results

print("✓ Query function defined")

## Example Queries

In [None]:
# Test with different medical queries
queries = [
    "What causes high blood sugar?",
    "Tell me about heart disease",
    "What are the symptoms of asthma?",
]

for query in queries:
    print(f"\n{'='*80}")
    print(f"Query: {query}")
    print(f"{'='*80}")
    
    results = retrieve_relevant_documents(query, top_k=2)
    
    for i, result in enumerate(results, 1):
        print(f"\nResult {i}:")
        print(f"Similarity Score: {result['similarity']:.4f}")
        print(f"Document: {result['document'][:200]}...")

## Next Steps

This notebook demonstrates the basic retrieval component. The complete RAG system would include:

1. **Document Processing**: Load and process medical documents
2. **Retrieval**: Find relevant documents (implemented above)
3. **Generation**: Use an LLM to generate responses based on retrieved documents
4. **Post-processing**: Format and validate the response

To contribute to this project:
- See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines
- Implement additional features
- Improve the document processing pipeline
- Add evaluation metrics
- Create a user interface

## Save Your Work

If you're in Google Colab and want to push your changes:

```python
# Configure git
!git config --global user.email "your-email@example.com"
!git config --global user.name "Your Name"

# Create a new branch
!git checkout -b feature/your-feature-name

# Add and commit your changes
!git add .
!git commit -m "Your commit message"

# Push to GitHub (you'll need authentication)
!git push origin feature/your-feature-name
```

See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed instructions on authentication and pushing code.