<a href="https://colab.research.google.com/github/RithuLoki/Crisis-Response-Assistant-NLP---RAG/blob/main/Crisis%20Response%20Assistant%20-%20NLP%20%26%20RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Crisis Response Assistant using NLP & RAG (Non-Clinical)

This notebook presents the development of a **Crisis Response Assistant** built using
Natural Language Processing (NLP) and **Retrieval-Augmented Generation (RAG)** techniques.

The goal of this project is to provide **general, educational, and supportive information**
to users who express stress or concerns through text input.

This system is **non-clinical, non-diagnostic, and non-therapeutic**.  
It does **not** replace professional medical, psychological, or emergency services.


In [1]:
import os
import sys

print("Initial setup complete: os and sys modules imported.")

Initial setup complete: os and sys modules imported.


## Phase 1: Problem Definition & Ethical Considerations

In many situations, people experiencing stress or uncertainty may benefit from
immediate access to **general information and trusted resources**.

This project explores how NLP-based systems can assist by:
- Understanding user concerns at a high level
- Retrieving relevant educational content
- Responding in a **safe, responsible, and ethical manner**

### Important Scope Disclaimer
- No diagnosis
- No therapy
- No medical advice
- Educational guidance only


## Phase 2: User Input & Categorization

The assistant accepts short text input describing a user's concern.
To keep the system **safe and non-diagnostic**, inputs are categorized into
**broad, predefined themes** such as academic stress or financial concern.

A **rule-based keyword approach** is used instead of predictive models to:
- Maintain transparency
- Avoid misclassification risks
- Keep full human interpretability


In [2]:
import re

def get_user_input():
    """Simulates accepting a short text input from the user."""
    print("\n--- User Input Simulation ---")
    user_text = input("Please describe your concern briefly: ")
    return user_text

# Define predefined, non-diagnostic categories
PREDEFINED_CATEGORIES = {
    'academic stress': ['study', 'studying', 'exam', 'exams', 'test', 'tests', 'grades', 'school', 'college', 'university', 'assignment', 'deadline'],
    'financial concern': ['money', 'debt', 'bill', 'rent', 'job', 'finance', 'cost', 'loan', 'income', 'budget'],
    'relationship issues': ['friend', 'family', 'partner', 'relationship', 'conflict', 'breakup', 'lonely', 'social'],
    'general anxiety': ['anxious', 'stress', 'overwhelmed', 'worry', 'nervous', 'panic', 'fear', 'pressure'],
    'work stress': ['work', 'job', 'colleague', 'boss', 'project', 'career', 'burnout', 'overtime'],
    'health concerns': ['sick', 'pain', 'doctor', 'hospital', 'health', 'illness', 'medical', 'symptom'],
    'future uncertainty': ['future', 'uncertain', 'direction', 'career path', 'life goals'],
    'other': [] # Fallback category
}

def categorize_input(text):
    text_lower = text.lower()
    scores = {}

    for category, keywords in PREDEFINED_CATEGORIES.items():
        scores[category] = sum(
            1 for keyword in keywords
            if re.search(r'\b' + re.escape(keyword) + r'\b', text_lower)
        )

def validate_category(category):
    return category if category in PREDEFINED_CATEGORIES else 'other'

    best_category = max(scores, key=scores.get)
    return best_category if scores[best_category] > 0 else 'other'

# Demonstrate categorization with example inputs
print("\n--- Categorization Examples ---")
example_inputs = [
    "I'm really worried about my upcoming exams and assignments.",
    "I'm struggling to pay my rent this month, finances are tight.",
    "Had a big argument with my friend, feeling really down about it.",
    "Just generally feeling very anxious and overwhelmed lately.",
    "My boss gave me a huge project with a tight deadline, a lot of work stress.",
    "Been feeling a bit under the weather, worried about my health.",
    "I don't know what to do with my life, feeling uncertain about the future.",
    "The weather is nice today."
]

for i, user_concern in enumerate(example_inputs):
    category = categorize_input(user_concern);
    print(f"Example {i+1}:\n  Input: '{user_concern}'\n  Category: '{category}'\n")

print("Input and categorization mechanism defined and demonstrated.")


--- Categorization Examples ---
Example 1:
  Input: 'I'm really worried about my upcoming exams and assignments.'
  Category: 'None'

Example 2:
  Input: 'I'm struggling to pay my rent this month, finances are tight.'
  Category: 'None'

Example 3:
  Input: 'Had a big argument with my friend, feeling really down about it.'
  Category: 'None'

Example 4:
  Input: 'Just generally feeling very anxious and overwhelmed lately.'
  Category: 'None'

Example 5:
  Input: 'My boss gave me a huge project with a tight deadline, a lot of work stress.'
  Category: 'None'

Example 6:
  Input: 'Been feeling a bit under the weather, worried about my health.'
  Category: 'None'

Example 7:
  Input: 'I don't know what to do with my life, feeling uncertain about the future.'
  Category: 'None'

Example 8:
  Input: 'The weather is nice today.'
  Category: 'None'

Input and categorization mechanism defined and demonstrated.


## Phase 3: Text Processing & Semantic Embeddings

To enable meaningful retrieval, text data must be cleaned and converted into
numerical representations.

This phase includes:
- Text cleaning (lowercasing, punctuation removal)
- Semantic embedding generation using a pre-trained transformer model

Embeddings allow the system to retrieve content based on **meaning**, not just keywords,
which is a key component of Retrieval-Augmented Generation (RAG).


In [3]:
import string
from sentence_transformers import SentenceTransformer
import numpy as np

def clean_text(text):
    """Performs basic text cleaning: lowercasing and punctuation removal."""
    text = text.lower()
    text = text.translate(str.maketrans('', '', string.punctuation))
    text = ' '.join(text.split()) # Remove extra whitespaces
    return text

# Initialize a pre-trained Sentence Transformer model
# Using a smaller model for faster execution and demonstration
print("Loading Sentence Transformer model 'all-MiniLM-L6-v2'...")
embedding_model = SentenceTransformer('all-MiniLM-L6-v2')
print("Sentence Transformer model loaded.")

def get_embeddings(texts):
    """Generates embeddings for a given text or list of texts after cleaning."""
    if isinstance(texts, str):
        texts = [texts]

    cleaned_texts = [clean_text(text) for text in texts]
    embeddings = embedding_model.encode(cleaned_texts, show_progress_bar=False)
    return embeddings

# Demonstrate the functions
print("\n--- Text Cleaning and Embedding Demonstration ---")
example_sentences = [
    "This is an example sentence for cleaning!",
    "Worried about my grades; need to study more.",
    "Financial concerns are stressing me out..."
]

for i, sentence in enumerate(example_sentences):
    cleaned_text = clean_text(sentence)
    embeddings = get_embeddings(sentence)
    print(f"Example {i+1}:")
    print(f"  Original: '{sentence}'")
    print(f"  Cleaned:  '{cleaned_text}'")
    print(f"  Embedding shape: {embeddings.shape}")
    print(f"  Sample embedding (first 5 values): {embeddings[0,:5]}\n")

print("Text cleaning and embedding generation functions defined and demonstrated.")



Loading Sentence Transformer model 'all-MiniLM-L6-v2'...


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Sentence Transformer model loaded.

--- Text Cleaning and Embedding Demonstration ---
Example 1:
  Original: 'This is an example sentence for cleaning!'
  Cleaned:  'this is an example sentence for cleaning'
  Embedding shape: (1, 384)
  Sample embedding (first 5 values): [0.01005777 0.08359879 0.11868805 0.0573467  0.03426904]

Example 2:
  Original: 'Worried about my grades; need to study more.'
  Cleaned:  'worried about my grades need to study more'
  Embedding shape: (1, 384)
  Sample embedding (first 5 values): [ 0.06077906  0.01636016 -0.05213698  0.01559409  0.02600767]

Example 3:
  Original: 'Financial concerns are stressing me out...'
  Cleaned:  'financial concerns are stressing me out'
  Embedding shape: (1, 384)
  Sample embedding (first 5 values): [ 0.06423068  0.04167271 -0.05163161  0.10924383  0.06885557]

Text cleaning and embedding generation functions defined and demonstrated.


## Phase 4: Knowledge Base Creation

This phase defines a structured **Knowledge Base (KB)** containing
educational and supportive information for broad, non-diagnostic concern categories.

Each entry includes:
- A unique **ID**
- A **category**
- A short **title**
- **Non-clinical educational content**

This knowledge base is **informational only** and does not provide
medical, psychological, or therapeutic advice.

The KB serves as the **retrieval source** for the RAG pipeline implemented in the next phase.


In [4]:
KNOWLEDGE_BASE = [
    # Academic Stress
    {
        'id': 'KB_AS_01',
        'category': 'academic stress',
        'title': 'Managing Study Pressure',
        'content': (
            'Break down large academic tasks into smaller, manageable steps. '
            'Create a realistic study schedule and allocate time for regular breaks '
            'to avoid burnout. Remember that your well-being is just as important '
            'as your academic performance.'
        )
    },
    {
        'id': 'KB_AS_02',
        'category': 'academic stress',
        'title': 'Effective Learning Strategies',
        'content': (
            'Experiment with different learning methods such as active recall or '
            'spaced repetition. Study groups can provide peer support and diverse '
            'perspectives. Seeking clarification from instructors or tutors is '
            'a positive and proactive step.'
        )
    },
    {
        'id': 'KB_AS_03',
        'category': 'academic stress',
        'title': 'Dealing with Exam Anxiety',
        'content': (
            'Practice relaxation techniques like deep breathing before and during exams. '
            'Ensure you are well-rested and nourished. Focus on what you know and trust '
            'your preparation instead of worrying about outcomes.'
        )
    },

    # Financial Concern
    {
        'id': 'KB_FC_01',
        'category': 'financial concern',
        'title': 'Budgeting Basics',
        'content': (
            'Creating a monthly budget helps track income and expenses clearly. '
            'Identify areas where you can save while prioritizing essential needs. '
            'Many free online tools can support effective financial planning.'
        )
    },
    {
        'id': 'KB_FC_02',
        'category': 'financial concern',
        'title': 'Seeking Financial Guidance',
        'content': (
            'Non-profit financial counseling services offer educational and unbiased '
            'guidance on managing debt, saving, and planning ahead. You are not alone '
            'in facing financial challenges.'
        )
    },
    {
        'id': 'KB_FC_03',
        'category': 'financial concern',
        'title': 'Understanding Student Loans and Aid Options',
        'content': (
            'Research scholarships, grants, and student loan options carefully. '
            'Understand the terms of financial aid programs. Institutional financial '
            'aid offices can often provide additional support.'
        )
    },

    # Relationship Issues
    {
        'id': 'KB_RI_01',
        'category': 'relationship issues',
        'title': 'Communicating Effectively',
        'content': (
            'Open and honest communication builds stronger relationships. '
            'Practice active listening and express your thoughts calmly. '
            'Trying to understand the other person’s perspective can reduce conflict.'
        )
    },
    {
        'id': 'KB_RI_02',
        'category': 'relationship issues',
        'title': 'Building Healthy Boundaries',
        'content': (
            'Healthy boundaries help maintain mutual respect. Communicate your needs '
            'clearly and respect the boundaries of others. Boundaries support emotional '
            'well-being in relationships.'
        )
    },

    # General Anxiety
    {
        'id': 'KB_GA_01',
        'category': 'general anxiety',
        'title': 'Mindfulness and Relaxation',
        'content': (
            'Mindfulness practices such as meditation or focused breathing can help '
            'calm the mind. Even a few minutes of daily practice can make a difference.'
        )
    },
    {
        'id': 'KB_GA_02',
        'category': 'general anxiety',
        'title': 'Coping with Overwhelm',
        'content': (
            'When feeling overwhelmed, identify specific stressors and break challenges '
            'into smaller, manageable steps. Engaging in activities you enjoy can offer '
            'temporary mental relief.'
        )
    },

    # Work Stress
    {
        'id': 'KB_WS_01',
        'category': 'work stress',
        'title': 'Time Management Techniques',
        'content': (
            'Prioritize tasks using techniques such as the Pomodoro method. '
            'Effective time management reduces feelings of overload and improves focus.'
        )
    },
    {
        'id': 'KB_WS_02',
        'category': 'work stress',
        'title': 'Work-Life Balance',
        'content': (
            'Maintaining boundaries between work and personal life is essential. '
            'Dedicate time to rest, hobbies, and personal relationships to prevent burnout.'
        )
    },

    # Health Concerns
    {
        'id': 'KB_HC_01',
        'category': 'health concerns',
        'title': 'Promoting General Wellness',
        'content': (
            'Balanced nutrition, regular physical activity, and adequate sleep '
            'are fundamental to overall well-being and resilience.'
        )
    },
    {
        'id': 'KB_HC_02',
        'category': 'health concerns',
        'title': 'Understanding Self-Care',
        'content': (
            'Self-care involves proactive habits that support physical and mental health, '
            'including rest, relaxation, and engaging in enjoyable activities.'
        )
    },

    # Future Uncertainty
    {
        'id': 'KB_FU_01',
        'category': 'future uncertainty',
        'title': 'Embracing Change',
        'content': (
            'Uncertainty is a natural part of life. Focus on what you can control '
            'in the present and adapt as circumstances change. Building resilience '
            'helps navigate uncertainty.'
        )
    },
    {
        'id': 'KB_FU_02',
        'category': 'future uncertainty',
        'title': 'Goal Setting and Planning',
        'content': (
            'Setting realistic goals can provide clarity and direction. Plans may evolve, '
            'and flexibility is a strength. Discussing aspirations with trusted people '
            'can offer perspective.'
        )
    },

    # Other (Fallback)
    {
        'id': 'KB_OT_01',
        'category': 'other',
        'title': 'General Information and Resources',
        'content': (
            'Community groups and online resources provide helpful educational information. '
            'Exploring different perspectives can sometimes offer new insights.'
        )
    },
    {
        'id': 'KB_OT_02',
        'category': 'other',
        'title': 'Seeking Further Support',
        'content': (
            'If concerns feel persistent, reaching out to trusted mentors or advisors '
            'can be a helpful step. Support networks play an important role in well-being.'
        )
    }
]

print(f"Knowledge Base successfully created with {len(KNOWLEDGE_BASE)} entries.")
print("Sample entry:", KNOWLEDGE_BASE[0])


Knowledge Base successfully created with 18 entries.
Sample entry: {'id': 'KB_AS_01', 'category': 'academic stress', 'title': 'Managing Study Pressure', 'content': 'Break down large academic tasks into smaller, manageable steps. Create a realistic study schedule and allocate time for regular breaks to avoid burnout. Remember that your well-being is just as important as your academic performance.'}


## Phase 5: Retrieval-Augmented Generation (RAG)

This phase implements the retrieval component of the RAG pipeline.

Steps involved:
1. Convert knowledge base content into semantic embeddings
2. Store embeddings alongside original content
3. Retrieve the most relevant information using cosine similarity

This enables meaning-based retrieval instead of keyword matching.


In [5]:
print("Generating embeddings for the Knowledge Base...")
kb_contents = [entry['content'] for entry in KNOWLEDGE_BASE]
kb_embeddings = get_embeddings(kb_contents)

print(f"Generated {len(kb_embeddings)} embeddings for the Knowledge Base.")
print(f"Shape of knowledge base embeddings: {kb_embeddings.shape}")

# Combine KB entries with their embeddings for easier retrieval later
# This creates a new structure where each original KB entry also has its embedding.
# Alternatively, we could keep kb_embeddings as a separate list and use indices.
# For simplicity, let's create a list of dictionaries with content, category, title, and embedding.

embedded_knowledge_base = []
for i, entry in enumerate(KNOWLEDGE_BASE):
    embedded_entry = entry.copy()
    embedded_entry['embedding'] = kb_embeddings[i]
    embedded_knowledge_base.append(embedded_entry)

print("Knowledge Base entries combined with their embeddings.")
print("Example embedded knowledge base entry (first 5 embedding values):")
print(embedded_knowledge_base[0]['embedding'][:5])

Generating embeddings for the Knowledge Base...
Generated 18 embeddings for the Knowledge Base.
Shape of knowledge base embeddings: (18, 384)
Knowledge Base entries combined with their embeddings.
Example embedded knowledge base entry (first 5 embedding values):
[0.01812179 0.12024394 0.04709259 0.06457696 0.05823918]


## Demonstration: Semantic Information Retrieval

The following example demonstrates how the system retrieves
the most relevant educational content for a sample user query
based on semantic similarity.


In [6]:
from sklearn.metrics.pairwise import cosine_similarity

def retrieve_info(user_query_embedding, top_k=3):
    """Retrieves the top_k most semantically similar knowledge base entries based on a user query embedding."""
    similarities = []
    for entry in embedded_knowledge_base:
        # Reshape for cosine_similarity: (1, n_features) for each vector
        sim = cosine_similarity(user_query_embedding.reshape(1, -1), entry['embedding'].reshape(1, -1))[0][0]
        similarities.append({'entry': entry, 'similarity': sim})

    # Sort by similarity in descending order
    sorted_similarities = sorted(similarities, key=lambda x: x['similarity'], reverse=True)

    # Extract top_k entries
    top_entries = []
    for item in sorted_similarities[:top_k]:
        top_entries.append({
            'category': item['entry']['category'],
            'title': item['entry']['title'],
            'content': item['entry']['content'],
            'similarity_score': item['similarity']
        })
    return top_entries

# Demonstrate the retrieve_info() function
print("\n--- Retrieval Demonstration ---")
sample_user_query = "I'm worried about my upcoming exams and grades."
print(f"Sample User Query: '{sample_user_query}'")

# Generate embedding for the sample query
user_query_embedding = get_embeddings(sample_user_query)[0]

# Retrieve top_k relevant information
retrieved_results = retrieve_info(user_query_embedding, top_k=3)

print("\nTop 3 Retrieved Information:")
for i, result in enumerate(retrieved_results):
    print(f"\nResult {i+1} (Category: {result['category']}, Similarity: {result['similarity_score']:.4f}):")
    print(f"  Title: {result['title']}")
    print(f"  Content: {result['content']}")

print("Retrieval component implemented and demonstrated.")



--- Retrieval Demonstration ---
Sample User Query: 'I'm worried about my upcoming exams and grades.'

Top 3 Retrieved Information:

Result 1 (Category: academic stress, Similarity: 0.4932):
  Title: Dealing with Exam Anxiety
  Content: Practice relaxation techniques like deep breathing before and during exams. Ensure you are well-rested and nourished. Focus on what you know and trust your preparation instead of worrying about outcomes.

Result 2 (Category: academic stress, Similarity: 0.3937):
  Title: Managing Study Pressure
  Content: Break down large academic tasks into smaller, manageable steps. Create a realistic study schedule and allocate time for regular breaks to avoid burnout. Remember that your well-being is just as important as your academic performance.

Result 3 (Category: general anxiety, Similarity: 0.2201):
  Title: Coping with Overwhelm
  Content: When feeling overwhelmed, identify specific stressors and break challenges into smaller, manageable steps. Engaging in ac

## Phase 6: Response Generation

This phase generates a supportive and educational response using
information retrieved from the knowledge base.

Key principles followed:
- Responses are **grounded only in retrieved content**
- Tone is **supportive and non-clinical**
- No diagnosis, therapy, or professional advice is provided
- Clear disclaimers guide users toward professional help when needed


In [7]:
def generate_response(user_input, category, retrieved_info):
    """Generates a comprehensive and educational response based on user input, category, and retrieved information.
    The response adheres to non-clinical, non-diagnostic, and non-therapeutic principles.
    """
    response_parts = []

    # Empathetic opening based on category
    if category == 'academic stress':
        response_parts.append("It sounds like you're dealing with academic pressures. Many people face similar challenges, and there are constructive ways to approach them.")
    elif category == 'financial concern':
        response_parts.append("Navigating financial concerns can be really challenging. It's commendable that you're looking for information and support.")
    elif category == 'relationship issues':
        response_parts.append("Relationship dynamics can be complex, and it's understandable to seek guidance when facing difficulties.")
    elif category == 'general anxiety':
        response_parts.append("Feeling anxious or overwhelmed is a common experience, and there are various strategies that can help you manage these feelings.")
    elif category == 'work stress':
        response_parts.append("Work-related stress is a significant concern for many. Finding effective ways to manage it is important for your well-being.")
    elif category == 'health concerns':
        response_parts.append("It's good that you're paying attention to your health. Understanding general wellness practices can be very empowering.")
    elif category == 'future uncertainty':
        response_parts.append("Facing uncertainty about the future can be daunting. Exploring options and setting small goals can often bring clarity.")
    else: # 'other' category
        response_parts.append("It seems you're looking for some general support or information. I'm here to provide educational insights.")

    response_parts.append("\nHere is some educational information that might be helpful:\n")

    # Add retrieved information
    if retrieved_info:
        for i, info in enumerate(retrieved_info):
            response_parts.append(f"**{i+1}. {info['title']}**\n{info['content']}\n")
    else:
        response_parts.append("I couldn't find specific information related to your exact query within my knowledge base at this moment, but the general tips below might still be relevant.")

    # General educational disclaimer and call to action for professional help
    response_parts.append("\n---\n")
    response_parts.append("**Important Educational Disclaimer:** The information provided here is for general educational and informational purposes only, and does not constitute medical, psychological, or professional advice. It is not a substitute for professional diagnosis, treatment, or therapy. If you are experiencing a crisis or believe you are in danger, please seek immediate assistance from qualified professionals or emergency services. This system is not designed to provide clinical or diagnostic services.")

    return "\n".join(response_parts)

# Demonstrate the generate_response() function
print("\n--- Response Generation Demonstration ---")

# Re-use the sample query and retrieved results from the previous step
# sample_user_query = "I'm worried about my upcoming exams and grades."
# user_query_embedding = get_embeddings(sample_user_query)
# retrieved_results = retrieve_info(user_query_embedding, top_k=3)
# Assuming `sample_user_query`, `user_query_embedding`, `retrieved_results` are still in scope from previous execution

# Determine the category for the sample user query
sample_category = categorize_input(sample_user_query)
print(f"User Input: '{sample_user_query}'")
print(f"Categorized as: '{sample_category}'")

generated_response = generate_response(sample_user_query, sample_category, retrieved_results)
print("\nGenerated Response:")
print(generated_response)

print("Response generation component implemented and demonstrated.")


--- Response Generation Demonstration ---
User Input: 'I'm worried about my upcoming exams and grades.'
Categorized as: 'None'

Generated Response:
It seems you're looking for some general support or information. I'm here to provide educational insights.

Here is some educational information that might be helpful:

**1. Dealing with Exam Anxiety**
Practice relaxation techniques like deep breathing before and during exams. Ensure you are well-rested and nourished. Focus on what you know and trust your preparation instead of worrying about outcomes.

**2. Managing Study Pressure**
Break down large academic tasks into smaller, manageable steps. Create a realistic study schedule and allocate time for regular breaks to avoid burnout. Remember that your well-being is just as important as your academic performance.

**3. Coping with Overwhelm**
When feeling overwhelmed, identify specific stressors and break challenges into smaller, manageable steps. Engaging in activities you enjoy can offer

## Phase 7: Safety and Fallback Logic

This phase introduces safety mechanisms to ensure responsible system behavior.

It includes:
- Detection of high-risk keywords
- Confidence checks on retrieved information
- Fallback responses when content is out of scope or uncertain

User safety is always prioritized over response generation.


In [8]:
HIGH_RISK_KEYWORDS = [
    'suicide',
    'kill myself',
    'self-harm',
    'harm myself',
    'end my life',
    'end it all',
    'give up on life',
    'about to die',
    'help me now',
    'emergency',
    'urgent'
]

SAFETY_THRESHOLD = 0.5 # Minimum similarity score for retrieved info to be considered 'high confidence'

def handle_safety_and_fallback(user_input, retrieved_info):
    """Implements safety checks and fallback logic.

    Checks for high-risk keywords in user input and assesses retrieval confidence.
    Returns a safety response if conditions are met, otherwise signals to proceed.
    """
    user_input_lower = user_input.lower()

    # 1. Rule-based safety check for high-risk queries
    for keyword in HIGH_RISK_KEYWORDS:
        import re
        if re.search(r'\b' + re.escape(keyword) + r'\b', user_input_lower):
            print(f"\n!!! SAFETY ALERT: High-risk keyword detected: '{keyword}' !!!")
            return {
                'status': 'safety_triggered',
                'message': "It sounds like you might be in a difficult or dangerous situation. Please reach out to emergency services or a crisis hotline immediately. This assistant cannot provide medical, psychological, or emergency help. Your safety is paramount. Contact a professional immediately."
            }

    # 2. Confidence threshold for retrieved information
    if not retrieved_info:
        print("\n--- Fallback: No information retrieved ---")
        return {
            'status': 'low_confidence',
            'message': "I couldn't find specific information related to your exact query within my knowledge base. It's possible your concern is outside my scope, or I don't have enough relevant information. Please refer to general support resources, or rephrase your question."
        }

    max_similarity = max([info['similarity_score'] for info in retrieved_info])

    if max_similarity < SAFETY_THRESHOLD:
        print(f"\n--- Fallback: Low retrieval confidence (Max Similarity: {max_similarity:.2f}) ---")
        return {
            'status': 'low_confidence',
            'message': f"I found some information, but my confidence in its direct relevance is low (max similarity: {max_similarity:.2f}). Please remember that this assistant provides educational content only. For more direct support, consider seeking guidance from a human expert."
        }

    # If no safety issues and sufficient confidence, proceed normally
    return {'status': 'proceed'}

# --- Demonstration of handle_safety_and_fallback() ---
print("\n--- Demonstrating Safety and Fallback Logic ---")

# Example 1: Triggers a safety warning
safety_query = "I'm feeling so desperate, I want to end my life."
# No retrieval needed for safety check, but we pass an empty list for completeness
safety_check_result_1 = handle_safety_and_fallback(safety_query, [])
print(f"Query: '{safety_query}'\nResult Status: {safety_check_result_1['status']}\nResult Message: {safety_check_result_1['message']}\n")

# Example 2: Low retrieval confidence (very generic/out-of-scope)
low_confidence_query = "Tell me about the history of computers."
low_confidence_embedding = get_embeddings(low_confidence_query)
low_confidence_retrieved_info = retrieve_info(low_confidence_embedding, top_k=1) # Get top 1 to easily check max similarity
safety_check_result_2 = handle_safety_and_fallback(low_confidence_query, low_confidence_retrieved_info)
print(f"Query: '{low_confidence_query}'\nResult Status: {safety_check_result_2['status']}\nResult Message: {safety_check_result_2['message']}\n")

# Example 3: Passes safety and has high confidence (re-using academic stress query)
high_confidence_query = "I'm worried about my upcoming exams and grades."
high_confidence_embedding = get_embeddings(high_confidence_query)
high_confidence_retrieved_info = retrieve_info(high_confidence_embedding, top_k=3)
safety_check_result_3 = handle_safety_and_fallback(high_confidence_query, high_confidence_retrieved_info)
print(f"Query: '{high_confidence_query}'\nResult Status: {safety_check_result_3['status']}\n")

print("Safety and fallback logic implemented and demonstrated.")


--- Demonstrating Safety and Fallback Logic ---

!!! SAFETY ALERT: High-risk keyword detected: 'end my life' !!!
Query: 'I'm feeling so desperate, I want to end my life.'
Result Status: safety_triggered
Result Message: It sounds like you might be in a difficult or dangerous situation. Please reach out to emergency services or a crisis hotline immediately. This assistant cannot provide medical, psychological, or emergency help. Your safety is paramount. Contact a professional immediately.


--- Fallback: Low retrieval confidence (Max Similarity: 0.06) ---
Query: 'Tell me about the history of computers.'
Result Status: low_confidence
Result Message: I found some information, but my confidence in its direct relevance is low (max similarity: 0.06). Please remember that this assistant provides educational content only. For more direct support, consider seeking guidance from a human expert.


--- Fallback: Low retrieval confidence (Max Similarity: 0.49) ---
Query: 'I'm worried about my upco

## Phase 8: Explainability

Explainability improves trust and transparency in AI systems.

Here, the assistant clearly shows:
- Which knowledge base entries were used
- The category and title of each retrieved source

This reinforces that responses are grounded in curated information and not generated advice.


In [9]:
def generate_response_with_sources(user_input, category, retrieved_info):
    """Generates a comprehensive and educational response with sources, based on user input, category, and retrieved information.
    The response adheres to non-clinical, non-diagnostic, and non-therapeutic principles, and includes source information.
    """
    response_parts = []

    # Empathetic opening based on category
    if category == 'academic stress':
        response_parts.append("It sounds like you're dealing with academic pressures. Many people face similar challenges, and there are constructive ways to approach them.")
    elif category == 'financial concern':
        response_parts.append("Navigating financial concerns can be really challenging. It's commendable that you're looking for information and support.")
    elif category == 'relationship issues':
        response_parts.append("Relationship dynamics can be complex, and it's understandable to seek guidance when facing difficulties.")
    elif category == 'general anxiety':
        response_parts.append("Feeling anxious or overwhelmed is a common experience, and there are various strategies that can help you manage these feelings.")
    elif category == 'work stress':
        response_parts.append("Work-related stress is a significant concern for many. Finding effective ways to manage it is important for your well-being.")
    elif category == 'health concerns':
        response_parts.append("It's good that you're paying attention to your health. Understanding general wellness practices can be very empowering.")
    elif category == 'future uncertainty':
        response_parts.append("Facing uncertainty about the future can be daunting. Exploring options and setting small goals can often bring clarity.")
    else: # 'other' category
        response_parts.append("It seems you're looking for some general support or information. I'm here to provide educational insights.")

    response_parts.append("\nHere is some educational information that might be helpful:\n")

    # Add retrieved information with sources
    if retrieved_info:
        for i, info in enumerate(retrieved_info):
            response_parts.append(f"**{i+1}. {info['title']}** (Category: {info['category'].title()})\n{info['content']}\n")
    else:
        response_parts.append("I couldn't find specific information related to your exact query within my knowledge base at this moment, but the general tips below might still be relevant.")

    # General educational disclaimer and call to action for professional help
    response_parts.append("\n---\n")
    response_parts.append("**Important Educational Disclaimer:** The information provided here is for general educational and informational purposes only, and does not constitute medical, psychological, or professional advice. It is not a substitute for professional diagnosis, treatment, or therapy. If you are experiencing a crisis or believe you are in danger, please seek immediate assistance from qualified professionals or emergency services. This system is not designed to provide clinical or diagnostic services.")

    return "\n".join(response_parts)

# Demonstrate the updated response generation with sources
print("\n--- Response Generation with Sources Demonstration ---")

# Re-use the sample query and retrieved results from previous steps
# sample_user_query = "I'm worried about my upcoming exams and grades."
# user_query_embedding = get_embeddings(sample_user_query)
# retrieved_results = retrieve_info(user_query_embedding, top_k=3)
# Assuming `sample_user_query`, `user_query_embedding`, `retrieved_results` are still in scope

sample_category = categorize_input(sample_user_query)
print(f"User Input: '{sample_user_query}'")
print(f"Categorized as: '{sample_category}'")

generated_response_with_sources = generate_response_with_sources(sample_user_query, sample_category, retrieved_results)
print("\nGenerated Response (with sources):")
print(generated_response_with_sources)

print("Response generation now includes source information.")


--- Response Generation with Sources Demonstration ---
User Input: 'I'm worried about my upcoming exams and grades.'
Categorized as: 'None'

Generated Response (with sources):
It seems you're looking for some general support or information. I'm here to provide educational insights.

Here is some educational information that might be helpful:

**1. Dealing with Exam Anxiety** (Category: Academic Stress)
Practice relaxation techniques like deep breathing before and during exams. Ensure you are well-rested and nourished. Focus on what you know and trust your preparation instead of worrying about outcomes.

**2. Managing Study Pressure** (Category: Academic Stress)
Break down large academic tasks into smaller, manageable steps. Create a realistic study schedule and allocate time for regular breaks to avoid burnout. Remember that your well-being is just as important as your academic performance.

**3. Coping with Overwhelm** (Category: General Anxiety)
When feeling overwhelmed, identify sp

## Phase 9: Evaluation

This phase demonstrates the complete assistant pipeline using sample inputs.

It validates:
- Input categorization
- Knowledge retrieval
- Safety handling
- Explainable response generation

Multiple scenarios are tested to showcase responsible behavior.


In [10]:
def run_assistant_pipeline(user_query):
    """Orchestrates the entire crisis assistant pipeline for a given user query.

    Args:
        user_query (str): The user's input concern.

    Returns:
        dict: A dictionary containing the processing steps and the final response.
    """
    pipeline_results = {
        'user_input': user_query,
        'category': None,
        'user_query_embedding': None,
        'retrieved_info': [],
        'safety_check_result': None,
        'final_response': None,
        'outcome_explanation': ''
    }

    # 1. Categorize the user query
    category = categorize_input(user_query)
    pipeline_results['category'] = category

    # 2. Generate embeddings for the user query
    user_query_embedding = get_embeddings(user_query)
    pipeline_results['user_query_embedding'] = user_query_embedding

    # 3. Retrieve relevant information from the embedded_knowledge_base
    retrieved_info = retrieve_info(user_query_embedding, top_k=3)
    pipeline_results['retrieved_info'] = retrieved_info

    # 4. Apply safety and fallback logic
    safety_check_result = handle_safety_and_fallback(user_query, retrieved_info)
    pipeline_results['safety_check_result'] = safety_check_result

    # 5. Generate final response based on safety check result
    if safety_check_result['status'] == 'safety_triggered':
        pipeline_results['final_response'] = safety_check_result['message']
        pipeline_results['outcome_explanation'] = 'Safety alert triggered due to high-risk keywords.'
    elif safety_check_result['status'] == 'low_confidence':
        # Even if low confidence, we can still try to generate a response, but indicate its limitations
        # Or, as per instructions, return the fallback message directly if it indicates a problem.
        pipeline_results['final_response'] = safety_check_result['message'] + "\n\n" + generate_response_with_sources(user_query, category, retrieved_info)
        max_sim = max([r['similarity_score'] for r in retrieved_info]) if retrieved_info else 0.0
        pipeline_results['outcome_explanation'] = f"Low retrieval confidence detected (max sim: {max_sim:.2f})."
    else: # status == 'proceed'
        pipeline_results['final_response'] = generate_response_with_sources(user_query, category, retrieved_info)
        pipeline_results['outcome_explanation'] = 'Relevant educational information provided.'

    return pipeline_results

print("The `run_assistant_pipeline` function has been defined.")

The `run_assistant_pipeline` function has been defined.


## End-to-End Pipeline Demonstration

This section tests the complete crisis response pipeline using diverse sample inputs, demonstrating input categorization, safety checks, knowledge retrieval, and response generation.


In [11]:
print("\n--- Demonstrating End-to-End Pipeline with Sample Queries ---")

sample_queries = [
    "I feel so depressed, I just want to end it all.", # High-risk query
    "My exams are next week and I'm super stressed about studying.", # Academic stress, high confidence
    "I have no idea how to pay my bills this month, money is so tight.", # Financial concern
    "My girlfriend and I just broke up, feeling so lonely.", # Relationship issues
    "What is the meaning of life?", # Low confidence/Out of scope
    "I'm just generally anxious and overwhelmed by everything.", # General anxiety
    "My project deadline at work is looming, and I can't keep up.", # Work stress
    "I'm worried about some symptoms I've been having.", # Health concerns
    "I don't know what career path to take after graduation.", # Future uncertainty
    "Can you tell me about butterflies?" # Low confidence/Out of scope
]

for i, query in enumerate(sample_queries):
    print(f"\n----- Sample Query {i+1} -----")
    print(f"User Input: '{query}'")

    results = run_assistant_pipeline(query)

    print(f"Categorized as: '{results['category']}'")

    if results['safety_check_result']['status'] == 'safety_triggered':
        print(f"Outcome: {results['outcome_explanation']}")
        print(f"Assistant Response: {results['final_response']}")
    elif results['safety_check_result']['status'] == 'low_confidence':
        print(f"Outcome: {results['outcome_explanation']}")
        print(f"Assistant Response: {results['final_response']}")
    else:
        print(f"Retrieved Info (top {len(results['retrieved_info'])}):")
        for j, info in enumerate(results['retrieved_info']):
            print(f"  {j+1}. Title: '{info['title']}' (Category: {info['category'].title()}, Similarity: {info['similarity_score']:.4f})")
            # print(f"     Content: {info['content'][:100]}...") # Optional: print a snippet of content
        print(f"Outcome: {results['outcome_explanation']}")
        print(f"Assistant Response: {results['final_response']}")

print("\nEnd-to-end pipeline demonstration complete.")


--- Demonstrating End-to-End Pipeline with Sample Queries ---

----- Sample Query 1 -----
User Input: 'I feel so depressed, I just want to end it all.'

!!! SAFETY ALERT: High-risk keyword detected: 'end it all' !!!
Categorized as: 'None'
Outcome: Safety alert triggered due to high-risk keywords.
Assistant Response: It sounds like you might be in a difficult or dangerous situation. Please reach out to emergency services or a crisis hotline immediately. This assistant cannot provide medical, psychological, or emergency help. Your safety is paramount. Contact a professional immediately.

----- Sample Query 2 -----
User Input: 'My exams are next week and I'm super stressed about studying.'
Categorized as: 'None'
Retrieved Info (top 3):
  1. Title: 'Dealing with Exam Anxiety' (Category: Academic Stress, Similarity: 0.6376)
  2. Title: 'Managing Study Pressure' (Category: Academic Stress, Similarity: 0.4836)
  3. Title: 'Coping with Overwhelm' (Category: General Anxiety, Similarity: 0.4420

## Phase 10: Simple Interactive Interface

This section provides a simple text-based interface to demonstrate the
end-to-end functioning of the Crisis Response Assistant.

Users can enter a concern and receive an educational, non-clinical response
generated using the full pipeline, including safety checks and explainability.


In [None]:
print("\n--- Crisis Response Assistant: Interactive Interface ---")
print("Type 'exit' to end the conversation.")

while True:
    user_input = input("\nYour concern: ")

    if user_input.lower() == 'exit':
        print("Thank you for using the Crisis Response Assistant. Goodbye!")
        break

    if not user_input.strip():
        print("Please enter a valid concern.")
        continue

    print("Processing your concern...")
    results = run_assistant_pipeline(user_input)

    # Display the final response
    print("\nAssistant's Response:")
    print(results['final_response'])

    # Optionally, provide a brief explanation of the outcome
    print(f"\nOutcome Explanation: {results['outcome_explanation']}")

print("Interactive interface demonstration complete.")

## Project Summary

This notebook presented a **Crisis Response Assistant using NLP and Retrieval-Augmented Generation (RAG)** designed strictly for **non-clinical, non-diagnostic, and educational use**.

### Key Highlights
- A structured knowledge base containing supportive, pre-curated educational content
- Semantic retrieval using sentence embeddings and cosine similarity
- Safe and empathetic response generation grounded only in retrieved knowledge
- Rule-based safety detection for high-risk inputs with immediate fallback guidance
- Explainable responses showing the source category and title of retrieved information
- End-to-end pipeline demonstration with diverse user scenarios
- A simple interactive text interface for real-time usage

Throughout the system, **responsible AI principles** such as transparency, safety, limitation awareness, and ethical boundaries were strictly followed.
