<a href="https://colab.research.google.com/github/AmitKPandey11/AI-agent-project/blob/main/Email_Agent_Colab_Friendly_(1).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Email Agent Demo — Colab Friendly

This notebook is fully Colab-optimized for low CPU/GPU consumption.
It demonstrates a lightweight **Retrieval-Augmented Generation (RAG)** pipeline
combined with **sentiment analysis** to generate context-aware email replies.

✅ **Highlights:**
- Minimal installs
- FAISS + Sentence-Transformer (MiniLM) embeddings
- TF-IDF fallback for offline mode
- HuggingFace sentiment pipeline with heuristic fallback


In [1]:
# Step 1: Check environment and setup
import os, sys
print('Python version:', sys.version)
try:
    import torch
    print('CUDA available:', torch.cuda.is_available())
except:
    print('Torch not installed')
os.environ.setdefault('HF_HOME', '/content/hf_cache')
os.environ.setdefault('TRANSFORMERS_OFFLINE', '0')

Python version: 3.12.12 (main, Oct 10 2025, 08:52:57) [GCC 11.4.0]
CUDA available: True


'0'

In [2]:
# Step 2: Install minimal dependencies
!pip install -q sentence-transformers faiss-cpu transformers scikit-learn

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m31.4/31.4 MB[0m [31m57.8 MB/s[0m eta [36m0:00:00[0m
[?25h

## Step 3: Create sample documents & embeddings

In [3]:
from pathlib import Path
import numpy as np, json
OUT = Path('email_agent_demo'); OUT.mkdir(exist_ok=True)

docs = {
  'product_policy.txt': 'Refunds are available within 30 days of purchase.',
  'campaign_report.txt': 'July campaign saw a 20% drop in conversions due to overspend.',
  'safety_guidelines.txt': 'Escalate any issues to the on-call manager promptly.'
}

for k,v in docs.items():
    (OUT/k).write_text(v, encoding='utf-8')

# Try to build embeddings using MiniLM
from sentence_transformers import SentenceTransformer
from sklearn.feature_extraction.text import TfidfVectorizer
try:
    model = SentenceTransformer('all-MiniLM-L6-v2')
    X = model.encode(list(docs.values()), show_progress_bar=False)
    np.save(OUT/'embeddings.npy', X)
    print('Sentence-Transformer embeddings created:', X.shape)
except Exception as e:
    print('SentenceTransformer failed:', e)
    vec = TfidfVectorizer(); X = vec.fit_transform(list(docs.values())).toarray()
    np.save(OUT/'embeddings.npy', X)

json.dump([{'source':k,'text':v} for k,v in docs.items()], open(OUT/'meta.json','w'))

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Sentence-Transformer embeddings created: (3, 384)


## Step 4: Build FAISS index (CPU)

In [4]:
import faiss
emb = np.load('email_agent_demo/embeddings.npy')
norms = np.linalg.norm(emb, axis=1, keepdims=True); emb /= norms
index = faiss.IndexFlatIP(emb.shape[1])
index.add(emb.astype('float32'))
faiss.write_index(index, 'email_agent_demo/faiss.index')
print('FAISS index built:', index.ntotal, 'documents')

FAISS index built: 3 documents


## Step 5: Define retrieval and sentiment functions

In [5]:
import json, faiss
from sklearn.metrics.pairwise import cosine_similarity
meta = json.load(open('email_agent_demo/meta.json'))
model = SentenceTransformer('all-MiniLM-L6-v2')
index = faiss.read_index('email_agent_demo/faiss.index')

def retrieve(query, k=3):
    q = model.encode([query]); q /= np.linalg.norm(q, axis=1, keepdims=True)
    D,I = index.search(q.astype('float32'), k)
    return [meta[i] for i in I[0]], D[0].tolist()

def analyze_sentiment(text):
    try:
        from transformers import pipeline
        sa = pipeline('sentiment-analysis')
        return sa(text[:512])[0]
    except:
        low = text.lower()
        if any(w in low for w in ['angry','issue','problem','bad']):
            return {'label':'NEGATIVE','score':0.9}
        return {'label':'POSITIVE','score':0.8}

## Step 6: Define EmailAgent class

In [6]:
class EmailAgent:
    def __init__(self, retriever, sentiment):
        self.retriever = retriever; self.sentiment = sentiment

    def respond(self, email_text):
        sent = self.sentiment(email_text)
        docs, scores = self.retriever(email_text)
        tone = 'empathetic' if sent['label'].lower().startswith('neg') else 'positive'
        body = f'Tone detected: {tone}\n\nBased on retrieved documents:'
        for d, s in zip(docs, scores):
            body += f"\n- {d['text']} (relevance: {round(float(s),3)})"
        body += '\n\nNext steps: I will investigate and respond shortly.'
        return body

## Step 7: Run demo

In [7]:
agent = EmailAgent(retrieve, analyze_sentiment)
incoming = "Hi team, the July campaign performed poorly, and I am upset about the drop in conversions."
print(agent.respond(incoming))

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

Device set to use cuda:0


Tone detected: empathetic

Based on retrieved documents:
- July campaign saw a 20% drop in conversions due to overspend. (relevance: 0.776)
- Escalate any issues to the on-call manager promptly. (relevance: 0.103)
- Refunds are available within 30 days of purchase. (relevance: 0.08)

Next steps: I will investigate and respond shortly.
