#  **Cold emailing**

### **Introduction**

This code implements an **Automated Cold Emailing System** designed for businesses to streamline their outreach process using AI-driven content generation and efficient email management workflows. The system integrates various components to handle the complete email lifecycle, from crafting personalized messages to managing responses and logging communication for future reference.

### **Key Features:**
1. **Automated Email Generation & Sending:**
   - Leverages **Meta LLaMA-3.2-3B-Instruct** for generating customized email content tailored to specific company needs.
   - Extracts relevant product information from a **PDF catalog** to enhance email personalization using **Retrieval-Augmented Generation (RAG)**.
   - Sends emails via **SMTP (Gmail)** with attachments and stores sent emails in an **SQLite database** for easy tracking.

2. **Response Management & Classification:**
   - Periodically checks for incoming responses using **IMAP**, categorizing replies as:
     - **Not Interested** (ignored)
     - **Need More Details** (handled by querying the AI with product data)
     - **Meeting Request** (forwarded to HR automatically)
   - Uses **Qwen 2.5 model** with **4-bit quantization** for accurate response classification.

3. **Product Recommendation System:**
   - Utilizes **FAISS (Facebook AI Similarity Search)** and **Sentence Transformers** for efficient retrieval of relevant product suggestions based on customer needs.
   - Embeds product descriptions to find the most relevant offerings in real time.

4. **Database & Event-Driven Architecture:**
   - **SQLite database** manages email logs (sent/received) with timestamp tracking.
   - Event-driven logic ensures that responses are processed automatically without manual intervention.

5. **Technologies & Libraries Used:**
   - **Python Libraries:** `pandas`, `transformers`, `smtplib`, `PyPDF2`, `faiss`, `sqlite3`, `sentence-transformers`
   - **AI Models:** Meta LLaMA, Qwen 2.5 for classification
   - **Email Protocols:** SMTP (sending) & IMAP (receiving)
   - **Database:** SQLite for email logging and tracking

### **Usage Workflow:**
1. Reads company information from a CSV file and product details from a PDF catalog.
2. Generates personalized emails using AI, attaches relevant product details, and sends them via Gmail SMTP.
3. Stores all sent and received emails in an SQLite database for future reference.
4. Continuously monitors email responses, classifies them, and triggers the next action accordingly.


# installs the necessary Python libraries
This cell installs the necessary Python libraries required for the cold emailing automation system. These packages include:

* pandas: For data manipulation and analysis, especially with CSV files.

* transformers: To work with pre-trained language models for generating email content.

* smtplib: To send emails using the Simple Mail Transfer Protocol (SMTP).

* PyPDF2: For extracting product information from PDF documents.

* bitsandbytes: Optimizes model loading and quantization for efficient processing.

* faiss-cpu: A library for efficient similarity search, useful for finding relevant products based on company needs.




In [None]:
!pip install pandas transformers smtplib
!pip install PyPDF2 bitsandbytes
!pip install faiss-cpu

[31mERROR: Could not find a version that satisfies the requirement smtplib (from versions: none)[0m[31m
[0m[31mERROR: No matching distribution found for smtplib[0m[31m
[0mCollecting PyPDF2
  Downloading pypdf2-3.0.1-py3-none-any.whl.metadata (6.8 kB)
Collecting bitsandbytes
  Downloading bitsandbytes-0.45.2-py3-none-manylinux_2_24_x86_64.whl.metadata (5.8 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch<3,>=2.0->bitsandbytes)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch<3,>=2.0->bitsandbytes)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch<3,>=2.0->bitsandbytes)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch<3,>=2.0->bitsandbytes)
  Downloading nvidia_

# logs into Hugging Face
This cell logs into Hugging Face to authenticate and access pre-trained models from the Hugging Face Model Hub. This step is essential for generating high-quality email content using advanced language models.

In [None]:
!huggingface-cli login


    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    To log in, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Enter your token (input will not be visible): 
Add token as git credential? (Y/n) y
Token is valid (permission: fineGrained).
The token `Cosmos` has been saved to /root/.cache/huggingface/stored_tokens
[1m[31mCannot authenticate through git-credential as no helper is defined on your machine.
You might have to re-authe

# script to sent first time
This cell initializes the core functionalities of the cold emailing system, including:

* Library Imports: Imports necessary modules for email handling, database management, NLP, and PDF processing.

* SMTP Configuration: Configures the Gmail SMTP server to send emails securely.

* Credentials Setup: Uses Gmail credentials (with an app password) for authentication.

* Data Handling: Reads company details from a CSV file and extracts product information from a PDF catalog.

* Database Initialization: Sets up an SQLite database to log sent and received emails.

* Product Extraction: Extracts and organizes product details from the PDF.

* Embedding & Similarity Search: Creates FAISS indexes to find relevant products based on company needs.

* Email Generation: Uses an LLM to craft personalized emails with compelling subjects and content.

* Email Sending: Sends customized emails to all companies listed in the CSV and logs them in the database.

* Database Interaction: Includes functions to log, retrieve, and categorize emails for efficient tracking.

In [None]:
import smtplib
import os
import pandas as pd
import PyPDF2
import torch
from email import encoders
from email.mime.base import MIMEBase
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from transformers import pipeline
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np
import re
import sqlite3

# Gmail SMTP Configuration
SMTP_SERVER = "smtp.gmail.com"
SMTP_PORT = 465

# Your Gmail Credentials (Use App Password, NOT Gmail password)
sender_email = "mahmouds3d1.1@gmail.com"
app_password = "nrum vhah zkxl uuqh"

# File paths
csv_file = "/content/VR_and_Software_Companies_2.csv"
pdf_file = "/content/detailed_product_catalog.pdf"

model = pipeline("text-generation", model="meta-llama/Llama-3.2-3B-Instruct", torch_dtype=torch.bfloat16, device_map="auto")

sender_name = "Mahmoud Saad"
sender_company = "M.S"

# Read CSV containing company details
companies_df = pd.read_csv(csv_file)

# Connect to SQLite database (or create it if it doesn't exist)
conn = sqlite3.connect('emails.db')
cursor = conn.cursor()

# Create a table to store emails
cursor.execute('''
CREATE TABLE IF NOT EXISTS emails (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    company_name TEXT NOT NULL,
    email TEXT NOT NULL,
    subject TEXT NOT NULL,
    message TEXT NOT NULL,
    category TEXT NOT NULL,  -- 'sent' or 'received'
    timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
)
''')

conn.commit()

def extract_subject_and_body(conversation):
    # Find the assistant's response
    assistant_response = next((item for item in conversation if item.get('role') == 'assistant'), None)

    if assistant_response:
        response_text = assistant_response.get('content', '')
    else:
        return "No Subject", "No Content"

    # Extract subject
    subject_match = re.search(r"Subject:\s*(.*)", response_text, re.IGNORECASE)
    subject = subject_match.group(1).strip() if subject_match else "No Subject"

    # Extract body: captures everything after the first "Subject" line
    body_match = re.search(r"Subject:.*?\n+(.*)", response_text, re.DOTALL)
    email_body = body_match.group(1).strip() if body_match else response_text.strip()

    return subject, email_body

# Extract product information from PDF
def extract_products_from_pdf(pdf_path):
    products = []
    with open(pdf_path, "rb") as file:
        reader = PyPDF2.PdfReader(file)
        text = "\n".join([page.extract_text() for page in reader.pages if page.extract_text()])

        lines = text.split("\n")
        for i in range(len(lines)):
            if "Product:" in lines[i]:
                product_name = lines[i].replace("Product: ", "").strip()
                industry = lines[i + 1].replace("Industry: ", "").strip()
                description = lines[i + 2].replace("Description: ", "").strip()
                specifications = lines[i + 3].replace("Specifications: ", "").strip()
                case_study = lines[i + 4].replace("Case Study: ", "").strip()
                compliance = lines[i + 5].replace("Compliance: ", "").strip()
                products.append({
                    "Product Name": product_name,
                    "Industry": industry,
                    "Description": description,
                    "Specifications": specifications,
                    "Case Study": case_study,
                    "Compliance": compliance
                })
    return products

product_list = extract_products_from_pdf(pdf_file)

# Load Sentence Transformer model for embeddings
embedder = SentenceTransformer('all-MiniLM-L6-v2')

# Create FAISS index for product descriptions
def create_faiss_index(products):
    descriptions = [p['Description'] for p in products]
    embeddings = embedder.encode(descriptions)
    index = faiss.IndexFlatL2(embeddings.shape[1])
    index.add(np.array(embeddings))
    return index, descriptions

faiss_index, product_descriptions = create_faiss_index(product_list)

# Function to find relevant products using embeddings
def find_relevant_products(need, top_k=3):
    need_embedding = embedder.encode([need])
    distances, indices = faiss_index.search(np.array(need_embedding), top_k)
    return [product_list[i] for i in indices[0]]

# Function to generate customized email and subject using LLM
def generate_email_and_subject(company_name, industry, need):
    relevant_products = find_relevant_products(need)
    product_suggestions = "\n".join([f"- {p['Product Name']}: {p['Description']}" for p in relevant_products])

    messages = [
        {"role": "system", "content": "You are a professional email assistant specializing in crafting compelling and structured business emails."},
        {"role": "user", "content": f"Generate a professional email with a compelling subject for {company_name} team in a company in {company_name}, a company in the {industry} industry.\n\nThe company has expressed a need for {need}, and we offer the following relevant products:\n\n{product_suggestions}\n\nThe email should include:\n- A relevant subject line\n- A personalized greeting\n- A concise, engaging introduction\n- A brief mention of the relevant products\n- A clear call to action for scheduling a discussion or demo\n- A professional closing with the sender's name ({sender_name}) and company ({sender_company})\n\nRespond with only the email subject and body."}
    ]

    output = model(messages, max_new_tokens=512)[0]['generated_text']
    print(output)
    subject, email_body = extract_subject_and_body(output)
    print(subject)
    return subject, email_body

# Function to send email and log it
def send_email(to_email, company_name, industry, need):
    try:
        subject, email_body = generate_email_and_subject(company_name, industry, need)

        message = MIMEMultipart()
        message["From"] = sender_email
        message["To"] = to_email
        message["Subject"] = subject
        message.attach(MIMEText(email_body, "plain"))

        with open(pdf_file, "rb") as attachment:
            part = MIMEBase("application", "octet-stream")
            part.set_payload(attachment.read())
        encoders.encode_base64(part)
        part.add_header("Content-Disposition", f"attachment; filename={os.path.basename(pdf_file)}")
        message.attach(part)

        with smtplib.SMTP_SSL(SMTP_SERVER, SMTP_PORT) as server:
            server.login(sender_email, app_password)
            server.sendmail(sender_email, to_email, message.as_string())

        # Log the sent email in the SQLite database
        cursor.execute('''
        INSERT INTO emails (company_name, email, subject, message, category)
        VALUES (?, ?, ?, ?, ?)
        ''', (company_name, to_email, subject, email_body, 'sent'))

        conn.commit()

    except Exception as e:
        print(f"❌ Error sending email to {company_name} ({to_email}): {e}")

# Function to store received emails
def store_received_email(company_name, email, subject, message):
    cursor.execute('''
    INSERT INTO emails (company_name, email, subject, message, category)
    VALUES (?, ?, ?, ?, ?)
    ''', (company_name, email, subject, message, 'received'))

    conn.commit()

# Function to retrieve emails
def get_emails(category=None):
    if category:
        cursor.execute('SELECT * FROM emails WHERE category = ?', (category,))
    else:
        cursor.execute('SELECT * FROM emails')

    return cursor.fetchall()

# Send emails to all companies
for _, row in companies_df.iterrows():
    send_email(row["Email"], row["Company Name"], row["Industry"], row["Need"])

# Example usage:
sent_emails = get_emails(category='sent')
received_emails = get_emails(category='received')

print("Sent Emails:")
for email in sent_emails:
    print(email)

print("\nReceived Emails:")
for email in received_emails:
    print(email)

# Close the database connection
conn.close()

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Device set to use cuda:0
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


[{'role': 'system', 'content': 'You are a professional email assistant specializing in crafting compelling and structured business emails.'}, {'role': 'user', 'content': "Generate a professional email with a compelling subject for GreenBuild Solutions team in a company in GreenBuild Solutions, a company in the Architecture & Construction industry.\n\nThe company has expressed a need for VR-based architectural visualization tools, and we offer the following relevant products:\n\n- VR Architect Pro: A high-end VR software for immersive architectural visualization.\n- ImmersiBuild VR: An advanced VR platform for architects to visualize and modify structures in real-time.\n- MediVR Sim: A VR-based medical training simulator for healthcare professionals.\n\nThe email should include:\n- A relevant subject line\n- A personalized greeting\n- A concise, engaging introduction\n- A brief mention of the relevant products\n- A clear call to action for scheduling a discussion or demo\n- A profession

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


[{'role': 'system', 'content': 'You are a professional email assistant specializing in crafting compelling and structured business emails.'}, {'role': 'user', 'content': "Generate a professional email with a compelling subject for MediTech Innovations team in a company in MediTech Innovations, a company in the Healthcare industry.\n\nThe company has expressed a need for VR-based medical training simulators, and we offer the following relevant products:\n\n- MediVR Sim: A VR-based medical training simulator for healthcare professionals.\n- MediSim XR: A next-generation extended reality (XR) medical training simulator for surgery and\n- EduVR Class: A VR-based virtual classroom solution for interactive online education.\n\nThe email should include:\n- A relevant subject line\n- A personalized greeting\n- A concise, engaging introduction\n- A brief mention of the relevant products\n- A clear call to action for scheduling a discussion or demo\n- A professional closing with the sender's nam


### 📧 **Script 1: `event_driven_emails.py`**  
**Purpose:**  
This script continuously monitors your Gmail inbox for unread emails. It handles the queuing and processing of these emails using an SQLite database and external scripts.  

#### 🗂️ **Key Components:**  
1. **Imports & Setup:**  
   - Libraries like `imaplib`, `email`, `sqlite3`, `subprocess`, and `psutil` are used for email handling, database interaction, and process management.  
   - Email credentials and paths for the database, queue, and process scripts are defined.  

2. **Database Initialization:**  
   - Creates an `emails` table in SQLite if it doesn't exist to store email metadata (company, subject, message, etc.).  

3. **Email Monitoring Functions:**  
   - **`fetch_unread_emails()`**: Connects to Gmail, fetches unread emails, extracts relevant information (sender, subject, body), and stores them in the database if the sender exists.  
   - **`mark_email_as_read()`**: Marks processed emails as read.  

4. **Queue Handling:**  
   - **`add_to_queue()`**: Adds new emails to a JSON-based queue if the processing script is busy.  
   - **`process_queue()`**: Processes queued emails once the system is free.  

5. **Integration with External Script:**  
   - **`send_emails_to_script()`**: Forwards emails to `response.py` for classification and response generation.  

6. **Continuous Loop:**  
   - Runs every 60 seconds to check for new emails, manage the queue, and trigger processing tasks.  


In [None]:
%%writefile event_driven_emails.py
import imaplib
import email
import time
import sqlite3
import subprocess
import json
from email.header import decode_header
import psutil  # To check if the external script is running
import time
import random
import smtplib
import os
import sqlite3
import PyPDF2
import numpy as np
import faiss
import re
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText


# Email credentials
EMAIL_USER = "mahmouds3d1.1@gmail.com"
EMAIL_PASSWORD = "nrum vhah zkxl uuqh"  # Use an app password if required
IMAP_SERVER = "imap.gmail.com"

# SQLite database path
SQLITE_DB = "/content/emails.db"
TEMP_FILE = "/content/temp_emails.json"  # Temporary file for email list
QUEUE_FILE = "/content/email_queue.json"   # ✅ Queue file for pending emails
PROCESS_SCRIPT = "/content/response.py"  # External script to handle emails

# Connect to SQLite database
conn = sqlite3.connect(SQLITE_DB)
cursor = conn.cursor()

# Create emails table if it doesn't exist
cursor.execute('''
CREATE TABLE IF NOT EXISTS emails (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    company_name TEXT NOT NULL,
    email TEXT NOT NULL,
    subject TEXT NOT NULL,
    message TEXT NOT NULL,
    category TEXT NOT NULL,  -- 'sent' or 'received'
    timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
)
''')
conn.commit()

# ✅ Check if the response.py script is running
def is_process_running(script_name):
    for proc in psutil.process_iter(['pid', 'name', 'cmdline']):
        if script_name in proc.info['cmdline']:
            return True
    return False

# ✅ Add emails to the queue
def add_to_queue(email_list):
    with open(QUEUE_FILE, "r") as file:
        queue = json.load(file)

    queue.extend(email_list)

    with open(QUEUE_FILE, "w") as file:
        json.dump(queue, file)

    print(f"Queued {len(email_list)} emails.")

# ✅ Process emails from the queue
def process_queue():
    with open(QUEUE_FILE, "r") as file:
        queued_emails = json.load(file)

    if queued_emails:
        print(f"Processing {len(queued_emails)} queued emails...")
        send_emails_to_script(queued_emails)

        # Clear the queue after processing
        with open(QUEUE_FILE, "w") as file:
            json.dump([], file)


def mark_email_as_read(mail, email_id):
    """Mark the email as read."""
    mail.store(email_id, '+FLAGS', '\\Seen')

def fetch_unread_emails():
    """Fetch unread emails from the inbox."""
    try:
        mail = imaplib.IMAP4_SSL(IMAP_SERVER)
        mail.login(EMAIL_USER, EMAIL_PASSWORD)
        mail.select("inbox")  # Select inbox

        # Search for unread emails
        status, messages = mail.search(None, "UNSEEN")
        email_ids = messages[0].split()

        if not email_ids:
            print("No new emails found.")
            mail.logout()
            return []

        EMAIL_LIST = []  # Store extracted email addresses
        for num in reversed(email_ids):  # Process from latest to oldest
            status, data = mail.fetch(num, "(RFC822)")
            for response_part in data:
                if isinstance(response_part, tuple):
                    msg = email.message_from_bytes(response_part[1])

                    # Decode email subject
                    subject, encoding = decode_header(msg["Subject"])[0]
                    if isinstance(subject, bytes):
                        subject = subject.decode(encoding if encoding else 'utf-8')

                    # Get sender email
                    from_email = msg.get("From")
                    from_email = from_email.split("<")[-1].strip(">")  # Extract clean email
                    EMAIL_LIST.append(from_email)

                    # Get email body
                    body = ""
                    for part in msg.walk():
                        content_type = part.get_content_type()
                        content_disposition = str(part.get("Content-Disposition") or "")

                        if content_type == "text/plain" and "attachment" not in content_disposition:
                            body = part.get_payload(decode=True).decode(errors="ignore")
                            break  # Stop after getting the plain text part

                    if not body:
                        continue  # Skip if no body found
                    print(f"\nNew Email from {from_email}: {subject}")

                    # ✅ Check if the sender exists in the contacts table
                    cursor.execute('SELECT * FROM emails WHERE email = ?', (from_email,))
                    contact = cursor.fetchone()

                    if contact:  # ✅ Insert only if the sender exists in the database
                        company_name = contact[1] if len(contact) > 1 else "Unknown Company"

                        # ✅ Insert the received email into the SQLite database
                        cursor.execute('''
                            INSERT INTO emails (company_name, email, subject, message, category)
                            VALUES (?, ?, ?, ?, ?)
                        ''', (company_name, from_email, subject, body, 'received'))
                        conn.commit()

                        # ✅ Mark email as read only for known senders
                        mark_email_as_read(mail, num)
                    else:
                        print(f"Skipped email from unknown sender: {from_email}")

        mail.logout()
        return EMAIL_LIST  # Return the list of emails

    except Exception as e:
        print("Error:", e)
        return []

def send_emails_to_script(email_list):
    """Pass the list of emails to another script (process_emails.py)."""
    if email_list:
        print(f"Passing {len(email_list)} emails to {PROCESS_SCRIPT}...")

        # Save email list to a temporary file
        with open(TEMP_FILE, "w") as file:
            json.dump(email_list, file)

        # Run the external script with the file path as an argument
        subprocess.Popen(["python", PROCESS_SCRIPT, TEMP_FILE])

# Ensure the queue file exists
if not os.path.exists(QUEUE_FILE):
    with open(QUEUE_FILE, "w") as file:
        json.dump([], file)  # Initialize with an empty list


# Run the script continuously every 60 seconds
while True:
    emails = fetch_unread_emails()

    # Check the size of the email queue
    with open(QUEUE_FILE, "r") as file:
        queue_size = len(json.load(file))

    if emails or queue_size != 0:
        if is_process_running(PROCESS_SCRIPT):
            # Add new emails to the queue if the process is running
            add_to_queue(emails)
        elif emails and queue_size != 0:
            # Process queued emails first, then send new emails
            process_queue()
            send_emails_to_script(emails)
        elif emails:
            # Send new emails directly if there's no queue
            send_emails_to_script(emails)
        elif queue_size != 0:
            # Process queued emails if no new emails are found
            process_queue()
    time.sleep(60)

Overwriting event_driven_emails.py



### 🤖 **Script 2: `response.py`**  
**Purpose:**  
This script processes the emails passed from `event_driven_emails.py`. It classifies responses using a zero-shot classification model and generates replies when needed.  

#### 🗂️ **Key Components:**  
1. **Imports & Setup:**  
   - Uses libraries like `transformers`, `sqlite3`, `faiss`, and `PyPDF2` for NLP tasks, database access, and working with product catalogs.  
   - Connects to the same SQLite database and loads the pending emails from a temporary JSON file.  

2. **Model Initialization:**  
   - **Zero-Shot Classification:** Uses Facebook's `BART-large-mnli` to classify email responses into categories:  
     - **"Not interested"**  
     - **"Need more details"**  
     - **"Want to make a meeting"**  
   - **LLM for Response Generation:** Uses Meta’s LLaMA model to generate detailed responses when needed.  

3. **Email Classification & Response:**  
   - **`classify_email()`**: Determines the intent of the latest email response.  
   - **`get_latest_response()` & `get_full_conversation()`**: Retrieve conversation history from the database to maintain context in replies.  
   - **`extract_subject_and_body()`**: Parses the generated response to extract the subject and body for the reply.  

4. **Sending Replies:**  
   - Composes and sends emails when the classification result is **"Need more details"**.  
   - For **"Want to make a meeting"**, it forwards the request to HR.  
   - **"Not interested"** responses are ignored.  

---

### 🔄 **How They Work Together:**  
1. **Email Received →** `event_driven_emails.py` detects it.  
2. **Processing →** If busy, the email is queued; otherwise, it's sent to `response.py`.  
3. **Classification →** `response.py` determines the email type.  
4. **Action →** Sends an automated reply, forwards to HR, or ignores based on the classification.  

In [None]:
%%writefile response.py
import smtplib
import os
import json
import sqlite3
import PyPDF2
import numpy as np
import faiss
import re
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
import torch
from transformers import pipeline
from sentence_transformers import SentenceTransformer
import psutil  # To check if the external script is running
import time
import random
print("Hi")
# File paths and email details
SQLITE_DB = "/content/emails.db"
pdf_file = "/content/detailed_product_catalog.pdf"
json_file = "/content/temp_emails.json"
sender_email = "mahmouds3d1.1@gmail.com"
app_password = "nrum vhah zkxl uuqh"
sender_name = "Mahmoud Saad"
sender_company = "M.S"

# Load email list from JSON file
with open(json_file, "r") as file:
    email_list = json.load(file)
print(email_list)
# Connect to SQLite database
conn = sqlite3.connect(SQLITE_DB)
cursor = conn.cursor()

# Load BART MNLI zero-shot classification pipeline
classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

# Load Llama model
model_id = "meta-llama/Llama-3.2-3B-Instruct"
pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# Classification categories
categories = ["Not interested", "Need more details", "Want to make a meeting"]

def classify_email(response):
    if not response or response.lower() == "no response":
        return "No Response"
    result = classifier(response, candidate_labels=categories)
    return result['labels'][0]

def get_latest_response(email):
    """Retrieve the latest response for a given email."""
    cursor.execute('''
    SELECT message FROM emails
    WHERE email = ? AND category = 'received'
    ORDER BY timestamp DESC
    LIMIT 1
    ''', (email,))
    result = cursor.fetchone()
    return result[0] if result else None

def get_full_conversation(email):
    """Retrieve the full conversation history for a given email."""
    cursor.execute('''
    SELECT message, category FROM emails
    WHERE email = ?
    ORDER BY timestamp ASC
    ''', (email,))
    results = cursor.fetchall()
    conversation = []
    for message, category in results:
        conversation.append(f"{category}: {message}")
    return "\n".join(conversation)

def extract_subject_and_body(conversation):
    # Find the assistant's response
    assistant_response = next((item for item in conversation if item.get('role') == 'assistant'), None)

    if assistant_response:
        response_text = assistant_response.get('content', '')
    else:
        return "No Subject", "No Content"

    # Extract subject
    subject_match = re.search(r"Subject:\s*(.*)", response_text, re.IGNORECASE)
    subject = subject_match.group(1).strip() if subject_match else "No Subject"

    # Extract body: captures everything after the first "Subject" line
    body_match = re.search(r"Subject:.*?\n+(.*)", response_text, re.DOTALL)
    email_body = body_match.group(1).strip() if body_match else response_text.strip()

    return subject, email_body

def extract_products_from_pdf(pdf_path):
    products = []
    with open(pdf_path, "rb") as file:
        reader = PyPDF2.PdfReader(file)
        text = "\n".join([page.extract_text() for page in reader.pages if page.extract_text()])

        lines = text.split("\n")
        for i in range(len(lines)):
            if "Product:" in lines[i]:
                product_name = lines[i].replace("Product: ", "").strip()
                industry = lines[i + 1].replace("Industry: ", "").strip()
                description = lines[i + 2].replace("Description: ", "").strip()
                specifications = lines[i + 3].replace("Specifications: ", "").strip()
                case_study = lines[i + 4].replace("Case Study: ", "").strip()
                compliance = lines[i + 5].replace("Compliance: ", "").strip()
                products.append({
                    "Product Name": product_name,
                    "Industry": industry,
                    "Description": description,
                    "Specifications": specifications,
                    "Case Study": case_study,
                    "Compliance": compliance
                })
    return products

product_list = extract_products_from_pdf(pdf_file)
embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

def create_embeddings(products):
    product_embeddings = []
    for product in products:
        text = f"Product: {product['Product Name']}, Description: {product['Description']}, Specifications: {product['Specifications']}, Case Study: {product['Case Study']}, Compliance: {product['Compliance']}"
        embedding = embedding_model.encode(text)
        product_embeddings.append(embedding)
    return np.array(product_embeddings)

product_embeddings = create_embeddings(product_list)
index = faiss.IndexFlatL2(product_embeddings.shape[1])
index.add(product_embeddings)

def get_relevant_product(query):
    query_embedding = embedding_model.encode([query])
    query_embedding = np.array(query_embedding)
    _, indices = index.search(query_embedding, k=1)
    return product_list[indices[0][0]]

def generate_email_and_subject(company_name, industry, conversation):
    relevant_product = get_relevant_product(conversation)
    messages = [
        {"role": "system", "content": "You are a professional email assistant who writes structured responses."},
        {"role": "user", "content": f"Given the conversation history for {company_name} in the {industry} industry, generate a well-structured response.\n\nConversation:\n{conversation}\n\nProduct Details:\n- {relevant_product}"}
    ]
    outputs = pipe(messages, max_new_tokens=512)
    response_text = outputs[0]["generated_text"]
    print(response_text)
    subject, email_body = extract_subject_and_body(response_text)
    return subject, email_body

# Process each email in the list
for email in email_list:
    print(f"Processing email from {email}...")
    latest_response = get_latest_response(email)
    category = classify_email(latest_response)

    if category == "Need more details":
        conversation = get_full_conversation(email)
        cursor.execute('''
        SELECT company_name, subject FROM emails
        WHERE email = ? AND category = 'received'
        ORDER BY timestamp DESC
        LIMIT 1
        ''', (email,))
        result = cursor.fetchone()
        if result:
            company_name, industry = result
            subject, response = generate_email_and_subject(company_name, industry, conversation)

            # Insert the sent email into the SQLite database
            cursor.execute('''
            INSERT INTO emails (company_name, email, subject, message, category)
            VALUES (?, ?, ?, ?, ?)
            ''', (company_name, email, subject, response, 'sent'))
            conn.commit()

            # Send the email
            message = MIMEMultipart()
            message["From"], message["To"], message["Subject"] = sender_email, email, subject
            message.attach(MIMEText(response, "plain"))

            with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
                server.login(sender_email, app_password)
                server.sendmail(sender_email, email, message.as_string())

            print(f"Sent response to {email} for {company_name}")

conn.close()
print("✅ Emails categorized, responses sent, and database updated.")
print("waiting")
time.sleep(random.randint(180, 200))
print("finished")

Overwriting response.py


run the script

In [None]:
!python event_driven_emails.py

No new emails found.

New Email from mahmoud.saad.mahmoud.11@gmail.com: Re: VR-Based Architectural Visualization Tools
Passing 1 emails to /content/response.py...
2025-02-09 08:33:20.852205: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1739090000.914314    7229 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1739090000.933716    7229 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
Hi
['mahmoud.saad.mahmoud.11@gmail.com']
Device set to use cuda:0
Loading checkpoint shards: 100% 2/2 [00:01<00:00,  1.85it/s]
Device set to use cpu
Processing email from mahmoud.saad.mahmoud.11@gmail.com...
Setting `pad_token_id` to `eos_token_id`:128001 for open-end gener

### Explanation for Cell 10
This cell initializes or processes a specific part of the cold emailing system.


In [None]:
# # Function to retrieve emails
# # Connect to SQLite database
# SQLITE_DB = "/content/emails.db"
# conn = sqlite3.connect(SQLITE_DB)
# cursor = conn.cursor()

# def get_emails(category=None):
#     if category:
#         cursor.execute('SELECT * FROM emails WHERE category = ?', (category,))
#     else:
#         cursor.execute('SELECT * FROM emails')

#     return cursor.fetchall()

# # Example usage:
# sent_emails = get_emails(category='sent')
# received_emails = get_emails(category='received')

# print("Sent Emails:")
# for email in sent_emails:
#     print(email)

# print("\nReceived Emails:")
# for email in received_emails:
#     print(email)

# # # ✅ Function to delete all received emails
# # def delete_received_emails():
# #     cursor.execute('DELETE FROM emails WHERE category = ?', ('received',))
# #     cursor.execute('DELETE FROM emails WHERE category = ?', ('sent',))
# #     conn.commit()
# #     print("All received emails have been deleted.")

# # # ✅ Example usage:
# # delete_received_emails()

# # # ✅ To verify deletion
# # received_emails = get_emails(category='received')
# # print("\nReceived Emails After Deletion:")
# # for email in received_emails:
# #     print(email)


Sent Emails:
(1, 'GreenBuild Solutions', 'mahmoud.saad.mahmoud.11@gmail.com', 'Revolutionize Architectural Visualization with GreenBuild Solutions', "Dear GreenBuild Solutions Team,\n\nI wanted to personally reach out to you regarding your expressed interest in VR-based architectural visualization tools. Our team at GreenBuild Solutions has been at the forefront of innovation in this field, and I'd like to introduce you to our cutting-edge solutions that can elevate your architectural visualization capabilities.\n\nWe offer a range of products, including VR Architect Pro, ImmersiBuild VR, and MediVR Sim, each designed to provide unparalleled immersive experiences for architects and designers. Whether you're looking to enhance your design workflow, improve collaboration, or create stunning visualizations, our products can help.\n\nI'd love to schedule a discussion or demo to explore how our solutions can meet your specific needs. Please let me know a convenient time, and I'll ensure tha

### Explanation for Cell 11
This cell initializes or processes a specific part of the cold emailing system.
