Skip to content

predaaaasaaaaaaa/ai-gmail-agent

Repository files navigation

πŸ€– AI Email Agent with MCP - V4 (Voice Conversational Bot)

Python MCP Docker Telegram Groq License

An intelligent AI email assistant powered by Groq's LLaMA 3.3 70B and Model Context Protocol (MCP). Manage Gmail and iCloud emails via natural voice conversation on Telegram.


πŸŽ‰ What's New in V4 Final

πŸ—£οΈ Fully Conversational AI - Bot doesn't just transcribe, it talks back!

New Features

  • πŸ”Š Text-to-Speech (TTS) - Bot responds with voice using Groq Orpheus
  • 🎯 Smart Voice System - Only speaks conversation, not data
  • πŸ’¬ Natural Q&A - Ask "What can you do?" or "Is this secure?"
  • 🎲 Dynamic Variations - Never repeats the same response twice
  • 🧠 Off-Topic Detection - Gracefully handles non-email questions
  • πŸ‘‚ Human-Like Interaction - Feels like talking to a real assistant

V4 Interaction Example

🎀 You: "Check my Gmail"
πŸ“ Bot: [Lists 10 emails in text]
πŸ”Š Bot: "Here are your latest 10 emails. Which one would you like me to read?"

🎀 You: "Read email number 2"
πŸ“ Bot: [Shows full email content]
πŸ”Š Bot: "Here's email number 2. Would you like me to draft a reply?"

🎀 You: "Draft a reply"
πŸ“ Bot: [Shows AI-generated draft]
πŸ”Š Bot: "I've drafted a reply for email 2. Would you like me to send it?"

🎀 You: "Send it"
πŸ“ Bot: "βœ… Reply sent!"
πŸ”Š Bot: "Done! Reply sent. Anything else I can help you with?"

✨ All Features

V4 Voice Intelligence

  • βœ… Groq Whisper (STT) - Understands your voice commands
  • βœ… Groq Orpheus TTS - Responds with natural voice (diana voice)
  • βœ… Smart Voice Triggers - Only speaks when needed (conversation, not data)
  • βœ… Dynamic Variations - 3-5 response variations per action (never repeats)
  • βœ… Natural Q&A - Answers "What can you do?" and "Is this secure?"
  • βœ… Off-Topic Handling - Redirects gracefully to email tasks

Core Email Capabilities

  • βœ… Read Emails - Gmail (primary inbox) + iCloud
  • βœ… Send Emails - Compose via natural language
  • βœ… Advanced Search - Find by sender, subject, date, keywords
  • βœ… Draft Replies - AI-generated with approval workflow
  • βœ… Voice + Text Input - Works both ways
  • βœ… Context Awareness - Remembers conversation state
  • βœ… MCP Architecture - Modular, reusable tools
  • βœ… Dockerized - One-command deployment

V4 Telegram Bot Commands

Setup Commands:

/start - Initialize bot
/help - See all commands
/status - View bot memory (emails, drafts, context)
/clear - Reset session memory

Voice/Text Commands:

"Check my Gmail" - List primary inbox emails
"Check my iCloud" - List iCloud emails
"Read email number 2" - Read specific email
"Read it" - Auto-read when only 1 result
"Draft a reply" - Generate reply for last read email
"Draft a reply for email 3" - Generate reply for specific email
"Draft a reply saying I will attend" - Custom reply hint
"Send reply" - Send pending draft
"Cancel" - Cancel pending draft
"Find emails from Nike" - Search by sender
"What can you do?" - See capabilities
"Is this secure?" - Security explanation

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Telegram User  β”‚ ◄── Voice Input
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ Voice Message
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Groq Whisper API   β”‚ ◄── Speech-to-Text
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ Transcribed Text
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Telegram Bot       β”‚ ◄── Context Management
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ Command
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Groq LLaMA 3.3 70B β”‚ ◄── AI Decision Making
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ Tool Calls
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   MCP Client        β”‚ ◄── Protocol Layer
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ JSON-RPC
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   MCP Server        β”‚ ◄── 10+ Email Tools
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Email Handlers     β”‚ ◄── Gmail API + iCloud IMAP/SMTP
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Groq Orpheus TTS   β”‚ ◄── Text-to-Speech (Voice Response)
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Telegram User  β”‚ ◄── Voice + Text Response
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

ai-gmail-agent/
β”œβ”€β”€ telegram_bot/
β”‚   └── bot.py              # V4 Telegram bot (voice conversation)
β”œβ”€β”€ agent/
β”‚   β”œβ”€β”€ client.py           # V2 CLI agent
β”‚   └── mcp_client.py       # MCP client wrapper
β”œβ”€β”€ mcp_server/
β”‚   β”œβ”€β”€ server.py           # MCP server
β”‚   └── email_tools.py      # Gmail & iCloud handlers
β”œβ”€β”€ .env                    # API keys (not committed)
β”œβ”€β”€ credentials.json        # Gmail OAuth (not committed)
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ docker-compose.yml
└── README.md

πŸš€ Quick Start - V4 Telegram Bot

Prerequisites

You'll need:

  1. Telegram Account - To use the bot
  2. Telegram Bot Token - From @BotFather
  3. Gmail OAuth Credentials - One-time setup (~15 min)
  4. Groq API Key - Free at console.groq.com
  5. iCloud App Password - Optional, for iCloud emails

Step 1: Create Telegram Bot

  1. Open Telegram and search for @BotFather
  2. Send /newbot command
  3. Follow prompts:
    • Bot name: My Email Assistant
    • Username: my_email_assistant_bot (must end in bot)
  4. Copy the bot token (looks like 1234567890:ABCdef...)
  5. Save it - you'll need it in .env

Step 2: Get Gmail API Credentials

Option A: Quick Video Tutorial

Option B: Step-by-Step

  1. Go to Google Cloud Console
  2. Create new project β†’ name it "Email Agent"
  3. Enable Gmail API:
    • APIs & Services β†’ Library
    • Search "Gmail API" β†’ Enable
  4. Create OAuth credentials:
    • APIs & Services β†’ Credentials
    • Create Credentials β†’ OAuth Client ID
    • Application type: Desktop app
    • Download JSON β†’ rename to credentials.json
  5. Configure OAuth consent:
    • OAuth consent screen
    • User type: External
    • Add yourself as test user
    • Scopes: Add Gmail scopes

Full Guide


Step 3: Get Groq API Key

  1. Go to console.groq.com
  2. Sign up (free - no credit card)
  3. API Keys β†’ Create new key
  4. Copy it

Note: One Groq API key gives you access to:

  • Whisper (Speech-to-Text)
  • LLaMA 3.3 70B (AI reasoning)
  • Orpheus TTS (Text-to-Speech)

Step 4: iCloud Setup (Optional)

If you want iCloud email support:

  1. Go to appleid.apple.com
  2. Sign in β†’ Security section
  3. App-Specific Passwords β†’ Generate
  4. Label: "Email Agent Bot"
  5. Copy the password (format: xxxx-xxxx-xxxx-xxxx)

Step 5: Create .env File

Create .env in project root:

# Telegram
TELEGRAM_BOT_TOKEN=1234567890:ABCdefGHIjklMNOpqrsTUVwxyz

# Groq API (one key for everything!)
GROQ_API_KEY=gsk_your_groq_api_key_here

# iCloud (optional)
ICLOUD_EMAIL=your-email@icloud.com
ICLOUD_PASSWORD=xxxx-xxxx-xxxx-xxxx

Step 6: Run the Bot

Option 1: Python (Local)

# Clone repo
git clone https://github.com/predaaaasaaaaaaa/ai-gmail-agent
cd ai-gmail-agent

# Create virtual environment
python -m venv .venv

# Activate (Windows)
.venv\Scripts\activate

# Activate (Linux/Mac)
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Run Telegram bot
python telegram_bot/bot.py

First run: Browser will open for Gmail OAuth β†’ sign in β†’ allow access

Option 2: Docker (Recommended)

# Pull latest image
docker pull samymetref/ai-email-agent:v4

# Run bot
docker run -d \
  --name email-bot \
  -v $(pwd)/credentials.json:/app/credentials.json:ro \
  -v $(pwd)/token.pickle:/app/token.pickle \
  -v $(pwd)/.env:/app/.env:ro \
  samymetref/ai-email-agent:v4 \
  python telegram_bot/bot.py

Step 7: Experience Voice Conversation!

  1. Open Telegram
  2. Search for your bot (@my_email_assistant_bot)
  3. Send /start
  4. Have a voice conversation:
    • 🎀 "Check my Gmail"
    • πŸ”Š Bot speaks: "Here are your latest 10 emails..."
    • 🎀 "Read email number 1"
    • πŸ”Š Bot speaks: "Here's email number 1. Would you like me to draft a reply?"
    • 🎀 "Yes, draft a reply"
    • πŸ”Š Bot speaks: "I've drafted a reply. Would you like me to send it?"

Or type the same commands - works both ways!


πŸ’¬ Usage Examples

V4 Conversational Flow

🎀 You (Voice): "Check my Gmail"

πŸ“ Bot (Text): 
Found 10 emails. Showing top 10:
1. From: john@company.com
   Subject: Meeting tomorrow
2. From: sarah@startup.io
   Subject: Project update
...

πŸ”Š Bot (Voice): "Here are your latest 10 emails. Which one would you like me to read?"

───────────────────────────────────

🎀 You (Voice): "Read email number 1"

πŸ“ Bot (Text):
From: john@company.com
Subject: Meeting tomorrow

Hi, can we meet tomorrow at 3pm?
Let me know!

πŸ”Š Bot (Voice): "Here's email number 1. Would you like me to draft a reply, or read another email?"

───────────────────────────────────

🎀 You (Voice): "Draft a reply saying I'll be there"

πŸ“ Bot (Text):
πŸ“§ DRAFT REPLY (email #1):

To: john@company.com
Subject: Re: Meeting tomorrow

Hi John,

Thank you for reaching out. I'll be there 
tomorrow at 3pm. Looking forward to it!

Best regards
---
Say 'send reply' to send or 'cancel' to cancel.

πŸ”Š Bot (Voice): "I've drafted a reply for email number 1. Would you like me to send it?"

───────────────────────────────────

🎀 You (Voice): "Send it"

πŸ“ Bot (Text): βœ… Reply sent to john@company.com!

πŸ”Š Bot (Voice): "Done! Reply sent. Anything else I can help you with?"

Natural Q&A

🎀 You: "What can you do?"

πŸ“ Bot: [Full capabilities list with emojis]

πŸ”Š Bot: "I've listed everything I can do for you. Feel free to ask me anything!"

───────────────────────────────────

🎀 You: "Is this secure?"

πŸ“ Bot: [Complete security explanation]

πŸ”Š Bot: "Your data is completely safe with me. Everything stays on your device!"

Off-Topic Handling

🎀 You: "What's the weather today?"

πŸ“ Bot: "I'm an email assistant, so I focus on managing your inbox! 
I can check Gmail and iCloud, read emails, draft replies, and 
search messages. Try saying 'check my Gmail'!"

πŸ”Š Bot: "I'm focused on emails! Want me to check your inbox?"

Advanced Search

🎀 "Find emails from Nike"

πŸ“ Found 2 emails:
1. From: Nike <updates@nike.com>
   Subject: New collection
2. From: Nike <promo@nike.com>
   Subject: 20% off sale

πŸ”Š "I found 2 emails. Which one would you like to read?"

🎀 "Read email 1"
πŸ”Š "Here's email number 1. Would you like me to draft a reply?"

πŸ› οΈ Tech Stack

Component Technology
Bot Framework python-telegram-bot 20.0+
Speech-to-Text Groq Whisper Large V3
Text-to-Speech Groq Orpheus (diana voice)
AI Reasoning Groq LLaMA 3.3 70B
Protocol Model Context Protocol (MCP)
Gmail Gmail API (OAuth 2.0)
iCloud IMAP/SMTP
Language Python 3.11+
Deployment Docker

πŸ”§ Development

Run Tests

# Test Telegram bot
python telegram_bot/bot.py

# Test MCP client
python test_mcp_client.py

# Test email handlers
python test_email.py

Project Commands

# Run bot locally
python telegram_bot/bot.py

# Run with Docker
docker-compose up telegram-bot

# Rebuild Docker
docker-compose build

# View logs
docker logs -f email-bot

πŸ› Troubleshooting

Bot doesn't respond

Problem: Bot shows "online" but doesn't reply

Solution:

# Check logs
python telegram_bot/bot.py

# Look for: "βœ… Bot ready with off-topic detection!"
# If not, check .env file has TELEGRAM_BOT_TOKEN

Voice not transcribing

Problem: Bot says "Couldn't transcribe"

Solution:

  • Check Groq API key in .env
  • Verify API quota: console.groq.com
  • Try shorter voice message (< 10 seconds)

TTS not working

Problem: Bot sends text but no voice response

Solution:

  • Check Groq API daily limits (3600 tokens/day for TTS)
  • Voice messages are under 200 chars each
  • Wait 24 hours if rate limited
  • Check logs for "βœ… Voice sent"

Gmail OAuth error

Problem: "Access denied" when signing in

Solution:

  1. Go to Google Cloud Console
  2. OAuth consent screen β†’ Add test users
  3. Add your Gmail address
  4. Delete token.pickle and try again

iCloud authentication failed

Problem: "Authentication failed"

Solution:

  • Use app-specific password, not regular password
  • Verify .env has correct format: xxxx-xxxx-xxxx-xxxx
  • Check Apple ID security settings

Docker won't start

Problem: Container exits immediately

Solution:

# Check files exist
ls credentials.json .env

# Check .env format
cat .env

# View container logs
docker logs email-bot

# Rebuild without cache
docker-compose build --no-cache

πŸ“¦ Available MCP Tools

The bot uses these email tools via MCP:

Tool Description Example
list_gmail_emails Fetch Gmail inbox "Check my Gmail"
list_icloud_emails Fetch iCloud inbox "Check my iCloud"
read_gmail_email Read Gmail content "Read email number 1"
read_icloud_email Read iCloud content "Read email 2"
send_gmail_email Send via Gmail Used after "Send reply"
send_icloud_email Send via iCloud Auto-detected
search_gmail Advanced Gmail search "Find emails from Nike"
search_icloud Search iCloud by sender "Find iCloud from John"
draft_gmail_reply Draft Gmail reply Auto-detected
draft_icloud_reply Draft iCloud reply Auto-detected

πŸ” Security & Privacy

What's Safe

βœ… All credentials stored locally (never uploaded)
βœ… OAuth tokens encrypted by Google
βœ… API keys in .env (gitignored)
βœ… Draft approval required (no auto-send)
βœ… Voice processed via Groq (encrypted HTTPS)
βœ… No data stored on Telegram servers
βœ… Open source - audit the code yourself!

What's Never Committed

🚫 credentials.json - Gmail OAuth
🚫 token.pickle - Gmail access token
🚫 .env - All API keys

What Groq Processes

βœ… Voice transcription (Whisper) - No email content
βœ… AI reasoning (LLaMA) - Command interpretation only
βœ… Voice generation (Orpheus) - Conversational responses only

Groq NEVER sees your email content!

Best Practices

  1. Revoke access anytime: Google Account
  2. Delete bot anytime: Send /deletebot to @BotFather
  3. Use test account for development
  4. Keep .env private - never share

πŸ“Š Version History

V4 (Current) - Voice Conversational AI

Released: February 2026

New:

  • πŸ”Š Text-to-Speech voice responses (Groq Orpheus)
  • 🎯 Smart voice system (only speaks conversation)
  • πŸ’¬ Natural Q&A ("What can you do?", "Is this secure?")
  • 🎲 Dynamic response variations (never repeats)
  • 🧠 Off-topic detection & graceful redirect
  • πŸ‘‚ Human-like conversation flow

Features:

  • Voice input AND output (full conversation)
  • Context-aware TTS messages
  • Token-optimized (200 chars max per voice)
  • 3-5 response variations per action
  • Capabilities & security explanations
  • Friendly off-topic handling

V3 - Telegram Voice Bot

Released: February 2026

New:

  • 🎀 Telegram bot with voice + text support
  • 🧠 Context-aware conversation memory
  • πŸ€– AI-powered reply generation
  • πŸ“± Mobile-friendly (Telegram app)
  • πŸ”„ Whisper transcription normalization

V2 - MCP Email Agent

Released: January 2026

New:

  • Model Context Protocol architecture
  • 10 email tools (list, read, send, search, draft)
  • Gmail advanced search

V1 - Basic Email Bot

Released: January 2026

Features:

  • Simple Gmail read/send
  • Basic CLI commands

🎯 Roadmap

V5 (Planned)

  • πŸ“Ž Attachment Support - Send/receive files
  • πŸ—“οΈ Calendar Integration - Schedule from emails
  • πŸ”” Push Notifications - Real-time email alerts
  • 🌐 Multi-language - Support more languages
  • 🎨 Voice Style Selection - Choose TTS voice
  • πŸ‘₯ Multi-user - Family/team bot access
  • πŸ“Š Analytics Dashboard - Email insights

πŸ“„ License

MIT License - Free to use, modify, and distribute


πŸ™ Acknowledgments

  • Groq - Fast LLaMA inference, Whisper API & Orpheus TTS
  • Telegram - Bot platform
  • Google - Gmail API
  • Apple - iCloud IMAP/SMTP
  • Anthropic - MCP protocol inspiration

πŸ“§ Links


🀝 Contributing

Contributions welcome! Please:

  1. Fork the repo
  2. Create feature branch (git checkout -b feature/amazing)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push branch (git push origin feature/amazing)
  5. Open Pull Request

❓ FAQ

Q: Is this free?
A: Yes! Groq API is free (with limits). Gmail API is free. Telegram is free.

Q: Does the bot store my emails?
A: No. Everything is processed in real-time and stored locally on your machine.

Q: Can other people use my bot?
A: No. Your bot is private. Only you can access it (unless you share the link).

Q: What if I run out of Groq credits?
A: Groq free tier: 3600 TTS tokens/day. Each voice message ~50-200 tokens. Wait 24h if you hit limits.

Q: Can I choose a different voice?
A: Yes! Edit bot.py line 363: voice="diana" β†’ Options: diana, hannah, autumn (feminine) or troy, austin, daniel (masculine)

Q: Can I host this on a server?
A: Yes! Use Docker on any VPS (AWS, DigitalOcean, etc.). Keep .env secure.

Q: Why does the bot speak some responses but not others?
A: By design! Bot sends TEXT for data (email lists, content) and VOICE for conversation (questions, confirmations). This saves tokens and is faster.


Built with ❀️ by Samy Metref

⭐ Star this repo if you find it useful!
πŸ’‘ Questions? Open an issue!
🐳 Docker Hub: Check latest version!


Ready to have voice conversations with your email assistant? Get started in 10 minutes! πŸš€

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors