A production-quality Python tool that automatically fetches order confirmation emails from Zomato, Swiggy, Swiggy Instamart, Blinkit, and Zepto via Gmail API, parses them into structured data, and generates comprehensive Excel reports with charts and CSV exports.
| Feature | Details |
|---|---|
| Platforms | Zomato, Swiggy, Swiggy Instamart, Blinkit, Zepto |
| Email parsing | Regex + BeautifulSoup HTML parsing + MIME decoding |
| Data extracted | Order ID, Payment ID, Amount, Date, Store, Status |
| Deduplication | SQLite cache prevents re-parsing emails |
| Cancellation detection | Detects cancelled/refunded orders |
| Excel reports | 6-sheet workbook with charts, filters, currency formatting |
| CSV exports | Per-platform, per-month, and full-dataset CSVs |
| Scheduling | APScheduler cron-based auto-runs |
| CLI | Rich Click-based commands |
| Unit tests | 35 passing pytest tests |
gmail_order_parser/
├── main.py # CLI entry point
├── gmail_auth.py # OAuth2 Gmail authentication
├── gmail_fetcher.py # Email fetching + routing
├── database.py # SQLite caching layer
├── scheduler.py # APScheduler daemon
├── requirements.txt
├── .env.example # Sample environment config
│
├── config/
│ ├── settings.py # Typed config from .env
│ └── logger.py # Colored console + file logging
│
├── parsers/
│ ├── __init__.py # Parser registry
│ ├── base_parser.py # Abstract base with shared utilities
│ ├── zomato_parser.py
│ ├── swiggy_parser.py
│ ├── instamart_parser.py
│ ├── blinkit_parser.py
│ └── zepto_parser.py
│
├── reports/
│ ├── excel_generator.py # Full Excel workbook generator
│ └── csv_exporter.py # CSV export module
│
├── tests/
│ └── test_parsers.py # 35 unit tests
│
├── scripts/
│ └── generate_sample_report.py # Demo without Gmail access
│
├── data/ # SQLite database (auto-created)
├── reports/ # Generated Excel files
├── exports/ # Generated CSV files
└── logs/ # Log files
git clone <repo-url>
cd gmail_order_parser
pip install -r requirements.txtcp .env.example .env
# Edit .env as needed (see Configuration section)- Go to Google Cloud Console
- Create a new project (or select existing)
- Navigate to APIs & Services → Library
- Search for Gmail API and click Enable
- Go to APIs & Services → Credentials
- Click Create Credentials → OAuth 2.0 Client IDs
- Application type: Desktop app
- Name it anything (e.g., "Gmail Order Parser")
- Click Create → Download JSON
- Rename the downloaded file to
credentials.json - Place it in the
config/directory
gmail_order_parser/
└── config/
└── credentials.json ← put it here
python main.py fetch-emailsA browser window will open asking you to authorize the app. Sign in with the Gmail account containing your order emails. The token is saved to config/token.json for future runs.
# Fetch and parse new emails from Gmail
python main.py fetch-emails
# Fetch with custom options
python main.py fetch-emails --max 200 --from-date 2024-01-01
# Generate Excel report
python main.py generate-report
# Generate report for one platform only
python main.py generate-report --platform Zomato
# Export all orders to CSV
python main.py export-csv
# Print monthly summary to console
python main.py monthly-summary
# Filter by year
python main.py monthly-summary --year 2024
# Run everything in one command
python main.py run-all
# Start the scheduler daemon
python main.py schedule
# Show database statistics
python main.py statsThe generated .xlsx workbook contains 6 sheets:
| Sheet | Contents |
|---|---|
| Dashboard | KPI cards, platform breakdown |
| Raw Orders | All orders with filters, colored rows for cancellations |
| Monthly Summary | Month-wise spend totals with grand total row |
| Platform Summary | Per-platform orders, spend, cancelled/refunded counts |
| Payment Methods | Payment method breakdown with % of total |
| Charts | Bar chart (monthly spend) + Pie chart (platform split) |
- ✅ Frozen header rows
- ✅ Auto-filters on all columns
- ✅ Currency formatting (₹#,##0.00)
- ✅ Alternating row colors
- ✅ Platform-colored rows
- ✅ Cancelled orders highlighted in pink
- ✅ Refunded orders highlighted in orange
# Path to OAuth credentials
GOOGLE_CREDENTIALS_PATH=config/credentials.json
GOOGLE_TOKEN_PATH=config/token.json
# Database location
DB_PATH=data/orders.db
# Output directories
REPORTS_DIR=reports
EXPORTS_DIR=exports
LOGS_DIR=logs
# Logging: DEBUG, INFO, WARNING, ERROR
LOG_LEVEL=INFO
# Cron for auto-schedule (default: 8 AM daily)
SCHEDULE_CRON=0 8 * * *
# Max emails per run (0 = unlimited)
MAX_EMAILS=500
# Only fetch emails after this date
FETCH_FROM_DATE=2024-01-01
# Platforms to scan
ENABLED_PLATFORMS=zomato,swiggy,instamart,blinkit,zeptopython -m pytest tests/ -vAll 35 tests cover:
- Parser
can_parse()matching for each platform - Order data extraction (amount, payment method, type)
- Cancellation/refund detection
- Base parser utilities (amount extraction, date parsing, HTML stripping)
Generate a sample report with 80 realistic synthetic orders:
python scripts/generate_sample_report.pyOutput: reports/sample_report.xlsx + 18 CSV files in exports/
- Create
parsers/newplatform_parser.pyinheritingBaseParser - Implement
can_parse()andparse_order() - Register in
parsers/__init__.py:from parsers.newplatform_parser import NewPlatformParser PARSER_REGISTRY.append(NewPlatformParser())
- Add sender domains to
config/settings.py→PLATFORM_SENDERS - Add unit tests in
tests/test_parsers.py
- Credentials and token files are in
config/— add to.gitignore - The tool uses
gmail.readonlyscope — it cannot modify or delete emails - All data is stored locally in SQLite — nothing is sent to external servers
google-auth, google-auth-oauthlib, google-api-python-client → Gmail API
pandas → Data processing
openpyxl → Excel generation
beautifulsoup4, lxml → HTML parsing
python-dotenv → Config management
APScheduler → Scheduling
click → CLI
colorlog → Colored logging
tqdm → Progress bars