A Python tool that analyzes your Gmail inbox to identify newsletters and automated emails, detecting unsubscribe opportunities and providing insights into your email subscriptions.
- Automated Gmail Analysis: Connects to your Gmail account securely using OAuth2
- Newsletter Detection: Identifies newsletter emails using multiple detection methods:
- List-Unsubscribe headers
- HTML content analysis
- Contextual link detection
- Smart Unsubscribe Detection:
- Finds unsubscribe links in email bodies
- Detects unsubscribe headers
- Provides confidence scoring for each detection
- Preserves contextual information around unsubscribe links
- Domain Analysis:
- Groups emails by sender domain
- Tracks unique senders per domain
- Calculates newsletter probability scores
- Detailed Reporting:
- CSV export of findings
- Domain-level statistics
- Unsubscribe link aggregation
- Python 3.7+
- Gmail account
- Google Cloud Project with Gmail API enabled
- Clone the repository:
git clone https://github.com/yourusername/gmail-newsletter-analyzer.git
cd gmail-newsletter-analyzer
- Install required packages:
pip install -r requirements.txt
- Set up Google Cloud Project:
- Go to Google Cloud Console
- Create a new project
- Enable the Gmail API
- Create OAuth 2.0 credentials
- Download the credentials file as
credentials.json
- Place your
credentials.json
file in the project root directory - First run will open a browser window for Gmail authentication
- After authentication, the app will save the token for future use
Run the main script:
python gmail_analyzer.py
This will:
- Connect to your Gmail account
- Analyze your recent emails
- Generate reports in CSV format
- Provide summary statistics in the console
The script generates several output files:
unsubscribe_analysis.csv
: Domain-level analysis with unsubscribe informationunsubscribe_detection.log
: Detailed logging of the analysis process
Example of the CSV output:
domain,unique_senders,header_link_ratio,html_link_ratio,avg_confidence,unsubscribe_url
newsletter.example.com,5,1.0,0.8,0.95,https://example.com/unsubscribe
The tool uses multiple methods to identify newsletters and unsubscribe options:
-
Header Detection:
- Looks for standard
List-Unsubscribe
headers - High confidence score for standard header implementations
- Looks for standard
-
HTML Content Analysis:
- Searches for common unsubscribe link patterns
- Analyzes link text and surrounding context
- Uses BeautifulSoup for reliable HTML parsing
-
Confidence Scoring:
- Scores each detection based on multiple factors:
- Link text relevance
- URL patterns
- Surrounding context
- Header presence
- HTML structure
- Scores each detection based on multiple factors:
- Domain Aggregation: Groups emails by sender domain
- Sender Analysis: Tracks unique senders per domain
- Link Detection: Finds and validates unsubscribe links
- Confidence Scoring: Provides reliability metrics for detections
- Context Preservation: Saves surrounding text for verification
- Uses OAuth 2.0 for secure Gmail access
- Stores credentials securely
- Read-only access to your Gmail account
- No email content is stored permanently
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Create a requirements.txt
file containing:
google-auth-oauthlib>=0.4.6
google-auth-httplib2>=0.1.0
google-api-python-client>=2.0.0
pandas>=1.3.0
pydantic>=1.8.0
beautifulsoup4>=4.9.3
python-dateutil>=2.8.2
- Only analyzes emails in your inbox (not other folders)
- Limited to recent emails (default: last 100)
- Requires manual verification of unsubscribe links
- May not detect all newsletter types
- Some unsubscribe links may require login
This project is licensed under the MIT License - see the LICENSE.md file for details
- Google Gmail API Documentation
- BeautifulSoup4 Documentation
- Pandas Documentation
- Python Type Hints Documentation
- 1.0.0 (2024-03)
- Initial release
- Basic newsletter detection
- Domain analysis
- Unsubscribe link detection
Your Name - @yourusername
Project Link: https://github.com/yourusername/gmail-newsletter-analyzer-