An Agentic AI system for analyzing Google Play Store review trends with high recall topic detection, semantic consolidation, and dynamic topic evolution. This system processes daily batches of reviews and generates rolling 30-day trend reports.
Company: Pulsegen Technologies
Tech Stack: Python, OpenAI/Anthropic API, SQLite, Pandas
This system uses LLM-based AI agents instead of traditional topic modeling techniques (LDA, BERTopic, etc.). The architecture consists of:
- Review Ingestion Agent: Collects and batches Play Store reviews by date
- Topic Detection Agent: Uses LLM reasoning to identify topics from reviews
- Semantic Consolidation Agent: Merges semantically similar topics into unified taxonomy
- Trend Analysis Agent: Generates rolling 30-day trend reports
play-store-trend-agent/
├── src/
│ ├── __init__.py
│ ├── scraper.py # Play Store review scraper
│ ├── batch_processor.py # Daily batch ingestion pipeline
│ ├── topic_agent.py # AI agent for topic detection
│ ├── consolidation_agent.py # Semantic consolidation agent
│ ├── trend_storage.py # Daily topic frequency storage
│ ├── report_generator.py # 30-day rolling report generator
│ └── main.py # Main pipeline orchestrator
├── output/ # Generated trend tables
├── video/ # Video demo script
├── data/ # SQLite database for trend storage
├── requirements.txt
└── README.md
The system uses an agent-based semantic consolidation approach:
- Embedding Similarity: Reviews are embedded using sentence transformers
- LLM Reasoning: Agent analyzes semantic similarity and merges topics
- Taxonomy Maintenance: Unified topics are stored in a hierarchical taxonomy
- Dynamic Evolution: New topics are automatically added when patterns emerge
Input Reviews:
- "Delivery guy was rude"
- "Delivery partner behaved badly"
- "Delivery person was impolite"
Consolidated Topic: Delivery partner rude
- Delivery issue
- Food stale
- Delivery partner rude
- Maps not working properly
- Instamart should be open all night
- Bring back 10 minute bolt delivery
The system automatically generates new topics when new patterns are detected.
- Python 3.9+
- OpenAI API key (or Anthropic API key)
- Internet connection for Play Store scraping
- Clone the repository:
git clone <repository-url>
cd play-store-trend-agent- Install dependencies:
pip install -r requirements.txt- Set up environment variables:
# Create .env file
OPENAI_API_KEY=your_api_key_here
# OR
ANTHROPIC_API_KEY=your_api_key_here- Initialize the database:
python src/main.py --init-dbProcess reviews for a specific app:
python src/main.py --app-id com.swiggy.android --start-date 2024-06-01Generate trend report for a specific date:
python src/main.py --generate-report --target-date 2024-06-30Process all batches from start date to today:
python src/main.py --app-id com.swiggy.android --start-date 2024-06-01 --process-allEdit config.py to customize:
- API provider (OpenAI/Anthropic)
- Model selection
- Similarity thresholds
- Batch sizes
- Review limits per day
The system generates CSV files with the following structure:
| Topic | Jun 1 | Jun 2 | Jun 3 | ... | Jun 30 |
|---|---|---|---|---|---|
| Delivery issue | 12 | 8 | 15 | ... | 23 |
| Food stale | 5 | 7 | 3 | ... | 11 |
| Delivery partner rude | 8 | 12 | 6 | ... | 9 |
- Rows: Topics (consolidated taxonomy)
- Columns: Dates (T-30 to T)
- Cells: Frequency count of topic mentions
- Multi-pass Analysis: Agent reviews each batch multiple times
- Context Window: Uses full review text, not just summaries
- Semantic Expansion: Identifies related concepts and variations
- Edge Case Handling: Captures rare but important topics
The system automatically:
- Detects new topic patterns
- Creates new taxonomy categories
- Merges similar new topics
- Maintains topic hierarchy
- Recall: Percentage of relevant issues captured
- Topic Consolidation Accuracy: Semantic merging correctness
- Topic Duplication Rate: Should be near zero
- Trend Window Accuracy: Correct 30-day rolling window
See video/demo_script.md for the complete video demonstration script.
Sample trend reports are available in the /output/ directory:
trend_report_2024-06-30.csvtrend_report_2024-07-15.csvtrend_report_2024-07-30.csv
Proprietary - Pulsegen Technologies
For questions or issues, please contact the development team.