Skip to content

Keerthan410/Pulsegen

Repository files navigation

Play Store Review Trend Analysis AI Agent

Project Overview

An Agentic AI system for analyzing Google Play Store review trends with high recall topic detection, semantic consolidation, and dynamic topic evolution. This system processes daily batches of reviews and generates rolling 30-day trend reports.

Company: Pulsegen Technologies
Tech Stack: Python, OpenAI/Anthropic API, SQLite, Pandas

Architecture

Agentic AI Approach

This system uses LLM-based AI agents instead of traditional topic modeling techniques (LDA, BERTopic, etc.). The architecture consists of:

  1. Review Ingestion Agent: Collects and batches Play Store reviews by date
  2. Topic Detection Agent: Uses LLM reasoning to identify topics from reviews
  3. Semantic Consolidation Agent: Merges semantically similar topics into unified taxonomy
  4. Trend Analysis Agent: Generates rolling 30-day trend reports

Key Components

play-store-trend-agent/
├── src/
│   ├── __init__.py
│   ├── scraper.py              # Play Store review scraper
│   ├── batch_processor.py      # Daily batch ingestion pipeline
│   ├── topic_agent.py          # AI agent for topic detection
│   ├── consolidation_agent.py  # Semantic consolidation agent
│   ├── trend_storage.py        # Daily topic frequency storage
│   ├── report_generator.py     # 30-day rolling report generator
│   └── main.py                 # Main pipeline orchestrator
├── output/                     # Generated trend tables
├── video/                      # Video demo script
├── data/                       # SQLite database for trend storage
├── requirements.txt
└── README.md

Topic Taxonomy Consolidation Strategy

Semantic Merging Rules

The system uses an agent-based semantic consolidation approach:

  1. Embedding Similarity: Reviews are embedded using sentence transformers
  2. LLM Reasoning: Agent analyzes semantic similarity and merges topics
  3. Taxonomy Maintenance: Unified topics are stored in a hierarchical taxonomy
  4. Dynamic Evolution: New topics are automatically added when patterns emerge

Example Consolidation

Input Reviews:

  • "Delivery guy was rude"
  • "Delivery partner behaved badly"
  • "Delivery person was impolite"

Consolidated Topic: Delivery partner rude

Seed Topics (Food Delivery Apps)

  • Delivery issue
  • Food stale
  • Delivery partner rude
  • Maps not working properly
  • Instamart should be open all night
  • Bring back 10 minute bolt delivery

The system automatically generates new topics when new patterns are detected.

Setup & Installation

Prerequisites

  • Python 3.9+
  • OpenAI API key (or Anthropic API key)
  • Internet connection for Play Store scraping

Installation Steps

  1. Clone the repository:
git clone <repository-url>
cd play-store-trend-agent
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up environment variables:
# Create .env file
OPENAI_API_KEY=your_api_key_here
# OR
ANTHROPIC_API_KEY=your_api_key_here
  1. Initialize the database:
python src/main.py --init-db

Usage

Running the Pipeline

Process reviews for a specific app:

python src/main.py --app-id com.swiggy.android --start-date 2024-06-01

Generate trend report for a specific date:

python src/main.py --generate-report --target-date 2024-06-30

Process all batches from start date to today:

python src/main.py --app-id com.swiggy.android --start-date 2024-06-01 --process-all

Configuration

Edit config.py to customize:

  • API provider (OpenAI/Anthropic)
  • Model selection
  • Similarity thresholds
  • Batch sizes
  • Review limits per day

Output Format

Trend Analysis Table

The system generates CSV files with the following structure:

Topic Jun 1 Jun 2 Jun 3 ... Jun 30
Delivery issue 12 8 15 ... 23
Food stale 5 7 3 ... 11
Delivery partner rude 8 12 6 ... 9
  • Rows: Topics (consolidated taxonomy)
  • Columns: Dates (T-30 to T)
  • Cells: Frequency count of topic mentions

High Recall Strategy

  1. Multi-pass Analysis: Agent reviews each batch multiple times
  2. Context Window: Uses full review text, not just summaries
  3. Semantic Expansion: Identifies related concepts and variations
  4. Edge Case Handling: Captures rare but important topics

Dynamic Topic Evolution

The system automatically:

  • Detects new topic patterns
  • Creates new taxonomy categories
  • Merges similar new topics
  • Maintains topic hierarchy

Evaluation Metrics

  • Recall: Percentage of relevant issues captured
  • Topic Consolidation Accuracy: Semantic merging correctness
  • Topic Duplication Rate: Should be near zero
  • Trend Window Accuracy: Correct 30-day rolling window

Video Demo Script

See video/demo_script.md for the complete video demonstration script.

Sample Outputs

Sample trend reports are available in the /output/ directory:

  • trend_report_2024-06-30.csv
  • trend_report_2024-07-15.csv
  • trend_report_2024-07-30.csv

License

Proprietary - Pulsegen Technologies

Contact

For questions or issues, please contact the development team.

About

Project structure for a trend analysis system. src contains core Python modules: scraping data, batch processing, topic and consolidation agents, trend storage, report generation, and the main entry point. Other folders store outputs, videos, and raw data, with dependencies listed in requirements.txt and project details in README.md.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages