🤖 Reddit Persona Generator using OpenRouter API 🚀

Welcome to the Reddit Persona Generator – a powerful tool that transforms Reddit user activity into deep, emotionally enriched, AI-generated user personas.

Personas are saved as text files in the output/ directory.

This project combines:

📄 Reddit data scraping
🧠 Sentiment & emotion analysis
🪄 Advanced prompt engineering
🔮 LLM-powered personality inference using OpenRouter

Project Directory Structure

reddit_persona_generator/
├── config.py
├── reddit_scraper.py
├── openrouter_credit_report.py
├── llm_inference.py
├── main.py
├── preprocessing.py
├── prompt_builder.py
├── persona_writer.py
├── requirements.txt
├── test_openrouter_key.py
├── README.md
├── data/
│   ├── [username]_raw.json
│   └── ...
├── enriched_data/
│   ├── [username]_enriched.json
│   └── ...
└── output/
    ├── [username]_persona.txt
    └── ...

data/ — Raw scraped Reddit data JSON files
enriched_data/ — Enriched JSON files with sentiment, emotion, NER, archetype, toxicity
output/ — Final generated persona text files
Root folder contains main Python scripts and configuration files

🎯 Project Overview

The Reddit Persona Generator takes a Reddit username and analyzes their public posts and comments to infer a psychographic and behavioral profile. This profile includes:

🔍 Name, age, location (inferred)
🧬 Personality traits (MBTI-style)
🧠 Motivations and emotional triggers
🧭 Decision-making patterns
💳 Spending habits and brand loyalties

The analysis pipeline leverages HuggingFace Transformers, VADER sentiment analysis, and OpenRouter API to create LLM-driven personas in plain English.

⚙️ Setup Instructions

🔁 1. Clone the Repository

git clone https://github.com/yourusername/reddit_persona_generator.git
cd reddit_persona_generator

🧪 2. (Optional) Set Up Virtual Environment

python -m venv venv
# Mac/Linux
source venv/bin/activate
# Windows
venv\Scripts\activate

📦 3. Install Dependencies

pip install -r requirements.txt

🔐 4. Set up Reddit & OpenRouter Credentials

Create an OpenRouter API Key

Go to OpenRouter
Sign up and generate your API key
While selecting the model to use it is preferred to use "google/gemini-2.5-pro" if you have credits in openrouter account other wise use "nvidia/llama-3.1-nemotron-ultra-253b-v1:free"

Create a Reddit App

Visit Reddit Apps
Create a new "script" app and fill in:
- Name: anything
- Redirect URI: http://localhost
Save and note the Client ID and Client Secret

Create a `.env` file

OPENROUTER_API_KEY=your_openrouter_api_key
LLM_MODEL_NAME=google/gemini-2.5-pro
REDDIT_CLIENT_ID=your_client_id
REDDIT_CLIENT_SECRET=your_client_secret
REDDIT_USER_AGENT=your_unique_user_agent

🗂️ Detailed Code Modules

📥 reddit_scraper.py — Collects Reddit Activity

Uses PRAW to fetch Reddit posts & comments

Saves raw data to JSON for downstream processing

🧪 preprocessing.py — Adds Sentiment, Emotion, NER, Archetype & Toxicity

Applies VADER for sentiment scoring

Uses HuggingFace model j-hartmann/emotion-english-distilroberta-base for emotion labeling

Uses spaCy for Named Entity Recognition (NER)

Maps subreddits to behavioral archetypes

Uses Detoxify for toxicity and civility scoring

Stores enriched data in JSON

📊 openrouter_credit_report.py — Check OpenRouter API Key & Credits

Sends a GET request to OpenRouter API to retrieve API key details and credit status.

Usage:

export OPENROUTER_API_KEY=your_api_key
python openrouter_credit_report.py

🧪 test_openrouter_key.py — Test OpenRouter API Key with Sample Request

Sends a sample chat completion request to OpenRouter API to verify API key and model.

Usage:

export OPENROUTER_API_KEY=your_api_key
export LLM_MODEL_NAME=google/gemini-2.5-pro
python test_openrouter_key.py

✍️ prompt_builder.py — Crafts AI Prompts

Extracts and formats behavioral data

Builds input prompts that are LLM-friendly

🤖 llm_inference.py — Connects to OpenRouter

Sends prompts to OpenRouter models like GPT-4

Handles retries, chunking, rate limits, and timeouts

📄 persona_writer.py — Saves Persona

Writes final persona to output/_persona.txt

🎬 main.py — Runs Everything

Orchestrates full pipeline from scraping to persona output

🧠 NLP Techniques & Their Benefits

Technique	Tool Used	Benefits
Sentiment Analysis	VADER	Lightweight, real-time, social-media optimized
Emotion Detection	distilroberta-base fine-tuned on emotion data	Identifies nuanced emotions: joy, anger, fear, etc.
Named Entity Recognition (NER)	spaCy	Extracts entities like people, organizations, locations for richer context
Subreddit Archetype Mapping	Custom dictionary mapping	Maps subreddit interests to behavioral archetypes
Toxicity & Civility Scoring	Detoxify	Detects toxic or uncivil language for content moderation insights
Prompt Chunking	Custom batching	Handles large input while respecting token limits
LLM Inference	Gemini 2.5/ Nvidia nemotron via OpenRouter	Enables deep personality generation
Prompt Engineering	Context-rich template generation	Guides LLMs to output structured and relevant persona
Modular Pipeline	Python modules per task	Easy to maintain, debug, and extend

🔄 Workflow Diagram

graph TD
    A[Input Reddit Username] --> B[Scrape Posts & Comments]
    B --> C[Save Raw JSON]
    C --> D[Preprocess: Sentiment, Emotion, NER, Archetype, Toxicity]
    D --> E[Build Prompt from Enriched Data]
    E --> F[Call OpenRouter LLM]
    F --> G[Generate Persona]
    G --> H[Write Persona to File]
    H --> I[Done ✅]

🚀 Running the Generator

Use this command to generate a persona:

python main.py https://www.reddit.com/user/your_target_user/

📂 Output Files

The project organizes files into the following directories:

data/ — Contains raw scraped Reddit data JSON files.
enriched_data/ — Contains enriched JSON files with added sentiment, emotion, NER, archetype, and toxicity features.
output/ — Contains the final generated persona text files.

Example output files for a user:

data/your_target_user_raw.json — Scraped raw data
enriched_data/your_target_user_enriched.json — Enriched data with sentiment, emotion, NER, archetype, toxicity
output/your_target_user_persona.txt — Final persona ✨

💬 Common Errors & Fixes

Error Code	Error Message	Cause	Fix
429	RateLimitError	Too many requests	Add delays or use retry
401	AuthenticationError	Bad/missing API key	Recheck `.env` file
408	TimeoutError	Server/network slow	Use retry logic (already built-in)
404	ModelNotFoundError	Wrong model name	Verify OpenRouter model ID
N/A	Token Overflow	Text too long	Chunking is already handled automatically

🤝 Contributions & Feedback

I welcome contributions:

🌟 Fork the repo
🛠 Fix bugs or add new models
✨ Extend the prompt builder
📣 Suggest new prompt templates or psychographic layers

Open an issue or PR anytime!

📌 Future Features (Roadmap)

🧠 Add OCEAN / Big Five personality scores
📊 Subreddit-based sentiment heatmaps
✍️ Stylometry and writing fingerprint detection
📄 Export to PDF / Notion integration
🕵️‍♂️ Optional sarcasm detection (via better model)

📣 Final Thoughts

This project blends behavioral science, NLP, and LLMs to create a mirror of how users present themselves on Reddit. Use it for research, digital marketing, audience profiling, or just for fun!

Turn Reddit data into rich, human-readable insights — instantly. 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🤖 Reddit Persona Generator using OpenRouter API 🚀

Project Directory Structure

📌 Table of Contents

🎯 Project Overview

⚙️ Setup Instructions

🔁 1. Clone the Repository

🧪 2. (Optional) Set Up Virtual Environment

📦 3. Install Dependencies

🔐 4. Set up Reddit & OpenRouter Credentials

Create an OpenRouter API Key

Create a Reddit App

Create a `.env` file

🗂️ Detailed Code Modules

🧠 NLP Techniques & Their Benefits

🔄 Workflow Diagram

🚀 Running the Generator

📂 Output Files

💬 Common Errors & Fixes

🤝 Contributions & Feedback

📌 Future Features (Roadmap)

📣 Final Thoughts

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
__pycache__		__pycache__
data		data
enriched_data		enriched_data
output		output
.env		.env
README.md		README.md
config.py		config.py
llm_inference.py		llm_inference.py
main.py		main.py
openrouter_credit_report.py		openrouter_credit_report.py
persona_writer.py		persona_writer.py
preprocessing.py		preprocessing.py
prompt_builder.py		prompt_builder.py
reddit_scraper.py		reddit_scraper.py
requirements.txt		requirements.txt
test_openrouter_key.py		test_openrouter_key.py

manishreddy123/Reddit-Persona-Generator-using-OpenRouter-API

Folders and files

Latest commit

History

Repository files navigation

🤖 Reddit Persona Generator using OpenRouter API 🚀

Project Directory Structure

📌 Table of Contents

🎯 Project Overview

⚙️ Setup Instructions

🔁 1. Clone the Repository

🧪 2. (Optional) Set Up Virtual Environment

📦 3. Install Dependencies

🔐 4. Set up Reddit & OpenRouter Credentials

Create an OpenRouter API Key

Create a Reddit App

Create a .env file

🗂️ Detailed Code Modules

🧠 NLP Techniques & Their Benefits

🔄 Workflow Diagram

🚀 Running the Generator

📂 Output Files

💬 Common Errors & Fixes

🤝 Contributions & Feedback

📌 Future Features (Roadmap)

📣 Final Thoughts

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Create a `.env` file

Packages