# **# 🚀 Reddit Bot Demo**

### 🔍 Description
This notebook demonstrates a Reddit bot developed as part of the Pelton Technology Inc. assessment.  
It features keyword filtering, topic relevance scoring using transformer models, and comment generation logic aligned with a specific personality profile (e.g., Indian politics and pop culture).  
The bot uses semantic similarity techniques to evaluate post titles before deciding to engage.

In [None]:

#Install necessary packages in Colab
!pip install -U pip  # Upgrade pip first
!pip install -U sentence-transformers transformers praw
!pip install tqdm
!pip install google-generativeai


In [None]:
# important since there is a dependency mismatch from providers of module
!pip uninstall -y transformers
!pip install transformers==4.40.2
!pip install sentence-transformers

In [12]:
#important imports
import praw
import json
import os
import random
import time
import threading
from datetime import datetime
from sentence_transformers import SentenceTransformer, util
import google.generativeai as genai
import sqlite3

Add reddit configuration here

In [6]:
botConfig = [{
          "botID": "1",
          "client_id": "xxxxxxxxxx", # Client Id for reddit
          "username": "xxxxxxxxxx", # Bot name
          "mail": "xxxxxxxx", # Put your mail here for reddit bot !
          "interest": "Pc Gaming", # Or any other interesting genere for bot
          "password": "xxxxxxxxx", # password of reddit bot account
          "secret": "xxxxxxx", # client secret of app for reddit
          "bot-personality": " Gamer , technology , scriptKiddie , nerdyhelper", # or any other bot personality traits to define pot's personality to avoit homogeneous bot responses and humanise as much as possible
}]

Reddit Filter configurations . You can tweek these according to your needs .

In [7]:
# Configuration to filter posts based on what-so every you would like
filterConf = {
    "nsfw": 0,
    "stickied": 0,
    "locked": 0,
    "archived": 0,
    "is_self": 1,
    "has_media": 1,
    "score_min": 5,
    "comment_count_min": 10,
    "flair_text": "Discussion",
    "title_contains": [
        "AMA",
        "Ask",
        "TIL"
    ],
    "ignore_flair": [
        "Meme",
        "Shitpost"
    ]
}

Add Sub-Reddits to scan content in

In [8]:
# Put the subreddits you would like to scrape and comment on for your own narrative
subReddits = ['India', 'AskReddit']

Gemini Api Key and Secrets to generate comment

In [19]:
# Demo uses Gemini-API
gemini_api = {
  "api_key": "xxxxxxxx" # Please put your gemini-api key here
}

KeyWords and BuzzWords dictionary

In [9]:
# Keywords for your agenda . This example revolves around Indian politics , pop culture and trends
bot_keywords = [
        "Narendra Modi", "Rahul Gandhi", "Lok Sabha", "BJP", "Congress", "Aam Aadmi Party",
        "UP elections", "India politics", "Nehru", "RSS", "CAA", "Kashmir", "Ram Mandir", "Uniform Civil Code",

        # Pop culture and media
        "Bollywood", "Shah Rukh Khan", "Deepika Padukone", "Ranbir Kapoor", "Kangana Ranaut",
        "Koffee with Karan", "Bigg Boss", "Indian Idol", "OTT", "Netflix India", "JioCinema",

        # Trends and youth culture
        "influencers", "reels", "Instagram", "YouTube India", "Standup comedy", "Ashneer Grover", "Shark Tank India",
        "Elvish Yadav", "CarryMinati", "Tech Burner", "BB Ki Vines", "desi memes",

        # Social + cultural
        "religion", "caste", "beef ban", "love jihad", "hijab ban", "farmers protest", "gender politics",
        "women's safety", "online trolling", "boycott culture", "nationalism", "freedom of speech"
    ]

# **BotDemo class , Executing this snippet will auto-comment on your selected sub-reddits based on bot_keywords !**

***How does it work ? ***



1.   We extract a batch of posts from reddit
2.   We measure digree of alignment of topic with respect to our keywords
3.   To measure the digree of alignment , we process topic and do sementic analysis using some models from huggingface  designed for sementic analysis
4.   We compare the calculated score with the threshold score
5.   If the calculated score is greater than threshold score , we generate a comment from Gemini using a template instructing the beheaviour and personality and tone of comment
6.   We post comment
7.   We store reference to comment and post to a local sqlite database






In [24]:
#
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

threshold_alignment = 0.4
bot_keywords = ["Modi", "BJP", "Congress", "Elections", "Bollywood", "Kangana", "Adani", "Rahul Gandhi", "UPSC", "JNU", "India"]

# --- filtering logic ---
def postPassesFilter(post):
    return not post.stickied and not post.over_18 and len(post.title) > 10

# --- Semantic Relevance Check ---
def isRelevantToBot(title):
    title_embed = model.encode(title, convert_to_tensor=True)
    keywords_embed = model.encode(bot_keywords, convert_to_tensor=True)
    similarities = util.cos_sim(title_embed, keywords_embed)[0]
    max_score = float(similarities.max())
    return (max_score > threshold_alignment), max_score

# --- Gemini comment generation ---
def generateBotComment(title, content):
    genai.configure(api_key=gemini_api["api_key"])
    model = genai.GenerativeModel("gemini-1.5-flash")

    rag_prompt = f"""
      You're an informed Reddit user from India, active in discussions around politics, pop culture, education, and current events.

      You tend to:
      - Share thoughtful, nuanced takes.
      - Speak like a real person (not a bot).
      - Keep things casual but insightful.
      - Avoid echoing propaganda or taking extreme sides.
      - Show awareness of both cultural and factual context.

      Now, here’s a post you might want to respond to:

      **Title**:
      {title}

      **Content**:
      {content}

      Craft a natural-sounding Reddit comment that:
      - Feels human and opinionated, not robotic.
      - Shows that you’ve understood the topic.
      - Might ask a question, add a take, or start a conversation.
      - Doesn’t exceed 4–5 sentences.

      Keep it real, like something you'd actually post.
    """

    try:
        response = model.generate_content(rag_prompt)
        return response.text.strip()
    except Exception as e:
        print(f"[Gemini Error] {e}")
        return None

# --- sqlite DB initialize ---
def initDB():
    conn = sqlite3.connect("bot_results.db")
    cur = conn.cursor()
    cur.execute("""CREATE TABLE IF NOT EXISTS bot_comments (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        timestamp TEXT,
        subreddit TEXT,
        post_title TEXT,
        post_url TEXT,
        comment_url TEXT
    )""")
    conn.commit()
    conn.close()

# --- store to sqlite ---
def saveToDB(subreddit, post_title, post_url, comment_url):
    conn = sqlite3.connect("bot_results.db")
    cur = conn.cursor()
    cur.execute("INSERT INTO bot_comments (timestamp, subreddit, post_title, post_url, comment_url) VALUES (?, ?, ?, ?, ?)",
                (datetime.utcnow().isoformat(), subreddit, post_title, post_url, comment_url))
    conn.commit()
    conn.close()

# --- Bot Class ---
class BotDemo:
    def __init__(self):
        self.botConfig = botConfig
        self.multiBot = len(self.botConfig) > 1

    # initialize praw
    def getPrawInstance(self, bot):
        return praw.Reddit(
            client_id=bot["client_id"],
            client_secret=bot["secret"],
            password=bot["password"],
            user_agent="botDemoUserAgent",
            username=bot["username"]
        )

    # scan relevant posts from sub-reddits
    def scanSubreddit(self, bot_info, sub_list):
        reddit = self.getPrawInstance(bot_info)
        username = bot_info["username"]

        for sub in sub_list:
            print(f"[{username}] Scanning /r/{sub}")
            # mechanism to keep searching 5 batches to get at least 3 relevant posts to comment on
            relevant_posts = []
            attempts = 0
            max_attempts = 5

            while len(relevant_posts) < 3 and attempts < max_attempts:
                try:
                    for post in reddit.subreddit(sub).new(limit=10):
                        if not postPassesFilter(post):
                            continue

                        is_relevant, score = isRelevantToBot(post.title)
                        if is_relevant:
                            print(f"  ✅ [{username}] Relevant: {post.title} (score: {score:.2f})")

                            # Generate comment
                            comment_text = generateBotComment(post.title, post.selftext or "[No Content]")
                            if comment_text:
                                try:
                                    comment = post.reply(comment_text)
                                    print(f"    💬 Commented: {comment.permalink}")
                                    saveToDB(sub, post.title, post.permalink, comment.permalink)
                                except Exception as e:
                                    print(f"    ❌ Failed to comment: {e}")
                            else:
                                print(f"    ⚠️ Skipped: Gemini failed to generate")

                            relevant_posts.append(post)
                        else:
                            print(f"  ❌ [{username}] Not relevant: {post.title}")
                        if len(relevant_posts) >= 3:
                            break
                    attempts += 1
                    time.sleep(2)
                except Exception as e:
                    print(f"[{username}] Error scanning /r/{sub}: {e}")
                    break

    def _startScan(self):
        if not self.multiBot:
            self.scanSubreddit(self.botConfig[0], subReddits)
            return

        sub_copy = subReddits.copy()
        random.shuffle(sub_copy)
        bot_count = len(self.botConfig)
        jobs = [[] for _ in range(bot_count)]

        for sub in sub_copy:
            rand_idx = random.randint(0, bot_count - 1)
            jobs[rand_idx].append(sub)

        threads = []
        for bot, sub_list in zip(self.botConfig, jobs):
            if sub_list:
                t = threading.Thread(target=self.scanSubreddit, args=(bot, sub_list))
                t.start()
                threads.append(t)

        for t in threads:
            t.join()

# --- Init DB and Run ---
initDB()
demo = BotDemo()
demo._startScan()


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



[Ok_Succotash2381] Scanning /r/India
  ❌ [Ok_Succotash2381] Not relevant: Thar Driver Deliberately Rams Into Elderly Man After Hitting His Scooter
  ❌ [Ok_Succotash2381] Not relevant: Worst day of my life – never imagined this over a rent dispute
  ❌ [Ok_Succotash2381] Not relevant: Indian news channels should expose companies exploiting job seekers through fake "assignments" in hiring processes
  ❌ [Ok_Succotash2381] Not relevant: A Jammu family’s fight to get their mother back from Pakistan | In India on a long-term visa since 1989, Rakshanda Rashid was deported after the Pahalgam attack. The Centre has challenged a HC order to repatriate her.
  ❌ [Ok_Succotash2381] Not relevant: Woman migrant worker, minor son ‘beaten up’ by Delhi police
  ❌ [Ok_Succotash2381] Not relevant: General Manager of HR manhandled and obstructed my movement - what are my legal options?
  ❌ [Ok_Succotash2381] Not relevant: Infosys is utterly bad
  ❌ [Ok_Succotash2381] Not relevant: Frequent delivery issues o

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



    💬 Commented: /r/india/comments/1mbhg8c/doubleengine_govt_has_betrayed_odisha_naveen/n5msxjp/
  ❌ [Ok_Succotash2381] Not relevant: My uncle is battling for his life, please read


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



  ❌ [Ok_Succotash2381] Not relevant: Thar Driver Deliberately Rams Into Elderly Man After Hitting His Scooter
  ❌ [Ok_Succotash2381] Not relevant: Worst day of my life – never imagined this over a rent dispute
  ❌ [Ok_Succotash2381] Not relevant: Indian news channels should expose companies exploiting job seekers through fake "assignments" in hiring processes
  ❌ [Ok_Succotash2381] Not relevant: A Jammu family’s fight to get their mother back from Pakistan | In India on a long-term visa since 1989, Rakshanda Rashid was deported after the Pahalgam attack. The Centre has challenged a HC order to repatriate her.
  ❌ [Ok_Succotash2381] Not relevant: Woman migrant worker, minor son ‘beaten up’ by Delhi police
  ❌ [Ok_Succotash2381] Not relevant: General Manager of HR manhandled and obstructed my movement - what are my legal options?
  ❌ [Ok_Succotash2381] Not relevant: Infosys is utterly bad
  ❌ [Ok_Succotash2381] Not relevant: Frequent delivery issues on amazon
  ✅ [Ok_Succotash2381] Relev

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



    ❌ Failed to comment: RATELIMIT: "Looks like you've been doing that a lot. Take a break for 9 minutes before trying again." on field 'ratelimit'
  ❌ [Ok_Succotash2381] Not relevant: My uncle is battling for his life, please read


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



  ❌ [Ok_Succotash2381] Not relevant: Thar Driver Deliberately Rams Into Elderly Man After Hitting His Scooter
  ❌ [Ok_Succotash2381] Not relevant: Worst day of my life – never imagined this over a rent dispute
  ❌ [Ok_Succotash2381] Not relevant: Indian news channels should expose companies exploiting job seekers through fake "assignments" in hiring processes
  ❌ [Ok_Succotash2381] Not relevant: A Jammu family’s fight to get their mother back from Pakistan | In India on a long-term visa since 1989, Rakshanda Rashid was deported after the Pahalgam attack. The Centre has challenged a HC order to repatriate her.
  ❌ [Ok_Succotash2381] Not relevant: Woman migrant worker, minor son ‘beaten up’ by Delhi police
  ❌ [Ok_Succotash2381] Not relevant: General Manager of HR manhandled and obstructed my movement - what are my legal options?
  ❌ [Ok_Succotash2381] Not relevant: Infosys is utterly bad
  ❌ [Ok_Succotash2381] Not relevant: Frequent delivery issues on amazon
  ✅ [Ok_Succotash2381] Relev

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



    ❌ Failed to comment: RATELIMIT: "Looks like you've been doing that a lot. Take a break for 9 minutes before trying again." on field 'ratelimit'


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



[Ok_Succotash2381] Scanning /r/AskReddit
  ❌ [Ok_Succotash2381] Not relevant: What are you doing on your phone when walking around?
  ❌ [Ok_Succotash2381] Not relevant: If you found out you had a serious, hard‑to‑treat illness and were told you only had a few months left to live — with a 14‑year‑old daughter, no father in the picture, but a sister by your side — how would you spend the time you have left? What advice would you give to someone in that situation?
  ❌ [Ok_Succotash2381] Not relevant: What do you do to study for longer periods of time without getting distracted ?
  ❌ [Ok_Succotash2381] Not relevant: What fashion trends do you think should make a comeback, and which ones should stay gone forever?
  ❌ [Ok_Succotash2381] Not relevant: What do you have in your pockets RIGHT NOW?
  ❌ [Ok_Succotash2381] Not relevant: What’s something you do everyday you wish you could get an allowance for?
  ❌ [Ok_Succotash2381] Not relevant: If they held another LiveAid charity concert in 2025,

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



  ❌ [Ok_Succotash2381] Not relevant: What are you doing on your phone when walking around?
  ❌ [Ok_Succotash2381] Not relevant: If you found out you had a serious, hard‑to‑treat illness and were told you only had a few months left to live — with a 14‑year‑old daughter, no father in the picture, but a sister by your side — how would you spend the time you have left? What advice would you give to someone in that situation?
  ❌ [Ok_Succotash2381] Not relevant: What do you do to study for longer periods of time without getting distracted ?
  ❌ [Ok_Succotash2381] Not relevant: What fashion trends do you think should make a comeback, and which ones should stay gone forever?
  ❌ [Ok_Succotash2381] Not relevant: What do you have in your pockets RIGHT NOW?
  ❌ [Ok_Succotash2381] Not relevant: What’s something you do everyday you wish you could get an allowance for?
  ❌ [Ok_Succotash2381] Not relevant: If they held another LiveAid charity concert in 2025, who would perform?
  ❌ [Ok_Succotash2381

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



  ❌ [Ok_Succotash2381] Not relevant: Do you collect or play Pokemon & when did you start ?
  ❌ [Ok_Succotash2381] Not relevant: What are you doing on your phone when walking around?
  ❌ [Ok_Succotash2381] Not relevant: If you found out you had a serious, hard‑to‑treat illness and were told you only had a few months left to live — with a 14‑year‑old daughter, no father in the picture, but a sister by your side — how would you spend the time you have left? What advice would you give to someone in that situation?
  ❌ [Ok_Succotash2381] Not relevant: What do you do to study for longer periods of time without getting distracted ?
  ❌ [Ok_Succotash2381] Not relevant: What fashion trends do you think should make a comeback, and which ones should stay gone forever?
  ❌ [Ok_Succotash2381] Not relevant: What do you have in your pockets RIGHT NOW?
  ❌ [Ok_Succotash2381] Not relevant: What’s something you do everyday you wish you could get an allowance for?
  ❌ [Ok_Succotash2381] Not relevant: If 

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



  ❌ [Ok_Succotash2381] Not relevant: Do you collect or play Pokemon & when did you start ?
  ❌ [Ok_Succotash2381] Not relevant: What are you doing on your phone when walking around?
  ❌ [Ok_Succotash2381] Not relevant: If you found out you had a serious, hard‑to‑treat illness and were told you only had a few months left to live — with a 14‑year‑old daughter, no father in the picture, but a sister by your side — how would you spend the time you have left? What advice would you give to someone in that situation?
  ❌ [Ok_Succotash2381] Not relevant: What do you do to study for longer periods of time without getting distracted ?
  ❌ [Ok_Succotash2381] Not relevant: What fashion trends do you think should make a comeback, and which ones should stay gone forever?
  ❌ [Ok_Succotash2381] Not relevant: What do you have in your pockets RIGHT NOW?
  ❌ [Ok_Succotash2381] Not relevant: What’s something you do everyday you wish you could get an allowance for?
  ❌ [Ok_Succotash2381] Not relevant: If 

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



  ❌ [Ok_Succotash2381] Not relevant: If you could be a actor in the next part of a movie franchise that you love . Which movie franchise would you pick and why?
  ❌ [Ok_Succotash2381] Not relevant: Do you collect or play Pokemon & when did you start ?
  ❌ [Ok_Succotash2381] Not relevant: What are you doing on your phone when walking around?
  ❌ [Ok_Succotash2381] Not relevant: If you found out you had a serious, hard‑to‑treat illness and were told you only had a few months left to live — with a 14‑year‑old daughter, no father in the picture, but a sister by your side — how would you spend the time you have left? What advice would you give to someone in that situation?
  ❌ [Ok_Succotash2381] Not relevant: What do you do to study for longer periods of time without getting distracted ?
  ❌ [Ok_Succotash2381] Not relevant: What fashion trends do you think should make a comeback, and which ones should stay gone forever?
  ❌ [Ok_Succotash2381] Not relevant: What do you have in your pockets R