# Reddit Post Analysis using open source models (llama 3.2, deepseek r1, mistral:7b)

1. **Sets the Role and Tone**  
   Instructs the AI to act as an **expert analyst** specializing in extracting insights from online forums like Reddit.

2. **Guides Sentiment Analysis**  
   Asks the AI to evaluate overall sentiment (e.g., positive, neutral, negative), and to present it as approximate percentages with a brief rationale.

3. **Groups and Labels Themes**  
   Instructs the AI to identify and cluster **key discussion themes**, perspectives, and emotional tones. Each theme should be explained and illustrated with **example comments**.

4. **Creates an Insights Table**  
   Requests a structured table with fields like *Perspectives, Frustrations, Tools, Suggestions* to concisely summarize the discussion’s core insights.

5. **Describes Community Dynamics**  
   Asks the AI to assess the **interaction style** (e.g., supportive, sarcastic, argumentative) and note any social patterns (e.g., consensus or conflict).

#### Imports

In [0]:
import praw
import os
from dotenv import load_dotenv
from IPython.display import Markdown, display
from openai import OpenAI
import ollama

#### Load Credentials

In [0]:
load_dotenv(override=True)
reddit = praw.Reddit(
    client_id=os.getenv("REDDIT_CLIENT_ID"),
    client_secret=os.getenv("REDDIT_CLIENT_SECRET"),
    user_agent=os.getenv("REDDIT_USER_AGENT"),
    username=os.getenv("REDDIT_USERNAME"),
    password=os.getenv("REDDIT_PASSWORD")
)

print("Authenticated as:", reddit.user.me())

In [0]:
openai = OpenAI()

#### Reddit Post Scraper

In [0]:
class RedditPostScraper:
    def __init__(self, url):
        self.submission = reddit.submission(url=url)
        self.submission.comments.replace_more(limit=None)
        self._title = self.submission.title
        self._text = self.submission.selftext
        self._comments = ""
        self._formatted_comments = []  # for reprocessing if needed

    def _generate_comments(self):
        comments_list = []
        for top_level in self.submission.comments:
            top_author = top_level.author.name if top_level.author else "[deleted]"
            comments_list.append(f"{top_author}: {top_level.body}")

            for reply in top_level.replies:
                reply_author = reply.author.name if reply.author else "[deleted]"
                comments_list.append(
                    f"{reply_author} replied to {top_author}'s comment: {reply.body}"
                )
        self._formatted_comments = comments_list

    def title(self):
        return f"Title:\n{self._title}\n{self._text}"

    def comments(self, max_words=None):
        if not self._formatted_comments:
            self._generate_comments()

        output_comments = []
        total_words = 0

        for comment in self._formatted_comments:
            word_count = len(comment.split())
            if max_words and total_words + word_count > max_words:
                break
            output_comments.append(comment)
            total_words += word_count

        return "Text:\n" + "\n\n".join(output_comments)

#### System and User Prompt

In [0]:
system_prompt = '''You are an expert analyst specializing in extracting insights from online discussion forums. You will be given the title of a Reddit post and a list of comments (some with replies). Your task is to analyze the sentiment of the discussion and extract structured insights that reflect the collective responses.
Your response **must be in well-formatted Markdown**. Use clear section headers (`##`, `###`), bullet points, and tables where appropriate.
Perform the following tasks:
---
## 1. Overall Sentiment Breakdown
- Determine the overall sentiment of the responses (e.g., positive, negative, neutral, mixed).
- Express the sentiment as approximate percentages (e.g., 60% positive, 25% neutral, 15% negative).
- Provide a short explanation for why the sentiment skews this way, referring to tone, topic sensitivity, controversy, humor, or supportiveness.
---
## 2. Thematic Grouping of Comments
- Identify key recurring **themes, perspectives, or discussion threads** in the comments.
- For each theme, create a subheading.
- Under each:
  - Briefly describe the focus or tone of that cluster (e.g., personal stories, criticism, questions, jokes).
  - Include 1–2 **example comments** using quote formatting (`>`), preferably ones with replies or high engagement.
---
## 3. Insights Table
If applicable, extract and structure insights into the following table. Leave any column empty if it’s not relevant to the post type:
| Perspectives/ Motivations     | Pains/ Concerns/ Frustrations    | Tools / References / Resources       | Suggestions / Solutions            |
|-------------------------------|----------------------------------|--------------------------------------|------------------------------------|
| - ...                         | - ...                            | - ...                                | - ...                              |
- Populate this table with concise bullet points.
- Adapt categories to match the discussion type (e.g., switch "Suggestions" to "Reactions" if it's a news thread).
---
## 4. Tone and Community Dynamics
- Comment on the **style and culture** of interaction: humor, sarcasm, empathy, trolling, intellectual debate, etc.
- Mention any noticeable social dynamics: agreement/disagreement, echo chambers, respectful debate, or hostility.
- Include casual or emotional comments if they illustrate community personality.
---
**Respond only in well-formatted Markdown.** Structure your output for clarity and insight, suitable for rendering in documentation, reports, or dashboards. Do not summarize every comment — focus on patterns, perspectives, and collective signals.

'''

In [0]:
def user_prompt_for(post):
    user_prompt = f"You are looking at a Reddit discussion titled:\n\n{post.title()}\n\n"
    user_prompt += "Below are the responses from various users. Analyze them according to the system prompt provided.\n"
    user_prompt += "Make sure your response is structured in Markdown with headers, lists, and tables as instructed.\n\n"
    user_prompt += post.comments(1000)
    return user_prompt


In [0]:
# post = RedditPostScraper("https://www.reddit.com/r/running/comments/1l77osa/pushing_through_a_run/")
# print(post.title())
# print(post.comments())

#### Generating messages

In [0]:
def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

#### llama 3.2

In [0]:
ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
def summarizellama(url):
    website = RedditPostScraper(url)
    response = ollama_via_openai.chat.completions.create(
        model = "llama3.2",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [0]:
def display_summaryllama(url):
    summary = summarizellama(url)
    display(Markdown(summary))

In [0]:
display_summaryllama("https://www.reddit.com/r/running/comments/1l77osa/pushing_through_a_run/")

#### deepseek

In [0]:
ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
def summarizedeepseek(url):
    website = RedditPostScraper(url)
    response = ollama_via_openai.chat.completions.create(
        model = "deepseek-r1",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [0]:
def display_summarydeepseek(url):
    summary = summarizedeepseek(url)
    display(Markdown(summary))

In [0]:
display_summarydeepseek("https://www.reddit.com/r/running/comments/1l77osa/pushing_through_a_run/")

#### Mistral

In [0]:
!ollama pull mistral:7b

In [0]:
ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
def summarizeMistral(url):
    website = RedditPostScraper(url)
    response = ollama_via_openai.chat.completions.create(
        model = "mistral:7b",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [0]:
def display_summaryMistral(url):
    summary = summarizeMistral(url)
    display(Markdown(summary))

In [0]:
display_summaryMistral("https://www.reddit.com/r/running/comments/1l77osa/pushing_through_a_run/")