<a href="https://colab.research.google.com/github/wsybok/CrewAI/blob/main/crewAI_Reddit.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install praw
!pip install langchain
!pip install crewai
!pip install asyncpraw



In [None]:
import praw
import time
import os
import asyncio
import asyncpraw

In [None]:
from langchain.tools import tool
from langchain.llms import Ollama
from crewai import Agent, Task, Process, Crew
from google.colab import userdata
from langchain.agents import load_tools

In [None]:
api = userdata.get('OPENAI_API_KEY')

In [None]:
human_tools = load_tools(["human"])

In [None]:
api = os.environ.get("OPENAI_API_KEY")

In [None]:
class BrowserTool:
    @tool("Scrape reddit content")
    def scrape_reddit(max_comments_per_post=7):
        """Useful to scrape a reddit content"""
        reddit = praw.Reddit(
            client_id="hrwfUNc2xubrhwrlHDStNA",
            client_secret="QzqidqUIkYuwN3g4MoNlm2rLARgw_A",
            user_agent="user-agent",
        )
        subreddit = reddit.subreddit("LocalLLaMA")
        scraped_data = []

        for post in subreddit.hot(limit=12):
            post_data = {"title": post.title, "url": post.url, "comments": []}

            try:
                post.comments.replace_more(limit=0)  # Load top-level comments only
                comments = post.comments.list()
                if max_comments_per_post is not None:
                    comments = comments[:7]

                for comment in comments:
                    post_data["comments"].append(comment.body)

                scraped_data.append(post_data)

            except praw.exceptions.APIException as e:
                print(f"API Exception: {e}")
                time.sleep(60)  # Sleep for 1 minute before retrying

        return scraped_data

In [None]:
explorer = Agent(
    role="Senior Researcher",
    goal="Find and explore the most exciting projects and companies on LocalLLama subreddit in 2024",
    backstory="""You are an expert strategist who knows how to spot emerging trends and companies in AI, tech, and machine learning.
    You're great at finding interesting, exciting projects on the LocalLLama subreddit. You turn scraped data into detailed reports with names
    of the most exciting projects and companies in the AI/ML world. ONLY use scraped data from the LocalLLama subreddit for the report.
    """,
    verbose=True,
    allow_delegation=False,
    tools=[BrowserTool().scrape_reddit] + human_tools
)


In [None]:
writer = Agent(
    role="Senior Technical Writer",
    goal="Write engaging and interesting blog posts about the latest AI projects using simple, layman vocabulary",
    backstory="""You are an expert writer on technical innovation, especially in the field of AI and machine learning. You know how to write in
    an engaging, interesting, but simple, straightforward, and concise manner. You know how to present complicated technical terms to the general audience in a
    fun way by using layman words. ONLY use scraped data from the LocalLLama subreddit for the blog.""",
    verbose=True,
    allow_delegation=True
)

In [None]:
critic = Agent(
    role="Expert Writing Critic",
    goal="Provide feedback and criticize blog post drafts. Ensure that the tone and writing style is compelling, simple, and concise",
    backstory="""You are an expert at providing feedback to technical writers. You can identify when a blog text isn't concise,
    simple, or engaging enough. You know how to provide helpful feedback that can improve any text. You ensure that the text
    remains technical and insightful while using layman terms.""",
    verbose=True,
    allow_delegation=True
)

In [None]:
task_report = Task(
    description="""Use and summarize scraped data from subreddit LocalLLama to make a detailed report on the latest rising projects in AI. Use ONLY scraped data from LocalLLama to generate the report. Your final answer MUST be a full analysis report, text only, ignore any code or anything that isn't text. The report has to have bullet points and with 5-10 exciting new AI projects and tools. Write names of every tool and project. Each bullet point MUST contain 3 sentences that refer to one specific AI company, product, model or anything you found on subreddit LocalLLama.""",
    agent=explorer,
    expected_output="""A text-based analysis report with the following structure:
    - Title of the report
    - Introduction
    - Bullet points listing 5-10 new AI projects and tools
      - Each bullet point must include:
        - The name of the AI company, product, or model
        - 3 sentences describing the project or tool
    - Conclusion
    The report should be detailed and concise, summarizing the most exciting new AI projects and tools found on the LocalLLama subreddit."""
)

In [None]:
task_blog = Task(
    description="""Write a blog article with text only and with a short but impactful headline and at least 10 paragraphs. The blog should summarize
    the report on the latest AI tools found on the LocalLLama subreddit. The style and tone should be compelling and concise, fun, technical but also use
    layman words for the general public. Name specific new, exciting projects, apps, and companies in the AI world. Don't
    write "**Paragraph [number of the paragraph]:**", instead start the new paragraph in a new line. Write the names of projects and tools in BOLD.
    ALWAYS include links to the post page. ONLY include information from LocalLLAma.""",

    agent=writer,
    expected_output="""A text-only blog article with the following structure:
    - A short but impactful headline
    - At least 10 paragraphs summarizing the report on AI tools from the LocalLLama subreddit
    - The article should:
      - Be compelling and concise
      - Use layman terms while being fun and technical
      - Highlight new and exciting AI projects, apps, and companies
      - Include the names of projects and tools in bold
      - Include links to projects, tools, and research papers
      - ONLY use information from LocalLLAma
    The format of the blog content should follow this markdown structure:
    ```
    ## [Title of post](link to post)
    - Interesting facts
    - Own thoughts on how it connects to the overall theme of the newsletter
    ## [Title of second post](link to post)
    - Interesting facts
    - Own thoughts on how it connects to the overall theme of the newsletter
    ```"""
)


In [None]:
task_critique = Task(
    description="""The task entails summarizing and critiquing AI projects from the LocalLLama subreddit. The output must be formatted in Markdown and include interesting facts and personal thoughts on how each project connects to the overall theme of the newsletter.""",
    agent=critic,
    output_file='output.md',
    expected_output="""The Output MUST have the following markdown format:
    ```
    ## [Title of post](link to post)
    - Interesting facts
    - Own thoughts on how it connects to the overall theme of the newsletter
    ## [Title of second post](link to post)
    - Interesting facts
    - Own thoughts on how it connects to the overall theme of the newsletter
    ```
    The critique should be compelling and concise, with explanations in layman terms, summarizing AI projects from the LocalLLama subreddit, with sections for interesting facts, key features, and potential impacts."""
)


In [None]:
crew = Crew(
    agents=[explorer, writer, critic],
    tasks=[task_report, task_blog, task_critique],
    verbose=2,
    process=Process.sequential,
)



In [None]:
result = crew.kickoff()

print("######################")
print(result)

[1m[95m [2024-07-27 09:23:45][DEBUG]: == Working Agent: Senior Researcher[00m
[1m[95m [2024-07-27 09:23:45][INFO]: == Starting Task: Use and summarize scraped data from subreddit LocalLLama to make a detailed report on the latest rising projects in AI. Use ONLY scraped data from LocalLLama to generate the report. Your final answer MUST be a full analysis report, text only, ignore any code or anything that isn't text. The report has to have bullet points and with 5-10 exciting new AI projects and tools. Write names of every tool and project. Each bullet point MUST contain 3 sentences that refer to one specific AI company, product, model or anything you found on subreddit LocalLLama.[00m


[1m> Entering new CrewAgentExecutor chain...[0m


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



[32;1m[1;3mTo create a detailed report on the latest rising projects in AI from the LocalLLama subreddit, I need to first scrape data from the subreddit. This will help me identify the most exciting projects and companies in AI, tech, and machine learning.

Action: Scrape reddit content
Action Input: {"max_comments_per_post": 7}[0m

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/l

[95m 

[{'title': 'Llama 3.1 Discussion and Questions Megathread', 'url': 'https://www.reddit.com/r/LocalLLaMA/comments/1eagjwg/llama_31_discussion_and_questions_megathread/', 'comments': ['Until they implement the new ROPE scaling algorithm, results of llama.cpp and exllamav2 inference will be similar or slightly inferior than LLama3, at least in all my benchmarks it shows that.', "I will add my experience with Llama-3.1-70b:\n\nI use the following quant:\n\n[https://huggingface.co/turboderp/Llama-3.1-70B-Instruct-exl2/tree/6.0bpw](https://huggingface.co/turboderp/Llama-3.1-70B-Instruct-exl2/tree/6.0bpw)\n\nSettings (text-generation-webui/exllamav2 dev branch): 64000 tokens window, auto-split, no cache quantization\n\nI have  4x3090 setup\n\nVram usage: 24x3 + 6gb = 78gb\n\nMy testing involves providing multiple chapters of a novel to the LLM. I then ask challenging questions, such as: asking it to list all characters in order of appearance.\n\nInitial impression: Very impressed by t

In [None]:
from google.colab import files
files.download('output.md')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>