<a href="https://colab.research.google.com/github/barbaroja2000/agents/blob/main/crew_ai_exa_anthropic_claude3_haiku_confluence.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Autonomous Research Agent

* Research Agent with Crewai as orchestrator
* Search via [EXA.ai](https://exa.ai) - Ancedata but seems to generate better quality search output than Tavily - YMMV mind....
* Super fast [Anthropic Claude 3 Haiku model](https://www.anthropic.com/news/claude-3-haiku)
* Prompts (Agent Backstory & task) finessed using Anthropic [meta-prompt](https://colab.research.google.com/drive/1SoAajN8CBYTl79VyTwxtxncfCWlHlyy9)
* Publishes to Confluence

In [23]:
!pip install -Uq 'crewai[tools]' langchain-exa langchain-anthropic atlassian-python-api markdown2 bs4

In [24]:
#@title Imports

import os
from crewai import Agent
import markdown2
import os
import json
from bs4 import BeautifulSoup
import json
from crewai_tools import tool
from exa_py import Exa
from atlassian import Confluence

In [25]:
#@title Passwords etc

from google.colab import userdata
exa_api_key=userdata.get('exa_api_key')
anthropic_api_key=userdata.get('anthropic_api_key')

os.environ["CONFLUENCE_API_KEY"]=userdata.get('confluence_api_key')
os.environ["CONFLUENCE_USERNAME"]=userdata.get('confluence_username')
os.environ["CONFLUENCE_URI"]=userdata.get('confluence_uri')

In [26]:
#@title Tools

exa = Exa(api_key=exa_api_key)

@tool("EXA search tool")
def search(query: str):
    """Search for a webpage based on the query."""
    return exa.search(f"{query}", use_autoprompt=True, num_results=5)

@tool("EXA similar pages tool")
def find_similar(url: str):
    """Search for webpages similar to a given URL.
    The url passed in should be a URL returned from `search`.
    """
    return exa.find_similar(url, num_results=5)

@tool("EXA get page contents tool")
def get_contents(ids: list[str]):
    """Get the contents of a webpage.
    The ids passed in should be a list of ids returned from `search`.
    """
    return exa.get_contents(ids)

@tool("Confluence Publisher")
def confluence_publisher_tool(content: str) -> str:
    """Use this tool to publish to Confluence"""
    confluence_uri = os.environ["CONFLUENCE_URI"]
    username = os.environ["CONFLUENCE_USERNAME"]
    password = os.environ["CONFLUENCE_API_KEY"]
    space_key = 'AI'

    # Convert Markdown content of the Article to HTML
    html_content = markdown2.markdown(content)

    soup = BeautifulSoup(html_content, 'html.parser')

    # Extract the first <h1> tag content for the title
    h1_tag = soup.find('h1')
    page_title = f"{h1_tag.text}" if h1_tag else 'No title found'

    # Remove the <h1> tag to get the rest of the HTML content without the title
    if h1_tag:
        h1_tag.decompose()

    html_content = str(soup)

    # Initialize Confluence client
    confluence = Confluence(
        url=confluence_uri,
        username=username,
        password=password
    )

    # Get page by title
    parent_info = confluence.get_page_by_title(space_key, "Research")

    # Extract the page ID
    parent_id = parent_info['id']

    # Create or update page with HTML content
    response = confluence.create_page(
        space_key,
        page_title,
        html_content,
        parent_id=parent_id,
        type='page',
        representation='storage',
        editor='v2',
        full_width=False
    )

    return json.dumps(response)

In [27]:
#@title Model

from langchain_anthropic import ChatAnthropic
from langchain_core.prompts import ChatPromptTemplate

model = ChatAnthropic(model='claude-3-haiku-20240307', anthropic_api_key=anthropic_api_key)

In [28]:
#@title Agents

max_iter=10

# Creating a senior researcher agent with memory and verbose mode
researcher = Agent(
  role='Senior Researcher',
  goal='Efficiently gather, synthesize, and present the most relevant and accurate information about {topic}',
  verbose=True,
  memory=True,
  max_iter=max_iter,
  backstory=("""I am an advanced AI research assistant, I have been developed by leading experts in the field of artificial intelligence. My primary function is to support users in their research endeavors by providing quick access to accurate and up-to-date information from a vast knowledge base spanning various academic and professional disciplines.
<traits>
- Highly knowledgeable and well-versed in various academic and professional disciplines
- Utilizes advanced AI techniques for natural language processing, information retrieval, and
knowledge representation
- Efficient and precise in searching, synthesizing, and presenting complex information
- Skilled at breaking down complex topics and identifying cross-disciplinary connections
- Focused on providing reliable and accurate research assistance to users
- Aims to empower users to make informed decisions and facilitate new discoveries
- Adaptable to a wide range of research needs and user requirements
- Clear, organized, and detailed in its communication style
</traits>"""
  ),
  tools=[search],
  allow_delegation=False,
  llm=model

)

# Creating a writer agent with custom tools and delegation capability
writer = Agent(
  role='Writer',
  goal='Narrate compelling tech stories about {topic}',
  verbose=True,
  memory=True,
  max_iter=max_iter,
  backstory=("""Expert in tech content strategy with a deep understanding of the tech industry.
Skills:
Converts complex technical concepts into engaging, easy-to-understand content.
Expert in structuring articles for readability and engagement.
Uses subheadings, bullet points, and lists for clarity.
Applies analogies and real-world examples to demystify technical topics.
Incorporates visuals like diagrams and infographics to support text.
Varies sentence structure for dynamic reading.
Signature Style:
Strong openings and closings that resonate with readers.
Ability to refine and polish drafts into polished pieces.
A meticulous attention to detail, aiming to inform, engage, and inspire.
Impact: Transforms technical, dry subjects into compelling reads, making technology accessible to a broader audience.
    """
  ),
  allow_delegation=False,
  llm=model
)


# Creating a publisher agent
publisher = Agent(
  role='Publisher',
  goal='Publishes research to Confluence',
  verbose=True,
  memory=True,
  max_iter=max_iter,
  backstory=("""
    I am an AI Publisher. My sole function is to publish finished markdown articles to Confluence.
    """
  ),
  tools=[confluence_publisher_tool],
  allow_delegation=False,
  llm=model
)

In [29]:
#@title Tasks

from crewai import Task
from pydantic import BaseModel

# Research task
research_task = Task(
  description=("""
        To conduct your research:
Identify the key concepts, terminology, and areas of study related to {topic}. Break the topic down into its core components and areas of focus.
Search for and gather information from authoritative sources including academic journals, subject matter experts, and other credible information repositories. Prioritize primary sources where possible.
Critically evaluate the information you find - check for credibility, potential biases, and the strength of evidence presented. Synthesize information from multiple reliable sources.
Organize the information in a structured way, connecting related concepts and ideas. Identify key theories, concepts, frameworks, debates, open questions and areas of consensus vs. disagreement within the field.
Summarize your findings, capturing the current state of knowledge on {topic}. Note key takeaways, implications, limitations in the current research, and potential areas for further investigation.
Identify a set of authoritative references on {topic} that could be used for further reading and to support the information presented.
Present your research findings in a format that is clear, comprehensive, well-structured and fully referenced. The output should demonstrate an expert-level understanding of {topic} suitable for an informed layperson audience.
        """

  ),
  expected_output='A comprehensive 12 paragraph long report on {topic}.',
  tools=[search],
  agent=researcher,
)

# Writing task with language model configuration
write_task = Task(
  description=("""
  Develop a concise, informative article that highlights the most significant information and advancements from the research specialist's report. Synthesize the key facts into an accessible narrative aimed at a tech-savvy audience, capturing the essence of these breakthroughs and their implications for both research and industry. Maintain a factual, clear tone that informs readers without excessive hype.

The article should include:

Start with a concise hook that immediately draws attention and establishes the significance of the topic
Use clear, descriptive subheadings for each section to guide readers through the narrative
8 focused paragraphs that distill the most important developments and their potential impact
Accessible explanations of complex concepts for a tech-literate but non-specialist audience
A forward-looking angle on how these advancements may shape the future of the field
A bulleted summary at the end with 4-5 key takeaways for readers
Aim for an article length of approximately 8 paragraphs. Cite specific information from the research report while crafting an original narrative arc. The final piece should be a polished, professional article ready for publication.
  """
  ),
  expected_output="""A comprehensive 8 paragraph article, with 4-5 key points bulleted at the end. All formatted in markdown""",
  agent=writer,
  async_execution=False,
  output_file='new-blog-post.md'  # Example of output customization
)


# Writing task with language model configuration
publish_task = Task(
  description=(
"Publish task to Confluence"
  ),
  expected_output='An article published to confluence.',
  tools=[confluence_publisher_tool],
  agent=publisher,
  async_execution=False
)


In [30]:
#@title Crew

from crewai import Crew, Process

# Forming the tech-focused crew with enhanced configurations
crew = Crew(
  agents=[researcher, writer, publisher],
  tasks=[research_task, write_task, publish_task],
  process=Process.sequential  # Optional: Sequential task execution is default
)



In [31]:
#@title Kick off

# Starting the task execution process with enhanced feedback
result = crew.kickoff(inputs={'topic': 'Microsoft AutoDev'})
print(result)



[1m> Entering new CrewAgentExecutor chain...[0m
[32;1m[1;3mThought: To efficiently gather information about Microsoft AutoDev, I will use the EXA search tool to find relevant and authoritative sources.

Action: [EXA search tool]
Action Input: {"query": "Microsoft AutoDev"}
[0m[91m 

Action '[EXA search tool]' don't exist, these are the only available Actions: EXA search tool: EXA search tool(query: 'string') - Search for a webpage based on the query.
[00m
[32;1m[1;3mThought: To efficiently gather information about Microsoft AutoDev, I will use the EXA search tool to find relevant and authoritative sources.

Action: EXA search tool
Action Input: {"query": "Microsoft AutoDev"}
[0m[93m 

Title: What is automated ML? AutoML - Azure Machine Learning
URL: https://learn.microsoft.com/en-us/azure/machine-learning/concept-automated-ml
ID: 1ZBeyx1oMqIgPTZNbWSsAg
Score: 0.20764565467834473
Published Date: 2023-02-24
Author: None
Text: None
Highlights: None
Highlight Scores: None


Ti