## Use Case: Research Paper Analysis and Visualization

*[Coding along with the Udemy online course [Mastering AutoGen: Building Multi-Agent Systems](https://www.udemy.com/course/autogen-agent-systems/?couponCode=MTST7102224A2) by Paulo Dichone]*

__Scenario:__ Agents that should be able to reasearch some provided papers, do some analysis on them and provide a visulization on the findings.

__Sequential Workflow of the Agents performing their task:__

1. The __Agent Initiate Chat__ initiates the chat.
2. The __Assistant Agent__ finds papers on the topic.
3. He passes his findings to an __Assistant Agent__ who analyzes specific applications from these papers (?).
4. Next an __Assistant Agent__ generates a bar chart and saves it.

In [1]:
from autogen import AssistantAgent, UserProxyAgent
import pandas as pd
from pprint import pprint
import autogen

In [2]:
api_key = pd.read_csv("~/tmp/chat_gpt/agentic-design-1.txt", sep=" ", header=None)[0][0]
print("Don't be a fool and send your api key to GitHub!")

Don't be a fool and send your api key to GitHub!


In [3]:
llm_config = {
    "model": "gpt-4o",
    "temperature": 0.4,
    "api_key": api_key
    }
print("Don't be a fool and send your api key to GitHub!")

Don't be a fool and send your api key to GitHub!


In [4]:
def read_article(file_path):
    with open(file_path, "r") as file:
        return file.read()

In [5]:
# create an AssistantAgent instance named "assistant"
assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
    is_termination_msg=lambda x: True if "TERMINATE" in x.get("content") else False,
)

In [6]:
# create a UserProxyAgent instance named "user_proxy"
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    is_termination_msg=lambda x: True if "TERMINATE" in x.get("content") else False,
    max_consecutive_auto_reply=10,
    code_execution_config={
        "work_dir": "work_dir",
        "use_docker": False,
    },
)

In [7]:
# define the agents
content_analysis_agent = AssistantAgent(
    name="Content_Analysis_Agent",
    llm_config=llm_config,
    system_message="""
    You analyze the submitted article for structure, coherence, and completeness.
    """,
)

In [8]:
# task 1: Find research papers
task1 = """
Find arxiv papers that discuss the applications of machine learning in healthcare.
"""
user_proxy.initiate_chat(assistant, message=task1)

[33muser_proxy[0m (to assistant):


Find arxiv papers that discuss the applications of machine learning in healthcare.


--------------------------------------------------------------------------------
[33massistant[0m (to user_proxy):

To find arXiv papers that discuss the applications of machine learning in healthcare, we can perform a search on the arXiv website using relevant keywords. I'll provide a Python script that uses the arXiv API to search for papers related to this topic and print their titles and abstracts.

```python
# filename: search_arxiv.py
import arxiv

def search_arxiv(query, max_results=10):
    search = arxiv.Search(
        query=query,
        max_results=max_results,
        sort_by=arxiv.SortCriterion.Relevance
    )
    
    for result in search.results():
        print(f"Title: {result.title}")
        print(f"Abstract: {result.summary}\n")
        print(f"Published: {result.published}")
        print(f"PDF: {result.pdf_url}\n")
        print("-" * 80)


ChatResult(chat_id=None, chat_history=[{'content': '\nFind arxiv papers that discuss the applications of machine learning in healthcare.\n', 'role': 'assistant', 'name': 'user_proxy'}, {'content': 'To find arXiv papers that discuss the applications of machine learning in healthcare, we can perform a search on the arXiv website using relevant keywords. I\'ll provide a Python script that uses the arXiv API to search for papers related to this topic and print their titles and abstracts.\n\n```python\n# filename: search_arxiv.py\nimport arxiv\n\ndef search_arxiv(query, max_results=10):\n    search = arxiv.Search(\n        query=query,\n        max_results=max_results,\n        sort_by=arxiv.SortCriterion.Relevance\n    )\n    \n    for result in search.results():\n        print(f"Title: {result.title}")\n        print(f"Abstract: {result.summary}\\n")\n        print(f"Published: {result.published}")\n        print(f"PDF: {result.pdf_url}\\n")\n        print("-" * 80)\n\n# Search for machin

In [9]:
# task 2: Analyze the results to list the specific healthcare applications
task2 = "Analyze the results to list the specific healthcare applications studied by these papers."
user_proxy.initiate_chat(assistant, message=task2, clear_history=False)

[33muser_proxy[0m (to assistant):

Analyze the results to list the specific healthcare applications studied by these papers.

--------------------------------------------------------------------------------
[33massistant[0m (to user_proxy):

To analyze the results for specific healthcare applications studied by the papers, we need to extract and interpret the information from the titles and abstracts of the papers retrieved by the script. Here's a plan to achieve this:

1. **Run the Script**: Execute the provided Python script to get a list of papers related to machine learning applications in healthcare.

2. **Review the Output**: Look at the titles and abstracts printed by the script to identify specific healthcare applications mentioned.

3. **List the Applications**: Based on the information from the titles and abstracts, list the specific applications of machine learning in healthcare that are discussed in these papers.

Please execute the script and share the output here. Onc

ChatResult(chat_id=None, chat_history=[{'content': '\nFind arxiv papers that discuss the applications of machine learning in healthcare.\n', 'role': 'assistant', 'name': 'user_proxy'}, {'content': 'To find arXiv papers that discuss the applications of machine learning in healthcare, we can perform a search on the arXiv website using relevant keywords. I\'ll provide a Python script that uses the arXiv API to search for papers related to this topic and print their titles and abstracts.\n\n```python\n# filename: search_arxiv.py\nimport arxiv\n\ndef search_arxiv(query, max_results=10):\n    search = arxiv.Search(\n        query=query,\n        max_results=max_results,\n        sort_by=arxiv.SortCriterion.Relevance\n    )\n    \n    for result in search.results():\n        print(f"Title: {result.title}")\n        print(f"Abstract: {result.summary}\\n")\n        print(f"Published: {result.published}")\n        print(f"PDF: {result.pdf_url}\\n")\n        print("-" * 80)\n\n# Search for machin

In [10]:
# task 3: Generate a bar chart
task3 = """Use this data to generate a bar chart of healthcare applications and the number of papers in each application and save it to a file.
"""
user_proxy.initiate_chat(assistant, message=task3, clear_history=False)

[33muser_proxy[0m (to assistant):

Use this data to generate a bar chart of healthcare applications and the number of papers in each application and save it to a file.


--------------------------------------------------------------------------------
[33massistant[0m (to user_proxy):

To generate a bar chart of healthcare applications and the number of papers in each application, we need to first extract the relevant information from the paper titles and abstracts. Here's a plan to achieve this:

1. **Extract Data**: Parse the titles and abstracts to identify specific healthcare applications mentioned in the papers.

2. **Count Occurrences**: Count the number of papers that discuss each application.

3. **Generate Bar Chart**: Use a plotting library to create a bar chart based on the counts.

4. **Save the Chart**: Save the generated chart to a file.

Since we don't have the output from the script execution, I'll provide a sample implementation that assumes we have a list of applic

ChatResult(chat_id=None, chat_history=[{'content': '\nFind arxiv papers that discuss the applications of machine learning in healthcare.\n', 'role': 'assistant', 'name': 'user_proxy'}, {'content': 'To find arXiv papers that discuss the applications of machine learning in healthcare, we can perform a search on the arXiv website using relevant keywords. I\'ll provide a Python script that uses the arXiv API to search for papers related to this topic and print their titles and abstracts.\n\n```python\n# filename: search_arxiv.py\nimport arxiv\n\ndef search_arxiv(query, max_results=10):\n    search = arxiv.Search(\n        query=query,\n        max_results=max_results,\n        sort_by=arxiv.SortCriterion.Relevance\n    )\n    \n    for result in search.results():\n        print(f"Title: {result.title}")\n        print(f"Abstract: {result.summary}\\n")\n        print(f"Published: {result.published}")\n        print(f"PDF: {result.pdf_url}\\n")\n        print("-" * 80)\n\n# Search for machin

In [11]:
# example usage
file_path = "../assets/data/_article.txt"
article_content = read_article(file_path)

In [12]:
initial_task = f"Analyze the following article for structure, coherence, and completeness: {article_content}"

In [13]:
content_result = user_proxy.initiate_chat(
    recipient=content_analysis_agent,
    message=initial_task,
    max_turns=2,
    summary_method="last_msg",
)

[33muser_proxy[0m (to Content_Analysis_Agent):

Analyze the following article for structure, coherence, and completeness: Signaling that investments in the supply chain sector remain robust, Pando, a startup developing fulfillment management technologies, today announced that it raised $30 million in a Series B round, bringing its total raised to $45 million.

Iron Pillar and Uncorrelated Ventures led the round, with participation from existing investors Nexus Venture Partners, Chiratae Ventures and Next47. CEO and founder Nitin Jayakrishnan says that the new capital will be put toward expanding Pando’s global sales, marketing and delivery capabilities.

“We will not expand into new industries or adjacent product areas,” he told TechCrunch in an email interview. “Great talent is the foundation of the business — we will continue to augment our teams at all levels of the organization. Pando is also open to exploring strategic partnerships and acquisitions with this round of funding.”



In [14]:
style_review_agent = AssistantAgent(
    name="Style_Review_Agent",
    llm_config=llm_config,
    system_message="""
    You review the article for language use, tone, and style consistency.
    """,
)

In [15]:
fact_checking_agent = AssistantAgent(
    name="Fact_Checking_Agent",
    llm_config=llm_config,
    system_message="""
    You verify the factual accuracy of the content.
    """,
)

In [16]:
editorial_feedback_agent = AssistantAgent(
    name="Editorial_Feedback_Agent",
    llm_config=llm_config,
    system_message="""
    You provide comprehensive feedback and suggestions for improvement.
    """,
)

In [17]:
final_review_agent = AssistantAgent(
    name="Final_Review_Agent",
    llm_config=llm_config,
    system_message="""
    You summarize the overall quality of the article and readiness for publication.
    """,
)

In [18]:
pprint(content_result.chat_history[0])

{'content': 'Analyze the following article for structure, coherence, and '
            'completeness: Signaling that investments in the supply chain '
            'sector remain robust, Pando, a startup developing fulfillment '
            'management technologies, today announced that it raised $30 '
            'million in a Series B round, bringing its total raised to $45 '
            'million.\n'
            '\n'
            'Iron Pillar and Uncorrelated Ventures led the round, with '
            'participation from existing investors Nexus Venture Partners, '
            'Chiratae Ventures and Next47. CEO and founder Nitin Jayakrishnan '
            'says that the new capital will be put toward expanding Pando’s '
            'global sales, marketing and delivery capabilities.\n'
            '\n'
            '“We will not expand into new industries or adjacent product '
            'areas,” he told TechCrunch in an email interview. “Great talent '
            'is the foundation o

In [19]:
style_result = user_proxy.initiate_chat(
    recipient=style_review_agent,
    message=content_result.chat_history[0], # the whole chat object is supposed to be passed here but leads to error
    max_turns=2,
    summary_method="last_msg",
)

[33muser_proxy[0m (to Style_Review_Agent):

Analyze the following article for structure, coherence, and completeness: Signaling that investments in the supply chain sector remain robust, Pando, a startup developing fulfillment management technologies, today announced that it raised $30 million in a Series B round, bringing its total raised to $45 million.

Iron Pillar and Uncorrelated Ventures led the round, with participation from existing investors Nexus Venture Partners, Chiratae Ventures and Next47. CEO and founder Nitin Jayakrishnan says that the new capital will be put toward expanding Pando’s global sales, marketing and delivery capabilities.

“We will not expand into new industries or adjacent product areas,” he told TechCrunch in an email interview. “Great talent is the foundation of the business — we will continue to augment our teams at all levels of the organization. Pando is also open to exploring strategic partnerships and acquisitions with this round of funding.”

Pand

In [20]:
fact_result = user_proxy.initiate_chat(
    recipient=fact_checking_agent,
    message=style_result.chat_history[0],
    max_turns=2,
    summary_method="last_msg",
)

[33muser_proxy[0m (to Fact_Checking_Agent):

Analyze the following article for structure, coherence, and completeness: Signaling that investments in the supply chain sector remain robust, Pando, a startup developing fulfillment management technologies, today announced that it raised $30 million in a Series B round, bringing its total raised to $45 million.

Iron Pillar and Uncorrelated Ventures led the round, with participation from existing investors Nexus Venture Partners, Chiratae Ventures and Next47. CEO and founder Nitin Jayakrishnan says that the new capital will be put toward expanding Pando’s global sales, marketing and delivery capabilities.

“We will not expand into new industries or adjacent product areas,” he told TechCrunch in an email interview. “Great talent is the foundation of the business — we will continue to augment our teams at all levels of the organization. Pando is also open to exploring strategic partnerships and acquisitions with this round of funding.”

Pan

In [21]:
feedback_result = user_proxy.initiate_chat(
    recipient=editorial_feedback_agent,
    message=fact_result.chat_history[0], # the whole chat object is supposed to be passed here but leads to error
    max_turns=2,
    summary_method="last_msg",
)

[33muser_proxy[0m (to Editorial_Feedback_Agent):

Analyze the following article for structure, coherence, and completeness: Signaling that investments in the supply chain sector remain robust, Pando, a startup developing fulfillment management technologies, today announced that it raised $30 million in a Series B round, bringing its total raised to $45 million.

Iron Pillar and Uncorrelated Ventures led the round, with participation from existing investors Nexus Venture Partners, Chiratae Ventures and Next47. CEO and founder Nitin Jayakrishnan says that the new capital will be put toward expanding Pando’s global sales, marketing and delivery capabilities.

“We will not expand into new industries or adjacent product areas,” he told TechCrunch in an email interview. “Great talent is the foundation of the business — we will continue to augment our teams at all levels of the organization. Pando is also open to exploring strategic partnerships and acquisitions with this round of funding.”

In [22]:
final_summary = user_proxy.initiate_chat(
    recipient=final_review_agent,
    message=feedback_result.chat_history[0], # the whole chat object is supposed to be passed here but leads to error,
    max_turns=2,
    summary_method="last_msg",
)

[33muser_proxy[0m (to Final_Review_Agent):

Analyze the following article for structure, coherence, and completeness: Signaling that investments in the supply chain sector remain robust, Pando, a startup developing fulfillment management technologies, today announced that it raised $30 million in a Series B round, bringing its total raised to $45 million.

Iron Pillar and Uncorrelated Ventures led the round, with participation from existing investors Nexus Venture Partners, Chiratae Ventures and Next47. CEO and founder Nitin Jayakrishnan says that the new capital will be put toward expanding Pando’s global sales, marketing and delivery capabilities.

“We will not expand into new industries or adjacent product areas,” he told TechCrunch in an email interview. “Great talent is the foundation of the business — we will continue to augment our teams at all levels of the organization. Pando is also open to exploring strategic partnerships and acquisitions with this round of funding.”

Pand

In [23]:
print("Final Summary or Report:")
print(final_summary)

Final Summary or Report:
ChatResult(chat_id=None, chat_history=[{'content': 'Analyze the following article for structure, coherence, and completeness: Signaling that investments in the supply chain sector remain robust, Pando, a startup developing fulfillment management technologies, today announced that it raised $30 million in a Series B round, bringing its total raised to $45 million.\n\nIron Pillar and Uncorrelated Ventures led the round, with participation from existing investors Nexus Venture Partners, Chiratae Ventures and Next47. CEO and founder Nitin Jayakrishnan says that the new capital will be put toward expanding Pando’s global sales, marketing and delivery capabilities.\n\n“We will not expand into new industries or adjacent product areas,” he told TechCrunch in an email interview. “Great talent is the foundation of the business — we will continue to augment our teams at all levels of the organization. Pando is also open to exploring strategic partnerships and acquisitions