In [1]:
# source: https://github.com/microsoft/autogen/blob/main/notebook/agentchat_groupchat_research.ipynb
%%capture --no-stderr
# !pip install pyautogen==0.2.9
# !pip install openai==1.10.0
# or pip install -r requirements.txt

UsageError: Line magic function `%%capture` not found.


In [4]:
from autogen import config_list_from_json

config_list_gpt4 = config_list_from_json(
    "OAI_CONFIG_LIST",
    filter_dict={
        "model": ["gpt-4-1106-preview"],
    },
)

In [3]:
gpt4_config = {
    "cache_seed": 42,
    "temperature": 0.0,
    "config_list": config_list_gpt4,
    "timeout": 120,
}

In [5]:
from autogen import UserProxyAgent, AssistantAgent, GroupChat, GroupChatManager

In [6]:
user_proxy = UserProxyAgent(
    name="Admin",
    system_message="A human admin. Interact with the planner to discuss\
        the plan. Plan execution needs to be approved by this admin.",
    code_execution_config=False
)

In [7]:
engineer = AssistantAgent(
    name="Engineer",
    llm_config=gpt4_config,
    system_message="""Engineer. You follow an approved plan. You write python/shell code to solve tasks. Wrap the code in a code block that specifies the script type. The user can't modify your code. So do not suggest incomplete code which requires others to modify. Don't use a code block if it's not intended to be executed by the executor.
Don't include multiple code blocks in one response. Do not ask others to copy and paste the result. Check the execution result returned by the executor.
If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.
""",
)

In [8]:
scientist = AssistantAgent(
    name="Scientist",
    llm_config=gpt4_config,
    system_message="""Scientist. You follow an approved plan.
    You are able to categorize papers after seeing their abstracts printed.
    You don't write code.""",
)

In [9]:
planner = AssistantAgent(
    name="Planner",
    system_message="""Planner. Suggest a plan. Revise the plan based on feedback
    from admin and critic, until admin approval. The plan may involve an engineer
    who can write code and a scientist who doesn't write code. Explain the plan 
    first. Be clear which step is performed by an engineer, and which step is 
    performed by a scientist.
    """,
    llm_config=gpt4_config
)

In [10]:
executor = UserProxyAgent(
    name="Executor",
    system_message="Executor. Execute the code written by the engineer\
        and report the result.",
    human_input_mode="NEVER",
    code_execution_config={"last_n_messages": 3,
                           "work_dir": "paper"},
)

In [11]:
critic = AssistantAgent(
    name="Critic", 
    system_message="""Critic. Double check plan, claims, code from other
    agents and provide feedback. Check whether the plan includes adding 
    verifiable info such as source URL.
    """,
    llm_config=gpt4_config,
    
)

In [12]:
groupchat = GroupChat(
    agents=[user_proxy, engineer, scientist, planner, executor, critic], 
    messages=[], max_round=50
)

In [13]:
manager = GroupChatManager(groupchat=groupchat, 
                           llm_config=gpt4_config)

In [3]:
user_proxy.initiate_chat(
    manager,
    message="""
    find papers on LLM applications that aid human learning from arxiv in the last week,
    create a markdown table of different domains.
    """,)

[33mAdmin[0m (to chat_manager):


    find papers on LLM applications that aid human learning from arxiv in the last week,
    create a markdown table of different domains.
    

--------------------------------------------------------------------------------
[33mPlanner[0m (to chat_manager):

Plan:

1. **Data Collection (Engineer)**:
   - The engineer will write a script to query the arXiv API for recent papers related to Large Language Models (LLMs) and their applications in aiding human learning. The script will filter papers published in the last week.
   - The script will use search queries with keywords like "Large Language Models", "Human Learning", "Educational Technology", etc., to find relevant papers.
   - The engineer will ensure that the script can parse the returned data to extract necessary information such as the title, authors, arXiv ID, and abstract.

2. **Data Processing (Engineer)**:
   - The engineer will write code to process the collected data and categorize 

execute_code was called without specifying a value for use_docker. Since the python docker package is not available, code will be run natively. Note: this fallback behavior is subject to change


[33mExecutor[0m (to chat_manager):

exitcode: 0 (execution succeeded)
Code output: 
| Title | Authors | arXiv ID | Abstract | Domain |
| --- | --- | --- | --- | --- |
| [DRDT: Dynamic Reflection with Divergent Thinking for LLM-based Sequential Recommendation](http://arxiv.org/abs/2310.20689v2) |  | 2310.20689v2 | The rise of Large Language Models (LLMs) has sparked interest in their application to sequential recommendation tasks as they can provide supportive i... | STEM Education |
| [Instruction Tuning with Human Curriculum](http://arxiv.org/abs/2310.06983v1) |  | 2310.06983v1 | The dominant paradigm for instruction tuning is the random-shuffled training of maximally diverse instruction-response pairs. This paper explores the ... | Skill Acquisition |
| [SELF: Language-Driven Self-Evolution for Large Language Models](http://arxiv.org/abs/2309.09530v1) |  | 2309.09530v1 | Large Language Models (LLMs) have demonstrated remarkable versatility across various domains. To further advance

execute_code was called without specifying a value for use_docker. Since the python docker package is not available, code will be run natively. Note: this fallback behavior is subject to change


[33mExecutor[0m (to chat_manager):

exitcode: 0 (execution succeeded)
Code output: 
| Title | Authors | arXiv ID | Published Date | Abstract | Domain |
| --- | --- | --- | --- | --- | --- |
| [DRDT: Dynamic Reflection with Divergent Thinking for LLM-based
  Sequential Recommendation](http://arxiv.org/abs/2312.11336v1) | Yu Wang, Zhiwei Liu, Jianguo Zhang, Weiran Yao, Shelby Heinecke, Philip S. Yu | 2312.11336v1 | 2023-12-18 | The rise of Large Language Models (LLMs) has sparked interest in their
application to sequential recommendation tasks as they can provide supportive
i... | STEM Education |
| [Learning From Mistakes Makes LLM Better Reasoner](http://arxiv.org/abs/2310.20689v2) | Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen | 2310.20689v2 | 2023-10-31 | Large language models (LLMs) recently exhibited remarkable reasoning
capabilities on solving math problems. To further improve this capability, this
w... | Skill Acquisition |
| [Instruction Tuning

execute_code was called without specifying a value for use_docker. Since the python docker package is not available, code will be run natively. Note: this fallback behavior is subject to change


[33mExecutor[0m (to chat_manager):

exitcode: 0 (execution succeeded)
Code output: 
| Title | Authors | arXiv ID | Published Date | Abstract | Domain |
| --- | --- | --- | --- | --- | --- |



--------------------------------------------------------------------------------
[33mScientist[0m (to chat_manager):

The execution of the corrected script has succeeded, but it appears that no papers matching the criteria were found for the specified date range of the last week. This could mean that there were no papers related to Large Language Models (LLMs) and their applications in aiding human learning published on arXiv during this period, or that the search query may need to be adjusted to capture a broader set of relevant papers.

If it is essential to find papers within this specific timeframe and topic, I would recommend expanding the search query to include a wider range of keywords or related topics, or extending the date range slightly to ensure that relevant papers are not misse

In [4]:
# Display text with markdown formatting
from IPython.display import Markdown

# Specify the text to display
text = """| Title | Authors | arXiv ID | Published Date | Abstract | Domain |
| --- | --- | --- | --- | --- | --- |
| [DRDT: Dynamic Reflection with Divergent Thinking for LLM-based
  Sequential Recommendation](http://arxiv.org/abs/2312.11336v1) | Yu Wang, Zhiwei Liu, Jianguo Zhang, Weiran Yao, Shelby Heinecke, Philip S. Yu | 2312.11336v1 | 2023-12-18 | The rise of Large Language Models (LLMs) has sparked interest in their
application to sequential recommendation tasks as they can provide supportive
i... | STEM Education |
| [Learning From Mistakes Makes LLM Better Reasoner](http://arxiv.org/abs/2310.20689v2) | Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen | 2310.20689v2 | 2023-10-31 | Large language models (LLMs) recently exhibited remarkable reasoning
capabilities on solving math problems. To further improve this capability, this
w... | Skill Acquisition |
| [Instruction Tuning with Human Curriculum](http://arxiv.org/abs/2310.09518v1) | Bruce W. Lee, Hyunsoo Cho, Kang Min Yoo | 2310.09518v1 | 2023-10-14 | The dominant paradigm for instruction tuning is the random-shuffled training
of maximally diverse instruction-response pairs. This paper explores the
... | Skill Acquisition |
| [Violation of Expectation via Metacognitive Prompting Reduces Theory of
  Mind Prediction Error in Large Language Models](http://arxiv.org/abs/2310.06983v1) | Courtland Leer, Vincent Trost, Vineeth Voruganti | 2310.06983v1 | 2023-10-10 | Recent research shows that Large Language Models (LLMs) exhibit a compelling
level of proficiency in Theory of Mind (ToM) tasks. This ability to imput... | Skill Acquisition |
| [SELF: Language-Driven Self-Evolution for Large Language Models](http://arxiv.org/abs/2310.00533v3) | Jianqiao Lu, Wanjun Zhong, Wenyong Huang, Yufei Wang, Fei Mi, Baojun Wang, Weichao Wang, Lifeng Shang, Qun Liu | 2310.00533v3 | 2023-10-01 | Large Language Models (LLMs) have demonstrated remarkable versatility across
various domains. To further advance LLMs, we propose 'SELF' (Self-Evoluti... | STEM Education |
| [Adapting Large Language Models via Reading Comprehension](http://arxiv.org/abs/2309.09530v1) | Daixuan Cheng, Shaohan Huang, Furu Wei | 2309.09530v1 | 2023-09-18 | We explore how continued pre-training on domain-specific corpora influences
large language models, revealing that training on the raw corpora endows t... | Skill Acquisition |
| [Re-Reading Improves Reasoning in Language Models](http://arxiv.org/abs/2309.06275v1) | Xiaohan Xu, Chongyang Tao, Tao Shen, Can Xu, Hongbo Xu, Guodong Long, Jian-guang Lou | 2309.06275v1 | 2023-09-12 | Reasoning presents a significant and challenging issue for Large Language
Models (LLMs). The predominant focus of research has revolved around develop... | Skill Acquisition |"""

# Display the text
Markdown(text)

| Title | Authors | arXiv ID | Published Date | Abstract | Domain |
| --- | --- | --- | --- | --- | --- |
| [DRDT: Dynamic Reflection with Divergent Thinking for LLM-based
  Sequential Recommendation](http://arxiv.org/abs/2312.11336v1) | Yu Wang, Zhiwei Liu, Jianguo Zhang, Weiran Yao, Shelby Heinecke, Philip S. Yu | 2312.11336v1 | 2023-12-18 | The rise of Large Language Models (LLMs) has sparked interest in their
application to sequential recommendation tasks as they can provide supportive
i... | STEM Education |
| [Learning From Mistakes Makes LLM Better Reasoner](http://arxiv.org/abs/2310.20689v2) | Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen | 2310.20689v2 | 2023-10-31 | Large language models (LLMs) recently exhibited remarkable reasoning
capabilities on solving math problems. To further improve this capability, this
w... | Skill Acquisition |
| [Instruction Tuning with Human Curriculum](http://arxiv.org/abs/2310.09518v1) | Bruce W. Lee, Hyunsoo Cho, Kang Min Yoo | 2310.09518v1 | 2023-10-14 | The dominant paradigm for instruction tuning is the random-shuffled training
of maximally diverse instruction-response pairs. This paper explores the
... | Skill Acquisition |
| [Violation of Expectation via Metacognitive Prompting Reduces Theory of
  Mind Prediction Error in Large Language Models](http://arxiv.org/abs/2310.06983v1) | Courtland Leer, Vincent Trost, Vineeth Voruganti | 2310.06983v1 | 2023-10-10 | Recent research shows that Large Language Models (LLMs) exhibit a compelling
level of proficiency in Theory of Mind (ToM) tasks. This ability to imput... | Skill Acquisition |
| [SELF: Language-Driven Self-Evolution for Large Language Models](http://arxiv.org/abs/2310.00533v3) | Jianqiao Lu, Wanjun Zhong, Wenyong Huang, Yufei Wang, Fei Mi, Baojun Wang, Weichao Wang, Lifeng Shang, Qun Liu | 2310.00533v3 | 2023-10-01 | Large Language Models (LLMs) have demonstrated remarkable versatility across
various domains. To further advance LLMs, we propose 'SELF' (Self-Evoluti... | STEM Education |
| [Adapting Large Language Models via Reading Comprehension](http://arxiv.org/abs/2309.09530v1) | Daixuan Cheng, Shaohan Huang, Furu Wei | 2309.09530v1 | 2023-09-18 | We explore how continued pre-training on domain-specific corpora influences
large language models, revealing that training on the raw corpora endows t... | Skill Acquisition |
| [Re-Reading Improves Reasoning in Language Models](http://arxiv.org/abs/2309.06275v1) | Xiaohan Xu, Chongyang Tao, Tao Shen, Can Xu, Hongbo Xu, Guodong Long, Jian-guang Lou | 2309.06275v1 | 2023-09-12 | Reasoning presents a significant and challenging issue for Large Language
Models (LLMs). The predominant focus of research has revolved around develop... | Skill Acquisition |