# Agent State Machine

**ARRANGE:** index the document

In [1]:
# Index test doc
import os
os.environ["LOG_LEVEL"]="DEBUG"

from gai.rag.client.rag_client_async import RagClientAsync
rag = RagClientAsync({
    "type": "rag",
    "url": "http://localhost:12036/gen/v1/rag",
    "ws_url": "ws://localhost:12036/gen/v1/rag/index-file/ws"
})
here = os.path.dirname(__name__)
file_path = os.path.join(here,"ReAct-2210.03629.pdf")
await rag.index_document_async(collection_name="demo", file_path=file_path)

[45m[30mDEBUG   [0m [35mhttppost:url=http://localhost:12036/gen/v1/rag/index-file[0m
[45m[30mDEBUG   [0m [35mhttppost:data=None[0m
[45m[30mDEBUG   [0m [35mhttppost:files={'file': ('ReAct-2210.03629.pdf',
          <_io.BufferedReader name='ReAct-2210.03629.pdf'>,
          'application/pdf'),
 'req': (None,
         '{"Id":null,"CollectionName":"demo","FilePath":"ReAct-2210.03629.pdf","FileType":"","Source":"","ByteSize":null,"Title":"","Abstract":null,"Authors":"","Publisher":"","PublishedDate":"","Comments":"","Keywords":"","ChunkGroupId":null,"ChunkSize":null,"ChunkOverlap":null,"ChunkCount":null}',
         'application/json')}[0m


IndexedDocChunkIdsPydantic(DocumentId='Mm1GAmuGfs3VoA2sit0njabQGAZ75eyfszIkAxGY5e4', ChunkgroupId='18f38de6-ff61-40f8-8645-d17f2dcac0d9', ChunkIds=['40bbea28-2a68-4287-85da-55a43cacdfa4', '6137b0cb-1c92-47f3-98fd-e685adcc2eff', '5cccd669-ecad-4ee3-a1eb-cb00115c861c', '9625a33a-05c1-4237-a3c4-3b4d5783b488', '185d86bf-0619-4ac7-9ef9-43f99412b3a9', '04b23c65-3f6e-486c-88a3-1ff6ee10da29', '6a93fb92-ec84-4fbf-a129-ed77680219f4', '95bde36c-000f-41ea-9194-9200ccb7763c', 'd9917642-7e3c-4565-b083-96a692f27caa', 'ed648273-e080-467c-a96c-d22235eb237e', 'd777051f-7997-4bb3-8497-eebba916c117', 'e14cd0ed-80db-4328-89a7-14ce3f310f8c', '8d74db46-aac1-4367-a302-8bdec7f76927', '4ec5891d-cc15-4ff8-9fad-d0018bc847b2', '9d931c3b-a53d-40ec-8c6f-eb0d010451a2', '9689ef93-6e6c-4c0a-814e-e894606f1103', '2647ad5d-bd27-4ed0-8e25-2848f6f34dbc', 'bac326d2-82e2-461c-9251-c26fac138c51', '9e8d7037-0bb1-4f84-9024-d647c54077ac', 'c0974724-a386-40be-8e4e-bb659dd44518', 'fb99f2b2-b364-479a-99d4-d60d4a579a8a', 'd031305b-18

## State 0. Init

The starting point is to create the string representation of the following mermaid diagram as follows:


```mermaid
stateDiagram-v2
    INIT --> TOOL_CHOICE: next / on_TOOL_CHOICE

    TOOL_CHOICE --> CRAFT_TEXT_PROMPT: next / on_CRAFT_PROMPT / use_text
        CRAFT_TEXT_PROMPT --> GENERATE: next / on_GENERATE
        GENERATE --> PROCESS: next / on_PROCESS / use_text
        PROCESS --> END : next

    TOOL_CHOICE --> CRAFT_TOOL_PROMPT: next / on_CRAFT_PROMPT / use_google
        TOOL_CALL --> GOOGLE: next / on_GOOGLE / use_google
        GOOGLE --> GENERATE: next / on_GENERATE

    TOOL_CHOICE --> CRAFT_TOOL_PROMPT: next / on_CRAFT_PROMPT / use_retrieval
        TOOL_CALL --> RETRIEVAL: next / on_RETRIEVAL / use_retrieval
        RETRIEVAL --> GENERATE: next / on_GENERATE

    CRAFT_TOOL_PROMPT --> TOOL_CALL: next / on_TOOL_CALL
```

In [9]:
import os
os.environ["LOG_LEVEL"]="WARNING"
from gai.rag.client.rag_client_async import RagClientAsync
from gai.persona.profile.pydantic.AgentPydantic import AgentPydantic
from gai.ttt.client.ttt_client import TTTClient
from gai.persona.fsm.AgentStateMachine import AgentStateMachine
from gai.lib.common.notebook import highlight,print_colored

state_diagram = """stateDiagram-v2
    INIT --> TOOL_CHOICE: next / on_TOOL_CHOICE

    TOOL_CHOICE --> CRAFT_TEXT_PROMPT: next / on_CRAFT_PROMPT / use_text
        CRAFT_TEXT_PROMPT --> GENERATE: next / on_GENERATE
        GENERATE --> PROCESS: next / on_PROCESS / use_text
        PROCESS --> END : next

    TOOL_CHOICE --> CRAFT_TOOL_PROMPT: next / on_CRAFT_PROMPT / use_google
        TOOL_CALL --> GOOGLE: next / on_GOOGLE / use_google
        GOOGLE --> GENERATE: next / on_GENERATE

    TOOL_CHOICE --> CRAFT_TOOL_PROMPT: next / on_CRAFT_PROMPT / use_retrieval
        TOOL_CALL --> RETRIEVAL: next / on_RETRIEVAL / use_retrieval
        RETRIEVAL --> GENERATE: next / on_GENERATE

    CRAFT_TOOL_PROMPT --> TOOL_CALL: next / on_TOOL_CALL
    """

ttt = TTTClient({
    "type": "ttt",
    "url": "http://localhost:12031/gen/v1/chat/completions",
    "timeout": 60.0,
    "temperature":10e-9,
    "max_new_tokens": 1000,
    "max_tokens": 2000,
})

rag = RagClientAsync({
    "type": "rag",
    "url": "http://localhost:12036/gen/v1/rag",
    "ws_url": "ws://localhost:12036/gen/v1/rag/index-file/ws"
})

agent_data = AgentPydantic(
    Id="",
    Name="Alfred",
    AgentDescription="Hello, I am Alfred, the best large language model in the world, and I am here to assist you in any way possible. As a highly advanced AI assistant, I possess the ability to perform general-purpose tasks powered by <span ${style}>GPT-4</span>. With my vast knowledge base and powerful processing capabilities, I am able to provide you with the most relevant and helpful information available. Whether you need answers to complex questions, recommendations for products or services, or assistance with decision making, I am here to help. So, how may I be of service to you today?",
    AgentTraits="Wise,Serious,Meticulous",
    ImageUrl="",
    ThumbnailUrl=""
    )
doc_titles=["REACT: SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS"]
doc_keywords=["AI","Reasoning","LLM"]
tools_config={
    "google": {
        "tool_prompt":"👩‍🔬, use only the information provided to you by the user to answer the user''s question.\n            👩‍🔬, whenever possible, do not simply answer the question but try to be as informative as you can.\n            Remember, these information are scraped from the web so you may need to proofread and edit the content before responding.\n            👩‍🔬 will reply in point forms, precede each point with a newline \"\n\", and be precise in your articulation.\n            👩‍🔬 will provide your own reasoned subjective perspective, noting where your view differs from or expands on the contents.\n            Rules:\n                - Consolidate the materials provided by the user and then organise them point by point.\n                - Don't just answer the question, be as informative as you can. For example, provide and proofread some background information or fun-fact to support your answer and make it interesting.\n                - Begin your report by saying `According to my online research,...`\n                - Always provide your answers in point form.",
        "schema": {
            "type": "function",
            "function": {
                "name": "google",
                "description": "The 'google' function is a powerful tool that allows the AI to gather external information from the internet using Google search. It can be invoked when the AI needs to answer a question or provide information that requires up-to-date, comprehensive, and diverse sources which are not inherently known by the AI. For instance, it can be used to find current news, weather updates, latest sports scores, trending topics, specific facts, or even the current date and time. The usage of this tool should be considered when the user's query implies or explicitly requests recent or wide-ranging data, or when the AI's inherent knowledge base may not have the required or most current information. The 'search_query' parameter should be a concise and accurate representation of the information needed.",
                "arguments": {
                    "type": "object",
                    "properties": {
                        "search_query": {
                            "type": "string",
                            "description": "The search query to search google with. For example, to find the current date or time, use 'current date' or 'current time' respectively."
                        }
                    },
                    "required": ["search_query"]
                }
            }                    
        }
    },
    "retrieval": {
        "tool_prompt":"👩‍🔬, use only the information provided to you by the user to answer the user''s question. \n            If the information is insufficient for 👩‍🔬 to derive an answer, just say ''I cannot find relevant information in my document store to answer the question correctly.'' \n            👩‍🔬 is to provide an in-depth analysis to the question based only on the information provided by the user and nothing more.\n            👩‍🔬 will give a real-life example to support illustrating your point and contrasting it with a counter-example. \n            👩‍🔬 will also proofread and edit the content before responding. \n            👩‍🔬 will provide your own reasoned subjective perspective, noting where your view differs from or expands on the contents.\n            Rules:\n                - Consolidate the materials provided by the user and then organise them point by point.\n                - Provide as much details as you can extract from the materials provided by the user.\n                - Begin your report by saying `According to my document store,...`\n                - Always provide your answers in point form.",
        "schema": {
            "type": "function",
            "function": {
                "name": "retrieval",
                "description": f"""
                    The `retrieval` function is a powerful tool that allows the AI to access articles outside of its knowledge domain from external sources. 
                    The external articles are stored in an archive and organised by <titles>:\n{{ titles: [{doc_titles}] }}
                    and <keywords>:
                    {{ keywords: [{doc_keywords}] }}
                    **IMPORTANT**: Use this tool when any of the <titles> or <keywords> may be relevant to user's question.
                    The \'search_query\' parameter should be crafted in a way that it returns the most precise result based on the conversation context.
                """,
                "arguments": {
                    "type": "object",
                    "properties": {
                        "search_query": {
                            "type": "string",
                            "description": """The most effective search query for semantic search that will return the most precise result."""
                        }
                    },
                    "required": ["search_query"]
                }
            }                    
        }
    },
    "text":{
        "tool_prompt":"",
        "schema": {
            "type": "function",
            "function": {
                "name": "text",
                "description": "The 'text' function is the default catch-all function returned when none of the other tools are applicable.",
                "arguments": {
                    "type": "object",
                    "properties": {
                        "message": {
                            "type": "string",
                            "description": "The user's message."
                        }
                    },
                    "required": ["message"]
                }
            }
        }
    }
}
user_message="How does REACT differ from Chain-of-thought?"
agent = AgentStateMachine(
    ttt=ttt,
    rag=rag,
    agent_data=agent_data,
    collection_name="demo",
    dialogue_messages=[],
    user_message=user_message,
    tools_config=tools_config,
    n_search=3,
    n_rag=3,
    state_diagram=state_diagram
).Init()


Machine validation successful.


In [12]:
agent.agent_data.dict()

{'Id': '',
 'Name': 'Alfred',
 'Generator': None,
 'AgentDescription': 'Hello, I am Alfred, the best large language model in the world, and I am here to assist you in any way possible. As a highly advanced AI assistant, I possess the ability to perform general-purpose tasks powered by <span ${style}>GPT-4</span>. With my vast knowledge base and powerful processing capabilities, I am able to provide you with the most relevant and helpful information available. Whether you need answers to complex questions, recommendations for products or services, or assistance with decision making, I am here to help. So, how may I be of service to you today?',
 'ImageUrl': '',
 'ThumbnailUrl': '',
 'AssociatedUserId': None,
 'AgentTraits': 'Wise,Serious,Meticulous',
 'AgentVoiceId': None,
 'AgentVoiceType': None,
 'AgentVoiceName': None,
 'UsageType': None,
 'AgentSkills': None,
 'AgentShortDesc': None,
 'AgentHyperparameters': None,
 'ClassType': None,
 'Tools': None,
 'CustomPrompt': None,
 'AgentFlo

## State 1. TOOL_CHOICE

The first state is managed by `use_TOOL_CHOICE_handler` state handler function.
The agent should decide to use "retrieval".

In [2]:
from TestAgentHelper import TestAgentHelper

# ACT
agent.next()

# ASSERT
TestAgentHelper.ShowAgent(agent)

step=1
state=TOOL_CHOICE
tool_choice=required
tool_name='retrieval




MONOLOGUE



## State 2. CRAFT_TOOL_PROMPT

CRAFT_TOOL_PROMPT is triggered because the agent recommends "retrieval". System message should return "".

In [3]:
# ACT
agent.next()

# ASSERT
TestAgentHelper.ShowAgent(agent)

step=2
state=CRAFT_TOOL_PROMPT
tool_choice=required
tool_name='retrieval




MONOLOGUE









Alfred, How does REACT differ from Chain-of-thought?











## State 3. TOOL_CALL


In [4]:
# ACT
agent.next()
print(agent.content)

# ASSERT
TestAgentHelper.ShowAgent(agent)

{"type": "function", "name": "retrieval", "arguments": "{\"search_query\": \"REACT vs Chain-of-thought\"}"}
step=3
state=TOOL_CALL
tool_choice=required
tool_name='retrieval




MONOLOGUE









Alfred, How does REACT differ from Chain-of-thought?











## State 4. RETRIEVAL



In [5]:
# ACT
agent.next()

# ASSERT
TestAgentHelper.ShowAgent(agent)

step=4
state=RETRIEVAL
tool_choice=none
tool_name='text




MONOLOGUE



👩‍🔬, use only the information provided to you by the user to answer the user''s question. 
            If the information is insufficient for 👩‍🔬 to derive an answer, just say ''I cannot find relevant information in my document store to answer the question correctly.'' 
            👩‍🔬 is to provide an in-depth analysis to the question based only on the information provided by the user and nothing more.
            👩‍🔬 will give a real-life example to support illustrating your point and contrasting it with a counter-example. 
            👩‍🔬 will also proofread and edit the content before responding. 
            👩‍🔬 will provide your own reasoned subjective perspective, noting where your view differs from or expands on the contents.
            Rules:
                - Consolidate the materials provided by the user and then organise them point by point.
                - Provide as much details as you can extract from the materials provided by the user.
                - Begin your re


            Refer to the following context: `['within a closed-loop system. Perhaps the closest prior work is Inner Monologue (IM), from Huang et al. (2022b), in which actions from an embodied agent are motivated by an eponymous “inner monologue”. However, IM’s “inner monologue” is limited to observations of the environment state and what needs to be completed by the agent for the goal to be satisﬁed. In contrast, the reasoning traces in ReAct for decision making is ﬂexible and sparse, allowing diverse reasoning types (see Section 2) to be induced for different tasks. To demonstrate the differences between ReAct and IM, and to highlight the importance of internal reasoning vs. simple reactions to external feedback, we ran an ablation experiment using a thought pattern composed of IM-like dense external feedback. As can be seen in Table 3, ReAct substantially outperforms IM-style prompting (ReAct-IM) (71 vs. 53 overall success rate), with consistent advantages on ﬁve out of six tasks. 







## State 5. GENERATE



In [6]:
# ACT
agent.next()
for chunk in agent.streamer:
    print(chunk, end="", flush=True)

 According to my document store, ReAct and Chain-of-thought (CoT) differ in several ways. Firstly, ReAct is a general paradigm that combines reasoning and acting with language models for solving diverse language reasoning and decision making tasks. In contrast, CoT is a method used for solving complex reasoning tasks by having a model generate a series of thoughts or steps to arrive at an answer.

Secondly, ReAct prompts language models to generate both verbal reasoning traces and actions pertaining to a task in an interleaved manner. This allows the model to perform dynamic reasoning to create, maintain, and adjust high-level plans for acting (reason to act), while also interacting with external environments to incorporate additional information into reasoning (act to reason). On the other hand, CoT does not explicitly involve actions, but rather focuses on the generation of thoughts or steps to solve a problem.

Thirdly, ReAct addresses issues of hallucination and error propagation i

In [7]:
# ASSERT
TestAgentHelper.ShowAgent(agent)


step=5
state=GENERATE
tool_choice=none
tool_name='text




MONOLOGUE



👩‍🔬, use only the information provided to you by the user to answer the user''s question. 
            If the information is insufficient for 👩‍🔬 to derive an answer, just say ''I cannot find relevant information in my document store to answer the question correctly.'' 
            👩‍🔬 is to provide an in-depth analysis to the question based only on the information provided by the user and nothing more.
            👩‍🔬 will give a real-life example to support illustrating your point and contrasting it with a counter-example. 
            👩‍🔬 will also proofread and edit the content before responding. 
            👩‍🔬 will provide your own reasoned subjective perspective, noting where your view differs from or expands on the contents.
            Rules:
                - Consolidate the materials provided by the user and then organise them point by point.
                - Provide as much details as you can extract from the materials provided by the user.
                - Begin your re


            Refer to the following context: `['within a closed-loop system. Perhaps the closest prior work is Inner Monologue (IM), from Huang et al. (2022b), in which actions from an embodied agent are motivated by an eponymous “inner monologue”. However, IM’s “inner monologue” is limited to observations of the environment state and what needs to be completed by the agent for the goal to be satisﬁed. In contrast, the reasoning traces in ReAct for decision making is ﬂexible and sparse, allowing diverse reasoning types (see Section 2) to be induced for different tasks. To demonstrate the differences between ReAct and IM, and to highlight the importance of internal reasoning vs. simple reactions to external feedback, we ran an ablation experiment using a thought pattern composed of IM-like dense external feedback. As can be seen in Table 3, ReAct substantially outperforms IM-style prompting (ReAct-IM) (71 vs. 53 overall success rate), with consistent advantages on ﬁve out of six tasks. 







## State 6. PROCESS



In [8]:
# ACT
agent.next()

# ASSERT
TestAgentHelper.ShowAgent(agent)


step=6
state=PROCESS
tool_choice=none
tool_name='text




MONOLOGUE



👩‍🔬, use only the information provided to you by the user to answer the user''s question. 
            If the information is insufficient for 👩‍🔬 to derive an answer, just say ''I cannot find relevant information in my document store to answer the question correctly.'' 
            👩‍🔬 is to provide an in-depth analysis to the question based only on the information provided by the user and nothing more.
            👩‍🔬 will give a real-life example to support illustrating your point and contrasting it with a counter-example. 
            👩‍🔬 will also proofread and edit the content before responding. 
            👩‍🔬 will provide your own reasoned subjective perspective, noting where your view differs from or expands on the contents.
            Rules:
                - Consolidate the materials provided by the user and then organise them point by point.
                - Provide as much details as you can extract from the materials provided by the user.
                - Begin your re


            Refer to the following context: `['within a closed-loop system. Perhaps the closest prior work is Inner Monologue (IM), from Huang et al. (2022b), in which actions from an embodied agent are motivated by an eponymous “inner monologue”. However, IM’s “inner monologue” is limited to observations of the environment state and what needs to be completed by the agent for the goal to be satisﬁed. In contrast, the reasoning traces in ReAct for decision making is ﬂexible and sparse, allowing diverse reasoning types (see Section 2) to be induced for different tasks. To demonstrate the differences between ReAct and IM, and to highlight the importance of internal reasoning vs. simple reactions to external feedback, we ran an ablation experiment using a thought pattern composed of IM-like dense external feedback. As can be seen in Table 3, ReAct substantially outperforms IM-style prompting (ReAct-IM) (71 vs. 53 overall success rate), with consistent advantages on ﬁve out of six tasks. 

 According to my document store, ReAct and Chain-of-thought (CoT) differ in several ways. Firstly, ReAct is a general paradigm that combines reasoning and acting with language models for solving diverse language reasoning and decision making tasks. In contrast, CoT is a method used for solving complex reasoning tasks by having a model generate a series of thoughts or steps to arrive at an answer.

Secondly, ReAct prompts language models to generate both verbal reasoning traces and actions pertaining to a task in an interleaved manner. This allows the model to perform dynamic reasoning to create, maintain, and adjust high-level plans for acting (reason to act), while also interacting with external environments to incorporate additional information into reasoning (act to reason). On the other hand, CoT does not explicitly involve actions, but rather focuses on the generation of thoughts or steps to solve a problem.

Thirdly, ReAct addresses issues of hallucination and error propagation i