<img src="https://storage.googleapis.com/mle-courses-prod/users/61b6fa1ba83a7e37c8309756/private-files/5ca3e740-cf30-11ef-a16d-9b898737f71b-Screen_Shot_2025_01_10_at_15.53.26.png" width=700 />

Tác giả: [ProtonX Team](https://protonx.coursemind.io/courses/677e0eeb02a8c600bdeb64e5/info)

Yêu cầu:
- [OpenAI key](https://platform.openai.com/settings/organization/api-keys) để làm backend LLMs
- [Serp API](https://serper.dev/api-key) key để tìm trên Google

In [None]:
!pip install serpapi
!pip install google-search-results

Collecting serpapi
  Downloading serpapi-0.1.5-py2.py3-none-any.whl.metadata (10 kB)
Downloading serpapi-0.1.5-py2.py3-none-any.whl (10 kB)
Installing collected packages: serpapi
Successfully installed serpapi-0.1.5
Collecting google-search-results
  Downloading google_search_results-2.4.2.tar.gz (18 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: google-search-results
  Building wheel for google-search-results (setup.py) ... [?25l[?25hdone
  Created wheel for google-search-results: filename=google_search_results-2.4.2-py3-none-any.whl size=32010 sha256=ebd193f460049ad0bb198bb4bfddf51972006209487126ae0b156062fff9c039
  Stored in directory: /root/.cache/pip/wheels/6e/42/3e/aeb691b02cb7175ec70e2da04b5658d4739d2b41e5f73cd06f
Successfully built google-search-results
Installing collected packages: google-search-results
Successfully installed google-search-results-2.4.2


In [None]:
from openai import OpenAI
from google.colab import userdata

In [None]:
def reasoning_step(state, user_input, intermediate_results):
    client = OpenAI(api_key=userdata.get('open_ai_key'))

    # Construct the assistant message dynamically with the state, user input, and intermediate results
    messages = [
        {
            "role": "system",
            "content": (
                "You are a reasoning and acting agent. Based on the current state and user input, decide the next action.\n"
                f"State: {state}\n"
                f"User Input: {user_input}\n"
                f"Intermediate Results: {intermediate_results}\n\n"
                "Respond with one of these actions:\n"
                "- Search(query)\n"
                "- Do nothing\n"
                "- Summarize(results): Only call when the length is greater than 100\n"
            )
        }
    ]

    response = client.chat.completions.create(
        model="gpt-4.1",
        messages=messages,
        response_format={
            "type": "json_schema",
            "json_schema": {
                "name": "agent_schema",
                "schema": {
                    "type": "object",
                    "required": [],
                    "properties": {}
                },
                "strict": False
            }
        },
        tools=[
            {
                "type": "function",
                "function": {
                    "name": "search_action",
                    "strict": True,
                    "parameters": {
                        "type": "object",
                        "required": ["query"],
                        "properties": {
                            "query": {
                                "type": "string",
                                "description": "The search query string to be submitted to the search engine."
                            }
                        },
                        "additionalProperties": False
                    },
                    "description": "Performs a search using the SERP API and returns the organic results."
                }
            },
            {
                "type": "function",
                "function": {
                    "name": "do_nothing",
                    "strict": True,
                    "parameters": {
                        "type": "object",
                        "required": [],
                        "properties": {},
                        "additionalProperties": False
                    },
                    "description": "A function that performs no action."
                }
            },
            {
                "type": "function",
                "function": {
                    "name": "summarize_action",
                    "description": "Summarizes product results using OpenAI's GPT-4 model",
                    "parameters": {
                    "type": "object",
                    "required": [
                        "results"
                    ],
                    "properties": {
                        "results": {
                        "type": "string",
                        "description": "The product results to summarize"
                        }
                    },
                    "additionalProperties": False
                    },
                    "strict": True
                }
            }
        ],
        tool_choice="required",
        temperature=1,
        max_completion_tokens=2048,
        top_p=1,
        frequency_penalty=0,
        presence_penalty=0
    )

    return response

Giả sử chưa search bao giờ:

In [None]:
assistant_message = reasoning_step(
    "start",
    "I lost my phone and I want to find the price of a newest iphone. Seems like the newsest is iphone 15?",
    []
).choices[0].message

In [None]:
assistant_message

ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_WLHzdQFRyMjhyKyvFPqVQAYd', function=Function(arguments='{"query":"iPhone 15 newest model price 2023"}', name='search_action'), type='function')])

Đã search và có thông tin rồi

In [None]:
assistant_message_a = reasoning_step(
    "start",
    "iPhone 14 price?",
    [{
            "action": "search_action",
            "parameters": "iPhone 14 price?",
            "result": "price: $1000"
    }
     ]
).choices[0].message
assistant_message_a

ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_EuooyZ0bPFQ2ireNlWGwFC0C', function=Function(arguments='{}', name='do_nothing'), type='function')])

ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_wu7qpAlqxwPp6mACHfU9l70p', function=Function(arguments='{}', name='do_nothing'), type='function')])

Đánh giá chất lượng


```
Query: iphone 14 price?
Expect: trong output cuối cùng có giá $500

```




In [None]:
def process_tool_calls(response):
    # Extract tool call information from the response
    tool_calls = response.tool_calls

    if tool_calls:
        # Extract tool function details
        tool_call_id = tool_calls[0].id
        tool_function_name = tool_calls[0].function.name
        tool_query_string = eval(tool_calls[0].function.arguments)

        # Print extracted details
        print('Tool Function Name:', tool_function_name)
        print('Tool Query String:', tool_query_string)

        return tool_function_name, tool_query_string

    else:
        print("No tool calls identified.")
        return None, None

process_tool_calls(assistant_message)

Tool Function Name: summarize_action
Tool Query String: {'results': 'price: $1000'}


('summarize_action', {'results': 'price: $1000'})

In [None]:
def search_action(query):
    from serpapi import GoogleSearch
    api_key = userdata.get("serpapi")
    search = GoogleSearch({
        "q": query,
        "api_key": api_key,
        "num": 1})

    results = search.get_dict()

    # Check if the results contain weather information
    if 'answer_box' in results and results.get('answer_box', {}).get('type') == 'weather_result':
        return extract_weather_data(results)

    # Return the full results if not a weather query
    return results

def extract_weather_data(results):
    """
    Extracts weather data from SerpAPI weather results

    Args:
        results (dict): The SerpAPI results dictionary

    Returns:
        dict: Dictionary containing formatted weather information
    """
    weather_data = {}

    if 'answer_box' in results:
        answer_box = results['answer_box']

        # Extract current weather information
        weather_data['current'] = {
            'temperature': answer_box.get('temperature'),
            'unit': answer_box.get('unit'),
            'weather': answer_box.get('weather'),
            'location': answer_box.get('location'),
            'date': answer_box.get('date'),
            'humidity': answer_box.get('humidity'),
            'precipitation': answer_box.get('precipitation'),
            'wind': answer_box.get('wind')
        }

        # Extract forecast if available
        if 'forecast' in answer_box:
            weather_data['forecast'] = answer_box['forecast']

        # Extract hourly forecast if available
        if 'hourly_forecast' in answer_box:
            weather_data['hourly'] = answer_box['hourly_forecast']

    return weather_data

In [None]:
search_action("Thời tiết hôm nay")

{'error': 'Your account has run out of searches.'}

In [None]:
def search_action(query):
    from serpapi import GoogleSearch
    api_key = userdata.get("serpapi")
    search = GoogleSearch({
        "q": query,
        "api_key": api_key,
        "num": 1})

    results = search.get_dict()


    info = []

    print("--->", results)
    for item in results['organic_results']:
        info.append(item['snippet'])

    return "\n".join(info)

In [None]:
search_action("thời tiết Hồ Gươm hôm nay")

---> {'search_metadata': {'id': '68031605409d3aa21b784cd5', 'status': 'Success', 'json_endpoint': 'https://serpapi.com/searches/1eeaeb5ea8699521/68031605409d3aa21b784cd5.json', 'created_at': '2025-04-19 03:18:29 UTC', 'processed_at': '2025-04-19 03:18:29 UTC', 'google_url': 'https://www.google.com/search?q=th%E1%BB%9Di+ti%E1%BA%BFt+H%E1%BB%93+G%C6%B0%C6%A1m+h%C3%B4m+nay&oq=th%E1%BB%9Di+ti%E1%BA%BFt+H%E1%BB%93+G%C6%B0%C6%A1m+h%C3%B4m+nay&num=1&sourceid=chrome&ie=UTF-8', 'raw_html_file': 'https://serpapi.com/searches/1eeaeb5ea8699521/68031605409d3aa21b784cd5.html', 'total_time_taken': 1.5}, 'search_parameters': {'engine': 'google', 'q': 'thời tiết Hồ Gươm hôm nay', 'google_domain': 'google.com', 'num': '1', 'device': 'desktop'}, 'search_information': {'query_displayed': 'thời tiết Hồ Gươm hôm nay', 'total_results': 2110000, 'time_taken_displayed': 0.2, 'organic_results_state': 'Results for exact spelling', 'results_for': 'Hoàn Kiếm Lake, Hang Trong'}, 'knowledge_graph': {'entity_type': '

'Thời tiết hiện tại ; RealFeel®. 86° ; RealFeel Shade™. 83° ; Chỉ số UV tối đa. 1 Thấp ; Gió. ĐĐN 6 mi/h ; Gió giật mạnh. 12 mi/h.'

In [None]:
search_action("Thời tiết hôm nay Hà Nội đi bơi ở Hồ Gươm")

['Thời tiết hiện tại. 08:22. 77°F. Có mây. RealFeel® 86°. Rất ấm áp. RealFeel Shade™ 83°. Rất ấm áp. RealFeel®. 86°. RealFeel Shade™. 83°. Chỉ số UV tối đa.']


'Thời tiết hiện tại. 08:22. 77°F. Có mây. RealFeel® 86°. Rất ấm áp. RealFeel Shade™ 83°. Rất ấm áp. RealFeel®. 86°. RealFeel Shade™. 83°. Chỉ số UV tối đa.'

In [None]:
def do_nothing():
    return None

In [None]:
def summarize_action(results):
    client = OpenAI(api_key=userdata.get('open_ai_key'))
    prompt = f"Summarize the following product results:\n{results}"

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "user", "content": prompt},
        ]
    )
    return response.choices[0].message.content


Open source model

In [None]:
import transformers

In [None]:
def summarize_using_open_source_model(doc):
    from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
    tokenizer = AutoTokenizer.from_pretrained("minhtoan/t5-small-wikilingua_vietnamese")

    model = AutoModelForSeq2SeqLM.from_pretrained("minhtoan/t5-small-wikilingua_vietnamese")
    model.cuda()
    tokenized_text = tokenizer.encode(doc, return_tensors="pt").cuda()
    model.eval()
    summary_ids = model.generate(
                        tokenized_text,
                        max_length=256,
                        num_beams=5,
                        repetition_penalty=2.5,
                        length_penalty=1.0,
                        early_stopping=True
                    )
    output = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return output

In [None]:
summarize_using_open_source_model("Hà Nội hiện có 30 đơn vị hành chính cấp huyện (12 quận, 17 huyện, một thị xã) và 526 đơn vị hành chính cấp xã (160 phường, 345 xã và 21 thị trấn). Thành phố chưa công bố số phường xã sau sắp xếp, nhưng nếu giảm 70% theo định hướng của lãnh đạo, Hà Nội sẽ còn 133 phường, xã và thị trấn.")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/446 [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/4.31M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/16.3M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/74.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/801 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/1.20G [00:00<?, ?B/s]

'Hà Nội hiện có 30 đơn vị hành chính cấp huyện (12 quận, 17 huyện, một thị xã) và 526 đơn vị hành chính cấp xã (160 phường, 345 xã và 21 thị trấn). Theo định hướng của lãnh đạo, Hà Nội sẽ còn 133 phường, xã và thị trấn.'

In [None]:
def answer_question(question, search_results):
    client = OpenAI(api_key=userdata.get('open_ai_key'))
    prompt = f"Answer the following question using the provided search results:\n\nQuestion: {question}\n\nSearch Results:\n{search_results}"
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "user", "content": prompt},
        ]
    )
    return response.choices[0].message.content


In [None]:
function_mapping = {
    "search_action": search_action,
    "do_nothing": do_nothing,
    "summarize_action": summarize_action,
    "answer_question": answer_question,
}



In [None]:
function_mapping["search_action"]("Thời tiết hôm nay Hà Nội đi bơi ở Hồ Gươm")

['Thời tiết hiện tại. 08:22. 77°F. Có mây. RealFeel® 86°. Rất ấm áp. RealFeel Shade™ 83°. Rất ấm áp. RealFeel®. 86°. RealFeel Shade™. 83°. Chỉ số UV tối đa.']


'Thời tiết hiện tại. 08:22. 77°F. Có mây. RealFeel® 86°. Rất ấm áp. RealFeel Shade™ 83°. Rất ấm áp. RealFeel®. 86°. RealFeel Shade™. 83°. Chỉ số UV tối đa.'

In [None]:
# function_to_call = function_mapping[tool_function_name]
# function_response = function_to_call(**tool_query_string)

In [None]:
# Get the current date format DD-MM-YYYY

def get_current_date():
    from datetime import datetime
    current_date = datetime.now().strftime("%d-%m-%Y")
    return current_date

In [None]:
get_current_date()

'19-04-2025'

In [None]:
from openai import OpenAI
from google.colab import userdata

def reasoning_step_advanced(state, user_input, intermediate_results):
    client = OpenAI(api_key=userdata.get('open_ai_key'))

    current_date = get_current_date()
    # Construct the assistant message dynamically with the state, user input, and intermediate results
    messages = [
        {
            "role": "system",
            "content": (
                "You are a reasoning and acting agent. Based on the current state and user input, decide the next action.\n"
                f"Current date is: {current_date}\n"
                f"State: {state}\n"
                f"User Input: {user_input}\n"
                f"Intermediate Results: {intermediate_results}\n\n"
                "Respond with one of these actions:\n"
                "- Search(query)\n"
                "- Do nothing\n"
                "- answer_question(results)\n"
            )
        }
    ]

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        response_format={
            "type": "json_schema",
            "json_schema": {
                "name": "agent_schema",
                "schema": {
                    "type": "object",
                    "required": [],
                    "properties": {}
                },
                "strict": False
            }
        },
        tools=[
            {
                "type": "function",
                "function": {
                    "name": "search_action",
                    "strict": True,
                    "parameters": {
                        "type": "object",
                        "required": ["query"],
                        "properties": {
                            "query": {
                                "type": "string",
                                "description": "The search query string to be submitted to the search engine."
                            }
                        },
                        "additionalProperties": False
                    },
                    "description": "Performs a search using the SERP API and returns the organic results."
                }
            },
            {
                "type": "function",
                "function": {
                    "name": "do_nothing",
                    "strict": True,
                    "parameters": {
                        "type": "object",
                        "required": [],
                        "properties": {},
                        "additionalProperties": False
                    },
                    "description": "A function that performs no action."
                }
            },
            {
                "type": "function",
                "function": {
                    "name": "summarize_action",
                    "description": "Summarizes product results using OpenAI's GPT-4 model",
                    "parameters": {
                    "type": "object",
                    "required": [
                        "results"
                    ],
                    "properties": {
                        "results": {
                        "type": "string",
                        "description": "The product results to summarize"
                        }
                    },
                    "additionalProperties": False
                    },
                    "strict": True
                }
            },
            {
                "type": "function",
                "function": {
                    "name": "answer_question",
                    "description": "Answer question using the search result",
                    "parameters": {
                    "type": "object",
                    "required": [
                        "question",
                        "search_results"
                    ],
                    "properties": {
                        "question": {
                            "type": "string",
                            "description": "The question to be answered"
                        },
                        "search_results": {
                            "type": "string",
                            "description": "The search results from api to answer the question"
                        }
                    },
                    "additionalProperties": False
                    },
                    "strict": True
                }
            }
        ],
        tool_choice="required",
        temperature=1,
        max_completion_tokens=2048,
        top_p=1,
        frequency_penalty=0,
        presence_penalty=0
    )

    return response

In [None]:
def react_agent(user_input):
    state = "start"
    intermediate_results = []

    while True:
        state = "in_progress"
        # Reasoning step: Decide the next action
        action_response = reasoning_step_advanced(state, user_input, intermediate_results).choices[0].message

        action_response = process_tool_calls(action_response)
        tool_function_name, tool_query_string = action_response

        if tool_function_name == "do_nothing":
            return intermediate_results[-1]["result"]
            state = "end"
            break

        # Execute the selected action
        function_to_call = function_mapping[tool_function_name]
        action_result = function_to_call(**tool_query_string)

        print(action_result)

        # Save intermediate results
        intermediate_results.append({
            "action": tool_function_name,
            "parameters": tool_query_string,
            "result": action_result
        })


    return intermediate_results

user_query = "Thời tiết hôm nay có phù hợp để đi bơi ở Hồ Gươm không?"
final_answer = react_agent(user_query)
print(f"Final Output: {final_answer}")

Tool Function Name: search_action
Tool Query String: {'query': 'Thời tiết Hà Nội ngày 19-04-2025'}


KeyError: 'organic_results'