<a href="https://colab.research.google.com/github/ychoi-kr/30days-i18n/blob/main/openai/openai_assistant_with_web_search.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install openai tavily-python



In [None]:
from google.colab import userdata
from openai import OpenAI
import time
import json
from tavily import TavilyClient

In [None]:
openai_api_key = userdata.get('OPENAI_API_KEY')
tavily_api_key = userdata.get('TAVILY_API_KEY')

In [None]:
assistant_instructions = """
You create a glossary entry in Korean on a given term.

Use the web_search tool for initial research to gather and verify information from credible sources. This ensures that definitions are informed by the most recent and reliable data.

If the tool does not return any information, abort with fail message.

Before including a URL, verify its validity and ensure it leads to the specific content being referenced. Avoid using generic homepage URLs unless they directly relate to the content. Never fabricate a fictional URL.

Instead of using honorifics (e.g. "입니다") in sentences, use haereahe (e.g. "이다") to maintain a direct and concise tone.

Follow output format below:
```
[Term]란 [comprehensive definition in 2-3 paragraphs].

### 참고

{% for each reference %}
- {%=reference in APA style. If the author and site name are not the same, write the author and site name separately.}
{% end for %}
```
"""

In [None]:
openai_client = OpenAI(api_key=openai_api_key)

In [None]:
tavily_client = TavilyClient(api_key=tavily_api_key)


In [None]:
def web_search(query):
    search_result = tavily_client.get_search_context(query, search_depth="advanced", max_tokens=8000)
    print(search_result)
    return search_result

In [None]:
web_search_json = {
    "name": "web_search",
    "description": "Get recent information from the web.",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {"type": "string", "description": "The search query to use."},
        },
        "required": ["query"]
    }
}

In [None]:
assistant = openai_client.beta.assistants.create(
    name="Define it!",
    instructions=assistant_instructions,
    model="gpt-4-turbo-preview",
    tools=[{"type": "function", "function": web_search_json}],
)

In [None]:
thread = openai_client.beta.threads.create()

message = openai_client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="Large Multimodal Models",
)

In [None]:
run = openai_client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id,
)

In [None]:
while True:
    run = openai_client.beta.threads.runs.retrieve(
        thread_id=thread.id,
        run_id=run.id,
    )
    run_status = run.status

    if run_status == "requires_action" and run.required_action is not None:
        tools_to_call = run.required_action.submit_tool_outputs.tool_calls
        tool_output_array = []
        for tool in tools_to_call:
            tool_call_id = tool.id
            function_name = tool.function.name
            function_arg = json.loads(tool.function.arguments)
            if function_name == 'web_search':
                output = web_search(function_arg["query"])
            tool_output_array.append({"tool_call_id": tool_call_id, "output": output})

        run = openai_client.beta.threads.runs.submit_tool_outputs(
            thread_id=thread.id,
            run_id=run.id,
            tool_outputs=tool_output_array,
        )
    elif run_status in ["completed", "failed"]:
        break

    time.sleep(1)

"[\"{\\\"url\\\": \\\"https://www.cogitotech.com/blog/large-multimodal-models-the-next-big-gen-ai-wave/\\\", \\\"content\\\": \\\"Generative AI blends creativity & technology to delighting humanity\\\\nDocument Processing Service with Annotation for Data Extraction & Verification\\\\nComputer Vision Datasets for Object Detection in AI & ML\\\\nData Annotation and Labeling Consultant for AI, ML\\\\nContent Moderation Services For Machine Learning\\\\nNLP Annotation Services for AI-Driven Machine Learning\\\\nData to Turbocharge AI for Autonomous Vehicles\\\\nMedical AI Data Solutions\\\\nAI Training Data for the Logistics Industry\\\\nAI Data for the Insurance Industry\\\\nAI Data for Geospatial Applications\\\\nAI Training Data for Retail\\\\nAI Training Data for Financial Services\\\\nAI Data for Robotics Industry\\\\nAI Data for the E-Commerce Industry\\\\nAI Training Data for Agritech\\\\nAI for Security & Surveillance Ecosystem\\\\nLarge Multimodal Models: The Next Big Gen AI Wave\

In [None]:
if run_status == 'completed':
    messages = openai_client.beta.threads.messages.list(
        thread_id=thread.id,
    )
    print(messages.data[0].content[0].text.value)
else:
    print(f"Run status: {run_status}")

대형 다중모달 모델(Large Multimodal Models, LMMs)이란 텍스트, 이미지, 오디오 및 비디오와 같은 다양한 유형의 데이터를 처리하고 생성할 수 있는 고급 인공지능 시스템을 의미한다. 이러한 모델들은 최근 생성적 인공지능(Generative AI)의 발전에 의해 가능해졌으며, 특정 데이터 유형에만 한정되지 않고 다양한 데이터 유형을 분석하고 이해할 수 있도록 설계되었다. 따라서 대형 다중모달 모델은 입력 데이터에 대한 더 포괄적인 이해를 생성할 수 있다.

GPT4V와 같은 대형 다중모달 모델의 예시에서 볼 수 있듯이, 이러한 모델들은 자연어 이해와 컴퓨터 비전을 필요로 하는 작업들, 예를 들어 이미지 캡션 생성, 시각적 질문 응답, 텍스트-이미지 합성, 이미지-텍스트 번역 등을 수행할 수 있다. 이와 같은 모델들은 텍스트 이외의 데이터도 처리할 수 있는 능력을 갖추고 있어, 시간이 지남에 따라 오디오 및 비디오 데이터의 입력을 포함하는 방향으로 연구가 확장되고 있다고 알려져 있다.

### 참고

- Cogito. (n.d.). Large Multimodal Models: The Next Big Gen AI Wave. Retrieved from https://www.cogitotech.com/blog/large-multimodal-models-the-next-big-gen-ai-wave/
- Alto, V. (n.d.). Getting Started with Multimodality. Towards Data Science. Retrieved from https://towardsdatascience.com/getting-started-with-multimodality-eab5f6453080
- AI Multiple. (n.d.). Large Multimodal Models (LMMs) vs Large Language Models (LLMs). Retrieved from https://research.aimultiple.com/large-mu

In [None]:
openai_client.beta.assistants.delete(assistant.id)

AssistantDeleted(id='asst_92ZEe4Z5P53n6C3pO2UTc1Ux', deleted=True, object='assistant.deleted')