# Intro to Built-In Functions from `contrib.functions`


## Initial Setup

Lets first import the necessary modules and define the agents.

In [1]:
import os
from autogen import AssistantAgent, UserProxyAgent
from autogen.agentchat.contrib.functions import youtube_utils as yt
from autogen.agentchat.contrib.functions import file_utils as fu

## Functions and Requirements

A python functions can have have many requirements. For example, 3rd-party python packages and secrets.

### Accessing requirements
You can access requirements via the `python_packages` and `env_var` properties

In [2]:
# get the requirements for the youtube transcript function
print("Code: ", yt.get_youtube_transcript.name)
print("Required python packages: ", yt.get_youtube_transcript.python_packages)


Code:  get_youtube_transcript
Required python packages:  ['youtube_transcript_api==0.6.0']


### Testing and pre-installing requirements

We also provide methods to install the required python packages. To do this, execute the following method in your execution environment. If required secrets are missing, the method will throw an error.

This is especially useful when setup is costly and needs to be done before actually invoking the function in some end task (in this case use by the agent).

## Simple Example

In [3]:
config_list = [
    {
        "model": "gpt-4",
        "api_key": os.environ.get("OPENAI_API_KEY"),
    }
]

assistant = AssistantAgent(name="coder", llm_config={"config_list": config_list, "cache": None})
user = UserProxyAgent(
    name="user",
    code_execution_config={
        "work_dir": "/tmp",
    },
    human_input_mode="NEVER",
    is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
)

In [4]:
assistant.register_for_llm(description="Fetch transcript of a youtube video")(yt.get_youtube_transcript)
user.register_for_execution()(yt.get_youtube_transcript)

result = user.initiate_chat(
    assistant,
    message="Please summarize the video: https://www.youtube.com/watch?v=9iqn1HhFJ6c",
    summary_method="last_msg",
)

[33muser[0m (to coder):

Please summarize the video: https://www.youtube.com/watch?v=9iqn1HhFJ6c

--------------------------------------------------------------------------------
[33mcoder[0m (to user):

[32m***** Suggested tool Call (call_zCo0cdMpn3jfN8jlu7LWGqcT): get_youtube_transcript *****[0m
Arguments: 
{
"youtube_link": "https://www.youtube.com/watch?v=9iqn1HhFJ6c"
}
[32m***************************************************************************************[0m

--------------------------------------------------------------------------------
[35m
>>>>>>>> EXECUTING FUNCTION get_youtube_transcript...[0m
[33muser[0m (to coder):

[33muser[0m (to coder):

[32m***** Response from calling tool "call_zCo0cdMpn3jfN8jlu7LWGqcT" *****[0m
[32m**********************************************************************[0m

--------------------------------------------------------------------------------
[33mcoder[0m (to user):

The video discusses the potential and risks of Arti

## Advanced: Registering Multiple Functions

Lets import multiple functions and use them accomplish more complex tasks.

In [5]:
# register multiple file reading functions
for foo in [
    # fu.read_text_from_image,
    fu.read_text_from_pdf,
    fu.read_text_from_docx,
    fu.read_text_from_pptx,
    fu.read_text_from_xlsx,
    fu.read_text_from_audio,
]:
    foo_desc = foo.__doc__  # get doctring of the function
    assistant.register_for_llm(description=foo_desc)(foo)
    user.register_for_execution()(foo)

In [6]:
dummy_png = "https://upload.wikimedia.org/wikipedia/commons/thumb/0/0f/Captioned_image_dataset_examples.jpg/1024px-Captioned_image_dataset_examples.jpg"
dummy_pdf = "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf"
dummy_mp3 = "https://github.com/realpython/python-speech-recognition/raw/master/audio_files/harvard.wav"

result = user.initiate_chat(
    assistant,
    message=f"Please summarize the contents of the following files: {' '.join([dummy_png, dummy_pdf, dummy_mp3])}",
    summary_method="last_msg",
)

[33muser[0m (to coder):

Please summarize the contents of the following files: https://upload.wikimedia.org/wikipedia/commons/thumb/0/0f/Captioned_image_dataset_examples.jpg/1024px-Captioned_image_dataset_examples.jpg https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf https://github.com/realpython/python-speech-recognition/raw/master/audio_files/harvard.wav

--------------------------------------------------------------------------------
[33mcoder[0m (to user):

[32m***** Suggested tool Call (call_A82Eb6pF1WcBZR6rBlfTEL0h): read_text_from_pdf *****[0m
Arguments: 
{
  "file_path": "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf"
}
[32m***********************************************************************************[0m

--------------------------------------------------------------------------------
[35m
>>>>>>>> EXECUTING FUNCTION read_text_from_pdf...[0m
[33muser[0m (to coder):

[33muser[0m (to coder):

[32m***** Response f

## Advanced: Functions that Require Secrets

In this example, we will use a function that expects a secret, e.g., an `OPENAI_API_KEY` for it work. One such example is the function that using GPT-4-vision to perform image understanding.

In [7]:
assistant.register_for_llm(description="Use gpt4 vision to understand an image")(fu.caption_image_using_gpt4v)
user.register_for_execution()(fu.caption_image_using_gpt4v)

result = user.initiate_chat(
    assistant,
    message=f"Please summarize the contents of the following image using gpt4v: {dummy_png}",
    summary_method="last_msg",
)

[33muser[0m (to coder):

Please summarize the contents of the following image using gpt4v: https://upload.wikimedia.org/wikipedia/commons/thumb/0/0f/Captioned_image_dataset_examples.jpg/1024px-Captioned_image_dataset_examples.jpg

--------------------------------------------------------------------------------
[33mcoder[0m (to user):

[32m***** Suggested tool Call (call_PeT4QTSuC1T8BOzGfCswdCc8): caption_image_using_gpt4v *****[0m
Arguments: 
{
  "file_path_or_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/0/0f/Captioned_image_dataset_examples.jpg/1024px-Captioned_image_dataset_examples.jpg"
}
[32m******************************************************************************************[0m

--------------------------------------------------------------------------------
[35m
>>>>>>>> EXECUTING FUNCTION caption_image_using_gpt4v...[0m
[33muser[0m (to coder):

[33muser[0m (to coder):

[32m***** Response from calling tool "call_PeT4QTSuC1T8BOzGfCswdCc8" *****[