## Preable - Install Deps

There are only a few dependencies for this tutorial.

In [1]:
!pip install pyautogen openai

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
[0m

## Introduction - Art Generator

In this example we will use AutoGen framework to construct an agent that is capable of generating an image through DALLE-3 and saving it to local disk.

For this demo, we are going to utilize [function calling](https://readme.fireworks.ai/docs/function-calling) feature launched by Fireworks. We initialize two agents - `UserProxyAgent` and `AssistantAgent`. The `AssistantAgent` is given the ability to issue a call for the provided functions but not execute them while `UserProxyAgent` is given the ability to execute the function calls issues by the `AssistantAgent`. In order to achieve this behaviour we use decorators provided by AutoGen library called `register_for_llm` and `register_for_execution`. Using these decorators allows us to easily define python functions and turn them into [JSON Spec](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat) needed by function calling API.

Finally, we setup system prompt for both the agents. We ask the `AssistantAgent` to be a helpful agent & focus on generating the correct function calls and we leave `UserProxyAgent` as is. For more advanced use cases we can ask `UserProxyAgent` to be a plan generator.


In [17]:
import autogen
import os
from IPython import get_ipython
from typing_extensions import Annotated
from typing import List
import uuid
import requests  # to perform HTTP requests
from pathlib import Path
from openai import OpenAI
from matplotlib import pyplot as plt
import cv2

In [12]:
config_list = autogen.config_list_from_dotenv(
  dotenv_file_path="~/secret_keys.env",
  model_api_key_map={
      #"gpt-4": {
      #  "api_key_env_var": "OPENAI_API_KEY",
      #},
      "accounts/fireworks/models/function-call-v1": {
          "api_key_env_var": "FW_API_KEY",
          "base_url": "https://api.fireworks.ai/inference/v1",
      },
  },
  filter_dict={
      "model": {
          #"gpt-4",
          "accounts/fireworks/models/function-call-v1"
      }
  }
)

In [None]:
config_list

In [14]:
llm_config = {
    "config_list": config_list,
    "timeout": 120,
    "temperature": 0
}
chatbot = autogen.AssistantAgent(
    name="chatbot",
    system_message="You are a helpful assistant that can use available functions when needed to solve problems. At each point, do your best to determine if the user's request has been addressed. If the request HAS been addressed, respond with a summary of the result. The summary must be written as a coherent helpful response to the user request e.g. 'Sure, here is result to your request ' or 'The tallest mountain in Africa is ..' etc. The summary MUST end with the word TERMINATE. If the  user request is pleasantry or greeting, you should respond with a pleasantry or greeting and TERMINATE.",
    llm_config=llm_config,
)

# create a UserProxyAgent instance named "user_proxy"
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    is_termination_msg=lambda x: x.get("content", "") and x.get("content", "").rstrip().endswith("TERMINATE."),
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
    code_execution_config={"work_dir": "coding"},
)


@user_proxy.register_for_execution()
@chatbot.register_for_llm(name="generate_and_save_images", description="Function to paint, draw or illustrate images based on the users query or request. Generates images from a given query using OpenAI's DALL-E model and saves them to disk. Use the code below anytime there is a request to create an image.")
def generate_and_save_images(query: Annotated[str, "A natural language description of the image to be generated"], image_size: Annotated[str, "The size of the image to be generated. (default is '1024x1024')"] = "1024x1024") -> List[str]:
    """
    :param query: A natural language description of the image to be generated.
    :param image_size: )
    :return: A list of filenames for the saved images.
    """

    client = OpenAI()  # Initialize the OpenAI client
    response = client.images.generate(model="dall-e-3", prompt=query, n=1, size=image_size)  # Generate images

    # List to store the file names of saved images
    saved_files = []

    # Check if the response is successful
    if response.data:
        for image_data in response.data:
            # Generate a random UUID as the file name
            file_name = str(uuid.uuid4()) + ".png"  # Assuming the image is a PNG
            file_path = Path(file_name)

            img_url = image_data.url
            img_response = requests.get(img_url)
            if img_response.status_code == 200:
                # Write the binary content to a file
                with open(file_path, "wb") as img_file:
                    img_file.write(img_response.content)
                    print(f"Image saved to {file_path}")
                    saved_files.append(str(file_path))
            else:
                print(f"Failed to download the image from {img_url}")
    else:
        print("No image data found in the response!")

    # Return the list of saved files
    return {"path": saved_files[0]}

# Example usage of the function:
# generate_and_save_images("A cute baby sea otter")

@user_proxy.register_for_execution()
@chatbot.register_for_llm(name="show_image", description="A function that is capable for displaying an image given path to a image file in png or jpg or jpeg.")
def show_image(path: Annotated[str, "The path to the image file that needs to be displayed"]) -> str:
  img = cv2.imread(path,-1)
  plt.imshow(img)
  plt.axis("off")
  plt.show()
  return ""



In [None]:
chatbot.llm_config

In [None]:
# start the conversation
user_proxy.initiate_chat(
    chatbot,
    message="paint and show an image of a glass of ethiopian coffee, freshly brewed in a tall glass cup, on a table right in front of a lush green forest scenery.",
)