<div style="display: flex; justify-content: flex-start; align-items: center; gap: 15px; margin-bottom: 20px;">
  <a target="_blank" href="https://colab.research.google.com/github.com/SylphAI-Inc/AdalFlow/blob/main/notebooks/tutorials/adalflow_modelclient.ipynb">
    <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
  </a>
  <a href="https://github.com/SylphAI-Inc/AdalFlow/blob/main/tutorials/adalflow_modelclient_sync_and_async.py" target="_blank" style="display: flex; align-items: center;">
      <img src="https://github.githubassets.com/images/modules/logos_page/GitHub-Mark.png" alt="GitHub" style="height: 20px; width: 20px; margin-right: 5px;">
      <span style="vertical-align: middle;"> Open Source Code [Partial]</span>
  </a>
</div>

# 🤗 Welcome to AdalFlow!
## The PyTorch library to auto-optimize any LLM task pipelines

Thanks for trying us out, we're here to provide you with the best LLM application development experience you can dream of 😊 any questions or concerns you may have, [come talk to us on discord,](https://discord.gg/ezzszrRZvT) we're always here to help! ⭐ <i>Star us on <a href="https://github.com/SylphAI-Inc/AdalFlow">Github</a> </i> ⭐


# Quick Links

Github repo: https://github.com/SylphAI-Inc/AdalFlow

Full Tutorials: https://adalflow.sylph.ai/index.html#.

Deep dive on each API: check out the [developer notes](https://adalflow.sylph.ai/tutorials/index.html).

Common use cases along with the auto-optimization:  check out [Use cases](https://adalflow.sylph.ai/use_cases/index.html).

# Author
This notebook was created by community contributor [Ajith](https://github.com/ajithvcoder/).

# Outline

This is a quick introduction of what AdalFlow is capable of. We will cover:

* How to use model client in sync and async calls
* How to do develop custom model client using adalflow

**Next: Try our [auto-optimization](https://colab.research.google.com/drive/1n3mHUWekTEYHiBdYBTw43TKlPN41A9za?usp=sharing)**


# Installation

1. Use `pip` to install the `adalflow` Python package. We will need `openai`, `groq`, and `faiss`(cpu version) from the extra packages.

  ```bash
  pip install adalflow[openai,groq,faiss-cpu]
  ```
2. Setup  `openai` and `groq` API key in the environment variables

### Install adalflow

In [1]:
# Install adalflow with necessary dependencies
from IPython.display import clear_output

!pip install -U adalflow[openai,groq,faiss-cpu]

clear_output()

In [None]:
!pip uninstall httpx anyio -y
!pip install "anyio>=3.1.0,<4.0"
!pip install httpx==0.24.1

### Set Environment Variables

Note: Enter your api keys in below cell #todo

In [None]:
%%writefile .env

OPENAI_API_KEY="PASTE-OPENAI_API_KEY_HERE"
GROQ_API_KEY="PASTE-GROQ_API_KEY-HERE"

Writing .env


In [3]:
from adalflow.utils import setup_env

# Load environment variables - Make sure to have OPENAI_API_KEY in .env file and .env is present in current folder
setup_env(".env")

### Basic Vannila Usage Example - model_client() - LLM Chat

In [4]:
from adalflow.components.model_client import OpenAIClient
from adalflow.core.types import ModelType

In [5]:
# Initialize the OpenAI client for API interactions
openai_client = OpenAIClient()
query = "What is the capital of France?"

# Set the model type to Large Language Model (LLM)
model_type = ModelType.LLM

# Construct the prompt by formatting the user's query
prompt = f"User: {query}\n"

# Configure model parameters:
# - model: Specifies GPT-3.5-turbo as the model to use
# - temperature: Controls randomness (0.5 = balanced between deterministic and creative)
# - max_tokens: Limits the response length to 100 tokens
model_kwargs = {"model": "gpt-3.5-turbo", "temperature": 0.5, "max_tokens": 100}

# Convert the inputs into the format required by OpenAI's API
api_kwargs = openai_client.convert_inputs_to_api_kwargs(
    input=prompt, model_kwargs=model_kwargs, model_type=model_type
)
print(f"api_kwargs: {api_kwargs}")


response = openai_client.call(api_kwargs=api_kwargs, model_type=model_type)

# Extract the text from the chat completion response
response_text = openai_client.parse_chat_completion(response)
print(f"response_text: {response_text}")

api_kwargs: {'model': 'gpt-3.5-turbo', 'temperature': 0.5, 'max_tokens': 100, 'messages': [{'role': 'system', 'content': 'User: What is the capital of France?\n'}]}
response_text: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage(completion_tokens=7, prompt_tokens=16, total_tokens=23), raw_response='The capital of France is Paris.', metadata=None)


### Basic Vannila Usage Example - model_client() - Embedding

In [6]:
openai_client = OpenAIClient()
query = "What is the capital of France?"

# Specify the model type to be used, setting it to EMBEDDER for embedding functionality
model_type = ModelType.EMBEDDER

# Create a batch of inputs by duplicating the query; useful for testing batch embedding capabilities
input = [query] * 2

# Set the model's parameters:
# - "text-embedding-3-small" is the model being used
# - "dimensions" defines the dimensionality of the embeddings
# - "encoding_format" specifies the data format for the embeddings
model_kwargs = {
    "model": "text-embedding-3-small",
    "dimensions": 8,
    "encoding_format": "float",
}

# Convert the inputs and model parameters to the format expected by the API using OpenAI client's helper method
api_kwargs = openai_client.convert_inputs_to_api_kwargs(
    input=input, model_kwargs=model_kwargs, model_type=model_type
)
print(f"api_kwargs: {api_kwargs}")

response = openai_client.call(api_kwargs=api_kwargs, model_type=model_type)

# Parse the embedding response to extract the embeddings in a usable format
reponse_embedder_output = openai_client.parse_embedding_response(response)
print(f"reponse_embedder_output: {reponse_embedder_output}")

api_kwargs: {'model': 'text-embedding-3-small', 'dimensions': 8, 'encoding_format': 'float', 'input': ['What is the capital of France?', 'What is the capital of France?']}
reponse_embedder_output: EmbedderOutput(data=[Embedding(embedding=[0.63402575, 0.24025092, 0.42818537, 0.37026355, -0.3518905, -0.041650757, -0.21627253, 0.21798527], index=0), Embedding(embedding=[0.63402575, 0.24025092, 0.42818537, 0.37026355, -0.3518905, -0.041650757, -0.21627253, 0.21798527], index=1)], model='text-embedding-3-small', usage=Usage(prompt_tokens=14, total_tokens=14), error=None, raw_response=None, input=None)


### Adalflow - model_client() - **OpenAI model** Embedding Usage (ModelType.EMBEDDER)

In [48]:
from typing import List
import numpy as np
from adalflow.core.types import ModelType, EmbedderOutput
from adalflow.components.model_client import OpenAIClient
from dataclasses import dataclass
from enum import Enum
from numpy.linalg import norm

In [49]:
@dataclass
class EmbeddingCollection:
    collection: List[float]
    cindex: int


@dataclass
class Usage:
    prompt_tokens: int
    total_tokens: int

In [50]:
openai_client = OpenAIClient()

In [51]:
def get_openai_embedding(text):
    # Set model type to EMBEDDER for embedding functionality
    model_type = ModelType.EMBEDDER

    # Prepare input and model-specific parameters
    input = text
    model_kwargs = {
        "model": "text-embedding-3-small",
        "dimensions": 8,
        "encoding_format": "float",
    }

    # Convert inputs to the required API format
    api_kwargs = openai_client.convert_inputs_to_api_kwargs(
        input=input, model_kwargs=model_kwargs, model_type=model_type
    )
    print(f"api_kwargs: {api_kwargs}")  # Debug output to verify API arguments

    # Call OpenAI API and parse response for embeddings
    response = openai_client.call(api_kwargs=api_kwargs, model_type=model_type)
    reponse_embedder_output = openai_client.parse_embedding_response(response)
    print(
        f"reponse_embedder_output: {reponse_embedder_output}"
    )  # Debug output to verify embeddings
    return reponse_embedder_output


def process_embeddings(embeddings_collection):
    # Extract embedding data for each item in the collection
    embeddingOutput = [emb.collection for emb in embeddings_collection]
    embeddingDataList = [each_emb_out.data for each_emb_out in embeddingOutput]
    embeddingList = [
        each_item.embedding
        for each_emb_data in embeddingDataList
        for each_item in each_emb_data
    ]

    # Convert to numpy array for easier manipulation and calculations
    embeddings_array = np.array(embeddingList)

    def calculate_similarity(emb1, emb2):
        # Compute cosine similarity between two embeddings
        return np.dot(emb1, emb2) / (norm(emb1) * norm(emb2))

    def get_average_embedding(embeddings_list):
        # Calculate the mean embedding across a list of embeddings
        return np.mean(embeddings_list, axis=0)

    def find_nearest_neighbors(
        query_index: int, embedding_list: List[List[float]], k: int = 5
    ):
        # Find top-k most similar embeddings to a query embedding, based on cosine similarity
        query_embedding = embedding_list[query_index]
        similarities = [
            (i, calculate_similarity(query_embedding, emb))
            for i, emb in enumerate(embedding_list)
            if i != query_index
        ]
        return sorted(similarities, key=lambda x: x[1], reverse=True)[:k]

    # Return dictionary of functions and processed data for further use
    return {
        "embeddings_array": embeddings_array,
        "calculate_similarity": calculate_similarity,
        "average_embedding": get_average_embedding,
        "find_nearest_neighbors": find_nearest_neighbors,
    }


# Demonstrate embeddings usage with sample data
def demonstrate_embeddings_usage(sample_embeddings, input_text_list):
    # Initialize processor and retrieve embeddings array
    processor = process_embeddings(sample_embeddings)
    embeddings = processor["embeddings_array"]

    print("1. Analyzing Semantic Similarities:")
    print("-" * 50)

    # Select a few random indices for similarity testing
    num_indices = 5
    assert len(input_text_list) == len(embeddings)
    indices = np.random.choice(len(input_text_list), num_indices, replace=False)
    selected_text = np.array(input_text_list)[indices]
    selected_embeddings = np.array(embeddings)[indices]

    # Display selected texts and their embeddings
    print("Selected indices:", indices)
    print("Selected elements from array1:", selected_text)
    print("Selected elements from array2:", selected_embeddings)

    # Calculate similarity between each pair of selected texts
    for i in range(len(selected_text)):
        for j in range(i + 1, len(selected_text)):
            similarity = processor["calculate_similarity"](
                selected_embeddings[i], selected_embeddings[j]
            )
            print(f"\nComparing:\n'{selected_text[i]}' \nwith:\n'{selected_text[j]}'")
            print(f"Similarity score: {similarity:.4f}")

    print("\n2. Finding Nearest Neighbors:")
    print("-" * 50)

    # Find and display the 3 nearest neighbors for the first text
    query_idx = 0
    neighbors = processor["find_nearest_neighbors"](query_idx, embeddings, k=3)
    print(f"\nQuery text: '{input_text_list[query_idx]}'")
    print("\nNearest neighbors:")

    for idx, similarity in neighbors:
        print(f"- '{input_text_list[idx]}' (similarity: {similarity:.4f})")

    print("\n3. Using Average Embeddings:")
    print("-" * 50)

    # Calculate and compare the average embedding for texts containing "Paris"
    paris_indices = [i for i, text in enumerate(input_text_list) if "Paris" in text]
    paris_embeddings = embeddings[paris_indices]
    avg_paris_embedding = processor["average_embedding"](paris_embeddings)

    print("\nComparing average 'Paris' embedding with all texts:")
    for i, text in enumerate(input_text_list):
        similarity = processor["calculate_similarity"](
            avg_paris_embedding, embeddings[i]
        )
        print(f"- '{text}' (similarity: {similarity:.4f})")

In [52]:
def run_model_client_embedding_usage():
    # Define a set of sample texts to test embedding and similarity functionalities
    sample_texts = [
        "What is the capital of France?",
        "Paris is the capital of France.",
        "What is the population of France?",
        "How big is Paris?",
        "What is the weather like in Paris?",
    ]

    # Duplicate each sample text to form an input list with repeated entries (for embedding testing)
    input_text_list = [text for text in sample_texts for _ in range(2)]

    # Generate embeddings for each text in the input list, and store them in an EmbeddingCollection
    embeddings_collection = [
        EmbeddingCollection(collection=get_openai_embedding(text), cindex=i)
        for i, text in enumerate(input_text_list)
    ]
    print(
        embeddings_collection
    )  # Debugging output to verify embeddings collection content

    # Demonstrate the usage of embeddings by analyzing similarities, finding neighbors, etc.
    demonstrate_embeddings_usage(embeddings_collection, input_text_list)

In [53]:
run_model_client_embedding_usage()

api_kwargs: {'model': 'text-embedding-3-small', 'dimensions': 8, 'encoding_format': 'float', 'input': ['What is the capital of France?']}
reponse_embedder_output: EmbedderOutput(data=[Embedding(embedding=[0.63402575, 0.24025092, 0.42818537, 0.37026355, -0.3518905, -0.041650757, -0.21627253, 0.21798527], index=0)], model='text-embedding-3-small', usage=Usage(prompt_tokens=7, total_tokens=7), error=None, raw_response=None, input=None)
api_kwargs: {'model': 'text-embedding-3-small', 'dimensions': 8, 'encoding_format': 'float', 'input': ['What is the capital of France?']}
reponse_embedder_output: EmbedderOutput(data=[Embedding(embedding=[0.63402575, 0.24025092, 0.42818537, 0.37026355, -0.3518905, -0.041650757, -0.21627253, 0.21798527], index=0)], model='text-embedding-3-small', usage=Usage(prompt_tokens=7, total_tokens=7), error=None, raw_response=None, input=None)
api_kwargs: {'model': 'text-embedding-3-small', 'dimensions': 8, 'encoding_format': 'float', 'input': ['Paris is the capital o

### Adalflow - model_client() - **OpenAI model** LLM Multichat Usage (ModelType.LLM)

In [13]:
from adalflow.components.model_client import OpenAIClient
from adalflow.core.types import ModelType
from adalflow.utils import setup_env
from typing import List, Dict

In [14]:
class ChatConversation:
    def __init__(self):
        # Initialize the OpenAI client for managing API calls
        self.openai_client = OpenAIClient()
        # Initialize an empty conversation history to store chat messages
        self.conversation_history: str = ""
        # Model parameters to customize the API call
        self.model_kwargs = {
            "model": "gpt-3.5-turbo",
            "temperature": 0.5,  # Controls randomness; 0.5 for balanced responses
            "max_tokens": 100,  # Limits the response length
        }

    def add_user_message(self, message: str):
        """Add a user message to the conversation history"""
        self.conversation_history += (
            f"<USER> {message} </USER>"  # Format for user message
        )

    def add_assistant_message(self, message: str):
        """Add an assistant message to the conversation history"""
        self.conversation_history += (
            f"<ASSISTANT> {message} </ASSISTANT>"  # Format for assistant message
        )

    def get_response(self) -> str:
        """Get response from the model based on conversation history"""
        # Convert the conversation history and model parameters into API arguments
        api_kwargs = self.openai_client.convert_inputs_to_api_kwargs(
            input=self.conversation_history,
            model_kwargs=self.model_kwargs,
            model_type=ModelType.LLM,
        )
        print(f"api_kwargs: {api_kwargs}")  # Debugging output to verify API parameters

        # Call the API with the generated arguments to get a response
        response = self.openai_client.call(
            api_kwargs=api_kwargs, model_type=ModelType.LLM
        )
        print("response: ", response)  # Debugging output for raw API response

        # Extract and parse the text response from the API output
        response_text = self.openai_client.parse_chat_completion(response)
        # Update conversation history with the assistant's response
        self.add_assistant_message(response_text)
        return response_text  # Return the assistant's response to the caller

In [15]:
def check_chat_conversation():
    # Initialize a new chat conversation
    chat = ChatConversation()

    # Example list of user questions to simulate a multi-turn conversation
    questions = [
        "What is the capital of France?",
        "What is its population?",
        "Tell me about its famous landmarks",
    ]

    # Iterate through each question in the list
    for question in questions:
        print(f"\nUser: {question}")  # Display the user's question
        chat.add_user_message(
            question
        )  # Add the user question to the conversation history

        response = (
            chat.get_response()
        )  # Get assistant's response based on conversation history
        print(f"Assistant: {response}")  # Display the assistant's response

    # Display the full conversation history after all exchanges
    print("\nFull Conversation History:")
    print(chat.conversation_history)  # Print the accumulated conversation history

In [16]:
# You can observe that each question is depended on previous question and the chat responds in apt manner
check_chat_conversation()


User: What is the capital of France?
api_kwargs: {'model': 'gpt-3.5-turbo', 'temperature': 0.5, 'max_tokens': 100, 'messages': [{'role': 'system', 'content': '<USER> What is the capital of France? </USER>'}]}
response:  ChatCompletion(id='chatcmpl-ASHotWDnw55BOd5d3zWzs0ucxztJr', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The capital of France is Paris.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None))], created=1731305047, model='gpt-3.5-turbo-0125', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=7, prompt_tokens=20, total_tokens=27, completion_tokens_details=CompletionTokensDetails(audio_tokens=0, reasoning_tokens=0, accepted_prediction_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))
Assistant: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage

### Adalflow - model_client() - **OpenAI model** LLM Multichat Usage (ModelType.LLM) - asynchronous (async())

In [17]:
import asyncio
from adalflow.components.model_client import OpenAIClient
from adalflow.core.types import ModelType
from typing import List

In [18]:
class ChatConversationAsync:
    def __init__(self):
        # Initialize with an asynchronous OpenAI client
        self.openai_client = OpenAIClient()

        # Default model parameters for the chat
        self.model_kwargs = {
            "model": "gpt-3.5-turbo",  # Model used for chat
            "temperature": 0.5,  # Controls randomness in response
            "max_tokens": 100,  # Maximum tokens in the generated response
        }

    async def get_response(self, message: str) -> str:
        """Asynchronously get a response from the model for a given user message"""

        # Convert input message and model parameters into the format expected by the API
        api_kwargs = self.openai_client.convert_inputs_to_api_kwargs(
            input=message,  # User's message input
            model_kwargs=self.model_kwargs,  # Model-specific settings
            model_type=ModelType.LLM,  # Specify the model type as a language model (LLM)
        )
        print(f"api_kwargs: {api_kwargs}")  # Log the API arguments for debugging

        # Make an asynchronous API call to OpenAI's model
        response = await self.openai_client.acall(
            api_kwargs=api_kwargs,  # Pass the prepared arguments
            model_type=ModelType.LLM,  # Specify the model type again
        )
        print("response: ", response)  # Print the raw response from the API

        # Parse the API response to extract the assistant's reply (chat completion)
        response_text = self.openai_client.parse_chat_completion(response)
        return response_text  # Return the parsed response text

In [19]:
async def check_chat_conversations_async():
    # Create an instance of ChatConversationAsync to handle asynchronous operations
    chat = ChatConversationAsync()

    # List of unrelated questions that will be handled in parallel
    questions = [
        "What is the capital of France?",  # Question 1
        "Is dog a wild animal?",  # Question 2
        "Tell me about amazon forest",  # Question 3
    ]

    # Create a list of asynchronous tasks, one for each question
    # Each task calls the get_response method asynchronously for a question
    tasks = [chat.get_response(question) for question in questions]

    # Gather the results of all asynchronous tasks concurrently
    responses = await asyncio.gather(*tasks)

    # Print the responses from the assistant along with the respective user questions
    for question, response in zip(questions, responses):
        print(f"\nUser: {question}")
        print(f"Assistant: {response}")

In [20]:
# Run the asynchronous function if in a file
# asyncio.run(check_chat_conversations_async())

# in jupyter notebook
await check_chat_conversations_async()

api_kwargs: {'model': 'gpt-3.5-turbo', 'temperature': 0.5, 'max_tokens': 100, 'messages': [{'role': 'system', 'content': 'What is the capital of France?'}]}
api_kwargs: {'model': 'gpt-3.5-turbo', 'temperature': 0.5, 'max_tokens': 100, 'messages': [{'role': 'system', 'content': 'Is dog a wild animal?'}]}
api_kwargs: {'model': 'gpt-3.5-turbo', 'temperature': 0.5, 'max_tokens': 100, 'messages': [{'role': 'system', 'content': 'Tell me about amazon forest'}]}
response:  ChatCompletion(id='chatcmpl-ASHqEOWoBOIiulzd0aoXeyKKb9npb', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The capital of France is Paris.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None))], created=1731305130, model='gpt-3.5-turbo-0125', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=7, prompt_tokens=14, total_tokens=21, completion_tokens_details=CompletionTokensDetails(audio

### Adalflow - model_client() - **OpenAI model** LLM Multichat Usage (ModelType.LLM) - Benchmark sync() vs async()

In [21]:
import asyncio
import time
from adalflow.components.model_client import (
    OpenAIClient,
)  # Assuming OpenAIClient with .call() and .acall() is available
from adalflow.core.types import ModelType

In [22]:
# Initialize the OpenAI client
openai_client = OpenAIClient()

# Sample prompt for testing
prompt = "Tell me a joke."

model_kwargs = {"model": "gpt-3.5-turbo", "temperature": 0.5, "max_tokens": 100}

In [23]:
# Synchronous function for benchmarking .call()
def benchmark_sync_call(api_kwargs, runs=10):
    """
    Benchmark the synchronous .call() method by running it multiple times.

    Parameters:
    - api_kwargs: The arguments to be passed to the API call
    - runs: The number of times to run the call (default is 10)
    """
    # List to store responses
    responses = []

    # Record the start time of the benchmark
    start_time = time.time()

    # Perform synchronous API calls for the specified number of runs
    responses = [
        openai_client.call(
            api_kwargs=api_kwargs,  # API arguments
            model_type=ModelType.LLM,  # Model type (e.g., LLM for language models)
        )
        for _ in range(runs)  # Repeat 'runs' times
    ]

    # Record the end time after all calls are completed
    end_time = time.time()

    # Output the results of each synchronous call
    for i, response in enumerate(responses):
        print(f"sync call {i + 1} completed: {response}")

    # Print the total time taken for all synchronous calls
    print(f"\nSynchronous benchmark completed in {end_time - start_time:.2f} seconds")


# Asynchronous function for benchmarking .acall()
async def benchmark_async_acall(api_kwargs, runs=10):
    """
    Benchmark the asynchronous .acall() method by running it multiple times concurrently.

    Parameters:
    - api_kwargs: The arguments to be passed to the API call
    - runs: The number of times to run the asynchronous call (default is 10)
    """
    # Record the start time of the benchmark
    start_time = time.time()

    # Create a list of asynchronous tasks for the specified number of runs
    tasks = [
        openai_client.acall(
            api_kwargs=api_kwargs,  # API arguments
            model_type=ModelType.LLM,  # Model type (e.g., LLM for language models)
        )
        for _ in range(runs)  # Repeat 'runs' times
    ]

    # Execute all tasks concurrently and wait for them to finish
    responses = await asyncio.gather(*tasks)

    # Record the end time after all tasks are completed
    end_time = time.time()

    # Output the results of each asynchronous call
    for i, response in enumerate(responses):
        print(f"Async call {i + 1} completed: {response}")

    # Print the total time taken for all asynchronous calls
    print(f"\nAsynchronous benchmark completed in {end_time - start_time:.2f} seconds")

In [24]:
api_kwargs = openai_client.convert_inputs_to_api_kwargs(
    input=prompt, model_kwargs=model_kwargs, model_type=ModelType.LLM
)

# Run both benchmarks
print("Starting synchronous benchmark...\n")
benchmark_sync_call(api_kwargs)

print("\nStarting asynchronous benchmark...\n")
await benchmark_async_acall(api_kwargs)

Starting synchronous benchmark...

sync call 1 completed: ChatCompletion(id='chatcmpl-ASHqYcxCVNAnLlsrnRvxh5cRrQOwf', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Why couldn't the bicycle stand up by itself? Because it was two-tired!", refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None))], created=1731305150, model='gpt-3.5-turbo-0125', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=17, prompt_tokens=12, total_tokens=29, completion_tokens_details=CompletionTokensDetails(audio_tokens=0, reasoning_tokens=0, accepted_prediction_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))
sync call 2 completed: ChatCompletion(id='chatcmpl-ASHqZz3G3jqGlHtKjoO9mbYjjS1Af', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Why did the scarecr

### Adalflow - model_client() - **OpenAI model** LLM Multichat Usage (ModelType.LLM) - Additional Utils -
- get_first_message_content()
- get_all_messages_content()
- get_probabilities()

In [25]:
from adalflow.components.model_client import OpenAIClient
from adalflow.core.types import ModelType
from adalflow.utils import setup_env
from adalflow.components.model_client.openai_client import (
    get_first_message_content,
    get_all_messages_content,
    get_probabilities,
)
from adalflow.core import Generator

In [26]:
def check_openai_additional_utils(func, model_kwargs):
    """
    This function demonstrates the usage of the OpenAI client and a custom utility function
    for generating responses from the LLM model, based on the given query in openai client.

    Parameters:
    - func: A function that will be used to parse the chat completion (for custom parsing).
    - model_kwargs: The additional model parameters (e.g., temperature, max_tokens) to be used in the model.

    Returns:
    - output: The generated response from the model based on the query.
    """

    # Initialize the OpenAI client with a custom chat completion parser
    openai_client = OpenAIClient(chat_completion_parser=func)

    # Define a sample query (user question)
    query = "What is the capital of France?"

    # Set the model type to LLM (Large Language Model)
    model_type = ModelType.LLM

    # Create the prompt by formatting the user query as a conversation
    prompt = f"User: {query}\n"

    # Define any additional parameters needed for the model (e.g., the input string)
    prompt_kwargs = {
        "input_str": "What is the capital of France?",
    }

    # Initialize the Generator with the OpenAI client and model parameters
    generator = Generator(model_client=openai_client, model_kwargs=model_kwargs)

    # Execute the generator to get a response for the prompt (using the defined prompt_kwargs)
    output = generator(prompt_kwargs=prompt_kwargs)

    # Return the generated output (response from the LLM)
    return output

In [27]:
def run_utils_functions():
    """
    This function runs a series of utility functions using different model
    configurations for generating responses. It demonstrates how to check
    OpenAI model outputs using various utility functions.
    """

    # Define the model arguments for the probability-based function (with logprobs)
    probability_model_kwargs = {
        "model": "gpt-3.5-turbo",  # Specify the model version
        "logprobs": True,  # Enable logprobs to get probability distributions for tokens
        "n": 2,  # Request 2 different completions for each query
    }

    # Define general model arguments for most other functions
    model_kwargs = {
        "model": "gpt-3.5-turbo",  # Specify the model version
        "temperature": 0.5,  # Control the randomness of responses (0 is deterministic)
        "max_tokens": 100,  # Set the maximum number of tokens (words) in the response
    }

    # List of functions to run with corresponding model arguments
    func_list = [
        [
            get_probabilities,
            probability_model_kwargs,
        ],  # Function to get probabilities with specific kwargs
        [
            get_first_message_content,
            model_kwargs,
        ],  # Function to get first message content
        [
            get_all_messages_content,
            model_kwargs,
        ],  # Function to get all messages content in multi-chat scenarios
    ]

    # Loop through each function and its corresponding arguments
    for each_func in func_list:
        # Check the function output using the specified arguments
        result = check_openai_additional_utils(each_func[0], each_func[1])

        # Print the function and result for debugging purposes
        print(f"Function: {each_func[0].__name__}, Model Args: {each_func[1]}")
        print(f"Result: {result}")

In [28]:
run_utils_functions()

[ChatCompletionTokenLogprob(token='The', bytes=[84, 104, 101], logprob=-7.076218e-05, top_logprobs=[]), ChatCompletionTokenLogprob(token=' capital', bytes=[32, 99, 97, 112, 105, 116, 97, 108], logprob=-1.9361265e-07, top_logprobs=[]), ChatCompletionTokenLogprob(token=' of', bytes=[32, 111, 102], logprob=-0.00020163313, top_logprobs=[]), ChatCompletionTokenLogprob(token=' France', bytes=[32, 70, 114, 97, 110, 99, 101], logprob=-1.2664457e-06, top_logprobs=[]), ChatCompletionTokenLogprob(token=' is', bytes=[32, 105, 115], logprob=-6.704273e-07, top_logprobs=[]), ChatCompletionTokenLogprob(token=' Paris', bytes=[32, 80, 97, 114, 105, 115], logprob=0.0, top_logprobs=[]), ChatCompletionTokenLogprob(token='.', bytes=[46], logprob=-2.1769476e-05, top_logprobs=[])]
[ChatCompletionTokenLogprob(token='The', bytes=[84, 104, 101], logprob=-7.076218e-05, top_logprobs=[]), ChatCompletionTokenLogprob(token=' capital', bytes=[32, 99, 97, 112, 105, 116, 97, 108], logprob=-1.9361265e-07, top_logprobs=[]

### Adalflow - model_client() - **Groq model** LLM Multichat Usage (ModelType.LLM)

In [33]:
from adalflow.components.model_client import GroqAPIClient
from adalflow.core.types import ModelType
from adalflow.utils import setup_env
from typing import List, Dict

In [29]:
class ChatConversation:
    def __init__(self):
        """
        Initialize a new ChatConversation object.
        - GroqAPIClient is used to interact with the Groq model.
        - conversation_history keeps track of the conversation between the user and assistant.
        - model_kwargs contains the model parameters like temperature and max tokens.
        """
        self.groq_client = (
            GroqAPIClient()
        )  # Initialize GroqAPIClient for model interaction
        self.conversation_history: str = (
            ""  # Initialize conversation history as an empty string
        )
        self.model_kwargs = {
            "model": "llama3-8b-8192",  # Specify the model to use
            "temperature": 0.5,  # Set the temperature for response variability
            "max_tokens": 100,  # Limit the number of tokens in the response
        }

    def add_user_message(self, message: str):
        """
        Add a user message to the conversation history in the required format.
        The message is wrapped with <USER> tags for better processing by the assistant.
        """
        self.conversation_history += (
            f"<USER> {message} </USER>"  # Append user message to history
        )

    def add_assistant_message(self, message: str):
        """
        Add an assistant message to the conversation history in the required format.
        The message is wrapped with <ASSISTANT> tags for better processing.
        """
        self.conversation_history += (
            f"<ASSISTANT> {message} </ASSISTANT>"  # Append assistant message to history
        )

    def get_response(self) -> str:
        """
        Generate a response from the assistant based on the conversation history.
        - Converts the conversation history and model kwargs into the format required by the Groq API.
        - Calls the API to get the response.
        - Parses and adds the assistant's reply to the conversation history.
        """
        # Prepare the request for the Groq API, converting the inputs into the correct format
        api_kwargs = self.groq_client.convert_inputs_to_api_kwargs(
            input=self.conversation_history,  # Use the conversation history as input
            model_kwargs=self.model_kwargs,  # Include model-specific parameters
            model_type=ModelType.LLM,  # Specify the model type (Large Language Model)
        )
        print(f"api_kwargs: {api_kwargs}")  # Log the API request parameters

        # Call the Groq model API to get the response
        response = self.groq_client.call(
            api_kwargs=api_kwargs,
            model_type=ModelType.LLM,  # Specify the model type again for clarity
        )
        print("response: ", response)  # Log the API response

        # Parse the response to extract the assistant's reply
        response_text = self.groq_client.parse_chat_completion(response)

        # Add the assistant's message to the conversation history
        self.add_assistant_message(response_text)

        # Return the assistant's response text
        return response_text

In [31]:
def check_chat_conversation():
    """
    This function simulates a multi-turn conversation between a user and an assistant.
    It demonstrates how user inputs are processed, and the assistant generates responses,
    while maintaining the conversation history for each query.
    """
    # Initialize the ChatConversation object
    chat = ChatConversation()  # This creates an instance of the ChatConversation class

    # Define a list of user questions for a multi-turn conversation
    questions = [
        "What is the capital of France?",  # First user question
        "What is its population?",  # Second user question
        "Tell me about its famous landmarks",  # Third user question
    ]

    # Loop through each question and get the assistant's response
    for question in questions:
        # Print the current question from the user
        print(f"\nUser: {question}")

        # Add the user's message to the conversation history
        chat.add_user_message(question)

        # Get the assistant's response based on the conversation history
        response = chat.get_response()

        # Print the assistant's response
        print(f"Assistant: {response}")

    # After the conversation, print the full conversation history
    print("\nFull Conversation History:")
    print(
        chat.conversation_history
    )  # This will print all messages (user and assistant) in the conversation history

In [34]:
check_chat_conversation()


User: What is the capital of France?
api_kwargs: {'model': 'llama3-8b-8192', 'temperature': 0.5, 'max_tokens': 100, 'messages': [{'role': 'system', 'content': '<USER> What is the capital of France? </USER>'}]}
response:  ChatCompletion(id='chatcmpl-c68fccb5-ed2b-4745-be81-acbac792387f', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The capital of France is Paris.', role='assistant', function_call=None, tool_calls=None))], created=1731305352, model='llama3-8b-8192', object='chat.completion', system_fingerprint='fp_a97cfe35ae', usage=CompletionUsage(completion_tokens=8, prompt_tokens=23, total_tokens=31, completion_time=0.006666667, prompt_time=0.003034232, queue_time=0.010475318, total_time=0.009700899), x_groq={'id': 'req_01jccxebfgf5qbnaea72y9atrm'})
Assistant: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage(completion_tokens=8, prompt_tokens=23, total_tokens=31), raw_response='The capital of France is Paris

### Adalflow - model_client() - **Groq model** LLM Multichat Usage (ModelType.LLM) - asynchronous (async())

In [35]:
import asyncio
from adalflow.components.model_client import GroqAPIClient
from adalflow.core.types import ModelType
from typing import List

In [36]:
class ChatConversation:
    def __init__(self):
        # Using an asynchronous client for communication with GroqAPI
        self.groq_client = GroqAPIClient()  # Create an instance of GroqAPIClient
        # Model configuration parameters (e.g., Llama model with 8b parameters and 8192 context length)
        self.model_kwargs = {
            "model": "llama3-8b-8192",  # Llama model with specific size
            "temperature": 0.5,  # Degree of randomness in the model's responses
            "max_tokens": 100,  # Maximum number of tokens in the response
        }

    async def get_response(self, message: str) -> str:
        """Get response from the model for a single message asynchronously"""

        # Convert the user input message to the appropriate format for the Groq API
        api_kwargs = self.groq_client.convert_inputs_to_api_kwargs(
            input=message,  # User's input message
            model_kwargs=self.model_kwargs,  # Model parameters
            model_type=ModelType.LLM,  # Model type for large language models (LLM)
        )
        print(f"api_kwargs: {api_kwargs}")  # Print the API arguments for debugging

        # Asynchronously call the Groq API with the provided API arguments
        response = await self.groq_client.acall(
            api_kwargs=api_kwargs,  # Pass the API arguments
            model_type=ModelType.LLM,  # Specify the model type
        )
        print("response: ", response)  # Print the API response for debugging

        # Parse the response to extract the assistant's reply from the API response
        response_text = self.groq_client.parse_chat_completion(response)
        return response_text  # Return the assistant's response text

In [37]:
async def check_chat_conversations():
    # Create an instance of ChatConversation
    chat = ChatConversation()

    # List of unrelated questions for independent async calls
    questions = [
        "What is the capital of France?",
        "Is dog a wild animal ?",
        "Tell me about amazon forest",
    ]

    # Run each question as an independent asynchronous task
    tasks = [chat.get_response(question) for question in questions]
    # Gather all the responses concurrently
    responses = await asyncio.gather(*tasks)

    # Display each response alongside the question
    for question, response in zip(questions, responses):
        print(f"\nUser: {question}")
        print(f"Assistant: {response}")

In [38]:
await check_chat_conversations()

api_kwargs: {'model': 'llama3-8b-8192', 'temperature': 0.5, 'max_tokens': 100, 'messages': [{'role': 'system', 'content': 'What is the capital of France?'}]}
api_kwargs: {'model': 'llama3-8b-8192', 'temperature': 0.5, 'max_tokens': 100, 'messages': [{'role': 'system', 'content': 'Is dog a wild animal ?'}]}
api_kwargs: {'model': 'llama3-8b-8192', 'temperature': 0.5, 'max_tokens': 100, 'messages': [{'role': 'system', 'content': 'Tell me about amazon forest'}]}
response:  ChatCompletion(id='chatcmpl-d2fb086a-5d23-409e-b060-4c00578611fe', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The capital of France is Paris.', role='assistant', function_call=None, tool_calls=None))], created=1731305379, model='llama3-8b-8192', object='chat.completion', system_fingerprint='fp_6a6771ae9c', usage=CompletionUsage(completion_tokens=8, prompt_tokens=17, total_tokens=25, completion_time=0.006666667, prompt_time=0.003519913, queue_time=0.010127806000000

### Adalflow - model_client() - **Groq model** LLM Multichat Usage (ModelType.LLM) - Benchmark sync() vs async()

In [39]:
import asyncio
import time
from adalflow.components.model_client import (
    GroqAPIClient,
)  # Assuming GroqAPI with .call() and .acall() is available
from adalflow.core.types import ModelType

In [40]:
# Initialize the Groq client
groq_client = GroqAPIClient()

# Sample prompt for testing
prompt = "Tell me a joke."

model_kwargs = {"model": "llama3-8b-8192", "temperature": 0.5, "max_tokens": 100}

In [41]:
# Synchronous function for benchmarking .call()
def benchmark_sync_call(api_kwargs, runs=10):
    # List to store responses from each synchronous call
    responses = []

    # Record the start time for benchmarking
    start_time = time.time()

    # Perform synchronous API calls in a loop
    responses = [
        groq_client.call(  # Calling the API synchronously
            api_kwargs=api_kwargs,  # Passing the API arguments
            model_type=ModelType.LLM,  # Defining the model type
        )
        for _ in range(runs)  # Repeat the call 'runs' times
    ]

    # Record the end time after all calls are completed
    end_time = time.time()

    # Print out the response from each synchronous call
    for i, response in enumerate(responses):
        print(f"sync call {i + 1} completed: {response}")

    # Print the total time taken for the synchronous benchmark
    print(f"\nSynchronous benchmark completed in {end_time - start_time:.2f} seconds")


# Asynchronous function for benchmarking .acall()
async def benchmark_async_acall(api_kwargs, runs=10):
    # Record the start time for benchmarking
    start_time = time.time()

    # Create a list of tasks for asynchronous API calls
    tasks = [
        groq_client.acall(  # Calling the API asynchronously
            api_kwargs=api_kwargs,  # Passing the API arguments
            model_type=ModelType.LLM,  # Defining the model type
        )
        for _ in range(runs)  # Repeat the call 'runs' times
    ]

    # Await the completion of all tasks concurrently
    responses = await asyncio.gather(
        *tasks
    )  # Gather all the responses from asynchronous calls

    # Record the end time after all asynchronous calls are completed
    end_time = time.time()

    # Print out the response from each asynchronous call
    for i, response in enumerate(responses):
        print(f"Async call {i + 1} completed: {response}")

    # Print the total time taken for the asynchronous benchmark
    print(f"\nAsynchronous benchmark completed in {end_time - start_time:.2f} seconds")

In [42]:
api_kwargs = groq_client.convert_inputs_to_api_kwargs(
    input=prompt, model_kwargs=model_kwargs, model_type=ModelType.LLM
)

# Run both benchmarks
print("Starting synchronous benchmark...\n")
benchmark_sync_call(api_kwargs)

print("\nStarting asynchronous benchmark...\n")
await benchmark_async_acall(api_kwargs)

Starting synchronous benchmark...

sync call 1 completed: ChatCompletion(id='chatcmpl-a6bc4231-b712-4014-a87d-0e9368f5d8f4', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Why couldn't the bicycle stand up by itself?\n\nBecause it was two-tired!", role='assistant', function_call=None, tool_calls=None))], created=1731305394, model='llama3-8b-8192', object='chat.completion', system_fingerprint='fp_179b0f92c9', usage=CompletionUsage(completion_tokens=18, prompt_tokens=15, total_tokens=33, completion_time=0.015, prompt_time=0.000141559, queue_time=0.01454033, total_time=0.015141559), x_groq={'id': 'req_01jccxfkx7epcsynkkex05e6v6'})
sync call 2 completed: ChatCompletion(id='chatcmpl-00586f1c-f6fb-4650-a549-ff24d462c6bf', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Why couldn't the bicycle stand up by itself?\n\nBecause it was two-tired!", role='assistant', function_call=None, tool_

### Adalflow - model_client() - **Custom Model** client building (ModelType.LLM) and (ModelType.EMBEDDER) - Synchronous
Note: I am using openai api as a example to build custom model client in adalflow. Even though its already there in adalflow repo below code will definitly be a starter code whom ever wants to build a custom model client

In [43]:
# Building simple custom third party model client and using it
# I have modified convert_inputs_to_api_kwargs() to make sure it follows the prompt of openai and i have used appropiate
# openai api call in __call__()

import openai
from adalflow.core.model_client import ModelClient
from adalflow.core.types import ModelType, GeneratorOutput, EmbedderOutput
from openai.types import (
    CreateEmbeddingResponse,
)
from adalflow.components.model_client.utils import parse_embedding_response

In [45]:
class SimpleCustomModelClient(ModelClient):
    # Initialize the custom model client
    def __init__(self):
        # Call the parent class's initializer
        super().__init__()
        pass  # Placeholder for any initialization logic if needed in the future

    # Method to convert input into API parameters for different model types (LLM or Embedder)
    def convert_inputs_to_api_kwargs(
        self, input=None, model_kwargs={}, model_type=ModelType.UNDEFINED
    ):
        """
        Convert the inputs into API arguments based on the model type.

        Args:
            input (str): The input text to be processed.
            model_kwargs (dict): Additional model parameters like temperature, max_tokens, etc.
            model_type (ModelType): The type of model to use (LLM or Embedder).

        Returns:
            dict: API arguments formatted for the specified model type.
        """
        if (
            model_type == ModelType.LLM
        ):  # If the model type is a large language model (LLM)
            return {
                "model": model_kwargs[
                    "model"
                ],  # Set the model to use (e.g., GPT-3, GPT-4)
                "messages": input,  # Provide the input as the message
                "temperature": model_kwargs[
                    "temperature"
                ],  # Set the temperature (creativity of the response)
                "max_tokens": model_kwargs[
                    "max_tokens"
                ],  # Max tokens to generate in the response
            }
        elif model_type == ModelType.EMBEDDER:  # If the model type is an embedder
            return {
                "model": model_kwargs["model"],  # Model name for embedding
                "input": [input],  # Provide the input in a list format for embedding
            }
        else:
            # Raise an error if the model type is unsupported
            raise ValueError(f"model_type {model_type} is not supported")

    # Method to make the actual API call to OpenAI for either completions (LLM) or embeddings
    def call(self, api_kwargs={}, model_type=ModelType.UNDEFINED):
        """
        Call the appropriate OpenAI API method based on the model type (LLM or Embedder).

        Args:
            api_kwargs (dict): Arguments to be passed to the API call.
            model_type (ModelType): The type of model (LLM or Embedder).

        Returns:
            Response: The API response from OpenAI.
        """
        if model_type == ModelType.LLM:  # If the model type is LLM (e.g., GPT-3, GPT-4)
            return openai.chat.completions.create(
                **api_kwargs
            )  # Call the chat API for completion
        elif model_type == ModelType.EMBEDDER:  # If the model type is Embedder
            return openai.embeddings.create(**api_kwargs)  # Call the embedding API
        else:
            # Raise an error if an invalid model type is passed
            raise ValueError(f"Unsupported model type: {model_type}")

    # Method to parse the response from a chat completion API call
    def parse_chat_completion(self, completion):
        """
        Parse the response from a chat completion API call into a custom output format.

        Args:
            completion: The completion response from the OpenAI API.

        Returns:
            GeneratorOutput: A custom data structure containing the parsed response.
        """
        # Note: GeneratorOutput is a adalflow dataclass that contains the parsed completion data
        return GeneratorOutput(
            data=completion,  # Store the raw completion data
            error=None,  # No error in this case
            raw_response=str(completion),  # Store the raw response as a string
        )

    # Method to parse the response from an embedding API call
    def parse_embedding_response(
        self, response: CreateEmbeddingResponse
    ) -> EmbedderOutput:
        """
        Parse the response from an embedding API call into a custom output format.

        Args:
            response (CreateEmbeddingResponse): The response from the embedding API.

        Returns:
            EmbedderOutput: A custom data structure containing the parsed embedding response.
        """
        try:
            # Attempt to parse the embedding response using a helper function
            return parse_embedding_response(response)
        except Exception as e:
            # If parsing fails, return an error message with the raw response
            return EmbedderOutput(data=[], error=str(e), raw_response=response)

In [46]:
def build_custom_model_client():
    # Instantiate the custom model client (SimpleCustomModelClient)
    custom_client = SimpleCustomModelClient()

    # Define the query for the model to process
    query = "What is the capital of France?"

    # Set the model type for a Large Language Model (LLM)
    model_type = ModelType.LLM

    # Prepare the message prompt as expected by the OpenAI chat API.
    # This format is suitable for GPT-like models (e.g., gpt-3.5-turbo).
    message_prompt = [
        {
            "role": "user",  # Define the user role in the conversation
            "content": [
                {
                    "type": "text",  # Specify that the input is a text type
                    "text": query,  # The actual query to be processed by the model
                }
            ],
        }
    ]

    # Print message indicating the usage of the LLM model type
    print("ModelType LLM")

    # Define additional model parameters like model name, temperature, and max tokens for LLM
    model_kwargs = {"model": "gpt-3.5-turbo", "temperature": 0.5, "max_tokens": 100}

    # Convert the input message and model kwargs into the required API parameters
    api_kwargs = custom_client.convert_inputs_to_api_kwargs(
        input=message_prompt, model_kwargs=model_kwargs, model_type=model_type
    )

    # Print the API arguments that will be passed to the call method
    print(f"api_kwargs: {api_kwargs}")

    # Call the LLM model using the prepared API arguments
    result = custom_client.call(api_kwargs, ModelType.LLM)

    # Print the result of the LLM model call (response from OpenAI)
    print(result)

    # Parse the chat completion response and output a more structured result
    response_text = custom_client.parse_chat_completion(result)

    # Print the structured response from the chat completion
    print(f"response_text: {response_text}")

    # Switch to using the Embedder model type
    print("ModelType EMBEDDER")

    # Define model-specific parameters for the embedding model
    model_kwargs = {
        "model": "text-embedding-3-small",
        "dimensions": 8,
        "encoding_format": "float",
    }

    # Convert the input query for the embedder model
    api_kwargs = custom_client.convert_inputs_to_api_kwargs(
        input=query, model_kwargs=model_kwargs, model_type=ModelType.EMBEDDER
    )

    # Print the API arguments that will be passed to the embedder model
    print(f"embedder api_kwargs: {api_kwargs}")

    # Call the Embedder model using the prepared API arguments
    result = custom_client.call(api_kwargs, ModelType.EMBEDDER)

    # Print the result of the Embedder model call (embedding response)
    print(result)

    # Parse the embedding response and output a more structured result
    response_text = custom_client.parse_embedding_response(result)

    # Print the structured response from the embedding model
    print(f"response_text: {response_text}")

In [47]:
build_custom_model_client()

ModelType LLM
api_kwargs: {'model': 'gpt-3.5-turbo', 'messages': [{'role': 'user', 'content': [{'type': 'text', 'text': 'What is the capital of France?'}]}], 'temperature': 0.5, 'max_tokens': 100}
ChatCompletion(id='chatcmpl-ASHw0PEDqdMlIAIZwr8w2t4L3C9u2', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The capital of France is Paris.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None))], created=1731305488, model='gpt-3.5-turbo-0125', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=7, prompt_tokens=14, total_tokens=21, completion_tokens_details=CompletionTokensDetails(audio_tokens=0, reasoning_tokens=0, accepted_prediction_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))
response_text: GeneratorOutput(id=None, data=ChatCompletion(id='chatcmpl-ASHw0PEDqdMlIAIZwr8w2t4L3C9u2',

# Adalflow multimodal model client

In [None]:
def analyze_single_image():
    """Example of analyzing a single image with GPT-4 Vision"""
    client = OpenAIClient()

    gen = Generator(
        model_client=client,
        model_kwargs={
            "model": "gpt-4o-mini",
            "images": "https://raw.githubusercontent.com/openai/openai-cookbook/main/examples/images/happy_cat.jpg",
            "max_tokens": 300,
        },
    )

    response = gen(
        {"input_str": "What do you see in this image? Be detailed but concise."}
    )
    print("\n=== Single Image Analysis ===")
    print(f"Description: {response.raw_response}")


def analyze_multiple_images():
    """Example of analyzing multiple images in one prompt"""
    client = OpenAIClient()

    # List of images to analyze together
    images = [
        "https://raw.githubusercontent.com/openai/openai-cookbook/main/examples/images/happy_cat.jpg",
        "https://raw.githubusercontent.com/openai/openai-cookbook/main/examples/images/sad_cat.jpg",
    ]

    gen = Generator(
        model_client=client,
        model_kwargs={"model": "gpt-4o-mini", "images": images, "max_tokens": 300},
    )

    response = gen(
        {
            "input_str": "Compare and contrast these two images. What are the main differences?"
        }
    )
    print("\n=== Multiple Images Analysis ===")
    print(f"Comparison: {response.raw_response}")


def generate_art_with_dalle():
    """Example of generating art using DALL-E 3"""
    client = OpenAIClient()

    gen = Generator(
        model_client=client,
        model_kwargs={
            "model": "dall-e-3",
            "size": "1024x1024",
            "quality": "standard",
            "n": 1,
        },
    )

    response = gen(
        {
            "input_str": "A serene Japanese garden with a small bridge over a koi pond, cherry blossoms falling gently in the breeze"
        }
    )
    print("\n=== Art Generation with DALL-E 3 ===")
    print(f"Generated Image URL: {response.data}")


def create_image_variations(image_path="path/to/your/image.jpg"):
    """Example of creating variations of an existing image"""
    client = OpenAIClient()

    gen = Generator(
        model_client=client,
        model_kwargs={
            "model": "dall-e-2",
            "image": image_path,
            "n": 2,  # Generate 2 variations
            "size": "1024x1024",
        },
    )

    response = gen({"input_str": ""})
    print("\n=== Image Variations ===")
    print(f"Variation URLs: {response.data}")


def edit_image_with_mask(image_path="path/to/image.jpg", mask_path="path/to/mask.jpg"):
    """Example of editing specific parts of an image using a mask"""
    client = OpenAIClient()

    gen = Generator(
        model_client=client,
        model_kwargs={
            "model": "dall-e-2",
            "image": image_path,
            "mask": mask_path,
            "n": 1,
            "size": "1024x1024",
        },
    )

    response = gen({"input_str": "Replace the masked area with a beautiful sunset"})
    print("\n=== Image Editing ===")
    print(f"Edited Image URL: {response.data}")


def mixed_image_text_conversation():
    """Example of having a conversation that includes both images and text"""
    client = OpenAIClient()

    gen = Generator(
        model_client=client,
        model_kwargs={
            "model": "gpt-4o-mini",
            "images": [
                "https://raw.githubusercontent.com/openai/openai-cookbook/main/examples/images/happy_cat.jpg",
                "https://path/to/local/image.jpg",  # Replace with your local image path
            ],
            "max_tokens": 300,
        },
    )

    conversation = """<START_OF_SYSTEM_PROMPT>You are a helpful assistant skilled in analyzing images and providing detailed descriptions.</END_OF_SYSTEM_PROMPT>
    <START_OF_USER_PROMPT>I'm showing you two images. Please analyze them and tell me what emotions they convey.</END_OF_USER_PROMPT>"""

    response = gen({"input_str": conversation})
    print("\n=== Mixed Image-Text Conversation ===")
    print(f"Assistant's Analysis: {response.raw_response}")

In [None]:
if __name__ == "__main__":
    print("OpenAI Image Processing Examples\n")

    # Basic image analysis
    analyze_single_image()

    # Multiple image analysis
    analyze_multiple_images()

    # Image generation
    generate_art_with_dalle()

    # create_image_variations(<path_to_image>)
    # edit_image_with_mask(<path_to_image>, <path_to_mask>)
    # mixed_image_text_conversation(<conversation_prompt>)

# Image generation with Dall E and image understanding

In [None]:
from adalflow.core import Generator
from adalflow.components.model_client.openai_client import OpenAIClient
from adalflow.core.types import ModelType

In [None]:
class ImageGenerator(Generator):
    """Generator subclass for image generation."""

    model_type = ModelType.IMAGE_GENERATION


def test_vision_and_generation():
    """Test both vision analysis and image generation"""
    client = OpenAIClient()

    # 1. Test Vision Analysis
    vision_gen = Generator(
        model_client=client,
        model_kwargs={
            "model": "gpt-4o-mini",
            "images": "https://upload.wikimedia.org/wikipedia/en/7/7d/Lenna_%28test_image%29.png",
            "max_tokens": 300,
        },
    )

    vision_response = vision_gen(
        {"input_str": "What do you see in this image? Be detailed but concise."}
    )
    print("\n=== Vision Analysis ===")
    print(f"Description: {vision_response.raw_response}")

    # 2. Test DALL-E Image Generation
    dalle_gen = ImageGenerator(
        model_client=client,
        model_kwargs={
            "model": "dall-e-3",
            "size": "1024x1024",
            "quality": "standard",
            "n": 1,
        },
    )

    # For image generation, input_str becomes the prompt
    response = dalle_gen(
        {"input_str": "A happy siamese cat playing with a red ball of yarn"}
    )
    print("\n=== DALL-E Generation ===")
    print(f"Generated Image URL: {response.data}")

# Invalid image url - Generator output still works!

In [None]:
def test_invalid_image_url():
    """Test Generator output with invalid image URL"""
    client = OpenAIClient()
    gen = Generator(
        model_client=client,
        model_kwargs={
            "model": "gpt-4o-mini",
            "images": "https://invalid.url/nonexistent.jpg",
            "max_tokens": 300,
        },
    )

    print("\n=== Testing Invalid Image URL ===")
    response = gen({"input_str": "What do you see in this image?"})
    print(f"Response with invalid image URL: {response}")

In [None]:
if __name__ == "__main__":
    print("Starting OpenAI Vision and DALL-E test...\n")
    test_invalid_image_url()
    test_vision_and_generation()

# Issues and feedback

If you encounter any issues, please report them here: [GitHub Issues](https://github.com/SylphAI-Inc/LightRAG/issues).

For feedback, you can use either the [GitHub discussions](https://github.com/SylphAI-Inc/LightRAG/discussions) or [Discord](https://discord.gg/ezzszrRZvT).