# 05 Multi Modal | 01 OpenAI GPT4 Vision

## GPT-4 vision

GPT-4 Vision is a multi modal model capable of processing both text and images as input, allowing it to generate insights and information from diverse data sources. This model seamlessly integrates natural language understanding with visual information, enabling a more comprehensive and context-aware understanding of the input data. By handling both text and images, GPT-4 Vision opens up new possibilities for applications that require a holistic analysis of multi modal content.


## Azure Environment

To execute the sample code Azure service specific information like endpoint, api key etc. is needed. Please ensure that you're using gpt-4 and model version `vision` or `vision preview`  ([Details and instructions can be found here](../01_CreateEnvironment/01_Environment.ipynb))


## Step 1: Create OpenAIClient

The OpenAIClient from Azure.AI.OpenAI is a .NET client library that acts as the centralized point for all .NET functionality that want to interact with a deployed Azure OpenAI Large Language Model. It provides methods to access the OpenAI REST APIs for various tasks such as text completion, text embedding, and chat completion, etc.. It also allows developers to specify the model, engine, and options for each request, such as temperature, frequency penalty, presence penalty, and stop sequences. 

The OpenAIClient can connect to any Azure OpenAI resource or to the non-Azure OpenAI inference endpoint, making it a versatile and powerful tool for .NET development with OpenAI.


In [1]:
#r "nuget: Azure.AI.OpenAI, 1.0.0-beta.16"
#r "nuget: DotNetEnv, 2.5.0"

using Azure; 
using Azure.AI.OpenAI;
using DotNetEnv;
using System.IO;
using System.Text.Json; 

//configuration file is created during environment creation
static string _configurationFile = @"../Configuration/application.env";
Env.Load(_configurationFile);

string oAiApiKey = Environment.GetEnvironmentVariable("WS_AOAIVISION_APIKEY") ?? "WS_AOAIVISION_APIKEY not found";
string oAiEndpoint = Environment.GetEnvironmentVariable("WS_AOAIVISION_ENDPOINT") ?? "WS_AOAIVISION_ENDPOINT not found";
string chatCompletionDeploymentName = Environment.GetEnvironmentVariable("WS_AOAIVISION_DEPLOYMENTNAME") ?? "WS_AOAIVISION_DEPLOYMENTNAME not found";
string storageConnectionString = Environment.GetEnvironmentVariable("WS_STORAGE_CONNECTIONSTRING") ?? "WS_STORAGE_CONNECTIONSTRING not found";
string assetsFolder = Path.Combine(Directory.GetCurrentDirectory(), "..", "..", "assets");

AzureKeyCredential azureKeyCredential = new AzureKeyCredential(oAiApiKey);
OpenAIClient openAIClient = new OpenAIClient(new Uri(oAiEndpoint), azureKeyCredential);


Console.WriteLine($"OpenAI Client created...");

OpenAI Client created...


## Step 2: Upload image to Azure Storage and crate Shared Access Signature

We expect the LLM to identify the attraction provided in this image: 

![LittleMermaid](../Assets/Vision/LittleMermaid.jpg) 

It should deliver the name, location and country of the attraction. 

Images can be provided to an instance of Azure OpenAI in two ways: 

- through a publicly available URI or 
- as a base64-encoded string. 
  
In this sample we upload the image to a Storage Account and create a Shared Access Signature which can be used to securely provide the image.

In [2]:
#r "nuget: Azure.Storage.Blobs"

using Azure.Storage.Blobs;
using Azure.Storage.Sas;

string containerName = "skit";
string blobName = "LittleMermaid.jpg";
string localFilePath = "../assets/Vision/LittleMermaid.jpg";
string sasUrl = "";

BlobServiceClient blobServiceClient = new BlobServiceClient(storageConnectionString);
BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient(containerName); 
containerClient.CreateIfNotExists();

BlobClient blobClient = containerClient.GetBlobClient(blobName);
using (FileStream fileStream = new FileStream(localFilePath, FileMode.Open)){
    await blobClient.UploadAsync(fileStream, true);
    BlobSasBuilder sasBuilder = new BlobSasBuilder() {
        BlobContainerName = containerName,
        BlobName = blobClient.Name,
        Resource = "b"
    };
    sasBuilder.ExpiresOn = DateTimeOffset.UtcNow.AddDays(1);
    sasBuilder.SetPermissions(BlobContainerSasPermissions.Read);

    sasUrl = blobClient.GenerateSasUri(sasBuilder).ToString();
}
Console.WriteLine($"Blob uploaded to {sasUrl}");


Blob uploaded to https://wsstorage418.blob.core.windows.net/skit/LittleMermaid.jpg?sv=2023-11-03&se=2024-04-24T09%3A53%3A41Z&sr=b&sp=r&sig=%2BxXDb1l5mf3gJRGQt1vfKzoK4mwV8jXZOdjznxoeSos%3D


## Step 3: Compose ChatCompletionsOptions

Each chat would follow similar structure, where _System_, _Agent_ and _User_ messages are added in sequence. Parameters, such as _Temperature_ could be set per call. We can provide the the image url and the user message containing instructions how to process the image using: 

```csharp
new ChatRequestUserMessage(
    new ChatMessageTextContentItem(userMessage),
    new ChatMessageImageContentItem(new Uri(sasUrl))
)
```

***System prompt:*** You are an assistant who helps travel agency staff to find interesting attractions around the world. You name the place of interest, the city and the country in which the attraction is located. You do not provide any further information

***Prompt:*** Which place of interest is this?


In [3]:
//Define System Prompt
string systemMessage = @" 
    You are an assistant who helps travel agency staff to find interesting attractions around the world. 
    You name the place of interest, the city and the country in which the attraction is located. 
    You do not provide any further information.
"; 

string userMessage = @"
    Which place of interest is this?
";

//Compose Chat (Simplified - No few shot learning in this example)
ChatCompletionsOptions chatCompletionsOptions = new ChatCompletionsOptions()
{
    DeploymentName = chatCompletionDeploymentName,
    Messages =
    {
        new ChatRequestSystemMessage(systemMessage),
        new ChatRequestUserMessage(userMessage),
        new ChatRequestUserMessage(
            new ChatMessageTextContentItem(userMessage),
            new ChatMessageImageContentItem(new Uri(sasUrl))
        )
    },
    MaxTokens = 100, 
    NucleusSamplingFactor = 0.1f,
    Temperature = 0.1f,
};

Console.WriteLine($"ChatCompletionsOptions created...");



ChatCompletionsOptions created...


## Step 4: Call ChatCompletion API

The final step is to call the `GetChatCompletionAsync()` method to get the response from the LLM.



In [4]:
Response<ChatCompletions> response = await openAIClient.GetChatCompletionsAsync(chatCompletionsOptions);
Console.WriteLine(response.Value.Choices[0].Message.Content);

The Little Mermaid statue, Copenhagen, Denmark.
