# 02 REST API | 04 Multi Modal

## GPT-4 vision

GPT-4 Vision is a multi modal model capable of processing both text and images as input, allowing it to generate insights and information from diverse data sources. This model seamlessly integrates natural language understanding with visual information, enabling a more comprehensive and context-aware understanding of the input data. By handling both text and images, GPT-4 Vision opens up new possibilities for applications that require a holistic analysis of multi modal content.

### Further information

- [Use GPT-4 Turbo with Vision](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/gpt-with-vision)
- [GPT-4 Turbo with Vision concepts](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/gpt-with-vision)

## Azure Environment

To execute the sample code Azure service specific information like endpoint, api key etc. is needed. Please ensure that you're using gpt-4 and model version `vision` or `vision preview`  ([Details and instructions can be found here](../01_DemoEnvironment/01_Environment.ipynb))

We expect the LLM to identify the attraction provided in this image: 

![LittleMermaid](../../media/img/03_SDK/04_LittleMermaidSmall.jpg) 

It should deliver the name, location and country of the attraction. 

The LLM expects a Url to the image which should be analyzed therefore we upload the image to a Storage Account and create a Shared Access Signature which can be used to securely access the image.

## Step 1:  Setup Parameters

In [None]:
#r "nuget: DotNetEnv, 2.5.0"
#r "nuget: System.Text.Json, 7.0.3"
using DotNetEnv;

using System.Net;
using System.Net.Http;
using System.Text.Json.Nodes;
using System.Text.Json;
using System.IO; 

static string _configurationFile = @"../01_DemoEnvironment/conf/application.env";
Env.Load(_configurationFile);

string apiBase = Environment.GetEnvironmentVariable("SKIT_AOAIVISION_ENDPOINT") ?? "SKIT_AOAIVISION_ENDPOINT not found";
string apiKey = Environment.GetEnvironmentVariable("SKIT_AOAIVISION_APIKEY") ?? "SKIT_AOAIVISION_APIKEY not found";
string deploymentName = Environment.GetEnvironmentVariable("SKIT_AOAIVISION_DEPLOYMENTNAME") ?? "SKIT_AOAIVISION_DEPLOYMENTNAME not found";
string assetsFolder = Path.Combine(Directory.GetCurrentDirectory(), "..", "..", "assets");

string apiVersion = "2023-12-01-preview"; //may change in the future

Expected output
```
Installed Packages
    DotNetEnv, 2.5.0
    Newtonsoft.Json, 13.0.1
    System.Text.Json, 7.0.3
```

## Step 2:  Retrieve image

Images can be provided to an instance of Azure OpenAI in two ways: 

- through a publicly available URI or 
- as a base64-encoded string. 
  
The publicly available URI allows the model to access images directly from the web, while the base64-encoded string represents the image's binary data, enabling including images directly in the input payload. 

This sample provides the image as base64-encoded string. Providing the image using a publicly available (e.g. Azure Storage Account Shared Access Signature) can be done by replacing `data:image/jpeg;base64,...` with the image's URI. 

```csharp
new JsonObject {
    { 
        "image_url",  new JsonObject 
        {
            { "url", "<<PUBLIC AVAILABLE URI>>" }
        }
    },
    {"type", "image_url" }
}
```



In [None]:
string base64Image = Convert.ToBase64String(File.ReadAllBytes(Path.Combine(assetsFolder, "docs", "02_REST_API", "LittleMermaid.jpg")));

Console.WriteLine("Image converted to base64-encoded string");

Expected output:
```
Image converted to base64-encoded string
```

## Step 3: Create payload

In [None]:
//Define System Prompt
string systemMessage = @" 
    You are an assistant who helps travel agency staff to find interesting attractions around the world. 
    You name the place of interest, the city and the country in which the attraction is located. 
    You do not provide any further information.
"; 

string userMessage = @"
    Which place of interest is this?
";

JsonObject requestPayload = new JsonObject
{
    { "messages", new JsonArray
        {
            new JsonObject
            {
                { "content", systemMessage },
                { "role", "system" }
            },
            new JsonObject 
            {
                { "content", 
                    new JsonArray {
                        new JsonObject {
                            { "text", userMessage },
                            { "type", "text" }
                        }, 
                        new JsonObject {
                            { 
                                "image_url",  new JsonObject 
                                {
                                    { "url", string.Concat("data:image/jpeg;base64,", base64Image) }
                                }
                            },
                            {"type", "image_url" }
                        }
                    }
                }, 
                { "role", "user" }
            }
        }
    },
    { "max_tokens", 200 },
    { "temperature", 0.7 },
    { "frequency_penalty", 0 },
    { "presence_penalty", 0 },
    { "top_p", 0.95 },
    { "model", "gpt4vision"}
};

string payload = JsonSerializer.Serialize(requestPayload, new JsonSerializerOptions
{
    WriteIndented = true // Optional: to make the JSON string more readable
});

Console.WriteLine("Request payload created...");

Expected output:
```
Request payload created...
```

## Step 4: Call OpenAI Endpoint

The code cell is using an instance of `HttpClient` to call the REST API of the deployed Azure OpenAI instance.

In [None]:

string endpoint = $"{apiBase}openai/deployments/{deploymentName}/chat/completions?api-version={apiVersion}";

using (HttpClient httpClient = new HttpClient())
{
    httpClient.BaseAddress = new Uri(endpoint);
    httpClient.DefaultRequestHeaders.Add("api-key",apiKey);
    httpClient.DefaultRequestHeaders.Accept.Add(new System.Net.Http.Headers.MediaTypeWithQualityHeaderValue("application/json"));

    var stringContent = new StringContent(payload, Encoding.UTF8, "application/json");

    var response = await httpClient.PostAsync(endpoint, stringContent);

    if (response.IsSuccessStatusCode)
    {
        using (var responseStream = await response.Content.ReadAsStreamAsync())
        {
            // Parse the JSON response using JsonDocument
            using (var jsonDoc = await JsonDocument.ParseAsync(responseStream))
            {
                // Access the message content dynamically
                JsonElement jsonElement = jsonDoc.RootElement;
                string messageContent = jsonElement.GetProperty("choices")[0].GetProperty("message").GetProperty("content").GetString();

                // Output the message content
                Console.WriteLine("Output: " + messageContent);
            }
        }
    }
    else
    {
        Console.WriteLine($"Error: {response}");
    }
}


Expected output:
```
Output: The Little Mermaid statue, Copenhagen, Denmark.
```

## Next Steps

Now that you understand how to use the REST API to interact with Azure OpenAI, explore how to use the Azure OpenAI client library for .NET in the [next notebook](../03_SDK/01_ChatCompletion.ipynb).