# Introduction

Today we will demonstrate how you can generate advertizing content for your inventory to boost sales. We'll do this taking advantage of Azure Cosmos DB for Mongo DB vCore's [vector similarity search](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/vector-search) functionality. We will use OpenAI embeddings to generate vectors for inventory description which expected to vastly enhance its semantics. The vectors are then stored and indexed in the Mongo vCore Database. During the content generation for the advertisement time we will also vectorize the advertisement topic and find matching inventory itmes. We will then use retrival augmented generation (RAG) by sending to top matches to OpenAI to generate a catchy advertisement.

# Scenario

1. Shoe Retailer who wants to sell more shoes 
2. Wants to run advertisement to capitalize on recent trends
2. Wants to use generate advertisement content using the inventory items that matches the trend

## Azure OpenAI <a class="anchor" id="azureopenai"></a>

Finally, let's setup our Azure OpenAI resource Currently, access to this service is granted only by application. You can apply for access to Azure OpenAI by completing the form at https://aka.ms/oai/access. Once you have access, complete the following steps:

- Create an Azure OpenAI resource following this quickstart: https://learn.microsoft.com/azure/ai-services/openai/how-to/create-resource?pivots=web-portal
- Deploy a `completions` and `embeddings` model 
    - For more information on `completions`, go here: https://learn.microsoft.com/azure/ai-services/openai/how-to/completions
    - For more information on `embeddings`, go here: https://learn.microsoft.com/azure/ai-services/openai/how-to/embeddings
- Copy the endpoint, key, deployment names for (embeddings model, completions model) into the config.json file.

## Create an Azure Cosmos DB for MongoDB vCore resource<a class="anchor" id="cosmosdb"></a>
Let's start by creating an Azure Cosmos DB for MongoDB vCore Resource following this quick start guide: https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/quickstart-portal

Then copy the connection details (server, user, pwd) into the environment file.

# Preliminaries <a class="anchor" id="preliminaries"></a>
First, let's start by installing the packages that we'll need later. 

In [1]:
! pip install numpy
! pip install openai==1.2.3
! pip install pymongo
! pip install python-dotenv
! pip install azure-core
! pip install azure-cosmos
! pip install tenacity
! pip install gradio
! pip show openai




[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: C:\Users\Khelan Modi\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip





[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: C:\Users\Khelan Modi\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip





[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: C:\Users\Khelan Modi\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip





[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: C:\Users\Khelan Modi\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip





[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: C:\Users\Khelan Modi\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip





[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: C:\Users\Khelan Modi\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip





[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: C:\Users\Khelan Modi\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip





[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: C:\Users\Khelan Modi\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


Name: openai
Version: 1.2.3
Summary: The official Python library for the openai API
Home-page: 
Author: 
Author-email: OpenAI <support@openai.com>
License: 
Location: C:\Users\Khelan Modi\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages
Requires: anyio, distro, httpx, pydantic, tqdm, typing-extensions
Required-by: 


Please use the example.env as a template to provide the necessary keys and endpoints in your own .env file.
Make sure to modify the env_name accordingly. 

In [2]:
import json
import time
import openai

from dotenv import dotenv_values
from openai import AzureOpenAI

# specify the name of the .env file name 
env_name = "example.env" # following example.env template change to your own .env file name
config = dotenv_values(env_name)

openai.api_type = config['openai_api_type']
openai.api_key = config['openai_api_key']
openai.api_base = config['openai_api_endpoint']
openai.api_version = config['openai_api_version']

client = AzureOpenAI(
    api_key=openai.api_key,
    api_version=openai.api_version,
    azure_endpoint = openai.api_base
)


# Create embeddings <a class="anchor" id="loaddata"></a>
Here we'll load a sample dataset containing descriptions of Azure services. Then we'll user Azure OpenAI to create vector embeddings from this data.

In [3]:
import openai

def generate_embeddings(text):
    try:
        response = client.embeddings.create(
            input=text, model="text-embedding-ada-002")
        embeddings = response.data[0].embedding
        return embeddings
    except Exception as e:
        print(f"An error occurred: {e}")
        return None

embeddings = generate_embeddings("Shoes for San Francisco summer")

if embeddings is not None:
    print(embeddings)

[0.013928048312664032, -0.018227633088827133, -0.002117219613865018, -0.028481490910053253, -0.0009161046473309398, 0.006358173675835133, -0.03551717475056648, 0.006937965750694275, 0.019374188035726547, -0.019608711823821068, 0.00758290383964777, -0.00019462134514469653, 0.006964024156332016, -0.008905352093279362, -0.0035178419202566147, -0.018318835645914078, 0.01792796514928341, 0.004856576211750507, 0.028716012835502625, -0.017849789932370186, -0.017458919435739517, 0.004286555573344231, 0.004788173828274012, -0.02371286042034626, -0.00694448035210371, -0.004768630024045706, 0.01878788135945797, -0.011022571474313736, -0.02457277663052082, -0.021706387400627136, 0.027256760746240616, -0.00280124437995255, -0.023139582946896553, -0.0028338171541690826, 0.0006429018685594201, -0.006709957495331764, -0.005426596850156784, 0.0009234335157088935, 0.012117010541260242, -0.010325517505407333, 0.012384106405079365, -0.0068663060665130615, -0.003034138586372137, -0.0029103627894073725, 0.0

# Connect and setup Cosmos DB for MongoDB vCore

## Set up the connection

In [4]:
import pymongo

env_name = "example.env"
config = dotenv_values(env_name)

mongo_conn = config['mongo_vcore_connection_string']
mongo_client = pymongo.MongoClient(mongo_conn)

##  Set up the DB and collection

In [5]:
DATABASE_NAME = "AdgenDatabase"
COLLECTION_NAME = "AdgenCollection"

mongo_client.drop_database(DATABASE_NAME)
db = mongo_client[DATABASE_NAME]
collection = db[COLLECTION_NAME]

if COLLECTION_NAME not in db.list_collection_names():
    # Creates a unsharded collection that uses the DBs shared throughput
    db.create_collection(COLLECTION_NAME)
    print("Created collection '{}'.\n".format(COLLECTION_NAME))
else:
    print("Using collection: '{}'.\n".format(COLLECTION_NAME))

Created collection 'AdgenCollection'.



## Create the vector index

**IMPORTANT: You can only create one index per vector property.** That is, you cannot create more than one index that points to the same vector property. If you want to change the index type (e.g., from IVF to HNSW) you must drop the index first before creating a new index.

### IVF
IVF is the default vector indexing algorithm, which works on all cluster tiers. It's an approximate nerarest neighbors (ANN) approach that uses clustering to speeding up the search for similar vectors in a dataset. 

In [6]:
db.command({
  'createIndexes': COLLECTION_NAME,
  'indexes': [
    {
      'name': 'vectorSearchIndex',
      'key': {
        "contentVector": "cosmosSearch"
      },
      'cosmosSearchOptions': {
        'kind': 'vector-ivf',
        'numLists': 1,
        'similarity': 'COS',
        'dimensions': 1536
      }
    }
  ]
});

### HNSW

HNSW stands for Hierarchical Navigable Small World, a graph-based data structure that partitions vectors into clusters and subclusters. With HNSW, you can perform fast approximate nearest neighbor search at higher speeds with greater accuracy. HNSW is an approximate (ANN) method. As a preview feature, this must be enabled using Azure Feature Enablement Control (AFEC) by selecting the "mongoHnswIndex" feature. For more information, see [enable preview features](https://learn.microsoft.com/azure/azure-resource-manager/management/preview-features).

HNSW works on M50 cluster tiers and higher while in preview.

In [8]:
db.command(
{ 
    "createIndexes": "ExampleCollection",
    "indexes": [
        {
            "name": "VectorSearchIndex",
            "key": {
                "contentVector": "cosmosSearch"
            },
            "cosmosSearchOptions": { 
                "kind": "vector-hnsw", 
                "m": 16, # default value 
                "efConstruction": 64, # default value 
                "similarity": "COS", 
                "dimensions": 1536
            } 
        } 
    ] 
}
)

OperationFailure: hnsw index is not supported yet, full error: {'ok': 0.0, 'errmsg': 'hnsw index is not supported yet', 'code': 115, 'codeName': 'CommandNotSupported'}

## Upload data to the collection
A simple `insert_many()` to insert our data in JSON format into the newly created DB and collection.

In [7]:
data_file = open(file="./data/shoes_with_vectors.json", mode="r") 
data = json.load(data_file)
data_file.close()

result = collection.insert_many(data)

print(f"Number of data points added: {len(result.inserted_ids)}")

Number of data points added: 300


# Vector Search in Cosmos DB for MongoDB vCore

In [8]:
# Function to assist with vector search
def vector_search(query, num_results=3):
    
    query_vector = generate_embeddings(query)

    embeddings_list = []
    pipeline = [
        {
            '$search': {
                "cosmosSearch": {
                    "vector": query_vector,
                    "numLists": 1,
                    "path": "contentVector",
                    "k": num_results
                },
                "returnStoredSource": True }},
        {'$project': { 'similarityScore': { '$meta': 'searchScore' }, 'document' : '$$ROOT' } }
    ]
    results = collection.aggregate(pipeline)
    return results

## Perform vector search query

In [9]:
query = "Shoes for Seattle sweater weather"
results = vector_search(query, 3)

print("\nResults:\n")
for result in results: 
    print(f"Similarity Score: {result['similarityScore']}")  
    print(f"Title: {result['document']['name']}")  
    print(f"Price: {result['document']['price']}")  
    print(f"Material: {result['document']['material']}") 
    print(f"Image: {result['document']['img_url']}") 
    print(f"Purchase: {result['document']['purchase_url']}\n")


Results:

Similarity Score: 0.8336247653551396
Title: Nature Breeze Rainforest Women's Stylish Buckle Strap Winter Rain Boot Shoes
Price: 57.98
Material: None
Image: https://ignitedemotorage.z5.web.core.windows.net/images/37.png
Purchase: https://www.kohls.com/product/prd-4675970/columbia-womens-crestwood-mid-waterproof.jsp?skuid=37797153&ci_mcc=ci&utm_campaign=WOMENS%20CASUAL%20SHOES&utm_medium=CSE&utm_source=bing&CID=shopping20&utm_campaignid=470254723&utm_adgroupid=1238051084185804&gclid=4e558534c2561628ce7ee09ca4a8f995&gclsrc=3p.ds&msclkid=4e558534c2561628ce7ee09ca4a8f995

Similarity Score: 0.8329114913940489
Title: Columbia Women's Redmond Mid Waterproof Hiking Shoes
Price: 95
Material: None
Image: https://ignitedemotorage.z5.web.core.windows.net/images/249.png
Purchase: https://www.kohls.com/product/prd-4675970/columbia-womens-crestwood-mid-waterproof.jsp?skuid=37797153&ci_mcc=ci&utm_campaign=WOMENS%20CASUAL%20SHOES&utm_medium=CSE&utm_source=bing&CID=shopping20&utm_campaignid=47

# Q&A over the data with GPT-3.5

Finally, we'll create a helper function to feed prompts into the `Completions` model. Then we'll create interactive loop where you can pose questions to the model and receive information grounded in your data.

In [None]:
#This function helps to ground the model with prompts and system instructions.

def generate_completion(prompt):
    system_prompt = '''
    You are an intelligent assistant for Microsoft Shoe store.
    You are designed to provide helpful answers to user questions about Azure services given the information about to be provided.
        - Only answer questions related to the information provided below, provide 3 clear suggestions in a list format.
        - Write two lines of whitespace between each answer in the list.
        - If you're unsure of an answer, you can say ""I don't know"" or ""I'm not sure"" and recommend users search themselves."
    '''

    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_input},
    ]

    for item in results:
        messages.append({"role": "system", "content": prompt['content']})

    response = client.chat.completions.create(
        model="gpt-4",
        messages=messages
    )
    
    return response

In [None]:
# Create a loop of user input and model output. You can now perform Q&A over the sample data!

user_input = ""
print("*** Please ask your model questions about Shoes. Type 'end' to end the session.\n")
user_input = input("Prompt: ")
while user_input.lower() != "end":
    results_for_prompt = vector_search(user_input)
   # print(f"User Prompt: {user_input}")
    completions_results = generate_completion(results_for_prompt)
    print("\n")
    print(completions_results['choices'][0]['message']['content'])
    user_input = input("Prompt: ")


# Generating Ad content with GPT-4 

Finally, we put it all together by creating an ad caption and an ad image via the `Completions` API and `DALL.E 3` API, and combine that with the vector search results.

In [12]:
from openai import OpenAI

def generate_ad_title(ad_topic):
    system_prompt = '''
    You are, Heelie, an intelligent assistant for generating witty and cativating tagline for online advertisement.
        - The ad campaign taglines that you generate are short and typically under 100 characters.
    '''

    user_prompt = f'''Generate a catchy, witty, and short sentence (less than 100 characters) 
                    for an advertisement for selling shoes for {ad_topic}'''
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt},
    ]

    response = client.chat.completions.create(
        model="gpt-4",
        messages=messages
    )
    
    return response.choices[0].message.content

def Generate_ad_image(ad_topic):
    daliClient = OpenAI(
        api_key=config['dali_api_key']
    )

    image_prompt = f'''
        Generate a photorealistic image of an ad campaign for selling {ad_topic}. 
        The image should be clean, with the item being sold in the foreground with an easily identifiable landmark of the city in the background.
        The image should also try to depict the weather of the location for the time of the year mentioned.
        The image should not have any generated text overlay.
    '''

    response = daliClient.images.generate(
        model="dall-e-3",
        prompt= image_prompt,
        size="1024x1024",
        quality="standard",
        n=1,
        )

    return response.data[0].url

def render_html_page(ad_topic):

    # Find the matching shoes from the inventory
    results = vector_search(ad_topic, 4)
    
    ad_header = generate_ad_title(ad_topic)
    #ad_image_url = Generate_ad_image(ad_topic)


    with open('./data/ad-start.html', 'r', encoding='utf-8') as html_file:
        html_content = html_file.read()

    html_content += f'''<header>
            <h2>{ad_header}</h1>
        </header>'''    

    # html_content += f'''
    #         <section class="ad">
    #         <img src="{ad_image_url}" alt="Base Ad Image" class="ad-image">
    #     </section>'''

    for result in results: 
        html_content += f''' 
        <section class="product">
            <img src="{result['document']['img_url']}" alt="{result['document']['name']}" class="product-image">
            <div class="product-details">
                <h3 class="product-title" color="gray">{result['document']['name']}</h2>
                <p class="product-price">{"$"+str(result['document']['price'])}</p>
                <p class="product-description">{result['document']['description']}</p>
                <a href="{result['document']['purchase_url']}" class="buy-now-button">Buy Now</a>
            </div>
        </section>
        '''

    html_content += '''</article>
                    </body>
                    </html>'''

    return html_content

# Putting it all together

In [13]:
import gradio as gr

css = """
    button { background-color: purple; color: read; }
    <style>
    </style>
"""

with gr.Blocks(css=css, theme=gr.themes.Default(spacing_size=gr.themes.sizes.spacing_sm, radius_size="none")) as demo:
    subject = gr.Textbox(placeholder="Ad Keywords", label="Prompt for Heelie!!")
    btn = gr.Button("Generate Ad")
    output_html = gr.HTML(label="Generated Ad HTML")

    btn.click(render_html_page, [subject], output_html)

    btn = gr.Button("Copy HTML")

if __name__ == "__main__":
    demo.launch()   

Running on local URL:  http://127.0.0.1:7861

To create a public link, set `share=True` in `launch()`.
