In [1]:
from azure.core.credentials import AzureKeyCredential
from dotenv import load_dotenv

import os

In [2]:
load_dotenv(override=True)

AZURE_SEARCH_SERVICE=os.getenv("AZURE_SEARCH_SERVICE")
AZURE_SEARCH_KEY=os.getenv("AZURE_SEARCH_KEY")
AZURE_OPENAI_ENDPOINT=os.getenv("AZURE_OPENAI_ENDPOINT")
AZURE_OPENAI_VERSION=os.getenv("AZURE_OPENAI_VERSION")
AZURE_OPENAI_KEY=os.getenv("AZURE_OPENAI_API_KEY")
AZURE_OPENAI_DEPLOYMENT=os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME")
AZURE_STORAGE_CONNECTION=os.getenv("AZURE_STORAGE_CONNECTION")
AZURE_OPENAI_EMBEDDING_MODEL=os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME")

credential = AzureKeyCredential(AZURE_SEARCH_KEY)

In [3]:
from azure.search.documents import SearchClient
from openai import AzureOpenAI

index_name = "py-rag-training-idx"

openai_client = AzureOpenAI(
  api_key = AZURE_OPENAI_KEY,  
  api_version = AZURE_OPENAI_VERSION,
  azure_endpoint =AZURE_OPENAI_ENDPOINT,
  azure_deployment=AZURE_OPENAI_DEPLOYMENT,
)

search_client = SearchClient(
     endpoint=AZURE_SEARCH_SERVICE,
     index_name=index_name,
     credential=credential
 )

GROUNDED_PROMPT="""
You are an AI assistant that helps users learn from the information found in the source material.
Answer the query using only the sources provided below.
Use bullets if the answer has multiple points.
If the answer is longer than 3 sentences, provide a summary.
Answer ONLY with the facts listed in the list of sources below. Cite your source when you answer the question
If there isn't enough information below, say you don't know.
Do not generate answers that don't use the sources below.
Query: {query}
Sources:\n{sources}
"""

In [4]:
from azure.search.documents.models import VectorizableTextQuery

query="What's the NASA earth book about?"
vector_query = VectorizableTextQuery(text=query, k_nearest_neighbors=50, fields="text_vector")

search_results = search_client.search(
    search_text=query,
    vector_queries= [vector_query],
    select=["title", "chunk", "locations"],
    top=5,
)

sources_formatted = "=================\n".join([f'TITLE: {document["title"]}, CONTENT: {document["chunk"]}, LOCATIONS: {document["locations"]}' for document in search_results])

In [5]:
response = openai_client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": GROUNDED_PROMPT.format(query=query, sources=sources_formatted)
        }
    ],
    model=AZURE_OPENAI_DEPLOYMENT,
)

print(response.choices[0].message.content)

- The NASA Earth book presents inspiring satellite images of Earth that tell a story of our 4.5-billion-year-old planet, highlighting the dynamic features of land, wind, water, ice, and air as seen from above (page-8.pdf).
- The images and their scientific context emphasize that our planet is as compelling as any fiction, aiming to inspire readers by showing the beauty and complexity of Earth from space (page-8.pdf).
- The book is a collaborative work involving writers, reporters, image designers, and scientists, all working to make scientific imagery and data accessible to the public (page-171.pdf).
- The book offers a satellite-based perspective of Earth, continuing NASA’s mission to help people explore and understand our planet (page-8.pdf).
- Summary: The NASA Earth book uses inspiring satellite imagery to celebrate Earth’s natural wonders, providing a scientific and artistic look at our home planet as seen from space, with the goal of making this perspective accessible and inspiri

In [6]:
query="Are there any cloud formations specific to oceans and large bodies of water?"
vector_query = VectorizableTextQuery(text=query, k_nearest_neighbors=50, fields="text_vector")

search_results = search_client.search(
    search_text=query,
    vector_queries= [vector_query],
    select=["title", "chunk", "locations"],
    top=5,
)

sources_formatted = "=================\n".join([f'TITLE: {document["title"]}, CONTENT: {document["chunk"]}, LOCATIONS: {document["locations"]}' for document in search_results])

response = openai_client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": GROUNDED_PROMPT.format(query=query, sources=sources_formatted)
        }
    ],
    model=AZURE_OPENAI_DEPLOYMENT
)

print(response.choices[0].message.content)

- Marine stratocumulus clouds are a type of low-level cloud formation that is specific to oceans and large bodies of water. They are essentially fog and are a persistent feature off the coast of Peru and Chile, developing most often during the winter and early spring. Prevailing winds can sometimes push these marine stratocumulus clouds inland through valleys that open to the ocean, but they are often blocked by coastal mountains and hills. (page-15.pdf)
- Another type of cloud formation that can form over oceans is the undular bore or solitary wave, which is produced when air masses from land and the ocean collide, such as off the coast of Mauritania. This interaction creates a wave structure in the atmosphere, with parts favorable for cloud formation. (page-23.pdf)

Summary:
There are cloud formations specific to oceans and large bodies of water, including persistent marine stratocumulus clouds (essentially fog over the Pacific coast of Peru and Chile) and atmospheric waves like undu