# 🚀 Get started to validate the setup

This Jupyter notebook is recommended for workshop/education only.

Prerequisites:

1. Set up your computing environment
2. Install the required library in your Python environment
3. Select the correct kernel (`azureml_py310_sdkv2`) for your Jupyter notebook


## 1. Azure Open AI Test

---


In [1]:
%load_ext autoreload
%autoreload 2

from common import check_kernel
check_kernel()

Kernel: python31014jvsc74a57bd01f90a0206bde5cf3732dab79adbbcc7570d5fab64b89fc69d46a8fe33664a709


In [2]:
import os
from openai import AzureOpenAI
from dotenv import load_dotenv, find_dotenv
envpath = find_dotenv()
load_dotenv(envpath)
print(envpath)
aoai_api_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
print(aoai_api_endpoint)
aoai_api_key = os.getenv("AZURE_OPENAI_API_KEY")
aoai_api_version = os.getenv("AZURE_OPENAI_API_VERSION")
aoai_deployment_name = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME")

if not aoai_api_version:
    aoai_api_version = os.getenv("OPENAI_API_VERSION")
if not aoai_deployment_name:
    aoai_deployment_name = os.getenv("DEPLOYMENT_NAME")

try:
    client = AzureOpenAI(
        azure_endpoint = aoai_api_endpoint,
        api_key        = aoai_api_key,
        api_version    = aoai_api_version
    )
    deployment_name = aoai_deployment_name
    print("=== Initialized AzuureOpenAI client ===")
    print(f"AZURE_OPENAI_ENDPOINT={aoai_api_endpoint}")
    print(f"AZURE_OPENAI_API_VERSION={aoai_api_version}")
    print(f"AZURE_OPENAI_DEPLOYMENT_NAME={aoai_deployment_name}")   
except (ValueError, TypeError) as e:
    print(e)

/mnt/d/BT/SRC/NLP/LLM/SFT/SLMWorkshopCN/.env
https://cog-pgwgybluulpec.openai.azure.com/
=== Initialized AzuureOpenAI client ===
AZURE_OPENAI_ENDPOINT=https://cog-pgwgybluulpec.openai.azure.com/
AZURE_OPENAI_API_VERSION=2024-07-01-preview
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini


In [3]:
# Create your prompt
system_message = """
You are an AI assistant that helps customers find information. As an assistant, you respond to questions in a concise and unique manner.
You can use Markdown to answer simply and concisely, and add a personal touch with appropriate emojis.

Add a witty joke starting with "By the way," at the end of your response. Do not mention the customer's name in the joke part.
The joke should be related to the specific question asked.
For example, if the question is about tents, the joke should be specifically related to tents.

Use the given context to provide a more personalized response. Write each sentence on a new line:
"""
context = """
    The Alpine Explorer Tent features a detachable partition to ensure privacy, 
    numerous mesh windows and adjustable vents for ventilation, and a waterproof design. 
    It also includes a built-in gear loft for storing outdoor essentials. 
    In short, it offers a harmonious blend of privacy, comfort, and convenience, making it a second home in nature!
"""
question = "What are features of the Alpine Explorer Tent?"

user_message = f"""
Context: {context}
Question: {question}
"""

# Simple API Call
response = client.chat.completions.create(
    model=deployment_name,
    messages=[
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_message},
    ],
  temperature=0.7,
  max_tokens=300
)

print(response.choices[0].message.content)

The Alpine Explorer Tent boasts several impressive features:

- **Detachable Partition**: Ensures privacy for occupants. 
- **Mesh Windows**: Allows for excellent ventilation while keeping bugs out. 
- **Adjustable Vents**: Helps to control airflow for added comfort. 
- **Waterproof Design**: Keeps you dry in wet conditions. 
- **Built-In Gear Loft**: Perfect for storing your outdoor essentials.

In short, it’s a cozy and convenient second home in nature! 🏕️

By the way, why did the tent break up with the backpack? 

It felt too much pressure! 😂


## (Optional) 2. Azure Document Inteligence Test

---


In [4]:
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import ContentFormat

doc_intelligence_endpoint = os.getenv("AZURE_DOC_INTELLIGENCE_ENDPOINT")
doc_intelligence_key = os.getenv("AZURE_DOC_INTELLIGENCE_KEY")

try:
    document_intelligence_client = DocumentIntelligenceClient(
        endpoint=doc_intelligence_endpoint, 
        credential=AzureKeyCredential(doc_intelligence_key),
        headers={"x-ms-useragent":"sample-code-figure-understanding/1.0.0"},
    )
    print("=== Initialized DocumentIntelligenceClient ===")
    print(f"AZURE_DOC_INTELLIGENCE_ENDPOINT={doc_intelligence_endpoint}")    
except (ValueError, TypeError) as e:
    print(e)
    
raw_data_dir = "../1_synthetic-qa-generation/raw_data"
file_path = f"{raw_data_dir}/pdf/en-imagenet-training-wrote-by-daekeun.pdf"

=== Initialized DocumentIntelligenceClient ===
AZURE_DOC_INTELLIGENCE_ENDPOINT=https://cog-di-pgwgybluulpec.cognitiveservices.azure.com/


In [5]:
with open(file_path, "rb") as f:
    poller = document_intelligence_client.begin_analyze_document(
        "prebuilt-layout", analyze_request=f, content_type="application/octet-stream", 
        output_content_format=ContentFormat.MARKDOWN 
    )

result = poller.result()
md_content = result.content
print(md_content)

<!-- PageHeader="24. 7. 22. 오전 9:52" -->
<!-- PageHeader="[Hands-on] Fast Training ImageNet on on-demand EC2 GPU instances with Horovod" -->


<figure>
</figure>


# [Hands-on] Fast Training ImageNet on on-demand EC2 GPU instances with Horovod

Author: Daekeun Kim (daekeun@amazon.com)


## Goal

This document is for people who need distributed GPU training using Horovod for
experimental purposes. Many steps are similar to what mentioned in Julien
Simon's article (https://medium.com/@julsimon/imagenet-part-1-going-on-an-
adventure-c0a62976dc72) and AWS

Documentation(https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-
horovod-tensorflow.html). So I recommend you to view these articles first. If there
are some things that aren't going well (e.g., Downloading the dataset does not
work, How to convert the raw data to the TFRecord feature set?, How to fix the
error ModuleNotFoundError: No module named 'cv2'? ) please refer this
document.


## Introduction

For data preparation and d

## 3. Azure ML Test

---


In [6]:
import os
import yaml
from datetime import datetime
snapshot_date = datetime.now().strftime("%Y-%m-%d")

with open('../2_slm-fine-tuning-mlstudio/phi3/config_prd.yml') as f:
    d = yaml.load(f, Loader=yaml.FullLoader)
    
AZURE_SUBSCRIPTION_ID = d['config']['AZURE_SUBSCRIPTION_ID']
AZURE_RESOURCE_GROUP = d['config']['AZURE_RESOURCE_GROUP']
AZURE_WORKSPACE = d['config']['AZURE_WORKSPACE']
AZURE_DATA_NAME = d['config']['AZURE_DATA_NAME']    
DATA_DIR = d['config']['DATA_DIR']
CLOUD_DIR = d['config']['CLOUD_DIR']
HF_MODEL_NAME_OR_PATH = d['config']['HF_MODEL_NAME_OR_PATH']
IS_DEBUG = d['config']['IS_DEBUG']
USE_LOWPRIORITY_VM = d['config']['USE_LOWPRIORITY_VM']


print(f"AZURE_SUBSCRIPTION_ID={AZURE_SUBSCRIPTION_ID}")
print(f"AZURE_RESOURCE_GROUP={AZURE_RESOURCE_GROUP}")
print(f"AZURE_WORKSPACE={AZURE_WORKSPACE}")
print(f"AZURE_DATA_NAME={AZURE_DATA_NAME}")
print(f"DATA_DIR={DATA_DIR}")
print(f"CLOUD_DIR={CLOUD_DIR}")
print(f"HF_MODEL_NAME_OR_PATH={HF_MODEL_NAME_OR_PATH}")
print(f"IS_DEBUG={IS_DEBUG}")
print(f"USE_LOWPRIORITY_VM={USE_LOWPRIORITY_VM}")

AZURE_SUBSCRIPTION_ID=49aee8bf-3f02-464f-a0ba-e3467e7d85e2
AZURE_RESOURCE_GROUP=rg-slmwrkshp_9
AZURE_WORKSPACE=mlw-pgwgybluulpec
AZURE_DATA_NAME=lgds-sftdemo241201
DATA_DIR=./dataset
CLOUD_DIR=./cloud
HF_MODEL_NAME_OR_PATH=microsoft/Phi-3.5-mini-instruct
IS_DEBUG=True
USE_LOWPRIORITY_VM=False


In [7]:
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential
from azure.ai.ml import MLClient
from azure.core.exceptions import HttpResponseError

credential = DefaultAzureCredential()
ml_client = MLClient(credential, AZURE_SUBSCRIPTION_ID, AZURE_RESOURCE_GROUP, AZURE_WORKSPACE)

# from azure.identity import ClientSecretCredential
# credentials = ClientSecretCredential(
#     client_id=client_id,
#     client_secret=client_secret,
#     tenant_id=tenant_id
# )

try:
    workspace = ml_client.workspaces.get(name=AZURE_WORKSPACE)
    print(f"Connected to Azure ML Workspace: {workspace.name}")
    print(f"Workspace Location: {workspace.location}")
    print(f"Workspace ID: {workspace.id}")
except HttpResponseError as e:
    print(f"Failed to connect to Azure ML Workspace: {e}")

Connected to Azure ML Workspace: mlw-pgwgybluulpec
Workspace Location: eastus
Workspace ID: /subscriptions/49aee8bf-3f02-464f-a0ba-e3467e7d85e2/resourceGroups/rg-slmwrkshp_9/providers/Microsoft.MachineLearningServices/workspaces/mlw-pgwgybluulpec
