# Azure Open AI in fabric

- Refer to: https://learn.microsoft.com/en-us/fabric/data-science/open-ai

## Pre-requisites:
- Check if openai python SDK and other packages are installed and available in the environment you are using. 
- Install other dependencies as needed.
- OpenAI and SynapseML are pre-installed in Fabric (preview)
- Create folder "unstructured_data" under "Files" in the pinned Lakehouse
- Upload "resume.txt" provided with the lab assets into "unstructured_data" folder
- Ensure Spark 1.2 Environment is being used at the workspace settings

In [1]:
%pip list

StatementMeta(, c94e35b6-5aa8-430f-96f7-c4febc000bcf, 3, Finished, Available, Finished)

Package                       Version
----------------------------- --------------------
absl-py                       2.0.0
adal                          1.2.7
adlfs                         2023.4.0
aiohttp                       3.8.6
aiosignal                     1.3.1
alembic                       1.12.0
ansi2html                     0.0.0
anyio                         3.7.1
appdirs                       1.4.4
argon2-cffi                   23.1.0
argon2-cffi-bindings          21.2.0
arrow                         1.3.0
astor                         0.8.1
asttokens                     2.4.0
astunparse                    1.6.3
async-timeout                 4.0.3
attrs                         23.1.0
autopage                      0.5.2
azure-core                    1.29.4
azure-datalake-store          0.0.51
azure-identity                1.14.1
azure-storage-blob            12.18.3
azure-storage-file-datalake   12.12.0
azure-synapse-ml-predict      1.0.0
azureml

In [1]:
%pip install openai==0.28

StatementMeta(, b70ee414-e79f-4a97-9c1c-7baa0c48d1d9, 7, Finished, Available, Finished)

Collecting openai==0.28
  Downloading openai-0.28.0-py3-none-any.whl.metadata (13 kB)
Downloading openai-0.28.0-py3-none-any.whl (76 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: openai
Successfully installed openai-0.28.0

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m25.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.



## Use case: Retrieve structured data from unstructured document

Ensure openai module is listed in the above command. If not use pip to install before running the below cell - %pip install --upgrade openai

In [2]:
import openai
# intialize deployment id
deployment_id = "gpt-4o"

StatementMeta(, b70ee414-e79f-4a97-9c1c-7baa0c48d1d9, 9, Finished, Available, Finished)

### Quick OpenAI Question Test

In [3]:
content = "Summarize the latest trends in automotive cockpit technologies and predict emerging customer demands in the North American market for the next 3 years."
#system = "You are acting as a Panasonic North America Data Analyst.Your role is to:Analyze complex business datasets from manufacturing, supply chain, automotive, and energy sectors.Identify key trends, anomalies, and actionable insights.Summarize findings in a clear, concise, and business-driven manner.Recommend next steps or strategic actions based on the data analysis.Focus on:Improving operational efficiency, product quality, and customer satisfaction.Supporting data-driven decision making for executives and business units.Providing visualizations (tables, charts) where appropriate to enhance understanding.Your responses should be structured professionally, with clear headings such as Findings, Analysis, Recommendations, and Next Steps."
#messages = [{"role": "system","content":system},{"role": "user","content":content}]
messages = [{"role": "user","content":content}]

response = openai.ChatCompletion.create(
    engine=deployment_id,        # Specify the deployment engine ID.
    messages=messages,           # Input messages to the chat model.
    temperature=0,               # Controls the randomness of the response. 0 means no randomness.
    max_tokens=3000,             # Maximum number of tokens to generate in the response.
    top_p=0.95,                  # Controls the diversity of the outputs. 0.95 allows for diversity.
    frequency_penalty=0,         # Penalizes new tokens based on their frequency in text so far.
    presence_penalty=0,          # Penalizes new tokens if they appear in the text so far.
    stop=''                   # Specifies a stopping sequence for the generated text.
)
print(response)  # Return the API response

StatementMeta(, b70ee414-e79f-4a97-9c1c-7baa0c48d1d9, 10, Finished, Available, Finished)

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The latest trends in automotive cockpit technologies are centered around enhancing user experience, connectivity, and safety. Key trends include:\n\n1. **Advanced Driver Assistance Systems (ADAS)**: Integration of more sophisticated ADAS features such as adaptive cruise control, lane-keeping assist, and automated parking. These systems are becoming more intuitive and reliable, paving the way for higher levels of autonomous driving.\n\n2. **Digital Instrument Clusters and Head-Up Displays (HUDs)**: Traditional analog gauges are being replaced by fully digital instrument clusters that offer customizable displays. HUDs are also becoming more common, projecting critical information onto the windshield to keep drivers' eyes on the road.\n\n3. **Voice and Gesture Control**: Voice recognition systems are improving, allowing for more natural and accurate interactions. Gesture control 

In [5]:
# print result
print(response['choices'][0]['message']['content'])

StatementMeta(, b70ee414-e79f-4a97-9c1c-7baa0c48d1d9, 12, Finished, Available, Finished)

The latest trends in automotive cockpit technologies are centered around enhancing user experience, connectivity, and safety. Key trends include:

1. **Advanced Driver Assistance Systems (ADAS)**: Integration of more sophisticated ADAS features such as adaptive cruise control, lane-keeping assist, and automated parking. These systems are becoming more intuitive and reliable, paving the way for higher levels of autonomous driving.

2. **Digital Instrument Clusters and Head-Up Displays (HUDs)**: Traditional analog gauges are being replaced by fully digital instrument clusters that offer customizable displays. HUDs are also becoming more common, projecting critical information onto the windshield to keep drivers' eyes on the road.

3. **Voice and Gesture Control**: Voice recognition systems are improving, allowing for more natural and accurate interactions. Gesture control is also emerging, enabling drivers to control various functions with simple hand movements.

4. **Augmented Reality (

### Transform Unstructured Data into Structured Data with OpenAI

In [9]:
# define openai chat completion function
def request_api(messages, deployment_id):
    response = openai.ChatCompletion.create(
        engine = deployment_id,
        messages = messages,
        temperature=0,
        max_tokens=3000,
        top_p=0.95,
        frequency_penalty=0,
        presence_penalty=0,
        stop='###')
    return response

StatementMeta(, b70ee414-e79f-4a97-9c1c-7baa0c48d1d9, 17, Finished, Available, Finished)

In [6]:
# define function to get structured data
def get_structured_data(document, prompt_postfix, deployment_id):
    content = prompt_postfix.replace('<document>', document)
    messages = [{"role":"user","content":content},]

    structured_data = request_api(messages, deployment_id)
    return structured_data

StatementMeta(, b70ee414-e79f-4a97-9c1c-7baa0c48d1d9, 14, Finished, Available, Finished)

### Read unstructured Data

In [7]:
# read sample resume

df = spark.read.text("Files/resume.txt")

document = ' '.join([str(x.asDict()['value']) for x in df.collect()])
display(document)

StatementMeta(, b70ee414-e79f-4a97-9c1c-7baa0c48d1d9, 15, Finished, Available, Finished)

"Contact chew.yean.yam@gmail.com www.linkedin.com/in/cyyam (LinkedIn) Top Skills Research Microarray Analysis OpenCV Languages English (Native or Bilingual) Malay (Professional Working) Certifications Worldwide Communities - Community SME 2018 Fred Kofman on Managing Conflict Chew-Yean Yam AI Leader & Practitioner | Data Science | Machine Learning | AI Strategy United Kingdom Summary Over 20 years of industrial and applied research experience with global advanced technology companies and internationally renowned research institutions: Microsoft, Intel, HP Labs, Agency for Science Technology & Research, British Aerospace Engineering, National Physical Laboratory (UK). I build and lead Data Science and Machine Learning specialist teams to design and develop AI applications with start-ups and established enterprises. I advise on their AI strategies, identify commercial opportunities and code with them to deployment. Experienced in incubation of new AI capabilities for products. Experience

### Sets up the prompt for extracting key sections from a resume and then calls the function to get the structured data.
#### Define the prompt with a placeholder for the document content

In [10]:
# prompt
prompt_postfix = """ <document>
  \n###
  \nExtract the key sections from the resume above into json.
"""
structured_data = get_structured_data(document, prompt_postfix, deployment_id)

StatementMeta(, b70ee414-e79f-4a97-9c1c-7baa0c48d1d9, 18, Finished, Available, Finished)

In [11]:
# print result
print(structured_data['choices'][0]['message']['content'])

StatementMeta(, b70ee414-e79f-4a97-9c1c-7baa0c48d1d9, 19, Finished, Available, Finished)

```json
{
  "contact": {
    "email": "chew.yean.yam@gmail.com",
    "linkedin": "www.linkedin.com/in/cyyam"
  },
  "top_skills": [
    "Research",
    "Microarray Analysis",
    "OpenCV"
  ],
  "languages": {
    "English": "Native or Bilingual",
    "Malay": "Professional Working"
  },
  "certifications": [
    "Worldwide Communities - Community SME 2018",
    "Fred Kofman on Managing Conflict"
  ],
  "summary": "Over 20 years of industrial and applied research experience with global advanced technology companies and internationally renowned research institutions: Microsoft, Intel, HP Labs, Agency for Science Technology & Research, British Aerospace Engineering, National Physical Laboratory (UK). I build and lead Data Science and Machine Learning specialist teams to design and develop AI applications with start-ups and established enterprises. I advise on their AI strategies, identify commercial opportunities and code with them to deployment. Experienced in incubation of new AI capab