<div class="alert alert-block alert-success">
    
    
### <center> Intelligent SQL Assistant</center>
## <center> For SQL Code Generation </center>    
### <center> Utilizing OpenAI </center>



**Author**: Atef Bader, PhD
<br>
**Last Edit**: 2/12/2024
<br>
**Revised By**: Edward Arroyo, PhD
<br>
<br>
    
    
</div>

![image-2.png](attachment:image-2.png)


<div class="alert alert-block alert-danger">
    
## Deliverables:

You are required to submit **three** files with the naming convention <font color = 'red'> <b>LastName_Assignment_5<b> </font>:  

1. **IPYNB Script**: This is your original notebook file ( <font color = 'red'> <b> LastName_Assignment_5.ipynb<b> </font>).  

2. **HTML Document**: This file must include all the source code you have written along with its output. This should include your source code and output. Follow these steps to generate the HTML file: 

   - After completing your work in the Jupyter Notebook, go to the menu bar at the top. 

   - Click on File, then hover over Download as. 

   - Select HTML (.html) 

   - Save the file with the appropriate naming convention ( <font color = 'red'> <b>LastName_Assignment_5.html</b> </font>). 
   - **Ensure all code and outputs are properly displayed in the HTML document.**

3. **MP4 Video Recording:** A live demo recording lasting between 5 and 10 minutes.

**Note**: You are required to provide your code and its output immediately following each requirement for this assignment.
    
</div>



<div class="alert alert-block alert-warning">

    
  

## Learning Objectives:

- Develop an **Intelligent SQL Assistant** that will :
    - Use private, personal, and real-time data (including daily tasks, calendar, diet, exercise, personal notes/documents, sports scores, etc.).
    - Be domain/goal specific (applicable to retailers, financial applications, insurance applications, etc.).
    - Make function calls to the LLM/GPT API (OpenAI) to integrate GPT's capabilities with external tools and APIs.
    - Convert natural language into API calls or database queries.
    - Answer questions posed in natural or programming languages.
    - Employ OpenAI to generate, extract, and transform text, whether natural or code-based.
    - Execute personal/business process workflows using real-time and geospatial data, for example, booking flights, ordering food, etc.

- Our **Intelligent Virtual Assistant** will differ from Apple Siri, Amazon Alexa, Google Assistant, or general internet chatbots by generating code and interacting with external tools to complete specific **tasks**.

</div>

<hr style="border:5px solid orange"> </hr>

</div>



<div class="alert alert-block alert-success">

## Learning Outcomes:

- Create an **Intelligent Virtual Assistant** capable of:
    - Using the OpenAI API to generate SQL code.
    - Employing prompt chaining for conversational AI interactions with a database server.
    - Transforming natural language queries into SQL commands.
    - Executing generated SQL queries in a chained conversation with a Database Server, such as PostgreSQL.

</div>

<hr style="border:5px solid orange"> </hr>

</div>

<div class="alert alert-info">

## Why Build/Use OpenAI Plugins, function_call, and Functions?

- The AI model acts as an intelligent API caller. It uses API specifications and a natural-language description of when to use the API to proactively perform actions. For instance, in response to a query like "Where should I stay in Paris for a couple of nights?", the model might call a hotel reservation plugin API, process the API response, and generate a comprehensive answer by integrating the API data with its natural language processing capabilities.

- ChatGPT relies on historical data, which necessitates the use of plugins/function_calls for scenarios involving:
    1. Real-time public data, such as sports scores, stock prices, and the latest news.
    2. Both real-time and non-real-time personal/private data, including company documents and personal notes.
    3. Personal/business process workflows, such as booking flights or ordering food.

</div>

<div class="alert alert-info">

## Prompt Engineering to Improve the Performance of LLMs

- This process entails refining the prompt to more effectively guide the model, which can lead to improved outcomes, particularly for complex tasks.

- Consider exploring various techniques, such as:
    - Defining the role, task, and context.
    - Using the phrase "Let’s think step by step" to direct the model towards incremental reasoning.
    - Employing few-shot learning.

- To constrain the context utilized by the OpenAI System, always start your prompt with the instruction:
    - "Do not assume, use only YOUR_KEYWORD."

- Utilize delimiters to clearly demarcate different sections of the input. Delimiters can include symbols like: `, """, < >, <tag> </tag>, and :.

- Specify your preferred output format:
    - HTML
    - JSON

</div>


<div class="alert alert-info">
    
### References:

Several prompt examples, utility/helper functions, and code snippets in this script have been adapted from the following sources:
- [OpenAI Platform Overview](https://platform.openai.com/docs/introduction/overview)
- [OpenAI Playground](https://platform.openai.com/playground)
- [OpenAI API Reference](https://platform.openai.com/docs/api-reference)
- [Rate Limits Overview](https://platform.openai.com/docs/guides/rate-limits/overview)
- [ChatGPT Prompt Engineering Course](https://learn.deeplearning.ai/chatgpt-prompt-eng/lesson/2/guidelines)
- [OpenAI Cookbook on GitHub](https://github.com/openai/openai-cookbook)
- Caelen, O. & Blete, M. (2023). *Developing Apps with GPT-4 and ChatGPT*. Sebastopol, CA: O’Reilly. ISBN-13: 978-1-098-15248-2

</div>
  

<div class="alert alert-block alert-warning">

### Create Your API Key at the Following URL:
[https://platform.openai.com/account/api-keys](https://platform.openai.com/account/api-keys)

</div>

<div class="alert alert-block alert-warning">
    
### Adding Your OpenAI API Key to System Environment Variables

For detailed instructions, visit: [https://www.immersivelimit.com/tutorials/adding-your-openai-api-key-to-system-environment-variables](https://www.immersivelimit.com/tutorials/adding-your-openai-api-key-to-system-environment-variables)
    
</div>
  

<div class="alert alert-block alert-warning">

### Saving your key using Environment Variables (Windows)


1. **Open System Properties**:
   - Right-click on `This PC` or `My Computer` on your desktop or in File Explorer, then click `Properties`.
   - Click `Advanced system settings` on the left sidebar.
   - In the System Properties window, go to the `Advanced` tab and click `Environment Variables`.

2. **Add New Environment Variable**:
   - Under the `User variables` section, click `New`.
   - For the Variable name, enter `OPENAI_API_KEY`.
   - For the Variable value, enter your actual OpenAI API key.

3. **Confirm and Apply**:
   - Click `OK` to close each dialog box.

    
    
    

    
#### Usage:
You will use this configuration to make calls to the OpenAI API like so:
- `openai.api_key = os.getenv("OPENAI_API_KEY")`
    
</div>
    

<div class="alert alert-block alert-warning">
    

    
</div>

<div class="alert alert-block alert-warning">
    
![EV_new.png](attachment:EV_new.png)
    
</div>

<div class="alert alert-block alert-warning">
    
![EV_new1.png](attachment:EV_new1.png)
    
</div>

<div class="alert alert-block alert-warning">
    
![EV_new2.png](attachment:EV_new2.png)
    
</div>

<div class="alert alert-block alert-warning">
    
![EV_new3.png](attachment:EV_new3.png)
    
</div>

<div class="alert alert-block alert-warning">

### Save Your Key by Modifying the .zshrc File (Mac)

To configure your environment with the OpenAI API key, follow these updated instructions:

1. Navigate to your home directory and locate one of the following files: `.bashrc`, `.bash_profile`, or `.zshrc`. The file you select depends on the shell you are using: `.bashrc` or `.bash_profile` for Bash, and `.zshrc` for Zsh.

2. Use a text editor of your choice to open the `.zshrc` file.
   
3. Scroll to the end of the file and add the following line:
    
   ```
   export OPENAI_API_KEY='your secret key'
   ```
   Replace `'your secret key'` with your actual OpenAI API key.
    
![OPEN_AI_API.png](attachment:OPEN_AI_API.png)
    
#### Usage:
You will use this configuration to make calls to the OpenAI API like so:
- `openai.api_key = os.getenv("OPENAI_API_KEY")`
    
</div>

<div class="alert alert-block alert-warning">

### Installing OpenAI and Other Required Packages

- Execute the command in the following cell *once* to install the necessary packages. After installation, it's recommended to comment out the code in the cell to prevent re-execution.

</div>

In [1]:
import sys
!{sys.executable} -m pip install openai 
!{sys.executable} -m pip install tenacity
!{sys.executable} -m pip install termcolor
!{sys.executable} -m pip install requests




In [2]:
from openai import OpenAI

In [3]:
import os
#import openai

import json
import requests
from tenacity import retry, wait_random_exponential, stop_after_attempt
from termcolor import colored

# See https://platform.openai.com/docs/models/gpt-3-5-turbo
GPT_MODEL = "gpt-3.5-turbo-1106"



<div class="alert alert-info">
    
### OpenAI Documentation URL:
- [OpenAI API Reference Introduction](https://platform.openai.com/docs/api-reference/introduction)
    
### List of OpenAI Models:   
- [OpenAI Models Overview](https://platform.openai.com/docs/models/overview)

### Model in Use:
- GPT-3.5-turbo-16k 

</div>


In [4]:

client = OpenAI(
    # Alternatively, hard code the API key like this:
    # api_key = "this is my api key"
    api_key=os.environ.get("OPENAI_API_KEY"), 
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Say this is a test",
        }
    ],
    model = GPT_MODEL,
)

In [5]:
# Helper Function

def get_completion(prompt, model=GPT_MODEL):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(        
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message.content



<div class="alert alert-info">

### Different Types of Output Formats:
1. List
2. Old-Style Table/Text
3. HTML
4. JSON

</div>


In [6]:
text = f"""
what is the list of best \ 
electrical vehicles with mileage range and cost.
"""
prompt = f"""
Create a list \ 
```{text}```
"""

In [7]:
message = get_completion(prompt)

print(message)

1. Tesla Model S - Mileage Range: 370 miles, Cost: $79,990
2. Chevrolet Bolt EV - Mileage Range: 259 miles, Cost: $36,620
3. Nissan Leaf - Mileage Range: 226 miles, Cost: $31,600
4. Audi e-tron - Mileage Range: 204 miles, Cost: $65,900
5. Hyundai Kona Electric - Mileage Range: 258 miles, Cost: $37,190


In [8]:
text = f"""
what is the list of best \ 
electrical vehicles with mileage range and cost.
"""

In [9]:

prompt = f"""
Create a table \ 
```{text}```
"""

In [10]:
response = get_completion(prompt)
print(response)

| Vehicle Model | Mileage Range (miles) | Cost (USD) |
|---------------|-----------------------|------------|
| Tesla Model S | 370                   | $79,990    |
| Chevrolet Bolt | 259                   | $36,620    |
| Nissan Leaf | 226                   | $31,600    |
| BMW i3 | 153                   | $44,450    |
| Hyundai Kona Electric | 258                   | $37,190    |


In [11]:
from IPython.display import display, HTML

<div class="alert alert-info">
Create HTML Output


In [12]:

prompt = f"""
Create HTML table \ 
```{text}```
"""

In [13]:
response = get_completion(prompt)
display(HTML(response))

Vehicle,Mileage Range (miles),Cost ($)
Tesla Model S,375,79990
Nissan Leaf,226,31600
Chevrolet Bolt EV,259,36620
Tesla Model 3,250,39990
Audi e-tron,204,65900


<div class="alert alert-info">
Create JSON Output


In [14]:
prompt = f"""
Create JSON  \ 
```{text}```
"""

In [15]:
response = get_completion(prompt)
print(response)

{
  "best_electrical_vehicles": [
    {
      "name": "Tesla Model S",
      "mileage_range": "402 miles",
      "cost": "$79,990"
    },
    {
      "name": "Chevrolet Bolt EV",
      "mileage_range": "259 miles",
      "cost": "$36,620"
    },
    {
      "name": "Nissan Leaf",
      "mileage_range": "226 miles",
      "cost": "$31,600"
    },
    {
      "name": "Audi e-tron",
      "mileage_range": "222 miles",
      "cost": "$65,900"
    },
    {
      "name": "Kia Niro EV",
      "mileage_range": "239 miles",
      "cost": "$39,090"
    }
  ]
}


<div class="alert alert-info">

**Prompt Mini Template:**
1. Do not assume ...
2. Task ...
3. Output Format ...


In [16]:
prompt = f"""

Do not assume, use only JSON format

Your task :  
    what is the list of best electrical vehicles with mileage range and cost.


Use the following format:
    Output JSON: summary and Manufacturer, Model, mileage_range, cost


"""

response = get_completion(prompt)

print(response)

{
  "summary": "List of best electrical vehicles with mileage range and cost",
  "vehicles": [
    {
      "Manufacturer": "Tesla",
      "Model": "Model S",
      "mileage_range": "402 miles",
      "cost": "$79,990"
    },
    {
      "Manufacturer": "Chevrolet",
      "Model": "Bolt EV",
      "mileage_range": "259 miles",
      "cost": "$36,620"
    },
    {
      "Manufacturer": "Nissan",
      "Model": "Leaf",
      "mileage_range": "226 miles",
      "cost": "$31,620"
    },
    {
      "Manufacturer": "Audi",
      "Model": "e-tron",
      "mileage_range": "222 miles",
      "cost": "$65,900"
    },
    {
      "Manufacturer": "Ford",
      "Model": "Mustang Mach-E",
      "mileage_range": "305 miles",
      "cost": "$42,895"
    }
  ]
}


<div class="alert alert-block alert-danger">
    
### Be Aware of OpenAI's Current Call Limit:
<br>
    
- RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-XYZ123... on requests per minute.
    - Limit: 3 / min. Please try again in 20s.
            

<div class="alert alert-block alert-warning">
    
## Prompt Engineering
- An **Effective Prompt** will clearly state the following elements within the prompt body:
    1. Role
    2. Context
    3. Task
    4. Output Format
</div>


<div class="alert alert-info">
    
## Add Context to Your Question

</div>


In [17]:
prompt = """
Context: I want to buy an electric vehicle. 
Task: Give me a suggestion for list of vehicles to consider."""



In [18]:
response = get_completion(prompt)

print(response)

Sure! Here are some popular electric vehicles to consider:

1. Tesla Model 3
2. Nissan Leaf
3. Chevrolet Bolt EV
4. BMW i3
5. Hyundai Kona Electric
6. Kia Niro EV
7. Audi e-tron
8. Ford Mustang Mach-E
9. Volkswagen ID.4
10. Porsche Taycan

These are just a few options to consider, and there are many other electric vehicles available on the market. It's important to research and test drive different models to find the best fit for your needs and preferences.


<div class="alert alert-info">
    
## Instruct the Model to Ask Questions About the Task Context

</div>

In [19]:

prompt = """
Context: I want to buy an electric vehicle. 
Task: Give me a suggestion for list of vehciles to consider.
Do not perform the requested task! Instead, can you ask me questions about the context so that when I answer, you can perform the requested task more efficiently?
"""



In [20]:
response = get_completion(prompt)

print(response)

Sure! Here are some questions to help me understand your needs and preferences when it comes to buying an electric vehicle:

1. What is your budget for purchasing an electric vehicle?
2. What is your primary reason for wanting to buy an electric vehicle (e.g. environmental concerns, cost savings on fuel, etc.)?
3. How far do you typically drive in a day or week?
4. Do you have access to charging stations at home or work?
5. Are there any specific features or specifications you are looking for in an electric vehicle (e.g. range, size, brand preference, etc.)?
6. Do you have any specific models or brands in mind that you are interested in?
7. Are there any incentives or rebates available in your area for purchasing an electric vehicle?
8. Do you have any concerns or reservations about buying an electric vehicle that you would like to address?


<div class="alert alert-info">
    
## Add Details to the Task Before Asking the Model

</div>
 

In [21]:

prompt = """
Context: I want to buy an electric vehicle. 
Task: Give me a suggestion for list of vehciles to consider. I drive to work more than 250 miles every day, and I have a budget of $40,000 for the vehicle to buy. of 

"""

 

In [22]:
response = get_completion(prompt)

print(response)

1. Tesla Model 3 Long Range - With a range of over 300 miles, the Tesla Model 3 Long Range is a great option for your daily commute. It also falls within your budget at around $40,000.

2. Chevrolet Bolt EV - The Chevrolet Bolt EV offers a range of over 250 miles and is priced within your budget. It's a practical and affordable option for your daily driving needs.

3. Nissan Leaf Plus - The Nissan Leaf Plus has a range of over 200 miles and is a more budget-friendly option compared to the Tesla Model 3. It's a reliable choice for your daily commute.

4. Hyundai Kona Electric - The Hyundai Kona Electric offers a range of over 250 miles and is priced competitively. It's a versatile and efficient option for your daily driving requirements.

5. Kia Niro EV - The Kia Niro EV has a range of over 250 miles and is within your budget. It's a spacious and practical choice for your daily commute.


<div class="alert alert-info">
    
## OpenAI Examples can be found here:
- https://platform.openai.com/examples
<br>
<br>

![image.png](attachment:image.png)

<div class="alert alert-info">
    
# Examples

<div class="alert alert-info">
    
### Grammar correction
- Corrects sentences to standard English.

In [23]:

response = get_completion("Correct this to standard English: She no went to the market.")

print(response)

She did not go to the market.


<div class="alert alert-info">
    
## Python to Natural Language

- Explain a piece of Python code in a language people can understand.

In [24]:
response = get_completion("""
# Python 3 
def hello(x): 
print('hello '+str(x)) 
# Explanation of what the code does""")

print(response)

The code defines a function called hello that takes a parameter x. Inside the function, it prints the string "hello " followed by the value of x. This function can be called with a specific value for x to print a personalized greeting.


<div class="alert alert-info">
    
## SQL Request

- Simple SQL query building.

In [25]:
response = get_completion("Create a SQL request to find all users who live in California and have over 1000 credits")

print(response)

SELECT * 
FROM users 
WHERE state = 'California' 
AND credits > 1000;


<div class="alert alert-info">
    
## The Role is used to instructs the model to deliver a desired response by compelling it to reply in a manner defined by the role.

In [26]:

prompt = """
Role: You are a salesperson for electric vehicles.
Context: I want to buy an electric vehicle. 
Task: Give me a suggestion for list of vehciles to consider. I drive to work more than 250 miles every day, and I have a budget of $40,000 for the vehicle to buy. of 

"""


In [27]:
response = get_completion(prompt)

print(response)

Based on your daily commute of more than 250 miles, I would recommend considering electric vehicles with longer range capabilities. Some options to consider within your budget of $40,000 are the Tesla Model 3, Chevrolet Bolt EV, and the Nissan Leaf Plus. These vehicles offer ranges of over 200 miles on a single charge, making them suitable for your daily driving needs. Additionally, they are known for their reliability and performance, making them great options for your electric vehicle purchase.


<div class="alert alert-info">
    
### Transformers in Large Language Models (LLMs) vs. Arithmetic Logic Units (ALUs) and Floating-Point Units (FPUs)

- Unlike ALUs and FPUs, which are hardware components designed to perform mathematical calculations, LLMs do not possess such units. In LLMs, numbers are treated similarly to words, without any specialized hardware for mathematical operations within the transformer architecture.
    
</div>


In [28]:
prompt = "How much is 2+2?"

response = get_completion(prompt)

print(response)

2+2 equals 4.


In [29]:
prompt = "How much is 4*5?"

response = get_completion(prompt)

print(response)

4*5 equals 20.


<div class="alert alert-info">
    
### The Answer Provided by ChatGPT for the Calculation of (123 * 456789) is Incorrect
- The correct answer is 56,185,047.

</div>


In [30]:
prompt = "How much is 123 * 456789?"

response = get_completion(prompt)

print(response)

The product of 123 and 456789 is 56,296,347.


<div class="alert alert-info">
    
## Zero-shot Chain of Thought (CoT) Strategy
- Empirical evidence shows that adding "Let's think step by step" to the prompt enables the model to tackle more complex reasoning challenges effectively. This approach, known as the Zero-shot CoT strategy, was first detailed in the scientific paper "Large Language Models are Zero-Shot Reasoners" by Takeshi Kojima et al., published in 2022.
- Incorporating this sentence encourages the model to dissect the problem into smaller, more manageable sub-problems, thereby enhancing its reasoning capabilities. Consequently, this allows the model to deduce solutions to previously unsolvable challenges in a single attempt.
- For more information on Chain of Thought prompt runs, visit: [Chain of Thought Prompting](https://learnprompting.org/docs/intermediate/chain_of_thought).

</div>


In [31]:
prompt = "How much is 123 * 456789? Let's think step by step."

response = get_completion(prompt)

print(response)


First, let's multiply 123 by 9:

123 * 9 = 1107

Next, let's multiply 123 by 80 and add a zero at the end:

123 * 80 = 9840

Now, let's add these two results together:

1107 + 9840 = 10947

Finally, let's multiply 123 by 400000 and add four zeros at the end:

123 * 400000 = 49200000

Now, let's add this result to the previous one:

10947 + 49200000 = 49210947

So, 123 * 456789 = 49210947.


<div class="alert alert-info">
    
## Is This Answer Correct?
### Compare the Following Three Answers Received:
- From your calculator
- From the raw prompt
- From the prompt including "Let's think step by step"

</div>


<div class="alert alert-block alert-danger">
    
Although the model took significantly longer to provide an answer breakdown with the "Let's think step by step" prompt, giving the impression that it was dissecting the problem into sub-problems to demonstrate its logical reasoning and the mathematical calculations performed, it ultimately produced the **WRONG answer**.

</div>

<div class="alert alert-info">
    
## Inferring Sentiments About Products/Services Based on Their Reviews
</div>


In [32]:
lamp_review = """
Needed a nice lamp for my bedroom, and this one had \
additional storage and not too high of a price point. \
Got it fast.  The string to our lamp broke during the \
transit and the company happily sent over a new one. \
Came within a few days as well. It was easy to put \
together.  I had a missing part, so I contacted their \
support and they very quickly got me the missing piece! \
Lumina seems to me to be a great company that cares \
about their customers and products!!
"""

In [33]:
prompt = f"""
Identify the following items from the review text: 
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Sentiment", "Anger", "Item" and "Brand" as the keys.
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
Format the Anger value as a boolean.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

{
  "Sentiment": "positive",
  "Anger": false,
  "Item": "lamp",
  "Brand": "Lumina"
}


<div class="alert alert-block alert-danger">

## Now that you have learned the basics of OpenAI Chat Completion, let's use it to connect to and query a relational database application.
- We will use https://api.openai.com/v1/chat/completions to send a POST request with JSON input.
- Functions and function_call will be utilized in the conversation with the database.
- Utility functions are originally provided by OpenAI Cookbook examples.

</div>
    

<div class="alert alert-block alert-danger">

### Note that you should already created and populated the "saleco" database in the first assignment.


</div>

<div class="alert alert-block alert-warning">

## Build the SaleCo Database on PostgreSQL
- Install **PostgreSQL 14** on your computer.
- Add PostgreSQL to your system's PATH environment variable: `set PATH=%PATH%;C:\Program Files\PostgreSQL\14\bin;`.
- Execute the SQL script to build the database schema and insert data into tables using **psql**: `psql -p 5433 -U postgres -f Build-DB-SaleCo.sql`.
- Use **psql** to run the following SQL scripts to establish the database schema and populate the tables:
    1. Build-DB-SaleCo.sql
    2. LoadRowsIntoDB.sql


![image.png](attachment:image.png)
    
</div>


### Utilities

First let's define a few utilities for making calls to the Chat Completions API and for maintaining and keeping track of the conversation state.

In [34]:
@retry(wait=wait_random_exponential(multiplier=1, max=40), stop=stop_after_attempt(3))
def chat_completion_request(messages, functions=None, function_call=None, model=GPT_MODEL):
    headers = {
        "Content-Type": "application/json",
        "Authorization": "Bearer " + client.api_key,
    }
    
    json_data = {"model": model, "messages": messages, "temperature":0.0}
    if functions is not None:
        json_data.update({"functions": functions})
    if function_call is not None:
        json_data.update({"function_call": function_call})
    try:
        response = requests.post(
            "https://api.openai.com/v1/chat/completions",
            headers=headers,
            json=json_data,
        )
        return response
    except Exception as e:
        print("Unable to generate ChatCompletion response")
        print(f"Exception: {e}")
        return e

In [35]:
def pretty_print_conversation(messages):
    role_to_color = {
        "system": "red",
        "user": "green",
        "assistant": "blue",
        "function": "magenta",
    }
    
    for message in messages:
        if message["role"] == "system":
            print(colored(f"system: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "user":
            print(colored(f"user: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "assistant" and message.get("function_call"):
            print(colored(f"assistant: {message['function_call']}\n", role_to_color[message["role"]]))
        elif message["role"] == "assistant" and not message.get("function_call"):
            print(colored(f"assistant: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "function":
            print(colored(f"function ({message['name']}): {message['content']}\n", role_to_color[message["role"]]))

In [36]:
import psycopg2

#establishing the connection
conn = psycopg2.connect(
   database="saleco", user='postgres', password='password', host='127.0.0.1', port= '5432'
)

In [37]:
#Creating a cursor object using the cursor() method
cursor = conn.cursor()

#Executing a select statement using the execute() method
cursor.execute("select version()")


In [38]:
# Fetch a single row using fetchone() method.
data = cursor.fetchone()
print("Connection established to: ",data)

Connection established to:  ('PostgreSQL 14.10 on x86_64-apple-darwin20.6.0, compiled by Apple clang version 12.0.5 (clang-1205.0.22.9), 64-bit',)


In [39]:
def get_table_names(conn):
    """Return a list of table names."""
    table_names = []
    cursor = conn.cursor()
    cursor.execute("SELECT table_name FROM information_schema.tables WHERE table_schema='public' AND table_type='BASE TABLE';")
    for table in cursor.fetchall():
        table_names.append(table[0])
    return table_names


def get_column_names(conn, table_name):
    """Return a list of column names."""
    column_names = []
    cursor = conn.cursor()
    cursor.execute("SELECT column_name FROM information_schema.columns WHERE table_schema = 'public' AND table_name   = %s;", [table_name])
    for col in cursor.fetchall():
        column_names.append(col[0])
    return column_names


def get_database_info(conn):
    """Return a list of dicts containing the table name and columns for each table in the database."""
    table_dicts = []
    for table_name in get_table_names(conn):
        columns_names = get_column_names(conn, table_name)
        table_dicts.append({"table_name": table_name, "column_names": columns_names})
    return table_dicts


In [40]:
table_names = get_table_names(conn)
print(len(table_names), table_names)

7 ['vendor', 'product', 'customer', 'invoice', 'line', 'v', 'p']


In [41]:
column_names_customer_table = get_column_names(conn, 'customer')
print(len(column_names_customer_table), column_names_customer_table)

7 ['cus_code', 'cus_lname', 'cus_fname', 'cus_initial', 'cus_areacode', 'cus_phone', 'cus_balance']


In [42]:
get_database_info(conn)

[{'table_name': 'vendor',
  'column_names': ['v_code',
   'v_name',
   'v_contact',
   'v_areacode',
   'v_phone',
   'v_state',
   'v_order']},
 {'table_name': 'product',
  'column_names': ['p_code',
   'p_descript',
   'p_indate',
   'p_qoh',
   'p_min',
   'p_price',
   'p_discount',
   'v_code']},
 {'table_name': 'customer',
  'column_names': ['cus_code',
   'cus_lname',
   'cus_fname',
   'cus_initial',
   'cus_areacode',
   'cus_phone',
   'cus_balance']},
 {'table_name': 'invoice',
  'column_names': ['inv_number', 'cus_code', 'inv_date']},
 {'table_name': 'line',
  'column_names': ['inv_number',
   'line_number',
   'p_code',
   'line_units',
   'line_price']},
 {'table_name': 'v',
  'column_names': ['v_code',
   'v_name',
   'v_contact',
   'v_areacode',
   'v_phone',
   'v_state',
   'v_order']},
 {'table_name': 'p',
  'column_names': ['p_code',
   'p_descript',
   'p_indate',
   'p_qoh',
   'p_min',
   'p_price',
   'p_discount',
   'v_code']}]

In [43]:
database_schema_dict = get_database_info(conn)
database_schema_string = "\n".join(
    [
        f"Table: {table['table_name']}\nColumns: {', '.join(table['column_names'])}"
        for table in database_schema_dict
    ]
)

print(database_schema_string)

Table: vendor
Columns: v_code, v_name, v_contact, v_areacode, v_phone, v_state, v_order
Table: product
Columns: p_code, p_descript, p_indate, p_qoh, p_min, p_price, p_discount, v_code
Table: customer
Columns: cus_code, cus_lname, cus_fname, cus_initial, cus_areacode, cus_phone, cus_balance
Table: invoice
Columns: inv_number, cus_code, inv_date
Table: line
Columns: inv_number, line_number, p_code, line_units, line_price
Table: v
Columns: v_code, v_name, v_contact, v_areacode, v_phone, v_state, v_order
Table: p
Columns: p_code, p_descript, p_indate, p_qoh, p_min, p_price, p_discount, v_code


In [44]:
functions = [
    {
        "name": "ask_database",
        "description": "Use this function to answer user questions about saleco. Output should be a fully formed SQL query.",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": f"""
                            SQL query extracting info to answer the user's question.
                            SQL should be written using this database schema:
                            {database_schema_string}
                            The query should be returned in plain text, not in JSON.
                            """,
                }
            },
            "required": ["query"],
        },
    }
]

In [45]:
def ask_database(conn, query):
    """Function to query Postgres database with a provided SQL query."""
    try:
        cursor = conn.cursor()
        cursor.execute(query)
        results = str(cursor.fetchall())
    except Exception as e:
        results = f"query failed with error: {e}"
    return results

def execute_function_call(message):
    if message["function_call"]["name"] == "ask_database":
        query = json.loads(message["function_call"]["arguments"])["query"]
        results = ask_database(conn, query)
    else:
        results = f"Error: function {message['function_call']['name']} does not exist"
    return results

<div class="alert alert-info">
    
## Examples: Text Descriptions to SQL Queries on `SaleCo` Database

- Hi, who are the top 3 customers by number of invoices?
- What is the name of the vendor with the most products?
- List the product names from the vendor with the most products.

</div>

In [46]:
messages = []
messages.append({"role": "system", "content": "Answer user questions by generating SQL queries against the Saleco Database."})
messages.append({"role": "user", "content": "Hi, who are the top 3 customer by number of invoices?"})
chat_response = chat_completion_request(messages, functions)
assistant_message = chat_response.json()["choices"][0]["message"]
messages.append(assistant_message)
if assistant_message.get("function_call"):
    results = execute_function_call(assistant_message)
    messages.append({"role": "function", "name": assistant_message["function_call"]["name"], "content": results})
pretty_print_conversation(messages)

[31msystem: Answer user questions by generating SQL queries against the Saleco Database.
[0m
[32muser: Hi, who are the top 3 customer by number of invoices?
[0m
[34massistant: {'name': 'ask_database', 'arguments': '{"query":"SELECT cus_code, COUNT(inv_number) AS num_invoices\\nFROM invoice\\nGROUP BY cus_code\\nORDER BY num_invoices DESC\\nLIMIT 3;"}'}
[0m
[35mfunction (ask_database): [(10011, 3), (10014, 2), (10015, 1)]
[0m


In [47]:
# conn.commit()

In [48]:
messages.append({"role": "user", "content": "What is the name of the vendor with the most products?"})
chat_response = chat_completion_request(messages, functions)
assistant_message = chat_response.json()["choices"][0]["message"]
messages.append(assistant_message)
if assistant_message.get("function_call"):
    results = execute_function_call(assistant_message)
    messages.append({"role": "function", "content": results, "name": assistant_message["function_call"]["name"]})
pretty_print_conversation(messages)

[31msystem: Answer user questions by generating SQL queries against the Saleco Database.
[0m
[32muser: Hi, who are the top 3 customer by number of invoices?
[0m
[34massistant: {'name': 'ask_database', 'arguments': '{"query":"SELECT cus_code, COUNT(inv_number) AS num_invoices\\nFROM invoice\\nGROUP BY cus_code\\nORDER BY num_invoices DESC\\nLIMIT 3;"}'}
[0m
[35mfunction (ask_database): [(10011, 3), (10014, 2), (10015, 1)]
[0m
[32muser: What is the name of the vendor with the most products?
[0m
[34massistant: {'name': 'ask_database', 'arguments': '{"query":"SELECT v_name, COUNT(p_code) AS num_products\\nFROM product\\nJOIN vendor ON product.v_code = vendor.v_code\\nGROUP BY vendor.v_code, v_name\\nORDER BY num_products DESC\\nLIMIT 1;"}'}
[0m
[35mfunction (ask_database): [('Bryson, Inc.', 4)]
[0m


In [49]:
messages.append({"role": "user", "content": "List the product names of the vendor with the most products?"})
chat_response = chat_completion_request(messages, functions)
assistant_message = chat_response.json()["choices"][0]["message"]
messages.append(assistant_message)
if assistant_message.get("function_call"):
    results = execute_function_call(assistant_message)
    messages.append({"role": "function", "content": results, "name": assistant_message["function_call"]["name"]})
pretty_print_conversation(messages)

[31msystem: Answer user questions by generating SQL queries against the Saleco Database.
[0m
[32muser: Hi, who are the top 3 customer by number of invoices?
[0m
[34massistant: {'name': 'ask_database', 'arguments': '{"query":"SELECT cus_code, COUNT(inv_number) AS num_invoices\\nFROM invoice\\nGROUP BY cus_code\\nORDER BY num_invoices DESC\\nLIMIT 3;"}'}
[0m
[35mfunction (ask_database): [(10011, 3), (10014, 2), (10015, 1)]
[0m
[32muser: What is the name of the vendor with the most products?
[0m
[34massistant: {'name': 'ask_database', 'arguments': '{"query":"SELECT v_name, COUNT(p_code) AS num_products\\nFROM product\\nJOIN vendor ON product.v_code = vendor.v_code\\nGROUP BY vendor.v_code, v_name\\nORDER BY num_products DESC\\nLIMIT 1;"}'}
[0m
[35mfunction (ask_database): [('Bryson, Inc.', 4)]
[0m
[32muser: List the product names of the vendor with the most products?
[0m
[34massistant: {'name': 'ask_database', 'arguments': '{"query":"SELECT p_descript\\nFROM product\\nJOI

<div class="alert alert-info">
<hr style="border:5px solid orange"></hr>

## Requirements
- Write your code in the cell provided below each requirement.
- The .ipynb and HTML documents you submit must include both the source code and the output for the specified requirements.

**Note** 
- **Requirement 1**: For each prompt you create, use `get_completion` to obtain the response.
- **Requirement 2**: Refer to "Examples: Text Descriptions to SQL Queries on `SaleCo` Database".

<hr style="border:5px solid orange"></hr>

</div>

<div class="alert alert-block alert-danger">

**Requirement 1:** Prompt Engineering

Your task is to engineer three prompts for a business entrepreneur who plans to start an online business selling Robot Vacuum Cleaners. The prompts should provide answers to the following specifications:
- A summary of suggestions for this business startup.
- The most important features of the Robot Vacuum Cleaner that will attract customers and ensure high customer satisfaction.
- The key performance indicators (KPIs) that should be considered for the Order to Cash (O2C) process of this business.

</div>


In [50]:
# Write your code here


<div class="alert alert-block alert-danger">

**Requirement 2:** Use OpenAI Chat Completion to generate and execute SQL query to do the following:
    
List the top three most popular product names and their vendors from the SaleCo Database.

</div>

In [51]:
# Write your code here
