### Introduction
* This project uses Open AI to generate SQL interview questions,
* The user can answer with either SQL code or natural language,
* and execute SQL queries using a local database

#### Requirements
* OpenAI api key
* Local MySQL server
* Jupyter Notebook
* A highly motivated student

In [None]:
!pip install openai

In [8]:
# https://pypi.org/project/python-dotenv/
# !pip install dotenv
# !pip install --upgrade pip
!pip install python-dotenv

Collecting python-dotenv
  Downloading python_dotenv-1.0.0-py3-none-any.whl (19 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-1.0.0


In [None]:
# !pip install sqlite3

In [None]:
from sqlalchemy import create_engine

def connect_mysql(user, password, host, port, database):
    """
    Connects to a MySQL database using SQLAlchemy and returns a connection object.
    
    Parameters:
    user (str): Username to connect to the database.
    password (str): Password for the user to connect to the database.
    host (str): Hostname or IP address of the MySQL server.
    port (int): Port number to connect to the MySQL server.
    database (str): Name of the database to connect to.
    
    Returns:
    conn (object): SQLAlchemy engine object.
    """
    # Define the connection string.
    conn_str = f"mysql+pymysql://{user}:{password}@{host}:{port}/{database}"

    # Create the SQLAlchemy engine.
    engine = create_engine(conn_str)

    # Connect to the database.
    conn = engine.connect()

    # Return the connection object.
    return conn


In [None]:
conn = connect_mysql(user="myuser", 
                     password="mypassword", 
                     host="localhost", 
                     database="mydatabase",
                     port=3306
                    )


In [None]:
from sqlalchemy import create_engine

def execute_query(query):
    engine = create_engine('mysql+pymysql://username:password@localhost/mydatabase')
    conn = engine.connect()
    result = conn.execute(query).fetchall()
    conn.close()
    return result


In [6]:
from IPython.display import display, HTML


In [2]:
import openai
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.getenv('OPENAI_API_KEY')

In [4]:
def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]


#### What is the difference between a join and a subquery?

In [11]:
prompt = 'What is the difference between a join and a subquery?'
response = get_completion(prompt)
display(HTML(response))


#### What are window functions in SQL?


In [21]:
prompt = 'What are window functions in SQL?with example, show the output of the example in html table format'
response = get_completion(prompt)
display(HTML(response))

id,date,amount,running_total
1,2021-01-01,100,100
2,2021-01-02,200,300
3,2021-01-03,150,450
4,2021-01-04,300,750
5,2021-01-05,250,1000



#### What is a stored procedure and how is it used in SQL?


In [28]:
prompt ='What is a stored procedure and how is it used in SQL? with example, place the example query in a <div> element, show the output of the example in html table format'
response = get_completion(prompt)
display(HTML(response))

EmployeeID,FirstName,LastName,Age,Salary
1,John,Doe,30,50000
2,Jane,Smith,25,45000
3,Bob,Johnson,40,60000



#### What is the difference between a clustered and non-clustered index in SQL?



In [29]:
prompt ='What is the difference between a clustered and non-clustered index in SQL? with example, place the example query in a <div> element, show the output of the example in html table format'
response = get_completion(prompt)
display(HTML(response))


#### How do you handle null values in SQL queries? Can you provide an example of a query where null values need to be handled appropriately?

In [8]:
prompt = 'How to handle null values in SQL queries? Please provide an example query that handles null values'
response = get_completion(prompt)
display(HTML(response))



#### Can you explain the differences between an inner join, outer join, left join, and right join?


#### What is the difference between a query and a transaction? How do you ensure consistency of data in a transaction?


#### How is a common table expression (CTE) different from a temporary table? Can you provide an example of when you would use a CTE versus a temporary table?


#### What is a stored procedure? Can you provide an example of when you would use a stored procedure?


#### Can you explain the difference between a subquery and a join? When would you use one over the other?

#### What is a view in SQL? Can you provide an example of when you would use a view?

#### How would you write a query to find the top 10 customers by revenue, including their order details, order count, and average order value?

In [13]:
prompt = 'How would you write a query to find the top 10 customers by revenue, including their order details, order count, and average order value?'
response = get_completion(prompt)
display(HTML(response))

#### How would you write a query to find the top N products by revenue, where N is a user-defined parameter?

In [9]:
prompt = 'How would you write a query to find the top N products by revenue, where N is a user-defined parameter?'
response = get_completion(prompt)
display(HTML(response))


#### Can you explain what a window function is and provide an example of when you would use one?

In [10]:
prompt = 'How would you write a query to calculate the running total of a particular column over a time period, such as monthly revenue or daily active users?'
response = get_completion(prompt)
display(HTML(response))

#### Can you explain the difference between correlated and non-correlated subqueries?