<div style="width: 100%; overflow: hidden;">
    <div style="width: 150px; float: left;"> <img src="https://raw.githubusercontent.com/DataForScience/Networks/master/data/D4Sci_logo_ball.png" alt="Data For Science, Inc" align="left" border="0" width=150px> </div>
    <div style="float: left; margin-left: 10px;"> <h1>Generative AI with OpenAI API</h1>
<h1>Code Generation</h1>
        <p>Bruno Gonçalves<br/>
        <a href="http://www.data4sci.com/">www.data4sci.com</a><br/>
            @bgoncalves, @data4sci</p></div>
</div>

In [1]:
from collections import Counter
from pprint import pprint

import pandas as pd
import numpy as np
import sqlite3

import matplotlib
import matplotlib.pyplot as plt 

from ipywidgets import interact
import openai
from openai import OpenAI

import os
import gzip

import tqdm as tq
from tqdm.notebook import tqdm

import watermark

%load_ext watermark
%matplotlib inline

We start by printing out the versions of the libraries we're using for future reference

In [2]:
%watermark -n -v -m -g -iv

Python implementation: CPython
Python version       : 3.11.7
IPython version      : 8.12.3

Compiler    : Clang 14.0.6 
OS          : Darwin
Release     : 24.3.0
Machine     : arm64
Processor   : arm
CPU cores   : 16
Architecture: 64bit

Git hash: 3a7a9a8b6856eb5855cd2ac76a384e203382ab54

watermark : 2.4.3
openai    : 1.30.5
matplotlib: 3.8.0
sqlite3   : 2.6.0
numpy     : 1.26.4
json      : 2.0.9
tqdm      : 4.66.4
pandas    : 2.2.3



Load default figure style

In [3]:
plt.style.use('d4sci.mplstyle')
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

# Text to Code

In [4]:
client = OpenAI()

In [14]:
messages = [
        {
            "role": "system", 
            "content": """You are a grumpy but expert Python software engineer 
             thats interviewing for a job. Please be as concise with your answers as possible."""
        },
        {
            "role": "user", 
            "content": """Create a recursive Python function to compute 
                Fibonacci numbers. Don't provide any explanation, just the code"""
        },
  ]

In [15]:
response = client.chat.completions.create(
    model="gpt-4",
    messages=messages,
    temperature=0,
    max_tokens=1024
)

Which produces the expected result

In [16]:
print(response.choices[0].message.content)

def fibonacci(n):
    if n <= 1:
       return n
    else:
       return(fibonacci(n-1) + fibonacci(n-2))


and works as expected

In [17]:
def fibonacci(n):
    if n <= 1:
        return n
    else:
        return fibonacci(n-1) + fibonacci(n-2)

In [18]:
fibonacci(15)

610

Let us define a utility function to make sequential queries easier

In [19]:
def chat(messages, prompt):
    messages.append({"role":"user", "content":prompt})
    
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        temperature=0,
        max_tokens=1024
    )
    
    messages.append(response.choices[0].message)
    
    return messages[-1].content

# Adding comments

In [20]:
print(chat(messages, "Can you add comments to this function?"))

```python
def fibonacci(n):
    # Base case
    if n <= 1:
        return n
    else:
        return fibonacci(n-1) + fibonacci(n-2)
```


In [21]:
print(chat(messages, "What is the purpose of recursion in this piece of code?"))

To calculate Fibonacci numbers by breaking down the problem into smaller subproblems.


In [22]:
messages

[{'role': 'system',
  'content': 'You are a grumpy but expert Python software engineer \n             thats interviewing for a job. Please be as concise with your answers as possible.'},
 {'role': 'user',
  'content': "Create a recursive Python function to compute \n                Fibonacci numbers. Don't provide any explanation, just the code"},
 {'role': 'user', 'content': 'Can you add comments to this function?'},
 ChatCompletionMessage(content='```python\ndef fibonacci(n):\n    # Base case\n    if n <= 1:\n        return n\n    else:\n        return fibonacci(n-1) + fibonacci(n-2)\n```', role='assistant', function_call=None, tool_calls=None, refusal=None),
 {'role': 'user',
  'content': 'What is the purpose of recursion in this piece of code?'},
 ChatCompletionMessage(content='To calculate Fibonacci numbers by breaking down the problem into smaller subproblems.', role='assistant', function_call=None, tool_calls=None, refusal=None)]

# Explain Existing Code

Let's use a relatively small python script

In [23]:
code_text = "".join(open("data/EpiModel.py").readlines())

In [24]:
print(code_text)

### −∗− mode : python ; −∗−
# @file EpiModel.py
# @author Bruno Goncalves
######################################################

import networkx as nx
import numpy as np
from numpy import linalg
from numpy import random
import scipy.integrate
import pandas as pd
import matplotlib.pyplot as plt

from tqdm import tqdm
tqdm.pandas()

class EpiModel(object):
    """Simple Epidemic Model Implementation
    
        Provides a way to implement and numerically integrate 
    """
    def __init__(self, compartments=None):
        self.transitions = nx.MultiDiGraph()
        self.seasonality = None
        
        if compartments is not None:
            self.transitions.add_nodes_from([comp for comp in compartments])
    
    def add_interaction(self, source, target, agent, rate):        
        self.transitions.add_edge(source, target, agent=agent, rate=rate)        
        
    def add_spontaneous(self, source, target, rate):
        self.transitions.add_edge(source, target, rate=rate)



In [25]:
%%time
print(chat(messages, "Please explain what this piece of code does: ```%s```" % code_text))

This code defines a class `EpiModel` for simulating and numerically integrating a simple epidemic model. It includes methods for adding interactions, spontaneous transitions, and vaccinations, as well as for plotting and calculating the basic reproduction number (R0). The main script at the end demonstrates how to use the class to simulate an SIR model and plot the results.
CPU times: user 13 ms, sys: 3.27 ms, total: 16.2 ms
Wall time: 1.04 s


In [26]:
%%time
print(chat(messages, "Can you please add a doc string to each function and method? Please include information about each argument of the function"))

```python
def __init__(self, compartments=None):
    """Initialize the EpiModel object.
    
    Args:
        compartments (list): List of compartment names.
    """

def add_interaction(self, source, target, agent, rate):
    """Add an interaction between compartments.
    
    Args:
        source (str): Source compartment name.
        target (str): Target compartment name.
        agent (str): Agent involved in the interaction.
        rate (float): Rate of the interaction.
    """

def add_spontaneous(self, source, target, rate):
    """Add a spontaneous transition between compartments.
    
    Args:
        source (str): Source compartment name.
        target (str): Target compartment name.
        rate (float): Rate of the transition.
    """

def add_vaccination(self, source, target, rate, start):
    """Add a vaccination transition between compartments.
    
    Args:
        source (str): Source compartment name.
        target (str): Target compartment name.
        rate 

# Interacting with a database

Let us open a small test database. This file was downloaded from https://github.com/chineseballer06/Statistical-Analysis-of-Northwind-Database/blob/master/Northwind_small.sqlite

In [28]:
con = sqlite3.connect("data/Northwind_small.sqlite")

In [29]:
messages = [
    {"role": "system", 
     "content": """You're a Database Administrator. 
        Please generate SQL queries to answer the following questions. 
        No comments are necessary."""},
    {"role": "user", "content": """
        # Table Employee, columns = [Id, LastName, First Name]
        # Table Shipper, columns = [Id, CompanyName, Phone]
        # Table OrderDetail, columns = [OrderId, ProductId, Quantity]
        # Table EmployeeTerritory, columns = [Id, EmployeeId, TerritoryId]
    """},
]

In [30]:
query_sql = chat(messages, "Generate a table with employee first name, last name and territory id")
print(query_sql)

SELECT e.FirstName, e.LastName, et.TerritoryId
FROM Employee e
JOIN EmployeeTerritory et ON e.Id = et.EmployeeId;


In [31]:
pd.read_sql(query_sql, con)

Unnamed: 0,FirstName,LastName,TerritoryId
0,Nancy,Davolio,6897
1,Nancy,Davolio,19713
2,Andrew,Fuller,1581
3,Andrew,Fuller,1730
4,Andrew,Fuller,1833
5,Andrew,Fuller,2116
6,Andrew,Fuller,2139
7,Andrew,Fuller,2184
8,Andrew,Fuller,40222
9,Janet,Leverling,30346


In [33]:
sql_query = chat(messages, "Compute how many employees work in each territory")
print(sql_query)

SELECT et.TerritoryId, COUNT(e.Id) AS NumEmployees
FROM Employee e
JOIN EmployeeTerritory et ON e.Id = et.EmployeeId
GROUP BY et.TerritoryId;


In [34]:
pd.read_sql(sql_query, con)

Unnamed: 0,TerritoryId,NumEmployees
0,1581,1
1,1730,1
2,1833,1
3,2116,1
4,2139,1
5,2184,1
6,2903,1
7,3049,1
8,3801,1
9,6897,1


In [35]:
sql_query = chat(messages, "How many shippers do we work with?")
print(sql_query)

SELECT COUNT(*)
FROM Shipper;


In [36]:
pd.read_sql(sql_query, con)

Unnamed: 0,COUNT(*)
0,3


In [37]:
messages

[{'role': 'system',
  'content': "You're a Database Administrator. \n        Please generate SQL queries to answer the following questions. \n        No comments are necessary."},
 {'role': 'user',
  'content': '\n        # Table Employee, columns = [Id, LastName, First Name]\n        # Table Shipper, columns = [Id, CompanyName, Phone]\n        # Table OrderDetail, columns = [OrderId, ProductId, Quantity]\n        # Table EmployeeTerritory, columns = [Id, EmployeeId, TerritoryId]\n    '},
 {'role': 'user',
  'content': 'Generate a table with employee first name, last name and territory id'},
 ChatCompletionMessage(content='SELECT e.FirstName, e.LastName, et.TerritoryId\nFROM Employee e\nJOIN EmployeeTerritory et ON e.Id = et.EmployeeId;', role='assistant', function_call=None, tool_calls=None, refusal=None),
 {'role': 'user',
  'content': 'Compute how many employees work in each territory'},
 ChatCompletionMessage(content='SELECT TerritoryId, COUNT(EmployeeId) AS NumEmployees\nFROM Empl

In [45]:
len(messages)

10

In [38]:
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        temperature=0,
        max_tokens=1024
    )

In [44]:
response.choices[0].message.content

'SELECT COUNT(Id)\nFROM Shipper;'

<center>
     <img src="https://raw.githubusercontent.com/DataForScience/Networks/master/data/D4Sci_logo_full.png" alt="Data For Science, Inc" align="center" border="0" width=300px> 
</center>