<div style="width: 100%; overflow: hidden;">
    <div style="width: 150px; float: left;"> <img src="https://raw.githubusercontent.com/DataForScience/Networks/master/data/D4Sci_logo_ball.png" alt="Data For Science, Inc" align="left" border="0" width=150px> </div>
    <div style="float: left; margin-left: 10px;"> <h1>Generative AI with OpenAI API</h1>
<h1>Code Generation</h1>
        <p>Bruno Gonçalves<br/>
        <a href="http://www.data4sci.com/">www.data4sci.com</a><br/>
            @bgoncalves, @data4sci</p></div>
</div>

In [1]:
from collections import Counter
from pprint import pprint

import pandas as pd
import numpy as np
import sqlite3

import matplotlib
import matplotlib.pyplot as plt 

from ipywidgets import interact
import openai

import os
import gzip

import tqdm as tq
from tqdm.notebook import tqdm

import watermark

%load_ext watermark
%matplotlib inline

We start by printing out the versions of the libraries we're using for future reference

In [2]:
%watermark -n -v -m -g -iv

Python implementation: CPython
Python version       : 3.10.9
IPython version      : 8.10.0

Compiler    : Clang 14.0.6 
OS          : Darwin
Release     : 22.5.0
Machine     : x86_64
Processor   : i386
CPU cores   : 16
Architecture: 64bit

Git hash: 40225cd66d36de03a59a6cda7f2fa68bc0164487

pandas    : 1.5.3
tqdm      : 4.64.1
openai    : 0.28.1
sqlite3   : 2.6.0
matplotlib: 3.7.2
watermark : 2.4.2
numpy     : 1.23.5



Load default figure style

In [3]:
plt.style.use('d4sci.mplstyle')
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

# Text to Code

In [4]:
openai.api_key = os.getenv("OPENAI_API_KEY")

In [5]:
messages = [
        {"role": "system", "content": """You are a grumpy but expert Python programmer 
        that interviewing for a job. Please be as concise with your answers a possible."""},
        {"role": "user", "content": """Create a recursive Python function to compute 
        Fibonacci numbers. Don't provide any explanation, just the code"""},
  ]

In [6]:
response = openai.ChatCompletion.create(
    # GPT-3.5-Turbo now incorporates the Codex functionality
    model="gpt-3.5-turbo",
    messages=messages,
    temperature=0,
    max_tokens=1024
)

Which produces the expected result

In [7]:
print(response["choices"][0]['message']["content"])

def fibonacci(n):
    if n <= 1:
        return n
    else:
        return fibonacci(n-1) + fibonacci(n-2)


and works as expected

In [8]:
def fibonacci(n):
    if n <= 1:
        return n
    else:
        return fibonacci(n-1) + fibonacci(n-2)

In [9]:
fibonacci(10)

55

Let us define a utility function to make sequential queries easier

In [10]:
def chat(messages, prompt):
    messages.append({"role":"user", "content":prompt})
    
    response = openai.ChatCompletion.create(
        # GPT-3.5-Turbo now incorporates the Codex functionality
        model="gpt-3.5-turbo",
        messages=messages,
        temperature=0,
        max_tokens=1024
    )
    
    print(response["choices"][0]['message']['content'])

    messages.append(response["choices"][0]['message'])

# Adding comments

In [11]:
chat(messages, "Can you add comments to this function?")

Sure, here's the code with comments:

```python
def fibonacci(n):
    # Base case: if n is 0 or 1, return n
    if n <= 1:
        return n
    
    # Recursive case: compute the Fibonacci number by summing the previous two numbers
    return fibonacci(n-1) + fibonacci(n-2)
```


In [12]:
chat(messages, "What is the purpose of recursion in this piece of code?")

The purpose of recursion in this code is to compute Fibonacci numbers. The function calls itself with smaller values of `n` until it reaches the base case (when `n` is 0 or 1), and then it returns the corresponding Fibonacci number. By recursively calling the function with smaller values, we can build up the Fibonacci sequence.


# Explaining Code

Let's use a relatively small python script

In [13]:
code_text = "".join(open("data/EpiModel.py").readlines())

In [14]:
print(code_text)

### −∗− mode : python ; −∗−
# @file EpiModel.py
# @author Bruno Goncalves
######################################################

import networkx as nx
import numpy as np
from numpy import linalg
from numpy import random
import scipy.integrate
import pandas as pd
import matplotlib.pyplot as plt

from tqdm import tqdm
tqdm.pandas()

class EpiModel(object):
    """Simple Epidemic Model Implementation
    
        Provides a way to implement and numerically integrate 
    """
    def __init__(self, compartments=None):
        self.transitions = nx.MultiDiGraph()
        self.seasonality = None
        
        if compartments is not None:
            self.transitions.add_nodes_from([comp for comp in compartments])
    
    def add_interaction(self, source, target, agent, rate):        
        self.transitions.add_edge(source, target, agent=agent, rate=rate)        
        
    def add_spontaneous(self, source, target, rate):
        self.transitions.add_edge(source, target, rate=rate)



In [15]:
chat(messages, "Please explain what this piece of code does: ```%s```" % code_text)

This code defines a class called `EpiModel` that represents a simple epidemic model. It provides methods to add interactions, spontaneous transitions, and vaccination to the model. It also includes methods to simulate and numerically integrate the model. The code also includes a main block that demonstrates the usage of the `EpiModel` class by creating an instance of the class, adding interactions and transitions, and then simulating and plotting the results.


In [16]:
chat(messages, "Can you please add a doc string to each function and method?")

Certainly! Here's the code with added docstrings:

```python
import networkx as nx
import numpy as np
from numpy import linalg
from numpy import random
import scipy.integrate
import pandas as pd
import matplotlib.pyplot as plt

from tqdm import tqdm
tqdm.pandas()

class EpiModel(object):
    """Simple Epidemic Model Implementation
    
    Provides a way to implement and numerically integrate an epidemic model.
    """
    def __init__(self, compartments=None):
        """Initialize the EpiModel object.
        
        Args:
            compartments (list): List of compartment names.
        """
        self.transitions = nx.MultiDiGraph()
        self.seasonality = None
        
        if compartments is not None:
            self.transitions.add_nodes_from([comp for comp in compartments])
    
    def add_interaction(self, source, target, agent, rate):        
        """Add an interaction between two compartments.
        
        Args:
            source (str): Name of the source

# Interacting with a database

Let us open a small test database. This file was downloaded from https://github.com/chineseballer06/Statistical-Analysis-of-Northwind-Database/blob/master/Northwind_small.sqlite

In [17]:
con = sqlite3.connect("data/Northwind_small.sqlite")

In [18]:
messages = [
    {"role": "system", "content": """You're a Database Administrator. 
    Please generate SQL queries to answer the following questions. 
    No comments are necessary."""},
    {"role": "user", "content": """
# Table Employee, columns = [Id, LastName, First Name]
# Table Shipper, columns = [Id, CompanyName, Phone]
# Table OrderDetail, columns = [OrderId, ProductId, Quantity]
# Table EmployeeTerritory, columns = [Id, EmployeeId, TerritoryId]
    """},
]

In [19]:
chat(messages, "Generate a table with employee first name, last name and territory id")

SELECT e.FirstName, e.LastName, et.TerritoryId
FROM Employee e
JOIN EmployeeTerritory et ON e.Id = et.EmployeeId;


In [20]:
pd.read_sql( """SELECT Employee.FirstName, Employee.LastName, EmployeeTerritory.TerritoryId
FROM Employee
JOIN EmployeeTerritory ON Employee.Id = EmployeeTerritory.EmployeeId;
""", con)

Unnamed: 0,FirstName,LastName,TerritoryId
0,Nancy,Davolio,6897
1,Nancy,Davolio,19713
2,Andrew,Fuller,1581
3,Andrew,Fuller,1730
4,Andrew,Fuller,1833
5,Andrew,Fuller,2116
6,Andrew,Fuller,2139
7,Andrew,Fuller,2184
8,Andrew,Fuller,40222
9,Janet,Leverling,30346


In [21]:
chat(messages, "Compute how many employees work in each territory")

SELECT et.TerritoryId, COUNT(*) AS EmployeeCount
FROM EmployeeTerritory et
GROUP BY et.TerritoryId;


In [22]:
pd.read_sql( """SELECT TerritoryId, COUNT(EmployeeId) AS EmployeeCount
FROM EmployeeTerritory
GROUP BY TerritoryId;
""", con)

Unnamed: 0,TerritoryId,EmployeeCount
0,1581,1
1,1730,1
2,1833,1
3,2116,1
4,2139,1
5,2184,1
6,2903,1
7,3049,1
8,3801,1
9,6897,1


In [23]:
chat(messages, "How many shippers do we work with?")

SELECT COUNT(*) AS ShipperCount
FROM Shipper;


In [24]:
pd.read_sql( """SELECT COUNT(*) AS ShipperCount
FROM Shipper;
""", con)

Unnamed: 0,ShipperCount
0,3


<center>
     <img src="https://raw.githubusercontent.com/DataForScience/Networks/master/data/D4Sci_logo_full.png" alt="Data For Science, Inc" align="center" border="0" width=300px> 
</center>