# OpenAI Chatbot with TAWA data (v3)

This TAWA output is from TAR 295, which modelled several single tier FTC changes, one of which was chosen for Budget 22.

in this version we have reintroduced margin of error for poverty estimates

In [1]:
from aipy.ai import *
import pandas as pd
import yaml
import jinja2
import tiktoken



## Available models

In [2]:
models = openai.Model.list().data

### GPT-4

In [3]:
[m['id'] for m in models if m['id'].startswith('gpt-4')]

['gpt-4-1106-preview',
 'gpt-4-vision-preview',
 'gpt-4',
 'gpt-4-0314',
 'gpt-4-0613']

### GPT-3.5

In [4]:
[m['id'] for m in models if m['id'].startswith('gpt-3.5')]

['gpt-3.5-turbo-instruct',
 'gpt-3.5-turbo-instruct-0914',
 'gpt-3.5-turbo-0613',
 'gpt-3.5-turbo-0301',
 'gpt-3.5-turbo',
 'gpt-3.5-turbo-16k-0613',
 'gpt-3.5-turbo-16k',
 'gpt-3.5-turbo-1106']

## Load data

tawa_details contains the tax year and the reform scenario descriptions

In [5]:
with open('tawa_details.yaml') as f:
    tawa_details = yaml.load(f, Loader=yaml.FullLoader)


In [6]:
tawa = pd.read_csv("input/tawa.csv")

tawa.drop(columns=['Name', 'Rounding_Rule', 'Population'], inplace=True)

In [7]:
fiscals = tawa[
    (tawa.Topic == 'Fiscals') & (tawa.Variable == 'Disposable_Income')][
        ['Scenario', 'Value']]

In [8]:
wnl_cols = [
    'Scenario', 'Variable', 'Winner_Loser', 'Eq_DI_Quantile','Value']
wnl_vars = ['Mean_Weekly_Change', 'Population_In_Category']

wnl = tawa[
    (tawa.Topic == "WnL") & tawa.Variable.isin(wnl_vars) & 
    (tawa.WnL_Group == "All Households")][wnl_cols]
wnl.Eq_DI_Quantile = pd.Categorical(
    wnl.Eq_DI_Quantile, categories=[str(i) for i in range(1, 11)]+['All'], 
    ordered=True)
wnl.Winner_Loser = pd.Categorical(
    wnl.Winner_Loser, categories=['Winners', 'Losers', 'All'], ordered=True)


In [9]:
wnl_wide = wnl.pivot_table(index=['Scenario', 'Eq_DI_Quantile'], columns=['Winner_Loser', 'Variable'], values='Value', aggfunc='first')
wnl_wide.columns = ['_'.join(col).strip() for col in wnl_wide.columns.values]
wnl_wide.reset_index(inplace=True)

In [11]:
poverty = tawa[
    (tawa.Topic == "Poverty") & (tawa.Variable == "Change_In_Population_In_Poverty")
    & (tawa.Population_Type == "Children")][
        ['Scenario', 'Poverty_Type', 'Value', 'Margin_Of_Error']]

## Prime model

In [13]:
with open('tawa_priming.jinja2') as f:
    priming_template = jinja2.Template(f.read())

In [14]:
chat = Chat()
chat.add_context(
    priming_template.render(
        tawa_details=tawa_details, fiscals=fiscals, wnl_wide=wnl_wide, 
        poverty=poverty))

Display number of tokens in the context before asking any questions

In [15]:
def num_tokens_from_string(string: str, encoding_name: str) -> int:
    encoding = tiktoken.encoding_for_model(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens



In [16]:
content = '\n'.join([m['content'] for m in chat.messages])

In [17]:
print(f"Number of tokens: {num_tokens_from_string(content, 'gpt-4')}")

Number of tokens: 2309


This number of tokens gives us several options for which model to use

## Ask some questions

In [18]:
chat.ask(
    "From a value for money perspective, which reform best supports low income households?",
    model = "gpt-4")

'The "FTC5a42.7k27" reform provides the best value for money in supporting low-income households. This reform increases the Family Tax Credit (FTC) rate by 5 dollars per week and increases the abatement rate to 27%. It also costs 68 million dollars, considerably less than all other reforms.\n\nFor deciles 1 to 3 of equivalised disposable income (the lowest income households), winners from this reform experience weekly income increases of 10, 9, and 8 dollars respectively. Some households (particularly in deciles 3) do lose from this reform, but the mean weekly change for all households in these deciles remains positive. \n\nMoreover, this reform reduces both fixed-line AHC50 and moving-line BHC50 child poverty measures by 8000 and 5000 children respectively, considering the margins of error. This is a considerable achievement given the lower fiscal cost.'

In [19]:
print(chat.messages[-1]['content'])

The "FTC5a42.7k27" reform provides the best value for money in supporting low-income households. This reform increases the Family Tax Credit (FTC) rate by 5 dollars per week and increases the abatement rate to 27%. It also costs 68 million dollars, considerably less than all other reforms.

For deciles 1 to 3 of equivalised disposable income (the lowest income households), winners from this reform experience weekly income increases of 10, 9, and 8 dollars respectively. Some households (particularly in deciles 3) do lose from this reform, but the mean weekly change for all households in these deciles remains positive. 

Moreover, this reform reduces both fixed-line AHC50 and moving-line BHC50 child poverty measures by 8000 and 5000 children respectively, considering the margins of error. This is a considerable achievement given the lower fiscal cost.


In [20]:
chat.ask("which reform is the most tightly targeted to low incomes?", model = "gpt-4")

'The "FTC5a42.7k27" reform is the most tightly targeted to low incomes. \n\nFor the first three deciles of equivalised disposable income (the lowest income groups), this reform provides weekly income increases of 10, 9, and 8 dollars respectively. This implies that the most substantial gains are seen by the lower-income households, thereby indicating that this reform is directed tightly at this income group.\n\nHowever, this reform also results in more losers amongst these initial deciles compared to the other reforms, with households in the third decile losing on average 5 dollars a week. By the fourth decile, the number of households losing from this reform exceeds those gaining and the weekly losses also grow larger, indicating that this reform targets support below this level.\n\nThis reform reduces child poverty, measured by the fixed-line AHC50 and moving-line BHC50 definitions, by 8000 and 5000 children respectively, considering the margins of error. Again, this shows the reform

In [21]:
print(chat.messages[-1]['content'])

The "FTC5a42.7k27" reform is the most tightly targeted to low incomes. 

For the first three deciles of equivalised disposable income (the lowest income groups), this reform provides weekly income increases of 10, 9, and 8 dollars respectively. This implies that the most substantial gains are seen by the lower-income households, thereby indicating that this reform is directed tightly at this income group.

However, this reform also results in more losers amongst these initial deciles compared to the other reforms, with households in the third decile losing on average 5 dollars a week. By the fourth decile, the number of households losing from this reform exceeds those gaining and the weekly losses also grow larger, indicating that this reform targets support below this level.

This reform reduces child poverty, measured by the fixed-line AHC50 and moving-line BHC50 definitions, by 8000 and 5000 children respectively, considering the margins of error. Again, this shows the reform is eff

and a final check of the content tokens

In [22]:
content = '\n'.join([m['content'] for m in chat.messages])
print(f"Number of tokens: {num_tokens_from_string(content, 'gpt-4')}")

Number of tokens: 2736


In [23]:
chat.write_txt(path='output/tawa_ai.txt')

Saved chat log to output/tawa_ai.txt
