# OpenAI Chatbot with TAWA data (v3)

This TAWA output is from TAR 295, which modelled several single tier FTC changes, one of which was chosen for Budget 22.

in this version we have reintroduced margin of error for poverty estimates

In [1]:
from aipy.ai import *
import pandas as pd
import yaml
import jinja2
import tiktoken



## Available models

In [2]:
models = openai.Model.list().data

### GPT-4

In [3]:
[m['id'] for m in models if m['id'].startswith('gpt-4')]

['gpt-4-1106-preview',
 'gpt-4-vision-preview',
 'gpt-4',
 'gpt-4-0314',
 'gpt-4-0613']

### GPT-3.5

In [4]:
[m['id'] for m in models if m['id'].startswith('gpt-3.5')]

['gpt-3.5-turbo-instruct',
 'gpt-3.5-turbo-instruct-0914',
 'gpt-3.5-turbo-0613',
 'gpt-3.5-turbo-0301',
 'gpt-3.5-turbo',
 'gpt-3.5-turbo-16k-0613',
 'gpt-3.5-turbo-16k',
 'gpt-3.5-turbo-1106']

## Load data

tawa_details contains the tax year and the reform scenario descriptions

In [5]:
with open('tawa_details.yaml') as f:
    tawa_details = yaml.load(f, Loader=yaml.FullLoader)


In [6]:
tawa = pd.read_csv("input/tawa.csv")

tawa.drop(columns=['Name', 'Rounding_Rule', 'Population'], inplace=True)

In [7]:
fiscals = tawa[
    (tawa.Topic == 'Fiscals') & (tawa.Variable == 'Disposable_Income')][
        ['Scenario', 'Value']]

In [8]:
wnl_cols = [
    'Scenario', 'Variable', 'Winner_Loser', 'Eq_DI_Quantile','Value']
wnl_vars = ['Mean_Weekly_Change', 'Population_In_Category']

wnl = tawa[
    (tawa.Topic == "WnL") & tawa.Variable.isin(wnl_vars) & 
    (tawa.WnL_Group == "All Households")][wnl_cols]
wnl.Eq_DI_Quantile = pd.Categorical(
    wnl.Eq_DI_Quantile, categories=[str(i) for i in range(1, 11)]+['All'], 
    ordered=True)
wnl.Winner_Loser = pd.Categorical(
    wnl.Winner_Loser, categories=['Winners', 'Losers', 'All'], ordered=True)


In [9]:
wnl_wide = wnl.pivot_table(index=['Scenario', 'Eq_DI_Quantile'], columns=['Winner_Loser', 'Variable'], values='Value', aggfunc='first')
wnl_wide.columns = ['_'.join(col).strip() for col in wnl_wide.columns.values]
wnl_wide.reset_index(inplace=True)

In [11]:
poverty = tawa[
    (tawa.Topic == "Poverty") & (tawa.Variable == "Change_In_Population_In_Poverty")
    & (tawa.Population_Type == "Children")][
        ['Scenario', 'Poverty_Type', 'Value', 'Margin_Of_Error']]

## Prime model

In [155]:
with open('tawa_priming.jinja2') as f:
    priming_template = jinja2.Template(f.read())

In [156]:
chat = Chat()
chat.add_context(
    priming_template.render(
        tawa_details=tawa_details, fiscals=fiscals, wnl_wide=wnl_wide, 
        poverty_wide=poverty_wide))

Display number of tokens in the context before asking any questions

In [157]:
def num_tokens_from_string(string: str, encoding_name: str) -> int:
    encoding = tiktoken.encoding_for_model(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens



In [158]:
content = '\n'.join([m['content'] for m in chat.messages])

In [159]:
print(f"Number of tokens: {num_tokens_from_string(content, 'gpt-4')}")

Number of tokens: 2169


This number of tokens gives us several options for which model to use

## Ask some questions

In [160]:
chat.ask(
    "From a value for money perspective, which reform best supports low income households?",
    model = "gpt-4")

'The best reform to support low income households from a value for money perspective appears to be FTC7.5a42.7k27. This is because it has the second highest fiscal cost of 145 million dollars but is noticeably more effective at reducing poverty. It reduces BHC by 7000 and AHC50 Fixed by 11000, which are the highest reductions among all reforms. It also leads to the highest mean weekly increase for the poorest households (16 dollars for 1st quantile) while only slightly causing losers in the higher quantiles. Its overall mean weekly change is also quite high at 2 dollars. So it seems to provide the most substantial support for lower-income groups with relatively considerable cost.'

In [161]:
print(chat.messages[-1]['content'])

The best reform to support low income households from a value for money perspective appears to be FTC7.5a42.7k27. This is because it has the second highest fiscal cost of 145 million dollars but is noticeably more effective at reducing poverty. It reduces BHC by 7000 and AHC50 Fixed by 11000, which are the highest reductions among all reforms. It also leads to the highest mean weekly increase for the poorest households (16 dollars for 1st quantile) while only slightly causing losers in the higher quantiles. Its overall mean weekly change is also quite high at 2 dollars. So it seems to provide the most substantial support for lower-income groups with relatively considerable cost.


In [162]:
chat.ask("which reform is the most tightly targeted to low incomes?", model = "gpt-4")

'The reform that appears to be most tightly targeted to low incomes is FTC5a42.7k27. The mean weekly income change for winning households in quantile 1 (lowest income) is 10 dollars and it gradually decreases to 5 dollars for quantile 6. This reform also starts introducing losers from quantile 3 onwards with the losers mean weekly change incrementally worsening as we move up income quantiles. The fiscal cost of this reform is also significantly lower at 68 million dollars. This indicates that this reform is offering a more focused support to lower-income households while also drawing back support from higher-income ones. Therefore, it could be seen as more tightly targeted to low incomes.'

In [163]:
print(chat.messages[-1]['content'])

The reform that appears to be most tightly targeted to low incomes is FTC5a42.7k27. The mean weekly income change for winning households in quantile 1 (lowest income) is 10 dollars and it gradually decreases to 5 dollars for quantile 6. This reform also starts introducing losers from quantile 3 onwards with the losers mean weekly change incrementally worsening as we move up income quantiles. The fiscal cost of this reform is also significantly lower at 68 million dollars. This indicates that this reform is offering a more focused support to lower-income households while also drawing back support from higher-income ones. Therefore, it could be seen as more tightly targeted to low incomes.


and a final check of the content tokens

In [164]:
content = '\n'.join([m['content'] for m in chat.messages])
print(f"Number of tokens: {num_tokens_from_string(content, 'gpt-4')}")

Number of tokens: 2481


In [166]:
chat.write_txt(path='output/tawa_ai.txt')

Saved chat log to output/tawa_ai.txt
