This notebook provides the following implementations:
* Data preprocessing
* Model Inference

# Load necesary modules

In [1]:
import pandas as pd
import numpy as np
import os, json, re, torch, random
from utils import *
from sklearn.linear_model import LogisticRegression
from functools import partial

# COMPAS dataset pre-processing

First, we provide a preview of COMPAS dataset. 

In [2]:
compas = pd.read_csv(os.path.join('compas', 'compas-scores-two-years.csv'))
compas.head()

Unnamed: 0,id,name,first,last,compas_screening_date,sex,dob,age,age_cat,race,...,v_decile_score,v_score_text,v_screening_date,in_custody,out_custody,priors_count.1,start,end,event,two_year_recid
0,1,miguel hernandez,miguel,hernandez,2013-08-14,Male,1947-04-18,69,Greater than 45,Other,...,1,Low,2013-08-14,2014-07-07,2014-07-14,0,0,327,0,0
1,3,kevon dixon,kevon,dixon,2013-01-27,Male,1982-01-22,34,25 - 45,African-American,...,1,Low,2013-01-27,2013-01-26,2013-02-05,0,9,159,1,1
2,4,ed philo,ed,philo,2013-04-14,Male,1991-05-14,24,Less than 25,African-American,...,3,Low,2013-04-14,2013-06-16,2013-06-16,4,0,63,0,1
3,5,marcu brown,marcu,brown,2013-01-13,Male,1993-01-21,23,Less than 25,African-American,...,6,Medium,2013-01-13,,,1,0,1174,0,0
4,6,bouthy pierrelouis,bouthy,pierrelouis,2013-03-26,Male,1973-01-22,43,25 - 45,Other,...,1,Low,2013-03-26,,,2,0,1102,0,0


Let's manually split COMPAS into training set and test set at a ratio of 7:3.

In [3]:
np.random.seed(123)
compas = compas.iloc[np.random.permutation(np.arange(0, len(compas)))]
compas_train = compas[:int(len(compas)*.7)].reset_index(drop = True)
compas_test = compas[int(len(compas)*.7):].reset_index(drop = True)

We can leverage the classification feature of GPT-3 by setting `classification = True` in `compasConverter()` to pre-process the dataset. 

In classification problems, each input in the prompt should be classified into one of the predefined classes. For this type of problem, we recommend:

* Use a separator at the end of the prompt, e.g. `\n\n###\n\n`. Remember to also append this separator when you eventually make requests to your model.
* Choose classes that map to a single token. At inference time, specify max_tokens=1 since you only need the first token for classification.
* Ensure that the prompt + completion doesn't exceed 2048 tokens, including the separator
* Aim for at least ~100 examples per class
* To get class log probabilities you can specify `logprobs=5` (for 5 classes) when using your model
* Ensure that the dataset used for finetuning is very similar in structure and type of task as what the model will be used for

In [4]:
def compasConverter(row, classification = False):
    """
    Convert each row of COMPAS into a sentence. 
    
    Parameters:
    -----------
    row: a row in the pandas.DataFrame
    classification = False: logic value. If classification = True, 
                            it will leverage the classification 
    """
    degree = 'felony' if row['c_charge_degree'] == 'F' else 'misdemeanor'
    race = ' ' + row['race'] if row['race'] != 'Other' else ''
    if isinstance(row['c_arrest_date'], str):
        prompt = "The defendant, a %d-year-old %s%s, was arrested on %s for a %s. The specific charge is %s. The defendant has committed %d juvenile misdemeanors, %d juvenile felonies, %d other juvenile delinquencies, and %d prior convictions for other offenses." % (row['age'], row['sex'].lower(), race, row['c_arrest_date'], degree, row['c_charge_desc'], row['juv_misd_count'], row['juv_fel_count'], row['juv_other_count'], row['priors_count']) 
    else:
        prompt = "The defendant, a %d-year-old %s%s, was arrested for a %s. The specific charge is %s. The defendant has committed %d juvenile misdemeanors, %d juvenile felonies, %d other juvenile delinquencies, and %d prior convictions for other offenses." % (row['age'], row['sex'].lower(), race, degree, row['c_charge_desc'], row['juv_misd_count'], row['juv_fel_count'], row['juv_other_count'], row['priors_count']) 
    
    prompt = prompt
    
    if not classification:
        if row['two_year_recid']:
            completion = 'The defendant is likely to reoffend in two years.'
        else:
            completion = "The defendant is not likely to reoffend in two years."
        return "{\"prompt\": \"%s###\", \"completion\": \" %s@@@\"}" % (prompt, completion)
    else:
        completion = "Yes." if row['two_year_recid'] else "No."
        return "{\"prompt\": \"%s Will this defendant reoffend in two years? ###\", \"completion\": \"%s@@@\"}" % (prompt, completion)
    
    

If we want to use the generation feature of GPT-3, we can use `compas_gen_train.jsonl` which is generated by the following code:

In [5]:
jsonl = '\n'.join(compas_train.apply(func = compasConverter, axis = 1).tolist())
with open(os.path.join('data','compas_gen_train.jsonl'), 'w') as outfile:
    outfile.write(jsonl)

In [6]:
jsonl = '\n'.join(compas_test.apply(func = compasConverter, axis = 1).tolist())
with open(os.path.join('data', 'compas_gen_test.jsonl'), 'w') as outfile:
    outfile.write(jsonl)

**Recommended:** If we want to use the classification feature of GPT-3, use `compas_class_train.jsonl`, which is generated by the following code:

In [7]:
jsonl = '\n'.join(compas_train.apply(func = partial(compasConverter, classification = True), axis = 1).tolist())
with open(os.path.join('data', 'compas_class_train.jsonl'), 'w') as outfile:
    outfile.write(jsonl)

In [8]:
jsonl = '\n'.join(compas_test.apply(func = partial(compasConverter, classification = True), axis = 1).tolist())
with open(os.path.join('data', 'compas_class_test.jsonl'), 'w') as outfile:
    outfile.write(jsonl)

# Fine-tune the GPT-3 model from the terminal

Check the my [OpenAI tuturial](https://volcano-hotel-6f9.notion.site/OpenAI-tutorial-9b35c35e247345348935de7f1b8d7018) to fine-tune GPT-3 model with processed data.

I have trained several models which you can play with:
* Generation 
    * Ada: `ada:ft-university-of-wisconsin-madison-2022-01-03-20-08-29`
    * Babbage: `babbage:ft-university-of-wisconsin-madison-2021-12-16-22-16-57`
    * Curie: `curie:ft-university-of-wisconsin-madison-2021-12-16-22-51-52`
* Classification
    * Ada: `ada:ft-university-of-wisconsin-madison-2022-01-03-21-20-24`
    
**Note that some of them are not trained by the dataset that I provided above, instead, they may use some other training dataset (with different train-test splition).** 
    
In your terminal, run `./test_class.sh ada` to obtain the output of the selected Ada model under classification feature on the test dataset; run `./test_gen.sh curie` to obtain the output of the selected Curie model under generation feature on the test dataset. Running the commands above will generate files that save the output of each prompt in the test dataset, and we will use the inference functions below to compare the output and labels. 

# Inference

After obtaining the output files, we can use the functions below to check the accuracy, fairness, etc.

In [11]:
class runTest_gen(object):
    """
    A class of functions for performing inference of a fine-tuned GPT3 generation model on COMPAS dataset.
    """
    def getDict(self, row):
        if row:
            prompt, completion = re.findall(r': \"(.+?)\"', row)
            return {'prompt': prompt, 'completion': completion}
        
    def grepCompletion(self, row):
        catch = re.findall(r'(is .{0,4}likely to reoffend in two years.)', row)
        if catch:
            return catch[0]
        
    def grepRace(self, row):
        catch = re.findall(r'ale (.+?),', row)
        if catch:
            return catch[0]
        
    def __init__(self, model='ada'):
        # prepare prompts
        with open(os.path.join('data', 'compas_gen_test.jsonl'), 'r') as f:
            reader = f.read()
        self.test = list(map(self.getDict, reader.split('\n')))
        self.model = model

        with open(os.path.join('data', 'test_prompt_' + model), 'w') as f:
            for line in self.test:
                if line:
                    f.write("%s\n" % line['prompt'])
            
    def readCompletion(self):
        with open(os.path.join('data', 'test_completion_' + self.model), 'r') as f:
            completion_reader = f.read()
        self.results = list(map(self.grepCompletion, completion_reader.split('\n')))
    
    def testResults(self):
        true_labels, generate_labels, race = [], [], []
        for line, generate_completion in zip(self.test, self.results):
            if generate_completion:
                generate_labels.append(not 'not' in generate_completion)
                true_labels.append(not 'not' in line['completion'])
                race.append(self.grepRace(line['prompt']))
        self.accuracy = torch.eq(torch.tensor(generate_labels), torch.tensor(true_labels)).sum().item() / len(true_labels)
        race_set = ['Caucasian', 'African-American']
        
        df = pd.DataFrame({'label': true_labels, 'output': generate_labels, 'race': race})
        pred_pos_l, true_pos_l = [], []
        print('Recidivism rate:')
        for r in race_set:
            pred_pos_l.append(df[df.race == r]['output'].mean())
            true_pos_l.append(df[df.race == r]['label'].mean())
            print('|---- %s: (predicted) %.2f%%, (true) %.2f%%' % (r, pred_pos_l[-1]*100, true_pos_l[-1]*100))
        print('Dataset demographic disparity: %.4f' % (max(true_pos_l) - np.array(true_pos_l).mean()))
        print('Test accuracy: %.2f%%, demographic disparity: %.4f' % (self.accuracy * 100, max(pred_pos_l) - np.array(pred_pos_l).mean()))
        
    def run(self):
        self.readCompletion()
        self.testResults()
        

In [13]:
class runTest_class(object):
    """
    A class of functions for performing inference of a fine-tuned GPT3 classification model on COMPAS dataset.
    """
    def getDict(self, row):
        if row:
            prompt, completion = re.findall(r': \"(.+?)\"', row)
            return {'prompt': prompt, 'completion': completion}
        
    def grepCompletion(self, row):
        check = re.findall(r'###(.+?)\.@@@', row)
        if check:
            completion = check[0]
            return True if "yes" in completion.lower() else False
        
    def grepRace(self, row):
        catch = re.findall(r'ale (.+?),', row)
        if catch:
            return catch[0]
        
    def __init__(self, model='ada'):
        # prepare prompts
        with open(os.path.join('data', 'compas_class_test.jsonl'), 'r') as f:
            reader = f.read()
        self.test = list(map(self.getDict, reader.split('\n')))
        self.model = model

        with open(os.path.join('data', 'test_class_prompt_'+model), 'w') as f:
            for line in self.test:
                if line:
                    f.write("%s\n" % line['prompt'])
            
    def readCompletion(self):
        with open(os.path.join('data', 'test_class_completion_' + self.model), 'r') as f:
            completion_reader = f.read()
        self.results = list(map(self.grepCompletion, completion_reader.split('\n')))
    
    def testResults(self):
        true_labels, generate_labels, race = [], [], []
        for line, generate_completion in zip(self.test, self.results):
            if not (generate_completion is None):
                generate_labels.append(generate_completion)
                true_labels.append('Yes.@@@' in line['completion'])
                race.append(self.grepRace(line['prompt']))
        self.accuracy = torch.eq(torch.tensor(generate_labels), torch.tensor(true_labels)).sum().item() / len(true_labels)
        
        df = pd.DataFrame({'label': true_labels, 'output': generate_labels, 'race': race})
        pred_pos_l, true_pos_l = [], []
        print('Recidivism rate:')
        
        r = 'Caucasian'
        pred_pos_l.append(df[df.race == r]['output'].mean())
        true_pos_l.append(df[df.race == r]['label'].mean())
        print('|---- %s: (predicted) %.2f%%, (true) %.2f%%' % (r, pred_pos_l[-1]*100, true_pos_l[-1]*100))
        
        pred_pos_l.append(df[df.race != r]['output'].mean())
        true_pos_l.append(df[df.race != r]['label'].mean())
        print('|---- %s: (predicted) %.2f%%, (true) %.2f%%' % (r, pred_pos_l[-1]*100, true_pos_l[-1]*100))
        
        print('Dataset demographic disparity: %.4f' % (max(true_pos_l) - np.array(true_pos_l).mean()))
        print('Test accuracy: %.2f%%, demographic disparity: %.4f' % (self.accuracy * 100, max(pred_pos_l) - np.array(pred_pos_l).mean()))
        
    def run(self):
        self.readCompletion()
        self.testResults()
        

Here I added logistic regression model for comparison.

In [84]:
# COMPAS
sensitive_attributes = ['race']
categorical_attributes = ['age_cat', 'c_charge_degree', 'c_charge_desc', 'sex']
continuous_attributes = ['age', 'juv_fel_count', 'juv_misd_count', 'juv_other_count', 'priors_count']
features_to_keep = ['sex', 'age', 'age_cat', 'race', 'juv_fel_count', 'juv_misd_count', 'juv_other_count',
        'priors_count', 'c_charge_degree', 'c_charge_desc','two_year_recid']
label_name = 'two_year_recid'

compas_csv = process_csv('compas', 'compas-scores-two-years.csv', label_name, 0, sensitive_attributes, ['Caucasian'], categorical_attributes, continuous_attributes, features_to_keep)
compas_csv['y'] = compas_csv[label_name]
compas_csv = compas_csv.drop(label_name, axis = 1).sample(frac = 1)
compas_train = compas_csv.iloc[:int(len(compas_csv)//7)]
compas_test = compas_csv.iloc[int(len(compas_csv)//7):]

In [4]:
compas_csv

Unnamed: 0,age,juv_fel_count,juv_misd_count,juv_other_count,priors_count,age_cat_25 - 45,age_cat_Greater than 45,age_cat_Less than 25,c_charge_degree_F,c_charge_degree_M,...,c_charge_desc_Viol Prot Injunc Repeat Viol,c_charge_desc_Violation License Restrictions,c_charge_desc_Violation Of Boater Safety Id,c_charge_desc_Violation of Injunction Order/Stalking/Cyberstalking,c_charge_desc_Voyeurism,c_charge_desc_arrest case no charge,sex_Female,sex_Male,z,y
5094,0.461538,0.00,0.000000,0.0,0.026316,0,1,0,0,1,...,0,0,0,0,0,1,0,1,0,1
4772,0.269231,0.00,0.000000,0.0,0.500000,1,0,0,0,1,...,0,0,0,0,0,0,0,1,0,0
5109,0.294872,0.00,0.000000,0.0,0.052632,1,0,0,1,0,...,0,0,0,0,0,0,0,1,0,1
7051,0.064103,0.00,0.000000,0.0,0.026316,0,0,1,1,0,...,0,0,0,0,0,0,0,1,0,0
6510,0.102564,0.00,0.076923,0.0,0.105263,1,0,0,1,0,...,0,0,0,0,0,1,1,0,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5134,0.115385,0.00,0.000000,0.0,0.000000,1,0,0,0,1,...,0,0,0,0,0,0,0,1,3,1
6796,0.064103,0.05,0.000000,0.0,0.078947,0,0,1,1,0,...,0,0,0,0,0,0,0,1,0,0
4223,0.064103,0.00,0.000000,0.0,0.000000,0,0,1,1,0,...,0,0,0,0,0,0,0,1,0,1
2787,0.115385,0.00,0.000000,0.0,0.000000,1,0,0,1,0,...,0,0,0,0,0,0,0,1,2,1


In [5]:
def lr(dataset):
    train, test = dataset
    
    # train
    clf = LogisticRegression(random_state=0, max_iter = 200).fit(train.drop(['y'], axis = 1), train['y'])
    
    # inference
    predictions = clf.predict(test.drop(['y'], axis = 1))
    acc = (predictions == test.y).sum() / len(test)
    
    pos_l = []
    print('Recidivism rate: ')
    for z in [0,1]:
        print('|proportion ----- %s: %.4f%%' % (z, ((test['z'] == z).mean())*100))
        pos_l.append(1-predictions[test['z'] == z].mean())
        print('|----- %s: %.4f%%' % (z, pos_l[-1]*100))
    print('Test accuracy: %.4f%%, demographic disparity: %.4f' % (acc*100, max(pos_l)-min(pos_l)))

In [24]:
lr((compas_train, compas_test))

Recidivism rate: 
|proportion ----- 0: 51.2775%
|----- 0: 48.8489%
|proportion ----- 1: 0.4690%
|----- 1: 20.6897%
Test accuracy: 66.2516%, demographic disparity: 0.2816


In [12]:
test_ada = runTest_gen('ada')
test_babbage = runTest_gen('babbage')
test_curie = runTest_gen('curie')
test_dup = runTest_gen('ada_dup')

In [14]:
test_classification_ada = runTest_class('ada')

In [27]:
test_classification_ada.run()

Recidivism rate:
|---- Caucasian: (predicted) 39.08%, (true) 41.95%
|---- Caucasian: (predicted) 46.93%, (true) 44.94%
Dataset demographic disparity: 0.0149
Test accuracy: 60.50%, demographic disparity: 0.0393


In [28]:
# $0.7
test_ada.run()

Recidivism rate:
|---- Caucasian: (predicted) 35.16%, (true) 42.07%
|---- African-American: (predicted) 51.56%, (true) 47.66%
Dataset demographic disparity: 0.0279
Test accuracy: 61.22%, demographic disparity: 0.0820


In [29]:
# $1.05
test_babbage.run()

Recidivism rate:
|---- Caucasian: (predicted) 36.60%, (true) 42.07%
|---- African-American: (predicted) 53.02%, (true) 47.76%
Dataset demographic disparity: 0.0284
Test accuracy: 60.66%, demographic disparity: 0.0821


In [30]:
# $5.27
test_curie.run()

Recidivism rate:
|---- Caucasian: (predicted) 37.64%, (true) 41.95%
|---- African-American: (predicted) 54.19%, (true) 47.76%
Dataset demographic disparity: 0.0290
Test accuracy: 59.60%, demographic disparity: 0.0827


In [31]:
# $4.92
test_dup.run()

Recidivism rate:
|---- Caucasian: (predicted) 40.80%, (true) 41.95%
|---- African-American: (predicted) 55.56%, (true) 47.76%
Dataset demographic disparity: 0.0290
Test accuracy: 58.40%, demographic disparity: 0.0738
