## Checking if two names match
This demo shows how you can take two names and test whether they are equivalent, despite superficial differences.

This is a challenging problem because any given name can be written in a number of ways. For example, 'Jennifer L. Chen' could potentially match any of the
following:
* Jennifer Chen
* Chen, Jennifer
* Jenny Chen
* Jenny L. Chen
* Jennifer Lilian Chen
* CHEN, Jennifer L.

In [1]:
import os
import pandas as pd
import json

In [2]:
import sys
sys.path.append('../..')
import openai_data_tools as dt

In [3]:
names = pd.read_csv('names.csv', dtype=str, keep_default_na=False)

Here is our test data set: name1 and name2 are the names to be matched, target is the desired output (1 for match, 0 for no match).

In [4]:
names

Unnamed: 0,name1,name2,target
0,Alice Walker,Alice Walker,1
1,Charles Dickens,John Grisham,0
2,Jennifer Chen,Jenny Chen,1
3,Darcy Reed,Greta Reed,0
4,George Washington,"Washington, George",1
5,Jimi Hendrix,James Hendrix,1
6,"Benedict, Alfonso","DeNiro, Alfonso",0
7,James Garcia,Jimmy Garcia,1
8,Harry Potter,"Malfoy, Draco",0
9,"Sheeran, Edward","Puth, Charlie",0


The input to the model needs to be a string, so for each of our test items, we add both names to a JSON structure and then stringify it.

In [5]:
names_json = [json.dumps({"name1": row['name1'], "name2": row['name2']}) for index, row in names.iterrows()]

In our instructions, we ask the model to return 1 for a match, and 0 for a mismatch. We also give specific instructions on how to handle the ordering of first and last name, as well as middle names.

In [6]:
processor = dt.DataProcessor(
    api_key=os.getenv("OPENAI_API_KEY"),
    model = 'gpt-3.5-turbo', 
    instructions = "You will be provided with a JSON object with two keys, 'name1' and 'name2'. If the two names are equivalent, return the value '1'. If they are different, return '0'. Treat nicknames as equivalent to the full name. Treat <last name>, <first name> as equivalent to <firstname> <lastname>. If both names have middle names or middle initials, they should match for the names to be equivalent. If only one name has a middle name or middle initial, ignore it."
)

The GPT models are not deterministic, meaning that they won't necessarily give you the same answer each time. To get more consistent results, we do multiple runs through the test items, and then use the most common response as the output (more on that below).

In [7]:
outputs = []
for run in range(1,6):
    print(f'Run {run}')
    output = processor.process(names_json)
    outputs.append(output)

Run 1
Progress: 100%
Run 2
Progress: 100%
Run 3
Progress: 100%
Run 4
Progress: 100%
Run 5
Progress: 100%


A `RunAggregator` is used to process output values from multiple runs of the model. We can use it to calculate the level of agreement across runs.

In [8]:
aggregator = dt.RunAggregator(outputs)
aggregator.agreement()

0.86

We can also use it to produce a single set of output values based on the most common response for each item. We can then score the model based on those output values.

In [9]:
combined_output = aggregator.output()
scorer = dt.Scorer(combined_output, names['target'])
scorer.accuracy()

0.95

The code below shows a breakdown of how the model did for each item. The `scores` method returns a list with 1 if the model output matches the target, and 0 if it doesn't.

In [11]:
scores = scorer.scores()
pd.DataFrame({'name1': names['name1'], 'name2': names['name2'], 'target': names['target'], 'output': combined_output, 'score': scores})

Unnamed: 0,name1,name2,target,output,score
0,Alice Walker,Alice Walker,1,1,1
1,Charles Dickens,John Grisham,0,0,1
2,Jennifer Chen,Jenny Chen,1,1,1
3,Darcy Reed,Greta Reed,0,0,1
4,George Washington,"Washington, George",1,1,1
5,Jimi Hendrix,James Hendrix,1,0,0
6,"Benedict, Alfonso","DeNiro, Alfonso",0,0,1
7,James Garcia,Jimmy Garcia,1,1,1
8,Harry Potter,"Malfoy, Draco",0,0,1
9,"Sheeran, Edward","Puth, Charlie",0,0,1
