![NVIDIA Logo](images/nvidia.png)

## Imports

In [1]:
import json
import random
import pandas as pd

from tqdm.notebook import tqdm

from llm_utils.nemo_service_models import NemoServiceBaseModel
from llm_utils.models import PubmedModels
from llm_utils.helpers import plot_experiment_results, accuracy_score
from llm_utils.pubmedqa import generate_prompt_and_answer, strip_response
from llm_utils.prompt_creators import create_prompt_with_examples, create_nemo_prompt_with_examples

## List Models

In [2]:
PubmedModels.list_models()

gpt8b: gpt-8b-000
gpt20b: gpt20b
gpt43b: gpt-43b-001


## Generate Few-shot Prompts

**We will start with GPT43B which was instruction fine-tuned. To properly format example shots for this model we've provided the helper function `create_nemo_prompt_with_examples`.**

**You'll notice a 3-shot prompt created by this helper function includes the addition of `User:` and `Assistant:` in various places to match GPT43B's instruction fine-tuning prompt template.**

### Non-Instruction Fine-tuned Prompt Format

**GPT20B and GPT8B are not instruction fine-tuned models, and thus we only need to provide our example prompt/response shots in a straightforward manner. We've provided the `create_prompt_with_examples` helper function to accomplish this.

## Try Few-shot Prompts

In [3]:
llms = {}

In [5]:
llms['gpt8b'] = NemoServiceBaseModel(PubmedModels.gpt8b.value, create_prompt_with_examples=create_prompt_with_examples)
llms['gpt20b'] = NemoServiceBaseModel(PubmedModels.gpt20b.value, create_prompt_with_examples=create_prompt_with_examples)
llms['gpt43b'] = NemoServiceBaseModel(PubmedModels.gpt43b.value, create_prompt_with_examples=create_nemo_prompt_with_examples)

In [6]:
llms

{'gpt8b': <llm_utils.nemo_service_models.NemoServiceBaseModel at 0x7fbf3c6f3380>,
 'gpt20b': <llm_utils.nemo_service_models.NemoServiceBaseModel at 0x7fbf3c6f3410>,
 'gpt43b': <llm_utils.nemo_service_models.NemoServiceBaseModel at 0x7fbf3c6f3440>}

### Few Shot Results with 43B Fine-Tuned Format 

In [7]:
!pwd

/workspace/dli/2-PubMedQA


In [8]:
df = pd.read_csv('/workspace/dli/2-PubMedQA/data_62124.csv')

In [9]:
df.head()

Unnamed: 0.1,Unnamed: 0,good address,compare address,label
0,0,14361 cupcake terrace,14361 cupofcake terrace,1
1,1,13823 principle terrace,2689 principal highway,0
2,2,908 toolbox street,1209 main highway,0
3,3,6785 wheel avenue,1809 main way,0
4,4,3944 electric screwdriver road,3944 elecc screwdriver,1


In [10]:
rowNum = 20 
c = 0
for i in range(10, rowNum):
    ##prompt = f'Should the following two addresses be linked {df['good address'][i]} and {df['compare address'][i]}?'
    #prompt = f'Are the following addresses mispelled versions of the same address or two distinct addresses: {df['good address'][i]} and {df['compare address'][i]}? ANSWER (yes|no)'

    fewPrompt = f'''Objective: Some addresses look the same, but indicate different physical locations and therefore they should not be linked, 
    one indicaation that two addresses should not be linked is they have different house numbers. Alternatively, some 
    addresses are not exact matches but should be linked, they often dont match because of a mispelling in one of the 
    street names.  For example, These pair of addresses should be matched: 990 sizzling place and 990 sizlig place. while, 
    this pair of streets 67 metal way and 87 petal drive indicate different address that should not be matched. 
    
    QUESTION: Should the following two addresses be linked 14361 cupcake terrace and 14361 cupofcake terrace? 
    ANSWER (yes|no): 
    
    Assistant:yes 
    
    QUESTION: Should the following two addresses be linked 13823 principle terrace and 2689 principal highway? 
    ANSWER (yes|no): 
    
    Assistant:no 
    
    QUESTION: Should the following two addresses be linked 908 toolbox street and 1209 main highway? 
    ANSWER (yes|no): 
    
    Assistant:no 

    QUESTION: Should the following two addresses be linked {df['good address'][i]} and {df['compare address'][i]}? 
    ANSWER (yes|no):'''
    
    response = llms['gpt43b'].generate(fewPrompt, tokens_to_generate=1, return_type='text').strip()

    if df['label'][i] == 1:
        label = 'yes'
    else:
        label = 'no'

    print(f'Good Address: {df['good address'][i]}')
    print(f'Compare Address: {df['compare address'][i]}')
    
    print(f'Response from model: {response}')
    print(f'Actual answer: {label}')
    correct = label == response
    print(f'Response from model correct: {correct}\n')
    if correct:
        c += 1

print(f'the accuracy is {c/10}')


Good Address: 13345 door parkway
Compare Address: 13345 dor parkway
Response from model: yes
Actual answer: yes
Response from model correct: True

Good Address: 6494 bug parkway
Compare Address: 6494 bg parkway
Response from model: yes
Actual answer: yes
Response from model correct: True

Good Address: 1635 arm road
Compare Address: 7683 main place
Response from model: no
Actual answer: no
Response from model correct: True

Good Address: 9329 anchor place
Compare Address: 9329 aynchzor place
Response from model: yes
Actual answer: yes
Response from model correct: True

Good Address: 9910 flower place
Compare Address: 9910 flwe place
Response from model: yes
Actual answer: yes
Response from model correct: True

Good Address: 10393 chalk boulevard
Compare Address: 10393 cfhtalk boulevard
Response from model: no
Actual answer: yes
Response from model correct: False

Good Address: 1519 boat center
Compare Address: 10261 main lane
Response from model: no
Actual answer: no
Response from mode