<a href="https://colab.research.google.com/github/TejSuklikar/GPT-3-Research-Project/blob/main/Cleaned_up_TQA_GPT3_Code_Zero_and_Few_Shot_with_Test_Data.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Load the JSON File**

In [None]:
#Library
import json
import pandas as pd

x = pd.read_json("/content/tqa_v2_test.json")





**Create a Content Table where each row is either a Topic or Adjunct Topic for a given lesson.**

In [None]:
from pandas.core.apply import Apply
topics = x['topics']
adjunctTopics = x['adjunctTopics']

length = x.index.stop
i=0
contentTable = []
while i < length:
  for key in topics[i].keys():
    contentTable.append( [x['globalID'][i], x['lessonName'][i], "Topic", "", key, topics[i][key]['content']['text']])
    j = 0
    for key in adjunctTopics[i].keys():
      contentID = "A_" + str(i*1000+j)
      if (key != 'Vocabulary'):
        contentTable.append( [x['globalID'][i], x['lessonName'][i], "Adjunct Topic", str(key), contentID, adjunctTopics[i][str(key)]['content']['text']])
      j+=1
  i+=1


**Create a simpler Content Table with just Topic content**

In [None]:
from pandas.core.apply import Apply
topics = x['topics']

length = x.index.stop
i=0
contentqaTable = []
while i < length:
  for key in topics[i].keys():
    content = topics[i][key]['content']['text']
    contentqaTable.append([x['globalID'][i], content])
  i+=1


**Convert the content table to a Data Frame.**

In [None]:
ct = pd.DataFrame(contentqaTable,columns=['id','context',])

**Since there are multiple rows of content for each Lesson ID, get the unique Lesson IDs. This will be useful in constructing Prompts per lesson**

In [None]:
lessonIds = ct.id.unique()


**Create a consolidated Lesson Table with one row per Lesson, and all the associated content**

In [None]:
consolidatedLessonTable = []
for l in lessonIds:
  lessonContext = ""
  lessonContents = ct[ct.id == l]
  for index, row in lessonContents.iterrows():
    lessonContext += row['context'] + "\n"
  consolidatedLessonTable.append([l,lessonContext])
clt = pd.DataFrame(consolidatedLessonTable,columns=['id','content'])

**Create a Question Answer Table with the Questions, Answer Choices, and Correct Answer per row. Additionally the associated Lesson ID is also stored for looking up and joining to the Lesson Table content.**

In [None]:
questions = x['questions']
length = x.index.stop
i=0

questionAnswerTable = []
answerTable =[]
while i < length:
  for key in questions[i]['nonDiagramQuestions'].keys():
    lessonID = x['globalID'][i]
    lessonName = x['lessonName'][i]
    questionID = key
    questionText = questions[i]['nonDiagramQuestions'][key]['beingAsked']['processedText']
    questionType = questions[i]['nonDiagramQuestions'][key]['questionType']
    questionSubType = questions[i]['nonDiagramQuestions'][key]['questionSubType']
    correctAnswerChoice = questions[i]['nonDiagramQuestions'][key]['correctAnswer']['processedText']
    answerChoices = questions[i]['nonDiagramQuestions'][key]['answerChoices']
    answerChoicesPrompt = ""
    correctAnswerDetail = ""
    for key2 in questions[i]['nonDiagramQuestions'][key]['answerChoices'].keys():
      answerChoicesPrompt = answerChoicesPrompt + questions[i]['nonDiagramQuestions'][key]['answerChoices'][key2]['rawText'] + "; "

    answerChoicesPrompt = answerChoicesPrompt[:-2]  
    questionAnswerTable.append([lessonID,questionText,correctAnswerChoice, answerChoicesPrompt])
  i+=1

In [None]:
qat = pd.DataFrame(questionAnswerTable,columns=['id','question', 'correct answer', 'answer choices'])

In [None]:
len(qat)

2512

In [None]:
qat.to_csv("qat_test.csv")

**For the Zero Shot Learning experiment, build a Prompt Table with just the prompt, Question plus Answer Choices. One Lesson per Row.**

**For the Few Shot Learning experiment, build a Prompt Table that combines the Lesson Content and Question plus Answer Choices in a single string. One Lesson per Row.**

In [None]:
fewShotPromptInstructions = "Use the lesson text below to answer the following questions by picking one of the choices provided. Only include the letter of the answer choice listed. For example, 3.c.\n\n"
zeroShotPromptInstructions = "Answer the following questions by picking one of the choices provided. Only include the letter of the answer choice listed.\n\n"

fewShotPromptTable = []
zeroShotPromptTable = []

fewShotAnswerKeyTable = []
zeroShotAnswerKeyTable = []

for i,l in clt.iterrows():
  fsPrompt = fewShotPromptInstructions + "Lesson:\n" + l['content'] +"\n\n"+"Questions:\n"
  zsPrompt = zeroShotPromptInstructions + "Questions:\n"  
  fsAnswers = ""
  lessonQAT = qat[qat.id == l['id']]
  qnum = 1
  for index,row in lessonQAT.iterrows():
    fsPrompt = fsPrompt + str(qnum) + ". " + row['question'] +"\n" + row['answer choices'] + "\n\n"
    zsPrompt = zsPrompt + str(qnum) + ". " + row['question'] +"\n" + row['answer choices'] + "\n\n"
    fsAnswers = fsAnswers + str(qnum) + ". " + row['correct answer'] + "; "
    zeroShotAnswerKeyTable.append([l['id'],qnum,row['correct answer']])
    qnum += 1
      
  fsAnswers = fsAnswers[:-2]

  for i in range(qnum-1):
    fsPrompt = fsPrompt + str(i+1) + ". ?\n"
    zsPrompt = zsPrompt + str(i+1) + ". ?\n"

  fsPrompt = fsPrompt + "===="
  zsPrompt = zsPrompt + "===="
 
  fewShotPromptTable.append ([l['id'],fsPrompt])
  zeroShotPromptTable.append ([l['id'],zsPrompt])

  fewShotAnswerKeyTable.append ([l['id'],fsAnswers])





In [None]:
fspt = pd.DataFrame(fewShotPromptTable,columns=['Lesson ID','Prompt'])
fsakt = pd.DataFrame(fewShotAnswerKeyTable,columns=['Lesson ID','Answers'])
zspt = pd.DataFrame(zeroShotPromptTable,columns=['Lesson ID','Prompt'])
zsakt = pd.DataFrame(zeroShotAnswerKeyTable,columns=['Lesson ID','Question Number','Answer'])

In [None]:
fspt['prompt length'] = fspt.apply(lambda x: len(str(x['Prompt'])), axis=1)

In [None]:
fspt.to_csv("fspt_test.csv")

In [None]:
fsakt.to_csv("fsakt_test.csv")
zspt.to_csv("zspt_test.csv")
zsakt.to_csv("zsakt_test.csv")

**Install openai and enter my API key**

In [None]:
!pip install --upgrade openai

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
OPENAI_API_KEY="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"

**Some functions for calling the completions api and processing the results**

A function to extract each individual answer out of the returned api response, into a separate row

In [None]:
def response_to_table (lId, r, answer_table):
  answer_list = r.strip().split("\n")
  for i in answer_list:
    row = i.split(".")
    answer_table.append([lId, row[0],row[1].strip()])
  return answer_table

A function to call the api for each prompt and process the returned completion

In [None]:
def lesson_answer (lId,p,answerTable):
  import os
  import openai

  openai.api_key = OPENAI_API_KEY

  start_sequence = "\nA:"
  restart_sequence = "\n\nQ: "

  response = openai.Completion.create(
    model="text-davinci-002",
    prompt=p,
    temperature=0,
    max_tokens=200,
    top_p=1,
    frequency_penalty=0,
    presence_penalty=0,
    stop=["===="]
  )
  if response['choices'][0]['finish_reason']=='stop':
    answerTable = response_to_table (lId, response['choices'][0]['text'],answerTable)
  
  return answerTable

Test out the execution on one prompt

In [None]:
import os
import openai

openai.api_key = OPENAI_API_KEY

start_sequence = "\nA:"
restart_sequence = "\n\nQ: "

response = openai.Completion.create(
    model="text-davinci-002",
    prompt=fspt['Prompt'][2],
    temperature=0,
    max_tokens=200,
    top_p=1,
    frequency_penalty=0,
    presence_penalty=0,
    stop=["===="]
  )

In [None]:
print (response)

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "text": "\n\n1. e\n2. d\n3. b\n4. c\n5. g\n6. a\n7. f\n8. a\n9. a\n10. b\n11. b\n12. b\n13. b\n14. a\n15. a\n16. b\n17. b\n18. d\n19. a\n20. d\n21. b\n22. a\n23. d\n24. d"
    }
  ],
  "created": 1663018016,
  "id": "cmpl-5plFAdwqW1kMslsyXjD3cjj5vJic2",
  "model": "text-davinci-002",
  "object": "text_completion",
  "usage": {
    "completion_tokens": 97,
    "prompt_tokens": 2181,
    "total_tokens": 2278
  }
}


**Run the few shot experiment. Since some prompts error out, we will run the experiments in batches.**

In [None]:
resultsTable = []

In [None]:
subsetTable = fspt.iloc[0:49]
for row in subsetTable.itertuples():
  resultsTable = lesson_answer(row[1],row[2], resultsTable)


**Run the Zero Shot experiment in batches**

In [None]:
zsResultsTable = []

In [None]:
subsetTable = zspt.iloc[0:49]
for row in subsetTable.itertuples():
  zsResultsTable = lesson_answer(row[1],row[2], zsResultsTable)

**Clean up**

We encounter some issues with the returned results. 
1. The actual answer is returned instead of the letter. For example, "true" instead of "a"
2. The answer is included in addition to the the letter. For example, "a. True" instead of "a"

I clean these up manually by exporting the results to a csv and fixing these issues there. Then I import the csv.

Additionally, the dtypes for the Question Number needs to be set to int.

In [None]:
fsrat = pd.DataFrame(resultsTable,columns=['Lesson ID','Question Number','Returned Answer'])

In [None]:
zsrat = pd.DataFrame(zsResultsTable,columns=['Lesson ID','Question Number','Returned Answer'] )

In [None]:
convert_dict = {'Question Number': int}
zsrat = zsrat.astype(convert_dict)

In [None]:
fsrat_clean = fsrat.drop_duplicates(subset = ['Lesson ID','Question Number'],keep='first').reset_index(drop=True)

In [None]:
fsrat_clean = fsrat_clean[fsrat_clean['Lesson ID'] != 'L_0886']

In [None]:
convert_dict = {'Question Number': int}
fsrat_clean = fsrat_clean.astype(convert_dict)

In [None]:
zsrat_clean = pd.read_csv("/content/zs_answers_compared.csv")
zsrat = zsrat_clean.drop(['Unnamed: 0','Answer','Is Correct'],axis=1)

In [None]:
zsrat['Returned Answer'] = zsrat.apply(lambda x: 'a' if x['Returned Answer'].lower() == 'true' else 'b' if x['Returned Answer'].lower() == 'false' else x['Returned Answer'], axis=1)

In [None]:
zsrat['Returned Answer'] = zsrat.apply(lambda x:  x['Returned Answer'][0], axis=1)

In [None]:
fsrat_clean.dtypes

Lesson ID          object
Question Number     int64
Returned Answer    object
dtype: object

**Create a table for each experiment that compares the results returned to the answer key. These are compt (for Few Shot) and zsCompt (for Zero Shot)**

In [None]:
compt = fsrat_clean.merge (zsakt,how='inner',left_on=['Lesson ID','Question Number'], right_on=['Lesson ID','Question Number'])

In [None]:
zsCompt = zsrat.merge (zsakt,how='inner',left_on=['Lesson ID','Question Number'], right_on=['Lesson ID','Question Number'])

**Compare the returned answer to the answer from the answer key, and set "Is Correct" to True if they are equal, and False if not equal**

In [None]:
compt['Is Correct'] = compt.apply(lambda x: x['Answer'] == x['Returned Answer'], axis=1)

In [None]:
zsCompt['Is Correct'] = zsCompt.apply(lambda x: x['Answer'] == x['Returned Answer'], axis=1)

**Zero Shot Accuracy**

In [None]:
len(zsCompt[zsCompt['Is Correct']==True])/len(zsCompt)

In [None]:
compt.to_csv("fs_answers_compared.csv")
zsCompt.to_csv("zs_answers_compared.csv")

**Few Shot Accuracy**

Some prompts from the Few Shot experiment had to be re-run manually in the playground. I exported the answers into a CSV, recorded the answers from the playground manually and imported the updated CSV.

In [None]:
fsCompt = pd.read_csv("/content/fs_answers_compared_fixed_playground.csv")

In [None]:
len(fsCompt[fsCompt['Is Correct']==1])/len(fsCompt)

0.8483927019982623

In [None]:
combined_results_table = zsCompt.merge (fsCompt,how='inner',left_on=['Lesson ID','Question Number'], right_on=['Lesson ID','Question Number'],suffixes=('_zs','_fs'))

**Stats on the combined results - 2,254 Questions and Answers for 179 lessons were run in both experiments. These were compared.**

In [None]:
combined_results_table.nunique()

Lesson ID             179
Question Number        34
Returned Answer_zs     11
Answer_zs               7
Is Correct_zs           2
Returned Answer_fs     10
Answer_fs               7
Is Correct_fs           2
dtype: int64

In [None]:
len(combined_results_table)

2254

**Comparison of Accuracy for the common questions**

In [None]:
print("Zero Shot Accuracy: " + str(len(combined_results_table[combined_results_table['Is Correct_zs']==True])/len(combined_results_table)) + "  Few Shot Accuracy: " + str(len(combined_results_table[combined_results_table['Is Correct_fs']==True])/len(combined_results_table)))

Zero Shot Accuracy: 0.7275953859804791  Few Shot Accuracy: 0.8478260869565217


In [None]:
combined_results_table.to_csv("combined_results_final.csv")