# Controlling GPT-3 Outputs for Question Answering

Here's a quick tutorial on how to use GPT-3 to search through a predefined list of facts and return the best answer to a question.

In [37]:
#@title Installs
!pip install sentence-transformers -q
!pip install openai -q

In [27]:
#@title Imports
from sentence_transformers import SentenceTransformer, util
import torch 
import pandas as pd
import os
import openai
import spacy

# Loading Models and Connecting to OpenAI

Below we load and define models for breaking text up into sentences (spacy) and searching through these sentences (sentence transformers. In order to connect to GPT-3, you'll need to set up an account with OpenAI and get your API key [here](https://beta.openai.com/account/api-keys)

In [38]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
embmodel = SentenceTransformer('msmarco-MiniLM-L-6-v3').to(device)
nlp = spacy.load("en_core_web_sm")

openai.api_key = "YOUR KEY HERE"

#Creating a List of Sentences to be Used as Answers

This is a predefined list of facts that will be provided to the chooser to use as its responses to user queries.

In [None]:
biography = """
My name is Henry Leonardi.
I am a senior at Ohio State University.
I am studying linguistics.
I speak spanish and italian fluently.
I am from Cincinnati Ohio.
I am a fourth year student studying linguistics and minoring in computer information systems.
I have 3 years of academic research experience in the field of Natural Language Processing (NLP).
I interned at Kyndi and currently works part-time as an NLP Engineer at Holocron Technologies.
At Kyndi, I helped automate data annotation tasks and is currently leading a project using GPT-3.
At Holocron, I trained and implemented text classification models and has added semantic search capabilities to their database.
I'm interested in machine learning and dialogue systems.
I'm excited about the potential of NLP and machine learning technologies.
"""

# This line breaks the text above into sentences which will be used as answers to user queries
facts = [str(sent) for sent in nlp(biography.replace("\n","")).sents]
fact_embs = embmodel.encode(facts)

#Using a Query to Find Relevant Answers 

In [41]:
def relevant_answers(query, label_embs, answers):

  #Embed the query using a sentence transformers model
  query_embedding = embmodel.encode(query, convert_to_tensor=True, show_progress_bar=False)

  #Find the sentences in the dataset most similar to the query
  cos_scores = util.cos_sim(query_embedding, label_embs)[0]
  top_results = torch.topk(cos_scores, k=5)
  top_answers = [answers[i] if i in range(len(answers)) else answers[i-len(answers)] for i in top_results[1]]

  #Create a dictionary with the top answers and return it
  answer_dict = {}
  for i in range(len(top_answers)+1):
      if i == len(top_answers):
          # This will be a valid answer if the user's query does not have an answer in the fact dataset
          answer_dict[i] = "I don't have a good response to that question"
      else:
          answer_dict[i] = top_answers[i]
  return answer_dict

#Using GPT-3 to Choose the Best Answer

In [42]:
def find_answer(query, label_embs, answers):

    #Get the top answers
    answer_dict = relevant_answers(query, label_embs, answers)

    #Prompt GPT-3 with the top answers instructing it to return the index of the best one
    response = openai.Completion.create(
      model="text-davinci-003",
      prompt=f"Here is a dictionary with the answers and their number labels:\n\n{str(answer_dict)}\n\nThe chatbot takes user queries as inputs, and returns the number label of the answer that will help them the most (the answer doesn't always have to be a perfect match). \n\nUser Input: \"{query}\"\nChatbot's returned number label: ",
      temperature=0.7,
      max_tokens=1,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0
    )
    try:
        num_label = int(response["choices"][0]["text"])
        answer = answer_dict[num_label]                                      
    except:
        answer = "Sorry, I don't have a good response for that"
    return answer

In [50]:
find_answer("what are your interests?", fact_embs, facts)

"I'm interested in machine learning and dialogue systems."