# 3. Train a fine-tuning model specialized for Q&A
This notebook will utilize the dataset of context, question and answer pairs to additionally create adversarial questions and context pairs, where the question was not generated on that context. In those cases the model will be prompted to answer "No sufficient context for answering the question". We will also train a discriminator model, which predicts whether the question can be answered based on the context or not.

We will add hard adversarial examples as well, which will be based either on semantically similar sections, or neighbouring sections, originating from the same article.

In [26]:
import openai
import pandas as pd



openai.api_key ='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'

df = pd.read_csv('deloitte_qa.csv')
olympics_search_fileid = "file-YwQzzozDj4JLtYIU3cqryyBv"
df.head()

Unnamed: 0,text,context,questions,answers
0,“Technologies that are easy to understand and ...,“Technologies that are easy to understand and ...,1. What technologies does Pavel Šiška believe ...,1. Pavel Šiška believes that technologies that...
1,Milan Kulhánek is a Partner in the Consulting ...,Milan Kulhánek is a Partner in the Consulting ...,1. What is Milan Kulhánek's role at Deloitte C...,1. Milan Kulhánek is a Partner in the Consulti...
2,“Technologies that are easy to understand and ...,“Technologies that are easy to understand and ...,1. What technologies does Pavel Šiška believe ...,1. Technologies that are easy to understand an...
3,For companies that want to use full potential ...,For companies that want to use full potential ...,1. What services does the company offer to sup...,1. The company offers a broad range of service...
4,Tax authorities require the taxpayers to prove...,Tax authorities require the taxpayers to prove...,1. What do tax authorities require of taxpayer...,1. Tax authorities require taxpayers to prove ...


Split the sections into a training and testing set

In [19]:
df['title'] = df.context

In [20]:
from sklearn.model_selection import train_test_split
train_df, test_df = train_test_split(df, test_size=0.1, random_state=42)
len(train_df), len(test_df)

(42, 5)

In [21]:
df

Unnamed: 0,text,context,questions,answers,title
0,“Technologies that are easy to understand and ...,“Technologies that are easy to understand and ...,1. What technologies does Pavel Šiška believe ...,1. Pavel Šiška believes that technologies that...,“Technologies that are easy to understand and ...
1,Milan Kulhánek is a Partner in the Consulting ...,Milan Kulhánek is a Partner in the Consulting ...,1. What is Milan Kulhánek's role at Deloitte C...,1. Milan Kulhánek is a Partner in the Consulti...,Milan Kulhánek is a Partner in the Consulting ...
2,“Technologies that are easy to understand and ...,“Technologies that are easy to understand and ...,1. What technologies does Pavel Šiška believe ...,1. Technologies that are easy to understand an...,“Technologies that are easy to understand and ...
3,For companies that want to use full potential ...,For companies that want to use full potential ...,1. What services does the company offer to sup...,1. The company offers a broad range of service...,For companies that want to use full potential ...
4,Tax authorities require the taxpayers to prove...,Tax authorities require the taxpayers to prove...,1. What do tax authorities require of taxpayer...,1. Tax authorities require taxpayers to prove ...,Tax authorities require the taxpayers to prove...
5,\n Tomáš Babáček is a partner in the Regulat...,\n Tomáš Babáček is a partner in the Regulat...,1. What is Tomáš Babáček's job title?\n2. What...,1. Tomáš Babáček is a partner in the Regulator...,\n Tomáš Babáček is a partner in the Regulat...
6,Working with Deloitte Private means you will w...,Working with Deloitte Private means you will w...,1. What are the benefits of working with a tru...,1. The benefits of working with a trusted advi...,Working with Deloitte Private means you will w...
7,It is ever more typical nowadays for business ...,It is ever more typical nowadays for business ...,1. What are some of the benefits of an interna...,1. Some of the benefits of an international bu...,It is ever more typical nowadays for business ...
8,Risk management tools consolidate a large numb...,Risk management tools consolidate a large numb...,1. What are some of the benefits of using risk...,1. Risk management tools can help reduce unnec...,Risk management tools consolidate a large numb...
9,It does not happen just in the Czech Republic ...,It does not happen just in the Czech Republic ...,1. What is the main issue discussed in the tex...,1. The main issue discussed in the text is tax...,It does not happen just in the Czech Republic ...


we check that he separator we intend to use isn't present within the contexts

## 3.1 Create the fine-tuning datasets for Q&A and discriminator models
The fine-tuning dataset is created in the following way. For every corresponding question, answer and context pair we create:
- Positive example: correct question, answer, context pair
- Negative examples:
  - random negative example, where the random context is paired with the question 
  - two hard negative examples
    - one originating from the same wikipedia article
    - another, which is most similar to the correct context

This process is noisy, as sometimes the question might be answerable given a different context, but on average we hope this won't affect the peformance too much.

We apply the same process of dataset creation for both the discriminator, and the Q&A answering model. We apply the process separately for the training and testing set, to ensure that the examples from the traing set don't feature within the test set.

In [22]:
import random

def get_random_similar_contexts(question, context, file_id=olympics_search_fileid, search_model='ada', max_rerank=10):
    """
    Find similar contexts to the given context using the search file
    """
    try:
        results = openai.Engine(search_model).search(
            search_model=search_model, 
            query=question, 
            max_rerank=max_rerank,
            file=file_id
        )
        candidates = []
        for result in results['data'][:3]:
            if result['text'] == context:
                continue
            candidates.append(result['text'])
        random_candidate = random.choice(candidates)
        return random_candidate
    except Exception as e:
        print(e)
        return ""

def create_fine_tuning_dataset(df, discriminator=False, n_negative=1, add_related=False):
    """
    Create a dataset for fine tuning the OpenAI model; either for a discriminator model, 
    or a model specializing in Q&A, where it says if no relevant context is found.

    Parameters
    ----------
    df: pd.DataFrame
        The dataframe containing the question, answer and context pairs
    discriminator: bool
        Whether to create a dataset for the discriminator
    n_negative: int
        The number of random negative samples to add (using a random context)
    add_related: bool
        Whether to add the related contexts to the correct context. These are hard negative examples

    Returns
    -------
    pd.DataFrame
        The dataframe containing the prompts and completions, ready for fine-tuning
    """
    rows = []
    for i, row in df.iterrows():
        for q, a in zip(("1." + row.questions).split('\n'), ("1." + row.answers).split('\n')):
            if len(q) >10 and len(a) >10:
                if discriminator:
                    rows.append({"prompt":f"{row.context}\nQuestion: {q[2:].strip()}\n Related:", "completion":f" yes"})
                else:
                    rows.append({"prompt":f"{row.context}\nQuestion: {q[2:].strip()}\nAnswer:", "completion":f" {a[2:].strip()}"})

    for i, row in df.iterrows():
        for q in ("1." + row.questions).split('\n'):
            if len(q) >10:
                for j in range(n_negative + (2 if add_related else 0)):
                    random_context = ""
                    if j == 0 and add_related:
                        # add the related contexts based on originating from the same wikipedia page
                        subset = df[(df.title == row.title) & (df.context != row.context)]
                        
                        if len(subset) < 1:
                            continue
                        random_context = subset.sample(1).iloc[0].context
                    if j == 1 and add_related:
                        # add the related contexts based on the most similar contexts according to the search
                        random_context = get_random_similar_contexts(q[2:].strip(), row.context, search_model='ada', max_rerank=10)
                    else:
                        while True:
                            # add random context, which isn't the correct context
                            random_context = df.sample(1).iloc[0].context
                            if random_context != row.context:
                                break
                    if discriminator:
                        rows.append({"prompt":f"{random_context}\nQuestion: {q[2:].strip()}\n Related:", "completion":f" no"})
                    else:
                        rows.append({"prompt":f"{random_context}\nQuestion: {q[2:].strip()}\nAnswer:", "completion":f" No appropriate context found to answer the question."})

    return pd.DataFrame(rows) 

We apply the same process of dataset creation for both the discriminator, and the Q&A answering model. We apply the process separately for the training and testing set, to ensure that the examples from the traing set don't feature within the test set.

In [27]:
for name, is_disc in [('discriminator', True), ('qa', False)]:
    for train_test, dt in [('train', train_df), ('test', test_df)]:
        ft = create_fine_tuning_dataset(dt, discriminator=is_disc, n_negative=1, add_related=True)
        ft.to_json(f'{name}_{train_test}.jsonl', orient='records', lines=True)

Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 64.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 64.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 64.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.op

Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 84.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 108.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 80.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.o

Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 108.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 104.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 128.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta

Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 92.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 120.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 124.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.

Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 104.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 148.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 144.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta

Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 168.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 168.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 172.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta

Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 188.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 188.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 192.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta

Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 212.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 228.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 208.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta

Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 228.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 184.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 236.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta

Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 252.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 200.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 260.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta

Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 288.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 268.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 272.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta

Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 296.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 304.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.
Rate limit reached for default in organization org-uTbKXOI3BONOsvlv80LDu7Rn on requests per min. Limit: 60.000000 / min. Current: 232.000000 / min. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta

We formatted the data according to the recommendations from the fine-tuning tool, which is available using
> openai tools fine_tunes.prepare_data -f qa_train.jsonl

We highly recommend that you use this tool, which suggests improvements in your data formatting for fine-tuning.


## 3.2 Submit the datasets for fine-tuning

In [32]:
import sys


In [36]:
!pip3 install openai

Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/Library/Frameworks/Python.framework/Versions/3.9/bin/python3.9 -m pip install --upgrade pip' command.[0m


In [44]:
openai.FineTune.create({t:})

InvalidRequestError: You submitted a length-0 POST body, but must submit a JSON object. (HINT: Try submitting an empty object instead, i.e. {}. If you're using curl, you can pass -d '{}' -H 'Content-Type: application/json')

In [43]:
!openai api fine_tunes.create -t "discriminator_train.jsonl" -v "discriminator_test.jsonl" --batch_size 16  --compute_classification_metrics --classification_positive_class " yes" --model ada

zsh:1: command not found: openai


(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2022-07-04 17:45:14] Created fine-tune: ft-QDmUoNr3Sp3l5X5XjyFLkRlK
[2022-07-04 17:45:24] Fine-tune costs $0.09
[2022-07-04 17:45:24] Fine-tune enqueued. Queue number: 0
[2022-07-04 17:45:29] Fine-tune started
[2022-07-04 17:46:03] Completed epoch 1/4
[2022-07-04 17:46:19] Completed epoch 2/4
[2022-07-04 17:46:33] Completed epoch 3/4
[2022-07-04 17:46:46] Completed epoch 4/4
[2022-07-04 17:47:08] Uploaded model: ada:ft-personal-2022-07-04-15-47-06
[2022-07-04 17:47:10] Uploaded result file: file-bBQKc6u4S3qOAGmByA5CWnUx
[2022-07-04 17:47:11] Fine-tune succeeded

Job complete! Status: succeeded 🎉
Try out your fine-tuned model:

openai api completions.create -m ada:ft-personal-2022-07-04-15-47-06 -p <YOUR_PROMPT>

In [38]:
!openai api fine_tunes.create -t "qa_train.jsonl" -v "qa_test.jsonl" --batch_size 16

zsh:1: command not found: openai


## 3.3 Using the fine-tuned models

We will now use the fine-tuned discriminator and the fine-tuned Q&A model. By requesting logprobs, we can see how certain the discriminator is in a `yes` vs `no` answer.

In [47]:
ft_discriminator = "ada:ft-personal-2022-07-04-15-47-06"
ft_qa = "curie:ft-personal-2022-07-04-15-50-49"

def apply_ft_discriminator(context, question, discriminator_model):
    """
    Apply the fine tuned discriminator to a question, to assess whether it can be answered from the context.
    """
    prompt = f"{context}\nQuestion: {question}\n Related:"
    result = openai.Completion.create(model=discriminator_model, prompt=prompt, max_tokens=1, temperature=0, top_p=1, n=1, logprobs=2)
    return result['choices'][0]['logprobs']['top_logprobs']

apply_ft_discriminator('The first human-made object in space was the Soviet Union satellite Sputnik 1 on 4 October 1957.', 
                        'What was the first human-made object in space?', ft_discriminator)

[<OpenAIObject at 0x7f96188fc710> JSON: {
   " no": -2.2787914,
   " yes": -0.10880088
 }]

We can see that the model can generalize well to different contexts and questions. 

In [49]:
QUESTION = 'Who is the partner for consulting in Deloitte?'

def apply_ft_qa_answer(context, question, answering_model):
    """
    Apply the fine tuned discriminator to a question
    """
    prompt = f"{context}\nQuestion: {question}\nAnswer:"
    result = openai.Completion.create(model=answering_model, prompt=prompt, max_tokens=30, temperature=0, top_p=1, n=1, stop=['.','\n'])
    return result['choices'][0]['text']

apply_ft_qa_answer('There are many partners in Deloitte, for example Jan Siska is the partner for Consulting.', 
                    QUESTION, ft_qa)


' No appropriate context found to answer the question'

We can see that the model can answer the question, when the context is appropriate.

In [10]:
apply_ft_qa_answer('The first human-made object in space was the Soviet Union satellite Sputnik 1 on 4 October 1957.',
                    'What is impressive about the Soviet Union?', ft_qa)

' The Soviet Union was the first country to successfully launch a satellite into space'

In [11]:
apply_ft_qa_answer('The first human-made object in space was the Soviet Union satellite Sputnik 1 on 4 October 1957.',
                    'How many cars were produced in the Soviet Union in 1970?', ft_qa)

' No appropriate context found to answer the question'

We can see that the model knows when to answer the question, and when to say that insufficient context is present to answer the question.

We can also combine a discriminator and a base model, or a fine-tuned Q&A model. Discriminator can essentially serve as a decision whether the question can be answered given the context or not.

In [50]:
def answer_question_conditionally(answering_model, discriminator_model, context, question, discriminator_logprob_yes_modifier=0):
    logprobs = apply_ft_discriminator(context, question, discriminator_model)
    yes_logprob = logprobs[' yes'] if ' yes' in logprobs else -100
    no_logprob = logprobs[' no'] if ' no' in logprobs else -100
    if yes_logprob + discriminator_logprob_yes_modifier < no_logprob:
        return " No appropriate context found to answer the question based on the discriminator."
    return apply_ft_qa_answer(context, question, answering_model)
answer_question_conditionally(ft_qa, ft_discriminator, 
                                "Crowdless games are a rare although not unheard-of occurrence in sports. \
                                 When they do occur, it is usually the result of events beyond the control \
                                 of the teams or fans, such as weather-related concerns, public health concerns, \
                                 or wider civil disturbances unrelated to the game. For instance, \
                                 the COVID-19 pandemic caused many sports leagues around the world \
                                 to be played behind closed doors.",
                                "Could weather cause a sport event to have no crowd?")

' No'

The above function illustrates how to potentially combine a discriminator and a fine-tuned Q&A model. This gives a more fine-grained control over how certain we want the model to be before it answers the question.

We'll now take a look on how answers endpoint works - combining search to retrieve the relevant context from a knowledge base, and then using the fine-tuned Q&A model to answer the question.

## 3.4 Answering the question based on a knowledge base
Finally we can use a logic similar to the [/answers](https://beta.openai.com/docs/api-reference/answers) endpoint, where we first search for the relevant context, and then ask a Q&A model to answer the question given that context. If you'd like to see the implementation details, check out the [`answers_with_ft.py`](answers_with_ft.py) file.

In [68]:
QUESTION = 'Who is the partner in the Regulatory & Compliance department at Deloitte Legal in Prague?'



from answers_with_ft import answer_question
answer_question(olympics_search_fileid, ft_qa, QUESTION)

' Tomáš Babáček is a partner in the Regulatory & Compliance department at Deloitte Legal in Prague'

In [72]:
ft_qa

'curie:ft-personal-2022-07-04-15-50-49'

In [None]:
import os
import openai
openai.api_key ='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'
openai.Answer.create(
  search_model="ada",
  model="curie",
  question='Who is the partner in the Regulatory & Compliance department at Deloitte Legal in Prague',
  documents=["Puppy A is happy.", "Puppy B is sad."],
  examples_context="In 2017, U.S. life expectancy was 78.6 years.",
  examples=[["What is human life expectancy in the United States?","78 years."]],
  max_tokens=5,
  stop=["\n", "<|endoftext|>"],
)