<br>

# AquinasGPT
<br>

The objective of this project is to lay the ground work for a chatbot that articulates its responses akin to the style found in Thomas Aquinas' Summa Theologiae. This project holds special importance to me, given my study of the Summa at 
[Thomas Aquinas College](https://www.thomasaquinas.edu/) (I even wrote my senior thesis on Aquinas' work the Division and Method of the Sciences which can be found [here](https://www.linkedin.com/in/jonscheaffer/overlay/1635484058296/single-media-viewer/?profileId=ACoAAC4PAjwBaKwgpkGvOJTXmVHGwbVUXyklzvk)). Additionally, the Summa's distinctive structure of question and response lends itself well to being adapted into the format of prompts and completions. Initially, I shall employ a prompt engineering approach, followed by a subsequent attempt at fine-tuning.

<br>

For the purposes of this project I will be testing my prompts with the questions:

1.) What is the meaning of Life?

2.) Is the pursuit of career advancement the ultimate source of meaning in life?

3.) Is Data Science the greatest of all career paths?

<br>

I purposely start vague and then go more specific. My goal with this is to analyze the differences between Prompt Engineering and Fine-Tuning. My hypothesis is that Fine-tuning will work best for 1 and 2 but will struggle with 3. Further, Prompt Engineering may yield the best results for 3 given its flexibility, however it may fall short in style and format on 1 and 2. 

In [23]:
question1 = "What is the meaning of Life"
question2 = "Is the pursuit of career advancement the ultimate source of meaning in life?"
question3 = "Is the Data Science greatest of all career paths"

In [20]:
#Imports and API key 
import os
from dotenv import load_dotenv
import openai
import pandas as pd

load_dotenv()
openai.api_key = os.environ.get('OPENAI_KEY')

# Prompt Engineering
<br>

### First Attempt with Vague Instructions
<br>

In [None]:
def get_completion(prompt, model="gpt-3.5-turbo"): 
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, 
    )
    return response.choices[0].message["content"]

In [4]:
prompt =  f""" 
    In the same arguement format and style that was used by Thomas Aquinas in his Summa Theologiae\
    answer the question [{question1}]. Answer should be from perspectice of the teachings of the \
    catholic church (Though this does not have to be explicitly mentioned)\
    """
response = get_completion(prompt)
print(response)

In [2]:
prompt =  f""" 
    In the same arguement format and style that was used by Thomas Aquinas in his Summa Theologiae\
    answer the question [{question2}]. Answer should be from perspectice of the teachings of the \
    catholic church (Though this does not have to be explicitly mentioned)\
    """
response = get_completion(prompt)
print(response)

In [8]:
prompt =  f""" 
    In the same arguement format and style that was used by Thomas Aquinas in his Summa Theologiae\
    answer the question [{question3}]. Answer should be from perspectice of the teachings of the \
    catholic church (Though this does not have to be explicitly mentioned)\
    """
response = get_completion(prompt)
print(response)

### Second Attempt with Specific Instructions
<Br>

In [None]:
prompt = f""" Answer in a clear way, in the manner of Thomas Aquinas
    not contradicting the teaching of the catholic church, the question [{question1}]
    answers should be in the format:
    
    Obj. 1: ...
    ...
    Obj. N: ...
    
    On the contrary, ...(concise)
    
    I answer that, ...(more nuanced)
    
    Reply to Obj. 1: ... (Note: if the "I answer that" answers a objection fully then no need for that reply)
    ...
    Reply to Obj. N: ...
    """
response = get_completion(prompt)
print(response)

In [None]:
prompt = f""" Answer in a clear way, in the manner of Thomas Aquinas
    not contradicting the teaching of the catholic church, the question [{question2}]
    answers should be in the format:
    
    Obj. 1: ...
    ...
    Obj. N: ...
    
    On the contrary, ...(concise)
    
    I answer that, ...(more nuanced)
    
    Reply to Obj. 1: ... (Note: if the "I answer that" answers a objection fully then no need for that reply)
    ...
    Reply to Obj. N: ...
    """
response = get_completion(prompt)
print(response)

In [None]:
prompt = f""" Answer in a clear way, in the manner of Thomas Aquinas
    not contradicting the teaching of the catholic church, the question [{question3}]
    answers should be in the format:
    
    Obj. 1: ...
    ...
    Obj. N: ...
    
    On the contrary, ...(concise)
    
    I answer that, ...(more nuanced)
    
    Reply to Obj. 1: ... (Note: if the "I answer that" answers a objection fully then no need for that reply)
    ...
    Reply to Obj. N: ...
    """
response = get_completion(prompt)
print(response)


# Fine-Tuning
<br>


In [25]:
#Reading csv cleaned in Data_prep file
ft = pd.read_csv("fine_tuning.csv")[["prompt", "completion"]]

#Coverting to Json
ft.to_json("AquinasGPT.jsonl", orient='records', lines=True)

In [26]:
!openai tools fine_tunes.prepare_data -f AquinasGPT.jsonl -q

Analyzing...

- Your file contains 582 prompt-completion pairs
- There are 13 examples that are very long. These are rows: [194, 208, 245, 352, 358, 366, 367, 385, 421, 437, 530, 580, 581]
For conditional generation, and for classification the examples shouldn't be longer than 2048 tokens.
- All prompts end with suffix `\n\n###\n\n`
- All completions end with suffix `\nEND`
- The completion should start with a whitespace character (` `). This tends to produce better results due to the tokenization we use. See https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset for more details

Based on the analysis we will perform the following actions:
- [Recommended] Remove 13 long examples [Y/n]: Y
- [Recommended] Add a whitespace character to the beginning of the completion [Y/n]: Y


Your data will be written to a new JSONL file. Proceed [Y/n]: Y

Wrote modified file to `AquinasGPT_prepared (2).jsonl`
Feel free to take a look!

Now use that file when fine-tuning:
> openai a

In [None]:
openai.Completion.create(
    model=FINE_TUNED_MODEL,
    prompt=question1)

In [None]:
openai.Completion.create(
    model=FINE_TUNED_MODEL,
    prompt=question2)

In [None]:
openai.Completion.create(
    model=FINE_TUNED_MODEL,
    prompt=question3)