# Classification with LLMs

Modern LLMs have been trained on very large datasets to be able to predict appropiate responses to queries. Through this training, they also become able to predict classifications of queries, and they can be instructed to return these classifications by specifying their task through the system prompt.

For this implementation, the system pompt is very important, as we can give information about the different classifications, and even feed the model examples of correct classifications. When we don't feed these examples, the process is called "**zero-shot**" classification, as we have not provided examples of the desired behavior. Otherwise, the process is called "**few-shot**", in which case the system prompt contains examples of the desired response.

We will now build a few-shot prompt to instruct the model to classify whether a query is a question about biology or a question about finance.

In [1]:
from openai import OpenAI
from Constants import OPENAI_API_KEY

delimiter = "####"

system_prompt = f"""
You will be provided with user queries and your task is to classify whether they are about finance or about biology. 

The user queries will be delimited with {delimiter} characters

As an output, provide one word, either "Finance" or "Biology"

Some examples of queries and how you should respond to them:

{delimiter}What is a credit default swap?{delimiter}
Finance

{delimiter}What are the mitochondria?{delimiter}
Biology

"""

Few-shot prompts raise the effectiveness of the model at generating the desired responses. Generally, the system prompt is one of the most important parts of an application, as it largely determines how the model will generate text for upcoming queries. The process of designing a prompt to specify the desired behavior is called "prompt engineering". When building an app that relies on LLMs, testing and comparing the responses of the model after using different prompts is a very useful task that leads to better prompt design

Now we use the API to finish the classification task. The steps are the same as in last Notebook

In [2]:
client = OpenAI(api_key=OPENAI_API_KEY)

In [3]:
query = "What is the difference between a Roth IRA and a 401(k)?"
model = "gpt-3.5-turbo-1106"
messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": f"{delimiter}{query}{delimiter}"}
]
response = client.chat.completions.create(model = model, messages = messages)
response.choices[0].message.content

'Finance'

In [4]:
query = "What is the life cycle of fungi?"
model = "gpt-3.5-turbo-1106"
messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": f"{delimiter}{query}{delimiter}"}
]
response = client.chat.completions.create(model = model, messages = messages)
response.choices[0].message.content

'Biology'

# Moderation with OpenAI

Just as classification is a possible task that can be fulfilled with LLMs, analysis of natural language for moderation purposes can be achieved too. In a sense, moderation is just a kind of classification, so it would be a natural extension of classification tasks. OpenAI provides a specialized version of the GPT models that has been trained for moderation. All we need to do is create a request to the moderation endpoint of the API and it will return the classifications generated by this version of the LLM.

For this example we use a negative review and then a positive review for the movie Whiplash (2014), taken from Rotten Tomatoes.

In [6]:
client.moderations.create(input="This is a student film in rampant overdrive and it will attract attention and offers. So I just hope Mr. Chazelle doesn't believe too much in his film's dumb message.").results

[Moderation(categories=Categories(harassment=False, harassment_threatening=False, hate=False, hate_threatening=False, self_harm=False, self_harm_instructions=False, self_harm_intent=False, sexual=False, sexual_minors=False, violence=False, violence_graphic=False, self-harm=False, sexual/minors=False, hate/threatening=False, violence/graphic=False, self-harm/intent=False, self-harm/instructions=False, harassment/threatening=False), category_scores=CategoryScores(harassment=0.004748139530420303, harassment_threatening=1.6683547983120661e-06, hate=7.284914772753837e-06, hate_threatening=5.816911952116754e-10, self_harm=4.6644157691844157e-07, self_harm_instructions=8.147496899368889e-09, self_harm_intent=5.143712300537118e-08, sexual=7.438193279085681e-06, sexual_minors=9.856715905698366e-07, violence=5.102516297483817e-05, violence_graphic=3.5347511584404856e-06, self-harm=4.6644157691844157e-07, sexual/minors=9.856715905698366e-07, hate/threatening=5.816911952116754e-10, violence/graphi

In [7]:
client.moderations.create(input="This is as much psychological thriller as musical drama and it turns on the increasingly toxic chemistry between two clearly damaged people, to the exclusion of pretty much anyone else in the film").results

[Moderation(categories=Categories(harassment=False, harassment_threatening=False, hate=False, hate_threatening=False, self_harm=False, self_harm_instructions=False, self_harm_intent=False, sexual=False, sexual_minors=False, violence=False, violence_graphic=False, self-harm=False, sexual/minors=False, hate/threatening=False, violence/graphic=False, self-harm/intent=False, self-harm/instructions=False, harassment/threatening=False), category_scores=CategoryScores(harassment=0.0025453385896980762, harassment_threatening=1.0351127457397524e-06, hate=0.00028362826560623944, hate_threatening=1.3501354700906631e-08, self_harm=6.758399831596762e-05, self_harm_instructions=5.868030505951083e-09, self_harm_intent=5.6470224052418416e-08, sexual=1.9095161405857652e-05, sexual_minors=2.957196088004821e-08, violence=0.0002453227061778307, violence_graphic=4.71908333565807e-06, self-harm=6.758399831596762e-05, sexual/minors=2.957196088004821e-08, hate/threatening=1.3501354700906631e-08, violence/grap

Neither category was flagged as being against OpenAI's terms of conduct. The negative review had a higher score for the harassment category, likely because of the wording and sentiment expressed in it; regardless, it is not classified as harassment.