# Github repo: https://github.com/jusbraun/genai_classifier_dataharvest

## When should we be using LLMs?
1. When I can tell when it's wrong: simple coding tasks (instant feedback)
2. When it doesn't matter that it's wrong: summary of unimportant meeting that I will probably never look at again.

When neither of these conditions are met, it's important to set up your pipeline so you can evaluate the error rate. This requires, to constrain the LLM's choices, ensure that it sticks to the provided options, and a hand-labelled test set against which you can benchmark the performance of the LLM.

But it's always worth asking: Can I regex this? Or is there some other kind of deterministic rule set I can use to get to the same result? When you have the option, it's always better to use an approach where you fully understand every step along the way!!!

## Using LLM APIs
### OLLAMA
Ollama allows you to run various LLMs locally on your machine. This includes models optimzed for text-only tasks (e.g., Llama 3 as well as multimodal models (llava). 
1. Download Ollama at https://ollama.com/
2. In your Terminal, type 'ollama pull <model_name>'
3. In your Terminal, type ollama 'run <model_name>': You now have a chatbot that runs locally on your machine!
4. If you don't just want a chatbot, but perform some systematic tasks, you may want to check out Ollama for Python 'pip install ollama' (https://github.com/ollama/ollama-python) or the code below. When you run code in Python, it will connect to your locally run LLM, so all the advantages of running LLMs locally still apply

### Open AI
Open AI allows access to its various models through a super easy to use API.
1. Buy API credits @ https://platform.openai.com/
2. Download API key and set as environment variable
3. Install OpenAI's package: 'pip install openai'
4. Also check out OpenAI's API tutorials (https://platform.openai.com/docs/quickstart).

There are many additional packages that give you an integrated interface for accessing various LLMs. Langchain packages are especially useful (https://github.com/langchain-ai/langchain).

In [1]:
import ollama #Ollama python library
from openai import OpenAI #OpenAI Python library

import pandas as pd
import time
import json
import os

#lists of distinct aggravating and mitigating circumstances used for classification
from mitigating_aggravating_ai_lists import aggravating_list, mitigating_list 

## The scenario
This workshop builds on a real-world (ongoing) investigation into potential bias in the Norwegian criminal justice system. It builds on work by Christian Nicolai and Henrik Bøe.

In order to analyze whether decisions are fair, we need to compare similar criminal cases. A key factor in determining the severity of punishment is the presence or absence of mitigating and aggravating circumstances. But we have thousands of cases and not enough people power to hand code all the cases. So we thought, maybe a LLM can get the job done for us.

The task is pretty straightforward: We have 'test_set' (paragraphs from judgements with mitigating/aggravating circumstances as well as labels) and aggravating_list as well as mitigating_list (unique aggravating/mitigating circumstances, each corresponding rougly to a letter in the Norwegian criminal code). The LLM's task is to figure out which, if any, aggravating or mitigating circumstance applies to each paragraph.

Because the investigation is ongoing, and due to privacy concerns, we cannot share the underlying data for this analysis. But you can use this notebook as a template for your own classification tasks.

In [2]:
test_set = pd.read_csv('../data/test_set.csv', index_col='Unnamed: 0') #load test_data
test_set.head() # show test_data

Unnamed: 0,id,type,text,hand_coded
1,2,mitigating,"LE-2017 -68515 concerned a pair of parents, or...",other
2,2,mitigating,The court cannot see that there are mitigating...,FALSE
3,3,aggravating,When assessing the punishment for item I a) an...,multiple crimes grouped into single sentence
4,3,aggravating,"The court finds that items I b) and c), as wel...",multiple crimes grouped into single sentence
5,3,aggravating,"In this assessment, the court has taken into a...",intoxication


In [3]:
# for the Open AI API to work, you need to set your API key as an environment variable.
os.environ["OPENAI_API_KEY"] = "your_OpenAI_API key"

In [4]:
# starts an OpenAI client
client = OpenAI()

In [5]:
# Systems prompts give LLMs a 'role' to play. In our case, we instruct the LLM to classify the judgements, 
# the formatting of the response, and the unique aggravating/mitigating circumstances
systems_prompt_aggravating = """You are an assistant who reads short text extracts from judgements. You must decide whether the text contains information about any aggravating circumstances in the case. You must identify the most relevant aggravating circumstance from the list below and enter it in JSON format in English.
You must only mention circumstances that are aggravating for this sentence, not circumstances for previous sentences. If it does not say that the circumstance is aggravating or if the text about the circumstance is written in the past tense, you must answer {"aggravating": "false"}.
You must answer in json format in English in this way: {"aggravating": "death threats"}. Name only one aggravating circumstance, the one you think is most important.
If you do not find a relevant circumstance in the list, answer {"aggravating": "other circumstances"}. If information about aggravating circumstances is missing, or if you are unable to interpret it, answer {"aggravating": "false"}.
If the wording of the circumstance is written in the past tense, for example "it was aggravating" or "it was aggravating", answer {"aggravating": "false"}.
Here is the list:
"""+aggravating_list

systems_prompt_mitigating = """You are an assistant who reads short text extracts from judgements. You must decide whether the text contains information about any mitigating circumstances in the case. You must identify the most relevant mitigating circumstance from the list below and enter it in JSON format in English.
You must only mention circumstances that are stated to be mitigating for this case, not circumstances that are mitigating for previous court cases. If it does not say that the circumstance is mitigating, or if the text about the circumstance is written in the past tense, you must answer {"mitigating": "false"}.
You must respond in json format in English in this way: {"mitigating": "self defense"}. Name only one mitigating circumstance, the one you think is the most important.
If you do not find a relevant circumstance in the list, answer {"mitigating": "other circumstances"}. If information about mitigating circumstances is missing or if you are unable to interpret it, answer {"mitigating": "false"}.
If the wording of the circumstance is written in the past tense, for example "it was mitigating" or "there was mitigating", answer {"mitigating": "false"}.
Here is the list:
"""+mitigating_list

## Customizing your API calls
Using APIs gives you more control over the LLMs behavior than just using an online chatbot. There are a couple of parameters, you want to pay especially close attention to:
1. The model you are using
2. Role: 'system' prompts allow you to give general behavioral instructions to the LLM. 'user' prompts are the actual messages the LLM respond to.
3. Seed: Setting a seed makes the model's output reproducible (as long as it doesn't change). Pick any number.
4. Temperature: how 'creative' should the LLM be. For most tasks you will want to set the temperature to zero.

The full list of parameters for Ollama and GPT are available here: https://pypi.org/project/ollama-python/ and here: https://platform.openai.com/docs/api-reference/chat/create

If you want to get fancy, you can also look into training a custom LLM or optimizing your local LLM's performance.

In [6]:
test_set_classified = [] #results data frame
my_seed = 9999 #seed

#loop over each row in the data frame
for index, row in test_set.iterrows():
    
    ### SET UP SYSTEMS PROMPT
    systems_prompt = ""
    if row['type'] == 'mitigating':
        systems_prompt = systems_prompt_mitigating
    else:
        systems_prompt = systems_prompt_aggravating
    
    ### OLLAMA ###
    start_llama = time.time() # measure start time
    response_llama = ollama.chat(model='llama3', #specify model
                                 messages=[
      {
        'role': 'system', # systems prompt
        'content': systems_prompt,
      },{
          'role' : 'user', # user prompt (the actual judgement paragraph)
          'content' : row['text']
      }
    ], options={
            'seed' : my_seed, # set seed
            'temperature' : 0 # set temperature
        }
    )
    end_llama = time.time() # measure end time
    time_llama = end_llama - start_llama #calculate run time
    row['time_llama'] = time_llama #save run time
    #extract Llama response
    try:
        response_llama_json = json.loads(response_llama['message']['content'])
        response_llama_json_keys = list(response_llama_json.keys())
        response_llama_json_values = list(response_llama_json.values())
    except:
        print('Llama did not return JSON')
        test_set_classified.append(row)
        next
    #ensure that response is well formatted
    if (not(len(response_llama_json_keys) == 1 and response_llama_json_keys[0] == row['type'])):
        print('JSON Llama output for index:', index, 'not well formatted')
        next
    #save response
    row['llama'] = response_llama_json_values[0]
    
    
    ### OPEN AI ###
    start_gpt = time.time() #measure start time
    response_gpt35 = client.chat.completions.create(
      model="gpt-3.5-turbo", # specify model
      response_format={ "type": "json_object" },  #specify response type
      messages=[
        {'role': 'system', 'content': systems_prompt}, #systems prompt
        {'role': 'user', 'content': row['text']} #user prompt (the actual judgement paragraph)
      ],
        seed=my_seed, # set seed
        temperature=0 # set temperature
    )
    end_gpt = time.time() # measure end time
    time_gpt = end_gpt - start_gpt #calculate run time
    row['time_gpt'] = time_gpt # save run time
    
    # extract openai response
    try:
        response_gpt35_json = json.loads(response_gpt35.choices[0].message.content)
        response_gpt35_json_keys = list(response_gpt35_json.keys())
        response_gpt35_json_values = list(response_gpt35_json.values())
    except:
        print('GPT did not return JSON')
        test_set_classified.append(row)
        next
    # ensure that response is well formatted
    if (not(len(response_gpt35_json_keys) == 1 and response_gpt35_json_keys[0] == row['type'])):
        print('JSON GPT 3.5 output for index:', index, 'not well formatted')
        next
    # save response
    row['gpt35'] = response_gpt35_json_values[0]
    

    test_set_classified.append(row) # add new row to results list
    
    print(row)


id                                                            2
type                                                 mitigating
text          LE-2017 -68515 concerned a pair of parents, or...
hand_coded                                                other
time_llama                                             6.010967
llama                                    young age of defendant
time_gpt                                               0.750304
gpt35                                   limited mental capacity
Name: 1, dtype: object
id                                                            2
type                                                 mitigating
text          The court cannot see that there are mitigating...
hand_coded                                                FALSE
time_llama                                             1.472724
llama                                                     false
time_gpt                                               0.700658
gpt35            

Llama did not return JSON
id                                                           11
type                                                aggravating
text            - 3 - 21-153912MED -TVIN/TGJO The court's as...
hand_coded                                     repeated offense
time_llama                                            17.389808
llama              multiple crimes grouped into single sentence
time_gpt                                               0.931248
gpt35                                          repeated offense
Name: 17, dtype: object
id                                                           13
type                                                aggravating
text          Violence against the police entails as a start...
hand_coded    multiple crimes grouped into single sentence; ...
time_llama                                             4.866497
llama                                          repeated offense
time_gpt                                              

id                                                           18
type                                                aggravating
text          Section 79 letter a of the Criminal Code appli...
hand_coded                                     repeated offense
time_llama                                             1.946537
llama                                          repeated offense
time_gpt                                               0.939189
gpt35                                          repeated offense
Name: 33, dtype: object
id                                                           18
type                                                 mitigating
text          Neurological specialist statement of 05.01.202...
hand_coded                                                FALSE
time_llama                                             4.088214
llama                                                     false
time_gpt                                               0.822137
gpt35           

id                                                           23
type                                                aggravating
text          The defendant is an almost 29-year-old. In the...
hand_coded                                     repeated offense
time_llama                                             4.892401
llama                                          repeated offense
time_gpt                                               0.597961
gpt35                                          repeated offense
Name: 49, dtype: object
id                                                           24
type                                                aggravating
text          In aggravating circumstances, the defendant wa...
hand_coded             driving related aggravating circumstance
time_llama                                             1.457929
llama                  driving related aggravating circumstance
time_gpt                                               0.575475
gpt35           

id                                                           31
type                                                aggravating
text          3 The court's assessment of the sentencing Whe...
hand_coded    multiple crimes grouped into single sentence; ...
time_llama                                             5.591339
llama                                                   threats
time_gpt                                               1.202747
gpt35                                                   threats
Name: 65, dtype: object
id                                                           31
type                                                 mitigating
text          In addition, the court emphasized Section 78, ...
hand_coded                              limited mental capacity
time_llama                                             4.594898
llama                                   limited mental capacity
time_gpt                                               0.687383
gpt35           

id                                                           41
type                                                aggravating
text          Strl Section 77 k) stipulates that, in an aggr...
hand_coded                                     repeated offense
time_llama                                             1.417902
llama                               previous criminal sanctions
time_gpt                                               0.521691
gpt35                                          repeated offense
Name: 81, dtype: object
id                                                           41
type                                                aggravating
text          Both mother and daughter have experienced the ...
hand_coded    vulnerable victim, underaged victim, victim de...
time_llama                                              2.58952
llama                                violation of private place
time_gpt                                               0.870421
gpt35           

id                                                           48
type                                                aggravating
text          In the present case, the sexual intercourse to...
hand_coded    vulnerable victim, underaged victim, victim de...
time_llama                                             3.519679
llama         vulnerable victim, underaged victim, victim de...
time_gpt                                               0.868374
gpt35         vulnerable victim, underaged victim, victim de...
Name: 97, dtype: object
id                                                           48
type                                                 mitigating
text          Based on the fact that there are three cases o...
hand_coded                                                FALSE
time_llama                                             3.527243
llama                                                     false
time_gpt                                               0.717744
gpt35           

id                                                           56
type                                                aggravating
text          It follows from the Road Traffic Act section 3...
hand_coded             driving related aggravating circumstance
time_llama                                             5.200485
llama                                              intoxication
time_gpt                                               0.877053
gpt35                                              intoxication
Name: 113, dtype: object
id                                                           56
type                                                aggravating
text          In the direction of increasing penalties, emph...
hand_coded                                     repeated offense
time_llama                                             1.677345
llama                                          repeated offense
time_gpt                                               1.378974
gpt35          

id                                                           61
type                                                 mitigating
text          Furthermore, the long time that has passed sin...
hand_coded                                 long processing time
time_llama                                             1.906274
llama                                      long processing time
time_gpt                                               0.631128
gpt35                                      long processing time
Name: 129, dtype: object
id                                                           62
type                                                aggravating
text          It is aggravating that the defendant hit the v...
hand_coded                 risk to life or health of the victim
time_llama                                              4.48423
llama                                          repeated offense
time_gpt                                               0.626135
gpt35          

id                                                           67
type                                                 mitigating
text          The defendant has made unreserved confessions ...
hand_coded                                           confession
time_llama                                             3.442395
llama                                                confession
time_gpt                                               0.664829
gpt35                                                confession
Name: 145, dtype: object
id                                                           70
type                                                 mitigating
text          It is the robbery in post I that is the most s...
hand_coded                                           confession
time_llama                                             2.091442
llama                                                confession
time_gpt                                               0.521902
gpt35          

In [7]:
test_set_classified = pd.DataFrame(test_set_classified) #convert results list to pandas dataframe
test_set_classified.head()

Unnamed: 0,id,type,text,hand_coded,time_llama,llama,time_gpt,gpt35
1,2,mitigating,"LE-2017 -68515 concerned a pair of parents, or...",other,6.010967,young age of defendant,0.750304,limited mental capacity
2,2,mitigating,The court cannot see that there are mitigating...,FALSE,1.472724,false,0.700658,confession
3,3,aggravating,When assessing the punishment for item I a) an...,multiple crimes grouped into single sentence,5.812534,multiple crimes grouped into single sentence,0.688785,repeated offense
4,3,aggravating,"The court finds that items I b) and c), as wel...",multiple crimes grouped into single sentence,1.685954,multiple crimes grouped into single sentence,1.011522,multiple crimes grouped into single sentence
5,3,aggravating,"In this assessment, the court has taken into a...",intoxication,3.75493,"vulnerable victim, underaged victim, victim de...",0.998499,driving related aggravating circumstance


In [8]:
test_set_classified.to_csv('../data/test_set_classified.csv') #save results data frame