# Workshop 2

In this analysis...

By the end of this analysis, you will have learned how:

## Environment Setup 

In [None]:
import sys
if 'google.colab' in sys.modules:  # If in Google Colab environment
    # Mount google drive to enable access to data files
    from google.colab import drive
    drive.mount('/content/drive')
    
    # Installing requisite packages
    !pip install --upgrade transformers openai  &> /dev/null

    # Change working directory 
    %cd /content/drive/MyDrive/LLM4BeSci_EADM2024/workshop_3

In [None]:
import pandas as pd
import seaborn as sns
from tqdm.notebook import tqdm_notebook as tqdm
from huggingface_hub import InferenceClient
import textwrap

## Zero-shot Classification: Media Bias
We begin by loading the dataset as a `pandas.DataFrame`:

In [None]:
media_bias_test = pd.read_csv('media_bias_test.csv')
media_bias_test

In [None]:
# Initialize client
api_key = '<your access token here>' 
client = InferenceClient(token=api_key)

# Zero-shot classification prompt
zero_shot_prompt = "Is this text neutral or partisan? Strictly answer with only 'neutral' or 'partisan':\n"

zero_shot_labels = []
for tweet in tqdm(media_bias_test['text']):    
    
    # Zero-shot classification 
    output = client.chat_completion(
        messages=[
            {"role": "system", "content": "You are a thoughtful political scientist who accurately distinguishes neutral and partisan messages"},
            {"role": "user", "content": zero_shot_prompt + tweet}
        ],
        model="meta-llama/Meta-Llama-3-8B-Instruct",
        max_tokens=100,
        temperature=0.0
    )
    
    # Accessing the text output and lowercasing it
    output = output.choices[0].message.content.lower()
    
    # Extract label and append to list
    label = 'neutral' if 'neutral' in output else 'partisan' if 'partisan' in output else 'nan' # 
    zero_shot_labels.append(label)

# Add zero-shot labels to dataframe
media_bias_test['zero_shot_label'] = zero_shot_labels
media_bias_test

In [None]:
# Comparing zero-shot and actual labels
print(f'Zero-shot accuracy: {(media_bias_test["zero_shot_label"] == media_bias_test["bias"]).mean()}')

In [None]:
# Confusion matrix
confusion = pd.crosstab(media_bias_test['bias'], media_bias_test['zero_shot_label'])
sns.heatmap(confusion, annot=True)

**TASK 1**: Try playing around with the prompt. Can you find one that increases the accuracy of Llama 3?**TASK 2**: Try playing around with different `temperature` values (e.g. 0.5, 1.0, and 3.0) and see how it affects the accuracy.

## Synthetic Participants: The Berlin Numeracy Test
In this section, we will explore the usage of causal LLMs as synthetic participants in a psychological experiment. We will again use Phi-3, this time to solve the [Berlin Numeracy Test](https://doi.org/10.1017/S1930297500001819). This is a widely used test to measure an individual's ability to understand and apply statistical concepts. 

The test consists of four questions that require a basic understanding of probability and statistics. In this exercise, we will ask Phi-3 to solve these questions. Phi-3 will provide an answer to each question, and we will evaluate the quality of the response.

The code begins by defining the four questions: 

In [None]:
q1 = """
Imagine we are throwing a five-sided die 50 times. On average, out of these 50 throws how many times would this five-sided die show an odd number (1, 3 or 5)?
"""

q2 = """
Out of 1,000 people in a small town 500 are members of a choir. Out of these 500 members in the choir 100 are men. Out of the 500 inhabitants that are not in the choir 300 are men. What is the probability that a randomly drawn man is a member of the choir? (please indicate the probability in percent).
"""

q3 = """
Imagine we are throwing a loaded die (6 sides). The probability that the die shows a 6 is twice as high as the probability of each of the other numbers. On average, out of these 70 throws, how many times would the die show the number 6?
"""

q4 = """
In a forest 20% of mushrooms are red, 50% brown and 30% white. A red mushroom is poisonous with a probability of 20%. A mushroom that is not red is poisonous with a probability of 5%. What is the probability that a poisonous mushroom in the forest is red?
"""

In [None]:
api_key = '<our access token here>'

# Loop through questions and generate responses
for i, question in enumerate([q1, q2, q3, q4]):
    print('-------------------------')   
    
    # Define prompt
    question = question + " Make your answer as brief as possible using less than 50 words."
    
    output = client.chat_completion(
    messages=[
        {"role": "system", "content": "You are a helpful assistant who is good at solving numerical problems."},
        {"role": "user", "content": question}
    ],
    model="meta-llama/Meta-Llama-3-70B-Instruct",
    max_tokens=100,
    temperature=0.0
    )
    
    # Accessing the text output 
    response = output.choices[0].message.content
    
    # Format question and response for printing
    question = '\n'.join(textwrap.wrap(question, 100))
    response = '\n'.join(textwrap.wrap(response, 100))
    print(f"Question {i+1}: {question} \n\nAnswer: {response}\n")

[ADD EVALUATION]

[ADD EXERCISES]