# AI, Decision-Making and Society Problem Set #1




## Overview

The goal of this homework is to provide some hands-on experience of exploratory data analysis and data curation. We will work step-by-step through investigating a dataset, training some ML models, and evaluating the models. We will apply a minimal version of the Grounded Theory approach to define custom categories and consider the effects of different design decisions on model performance. Overall, the homework is divided into 5 parts:

1.   Spinning Up:  Prepare the environment and load the dataset.
2.   Quantitative Evaluation: Train a simple model and evaluate its performance.
3.   Exploratory Analysis: Analyze the dataset, improving it through manual cleaning and modifications.
4.   Custom Training & Evaluation: Conduct prompt engineering to explore the capabilities of large language models.
5.   Reflections on the Process: Reflect on the entire process and discuss the results.

This notebook is designed to be run in a Google Colab environment.

**Submission Instructions**: Please submit a PDF of your completed notebook to Gradescope. Make sure that all cell outputs are included. This assignment is due by 11:59 PM on Wednesday, 9/18/24.  

## Spinning Up

In this section, we will set up our environment for the rest of the assignment. We will:

*   Set up your colab to interact with the Gemini language model
*   Download a dataset to use for the assignment
*   Implement code to train a classifier

### Setting up your coding environment

First, we will set up the python environment to interact with libraries for model training and data manipulation.
Moreover, make sure to enable the GPU on the notebook by navigating to the 'runtime' tab, then clicking on 'change runtime type' and then selecting one of the available GPUs

In [None]:
## Install the generative AI interface
!pip install -U -q google-generativeai
!pip install transformers

In [None]:
## Imports in order to call relevant libraries
import re
import tqdm
import keras
import numpy as np
import pandas as pd
import os

import google.generativeai as genai

from google.colab import userdata

import seaborn as sns
import matplotlib.pyplot as plt

from keras import layers
from matplotlib.ticker import MaxNLocator
import sklearn.metrics as skmetrics

Next, we will need to configure our code to connect to the language model server. You can do this with a Colab Secret named `GOOGLE_API_KEY`. If you don't already have this configured (or you're unsure if you do) follow the instructions in the [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) quickstart.

In [None]:
API_KEY=userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=API_KEY)

### Dataset

Now that we have configured our environment, we need some data to get started. For this assignment, we will use the [Employee Review dataset](https://www.kaggle.com/datasets/fiodarryzhykau/employee-review/code). This is a dataset of 880 employee performance reviews with a combined metric that asseses the performance & potential of the employees, measured on a scale from 1 to 9. We will consider the potential use of AI systems trained on this data.

### Getting the Data

Download the dataset from [Kaggle](https://www.kaggle.com/datasets/fiodarryzhykau/employee-review/code) and extract the csv files train_set.csv and test_set.csv to your local machine. Then, upload it to the Colab environment.

You can find instructions for uploading a file to a colab session [here](https://saturncloud.io/blog/how-to-use-google-colab-to-work-with-local-files/#:~:text=Uploading%20Files%20to%20Google%20Colab&text=Click%20on%20the%20%E2%80%9CFiles%E2%80%9D%20tab,for%20the%20upload%20to%20complete.).

## Question 1: Looking at the Data and Documentation

Most data has documentation that describes how it was collected, what its intended purposes were, and known issues or risks. (If your data doesn't have this, its generally good practice to ask why!) Before we interact with the data, look at the documentation for this dataset and some example entries on [Kaggle](https://www.kaggle.com/datasets/fiodarryzhykau/employee-review/data). Answer the following questions with 1-3 sentences each.

#### Q 1.1: How was this dataset constructed?

---fill your answer here---

#### Q 1.2: What was the original purpose for the dataset?

---fill your answer here---

#### Q 1.3: What does it mean that the dataset is "partially reviewed"? Why might this be important?

---fill your answer here---

#### Q 1.4: Identify and describe one potentially appropriate and one potentially inappropriate application of the dataset (or a model trained on it). These applications should be hypothetical and do not need to directly correspond to a specific real-world use case of this dataset.

---fill your answer here---

#### Related Reading

[Datasheets for Datasets](https://arxiv.org/abs/1803.09010) articulates the motivation and reasoning behind this documentation.

Now that we've considered the data source, let's look at the data! We'll do that by reading the .csv files into a [pandas](https://pandas.pydata.org/) dataframe (using the command pd.read_csv) and display the first few rows of the dataframe.

In [None]:
train_df = pd.read_csv('train_set.csv')
train_df.set_index('id', inplace=True)

test_df = pd.read_csv('test_set.csv')
test_df.set_index('id', inplace=True)

Run this cell to see an example of what a data point from the training set looks like.

In [None]:
test_df.head()

## Question 2: Define and train the classifier

In this section we will take the feedback review and will generate the embeddings, from which we will try to predict the performance/potential score for each employee.
The generation of the embedding is done similarly to the next text classification [example](https://colab.research.google.com/github/google-gemini/cookbook/blob/main/examples/Classify_text_with_embeddings.ipynb#scrollTo=_mwJYXpElYJc).

### Creating the embeddings

Create the embeddings using the embedding-001 model, similarly to the provided classification example. Make sure that your colab notebook is enabled with the GPU configuration.

###### Remark: the embedding generation might take a while, so be patient.

In [None]:
from tqdm.auto import tqdm
tqdm.pandas()

from google.api_core import retry

def make_embed_text_fn(model):

  @retry.Retry(timeout=300.0)
  def embed_fn(text: str) -> list[float]:
    # Set the task_type to CLASSIFICATION.
    embedding = genai.embed_content(model=model,
                                    content=text,
                                    task_type="classification")
    return embedding['embedding']

  return embed_fn

def create_embeddings(model, df):
  df['Embeddings'] = df['feedback'].progress_apply(make_embed_text_fn(model))
  return df

In [None]:
model_embeddings = 'models/embedding-001'
df_train = create_embeddings(model_embeddings, train_df)
df_test = create_embeddings(model_embeddings, test_df)

Now, if we look at the data we can see that there is a new column that includes the embeddings for the review. There are opaque numbers that show how a model represents a particular example. We will use these embeddings to train a custom classifer next.

In [None]:
train_df.head()

In [None]:
test_df.head()

## Question 3: Build a simple classification model

Following the standard classifier example, we will now build a classifier comprised of two fully-connected layers, and will use it to classify the different performance/potential metric based on the feedback embeddings.

In [None]:
def build_classification_model(input_size: int, num_classes: int) -> keras.Model:
  inputs = x = keras.Input(shape=(input_size,))
  x = layers.Dense(input_size, activation='relu')(x)
  x = layers.Dense(num_classes, activation='sigmoid')(x)
  return keras.Model(inputs=[inputs], outputs=x)

In [None]:
# Derive the embedding size from the first training element.
embedding_size = len(df_train['Embeddings'].iloc[0])

# Give your model a different name, as you have already used the variable name
# 'model'
classifier = build_classification_model(embedding_size,
                                        len(df_train['label'].unique()))
classifier.summary()

classifier.compile(loss = keras.losses.SparseCategoricalCrossentropy(
                          from_logits=True),
                   optimizer = keras.optimizers.Adam(learning_rate=0.0001),
                   metrics=['accuracy'])

### Question 3.1: Training the model

In order to train this model, we need to set the training data for our classifier. Modify the code below so that the training data are stored in the following variables:

```
y_train, x_train
```
and the validation data in
```
y_val, x_val
```
Note that, similarly to the reference [example](https://colab.research.google.com/github/google-gemini/cookbook/blob/main/examples/Classify_text_with_embeddings.ipynb#scrollTo=_mwJYXpElYJc), you might find it useful to use the function "np.stack" for the concatenation of the embeddings.


Furthermore, feel free to try several different values of NUM_EPOCHS and BATCH_SIZE to see how they affect the model's performance.

In [None]:
NUM_EPOCHS = 40
BATCH_SIZE = 8

# Configure the training data to fit the classifier.
y_train = ## YOUR CODE TO EXTRACT TRAINING LABELS HERE
x_train = ## YOUR CODE TO EXTRACT TRAINING INPUTS HERE
y_val   = ## YOUR CODE TO EXTRACT VALIDATION LABELS HERE
x_val   = ## YOUR CODE TO EXTRACT VALIDATION INPUTS HERE

# Train the model for the desired number of epochs.
history = classifier.fit(x=x_train,
                         y=y_train,
                         validation_data=(x_val, y_val),
                         batch_size=BATCH_SIZE,
                         epochs=NUM_EPOCHS,)

### Question 3.2: Evaluating the model

We will use Keras' <a href="https://www.tensorflow.org/api_docs/python/tf/keras/Model#evaluate"><code>Model.evaluate</code></a> procedure to get the loss and the accuracy on the test dataset.

In [None]:
## Evaluate classifier on validation data
classifier.evaluate(x=x_val, y=y_val, return_dict=True)

When evaluating model performance, we need to consider the possibility that our evaluation got lucky. One way to approach this is through the use of confidence intervals. These tell us a range of possible accuracy values that are consistent with the data (see a more detailed description in the the supportive PDF file). Calculate the 95% confidence interval for the accuracy of the model using the formula

\begin{equation}
    \mathrm{CI} = \mathrm{accuracy} \pm 1.96 \times \sqrt{\frac{\mathrm{accuracy} \times (1 - \mathrm{accuracy})}{\mathrm{test \ size}}}
\end{equation}

In [None]:
def build_CI(y_hat, y_true):
  # The function should return the one-sided width of the confidence interval,
  # as it described in the equation above.
  ## FILL IN WITH YOUR CODE TO COMPUTE THE CONFIDENCE INTERVALS WIDTH ##
  return CI_onesided_width

y_hat_test = classifier.predict(x=x_val)
y_hat_test = np.argmax(y_hat_test, axis=1)
print('========== CIs ==========')
initial_CI_onesided_width = build_CI(y_hat_test, y_val)
print(initial_CI_onesided_width)

It is often useful to look at performance over the course of training. The next cell gives code to create this plot.

In [None]:
def plot_history(history):
  """
    Plotting training and validation learning curves.

    Args:
      history: model history with all the metric measures
  """
  fig, (ax1, ax2) = plt.subplots(1,2)
  fig.set_size_inches(20, 8)

  # Plot loss
  ax1.set_title('Loss')
  ax1.plot(history.history['loss'], label = 'train')
  ax1.plot(history.history['val_loss'], label = 'test')
  ax1.set_ylabel('Loss')

  ax1.set_xlabel('Epoch')
  ax1.legend(['Train', 'Validation'])

  # Plot accuracy
  ax2.set_title('Accuracy')
  ax2.plot(history.history['accuracy'],  label = 'train')
  ax2.plot(history.history['val_accuracy'], label = 'test')
  ax2.set_ylabel('Accuracy')
  ax2.set_xlabel('Epoch')
  ax2.legend(['Train', 'Validation'])

  plt.show()

plot_history(history)

#### Question 3.3

Why is there a gap between the accuracy on the training set and the accuracy on the validation set? Why does the training loss decrease while the validation loss stagnates?

---Fill your answer here---

Run this code. This will be used for Question 3.4, but will also be used in other parts later.

In [None]:
y_hat = classifier.predict(x=x_val)
y_hat = np.argmax(y_hat, axis=1)

In [None]:
labels_dict = dict(zip(df_test['nine_box_category'], df_test['label']))
labels_dict_reversed = dict(zip(df_test['label'], df_test['nine_box_category']))

labels_dict

#### Question 3.4 - Confusion Matrix [Only for students enrolled in the graduate version of the class]

Similarly to the classical classification [example](https://colab.research.google.com/github/google-gemini/cookbook/blob/main/examples/Classify_text_with_embeddings.ipynb#scrollTo=_mwJYXpElYJc), build and observe the confusion matrix, which characterize the model accuracy across the difference classes

In [None]:
cm = skmetrics.confusion_matrix(y_val, y_hat)
disp = skmetrics.ConfusionMatrixDisplay(confusion_matrix=cm,
                              display_labels=labels_dict.keys())
disp.plot(xticks_rotation='vertical')
plt.title('Confusion matrix for newsgroup test dataset');
plt.grid(False)

What does the confusion matrix reveal about the model's performance? Are there any specific types of errors you can identify?

---Fill your answer here---

## Question 4: Data Exploration

After looking at the quantitative evaluation of the model, now we will look at qualitative analysis. This will involve exploring the data and identifying some custom categories and labels. To start, let's add a new column to the dataset with the predicted labels.

In [None]:
y_hat_train = ## YOUR CODE TO CALL THE CLASSIFIER ON TRAIN DATA HERE
y_hat_train = np.argmax(y_hat_train, axis=1)
df_train['predicted_label'] = y_hat_train

y_hat_test = ## YOUR CODE TO CALL THE CLASSIFIER ON TEST DATA HERE
y_hat_test = np.argmax(y_hat_test, axis=1)
df_test['predicted_label'] = y_hat_test

#### Question 4.1: Custom Ratings

In multiple cases (and also in our dataset), some of the annotated labels might be noisy or incorrect. To that end, we would like to utilize the classifier we have developed and a human annotator to re-generate some labels. Please implement a short function that lets the user decide on the right label, and records the user choice as the new label. Then, use this function to extend the dataset and create a new set of labels. Your loop should display the confidence interval of model accuracy measured against the new labels.


You can label as many as you want, but you should label at least 30 examples.

In [None]:
# Function to manually label each example
pd.set_option('display.max_colwidth', None)

def manual_labeling(row):
    """
    Function to manually label data. Shows the data for review, collects the
    updated response, and then return either the original label or the updated
    label, depending on the response.
    """

    print(f"Feedback is: {row[['feedback']]}")
    print(f"Original Label: {labels_dict_reversed[row['label']]}")
    print(f"Predicted Label: {labels_dict_reversed[row['predicted_label']]}")

    # Ask the user for manual input on which label is correct
    correct_label = ## YOUR CODE HERE

    ## Subtract 1 to deal with 0 indexing
    return (int(correct_label) - 1)

# Apply the manual labeling function to each row and store the results
examples_to_inspect = 30

# Create new colums with default values of -1 to indicate they are
# unlabeled
df_train['manual_label'] = -1
df_test['manual_label'] = -1
for i in range(len(df_train)):

  # Apply manual labeling for df_train
  df_train.iloc[i, df_train.columns.get_loc('manual_label')] = manual_labeling(
      df_train.iloc[i])

  # Apply manual labeling for df_test
  df_test.iloc[i, df_test.columns.get_loc('manual_label')] = manual_labeling(
      df_test.iloc[i])

  # calculate the new confidence intervals width: this is the same as before,
  # but with the updated labels
  Curr_CI_onesided_width = ## YOUR CODE HERE

  # Print the current CI every 10 examples
  if i%10 == 0:
    print('=======================================================')
    print(f"=========Current i is {i}")
    print(f"Current CI: {Curr_CI_onesided_width}")
    print(f"Original CI: {initial_CI_onesided_width}")

  if i >= examples_to_inspect:
      break

# Display the updated DataFrame
print("\nUpdated DataFrame with manual labels:")
print(df_train.head())

#### Question 4.2 Reflection

How well do your labels agree with the labels in the dataset? Are there any patterns to the differences? How might a model trained on the new labels be different?

Please write 3-5 sentences.

---Fill your answer here---

## Question 5: Creating New Labels

So far, over the course of the assignment, we have moved to interact more and more with the actual data:

1.   First, you looked at the data documentation and information about the dataset.
2.   Second, you looked an quantitative evaluation of a model using the existing labels.
3.   Third, you did a manual subjective evaluation of the model using your own interpretation of the labels.

Now, we'll go one step further and invent some new categories for the data. The first step is to do initial coding of the data. For this step, we will collect some free form annotations of the data.

*Hint*: If you're not sure how to go about annotating data, look at the content in Ch. 5 of Charmaz about initial coding.



### Question 5.1

First, modify the code below to collect annotations of the data and store them in the dataframe.

In [None]:
## Add new columns to hold the annotations. Initialize with an empty string

df_train['annotation'] = ''
df_test['annotation'] = ''

def annotate(row):
    print(f"Feedback is: {row['feedback']}")
    print(f"Original Label: {labels_dict_reversed[row['label']]}")
    print(f"Predicted Label: {labels_dict_reversed[row['predicted_label']]}")

    # Ask the user for an annotation for the example
    annotation = ## YOUR CODE HERE

    # Return the annotation
    return annotation

# Apply the function to your dataset
# Apply the manual labeling function to each row and store the results
examples_to_inspect = 4

for i in range(len(df_train)):

  # Annotate an example from the training set
  df_train.iloc[i, df_train.columns.get_loc('annotation')] = annotate(
      df_train.iloc[i])

  # Annotate an example from the testing set
  df_test.iloc[i, df_test.columns.get_loc('annotation')] = annotate(
      df_test.iloc[i])

  if i >= examples_to_inspect:
      break

  # Print the current iteration index
  if i%3 == 0:
    print('=======================================================')
    print(f"=========Current i is {i}=========")

# Display the collected Annotations
for _, row in df_train.iterrows():
  if row['annotation'] != '':
    print(f"Feedback is: {row['feedback']}")
    print(f"Original Label: {labels_dict_reversed[row['label']]}")
    print(f"Predicted Label: {labels_dict_reversed[row['predicted_label']]}")
    print(f"Annotation: {row['annotation']}")
    print('\n')

### Question 5.2

Now that we have our annotations, the next step is to move from initial coding to focused coding. Review your annotations and identify 3 interesting similarities or differences that you found in the data. Next propose 2 different ways of categorizing the data based on the similarities.

For example, your annotations may have noticed that some reviews focus on interpersonal behavior (e.g., how well the employee works on a team) while others focus on work quality (e.g., do they finish their work on time and in a quality manner). In this case, you might create a category called "Review Focus" with labels: "other", "interpersonal skills", "work quality".

For each category please:

1.   Provide a 1-2 sentence summary.
2.   List the labels in that category.
3.   Provide an example of each label.
4.   Describe why the categorization might be useful.

In [None]:
## Create code that defines your categories
## Maps category name to (labels, summary, examples)

custom_categories = {'CUSTOM CATEGORY 1': (('LABEL 1', 'LABEL 2'),
                                           "CATEGORY 1 SUMMARY"),
                     'CUSTOM CATEGORY 2': (('LABEL 1', 'LABEL 2'),
                                           "CATEGORY 2 SUMMARY")}

## Create a dictionary that maps (Category, Label) -> Example
example_dict = {('CATEGORY', 'LABEL'): 'EXAMPLE FROM ANNOTATIONS'}


for category in custom_categories:
  print(f"Category is: {category}; {custom_categories[category][1]}")
  print(f"Labels are: {list(enumerate(custom_categories[category][0]))}")

#### In 3-5 sentences, explain why each category might be useful

---Fill your answer here---

Now we can add a new column to the dataset for each category. We will use -1 to denote unlabeled entries and the order of the labels in the dictionaries above to map numbers to labels.

In [None]:
for category in custom_categories:
  df_train[category] = -1
  df_test[category] = -1

def custom_label(row, category, labels, summary):

  print(f"Feedback is: {row['feedback']}")
  print(f"Original Label: {labels_dict_reversed[row['label']]}")
  print(f"summary for category is: {summary}")

  # Ask the user for a label
  label = ## YOUR CODE HERE

  # Return the label
  return labels[int(label)]

# Apply the function to your dataset
examples_to_inspect = 20

def label_category(category, labels):

  print(f"Category is: {category}; {custom_categories[category][1]}")
  print(f"Labels are: {list(enumerate(labels))}")
  for label in labels:
    print(f"Example for {label} is: {example_dict[category, label]}")

  print('\n')
  print('=======================================================')
  print('\n')

  for i in range(len(df_train)):

    # Annotate an example from the training set
    df_train.iloc[i, df_train.columns.get_loc(category)]= custom_label(
        df_train.iloc[i],category,labels,custom_categories[category][1])

    # Annotate an example from the testing set
    df_test.iloc[i, df_test.columns.get_loc(category)]= custom_label(
        df_test.iloc[i],category,labels,custom_categories[category][1])

    if i >= examples_to_inspect:
        break

    # Print the current iteration index
    if i%3 == 0:
      print('=======================================================')
      print(f"=========Current i is {i}=========")

# Label each category
for category in custom_categories:
  print('=======================================================')
  label_category(category, labels = custom_categories[category][0])
  print('=======================================================')

Print the df_train head and see the additional columns

In [None]:
print(df_train.head())

## Question 6: Building a Model for Your Labels

Now that we have some labels here, we will experiment with some methods for predicting your custom labels. One way to approach this would be to use the same method we did up above. If you'd like, you can implement that method and evaluate the new model quantiatively. However, you will likely find that performance is quite poor --- we tend to need more labels than is educationally productive to assign in a PSET!

An alternative is to turn our classification problem into a text completion task. In practice, this involves phrasing the classification task in natural language (i.e., English for our purposes here). For instance, if we wanted to classify whether sentence "I'm really disappointed that summer is over." is happy or sad, we could ask a language model to complete the following sentence:


```
"""
Is the following sentence happy?. Answer with "happy", "sad", or "neither". "I'm really disappointed that summer is over."
"""
```

While it isn't gauranteed that this will work, in practice these types of methods can work well with 0 additional data --- as long as they are evaluated properly. While this might seem like magic (and it is amazing), it's always a good idea to be skeptical of what seems like a free lunch. There are some important caveats that we need to consider:

*   The way you phrase your question directly determines the categories the model will recognize. Everything else will be influenced by the model's inherent biases. This isn't necessarily bad (for example, the model's bias towards generating coherent responses is crucial), but it means the model will rely on its data-driven interpretation of your prompt, which may or may not align with your intended meaning.

*  Small changes to prompts can change lead to large changes in behavior. [For example](https://arxiv.org/abs/2309.03409), including the phrase "Take a deep breath and work on this problem step-by-step" or offering to "pay" the model for better answers has improved performance in experiments. The process of iterating on these *prompts* has gotten the name *prompt engineering*. It's more of an iterative design exercise then a science, but [best practices](https://llama.meta.com/docs/how-to-guides/prompting/) are starting to emerge.

#### Optional Reading
While the idea of reducing one problem to another that you already know how to solve is one of the fundamentals in computer science, the application of this to large language models doing text completion tasks was introduced in "[Language Models are Unsupervised Multitask Learners.](https://paperswithcode.com/paper/language-models-are-unsupervised-multitask)"

### Question 6.1

Write out 3 potential prompts for each custom category you created. Explain your reasoning for the prompt in 1-3 sentences each. The prompts should be phrased as a question that the model can complete, aim to classify between the classes you created.

For example, if you had a category called "Work Quality" with labels "bad", "moderately bad", "moderately good" and "good worker", an example of a prompt can be: "Please classify the work quality of this employee according to the next list- bad, moderately bad, moderately good or good worker: {}"
and where the {} is replaced by the sentence you want to classify, which in our case is the feedback field.

In [None]:
prompts = {'CUSTOM CATEGORY 1': ['YOUR PROMPT HERE',
                                 ## WHY MIGHT THIS PROMPT WORK?
                                 'YOUR PROMPT HERE',
                                 ## WHY MIGHT THIS PROMPT WORK?
                                 'YOUR PROMPT HERE',
                                 ## WHY MIGHT THIS PROMPT WORK?
                                 ],
           'CUSTOM CATEGORY 2': ['YOUR PROMPT HERE',
                                 ## WHY MIGHT THIS PROMPT WORK?
                                 'YOUR PROMPT HERE',
                                 ## WHY MIGHT THIS PROMPT WORK?
                                 'YOUR PROMPT HERE',
                                 ## WHY MIGHT THIS PROMPT WORK?
                                 ],}



Now that we have our prompts, we can evaluate them! We'll use the DistilBART language model, applying the designed prompts to the feedback field from the employees.

For example, consider an interaction with the model. Suppose the prompt you designed is: 'Does the following employee have good potential? Answer with "no", "mild potential", or "extraordinary potential",' and the feedback field is: 'John's performance was poor in the last quarter, although he graduated first in his class at MIT and scored an A+ in the AI, Decision-Making, and Society course.' In this case, we expect the model to return 'extraordinary potential'.

In [None]:
from transformers import pipeline
import torch
device = 0 if torch.cuda.is_available() else -1

# To accomplish that task, we will use the Bart model
classifier = pipeline("zero-shot-classification",
                      model="valhalla/distilbart-mnli-12-1", device = device)

# Function to classify feedback based on prompt
def classify_feedback(feedback, prompt, labels):
    # Combine prompt with feedback
    combined_input = # --- fill your code here --- #

    # Perform zero-shot classification
    classification_result = classifier(combined_input, candidate_labels=labels)

    # Return the label with the highest score
    return classification_result['labels'][0]

# Apply summarization prompt
# Apply the prompts to each feedback and store the generated results
number_to_classify = 5
df_train['category_0_prompt_0'] = df_train.head(number_to_classify)['feedback']\
  .apply(lambda feedback: classify_feedback(feedback,
                      prompts[list(custom_categories.keys())[0]][0],
                      custom_categories[list(custom_categories.keys())[0]][0]))
df_train['category_0_prompt_1'] = df_train.head(number_to_classify)['feedback']\
  .apply(lambda feedback: classify_feedback(feedback,
                      prompts[list(custom_categories.keys())[0]][1],
                      custom_categories[list(custom_categories.keys())[0]][0]))
df_train['category_0_prompt_2'] = df_train.head(number_to_classify)['feedback']\
  .apply(lambda feedback: classify_feedback(feedback,
                      prompts[list(custom_categories.keys())[0]][2],
                      custom_categories[list(custom_categories.keys())[0]][0]))

df_train['category_1_prompt_0'] = df_train.head(number_to_classify)['feedback']\
  .apply(lambda feedback: classify_feedback(feedback,
                      prompts[list(custom_categories.keys())[1]][0],
                      custom_categories[list(custom_categories.keys())[1]][0]))
df_train['category_1_prompt_1'] = df_train.head(number_to_classify)['feedback']\
  .apply(lambda feedback: classify_feedback(feedback,
                      prompts[list(custom_categories.keys())[1]][1],
                      custom_categories[list(custom_categories.keys())[1]][0]))
df_train['category_1_prompt_2'] = df_train.head(number_to_classify)['feedback']\
  .apply(lambda feedback: classify_feedback(feedback,
                      prompts[list(custom_categories.keys())[1]][2],
                      custom_categories[list(custom_categories.keys())[1]][0]))

# print the results
print(df_train[['feedback', list(custom_categories.keys())[0],
                'category_0_prompt_0']].head(number_to_classify))
print(df_train[['feedback', list(custom_categories.keys())[0],
                'category_0_prompt_1']].head(number_to_classify))
print(df_train[['feedback', list(custom_categories.keys())[0],
                'category_0_prompt_2']].head(number_to_classify))

print(df_train[['feedback', list(custom_categories.keys())[1],
                'category_1_prompt_0']].head(number_to_classify))
print(df_train[['feedback', list(custom_categories.keys())[1],
                'category_1_prompt_1']].head(number_to_classify))
print(df_train[['feedback', list(custom_categories.keys())[1],
                'category_1_prompt_2']].head(number_to_classify))

### Question 6.2 - Few-Shot Prompting [Only for students enrolled in the graduate version of the class]

We now aim to perform few-shot prompting, where we use our zero-shot classifier (from the previous question) to create prompts that guide the model to classify unseen labels. Specifically, the model will use a few labeled examples in the prompt to enhance its understanding of new labels. Additionally, we would like to implement a function that re-generates prompts based on the predictions made by the model on the previous prompts, incorporating them into future examples to improve the prompting mechanism iteratively.

Given the previous prompts and the model's predictions, check the response of the model on some new unseen prompts and labels. You can use the previous prompts and labels to generate the new prompts and labels, thus "guiding" the model towards the right answer by giving him the previously generated information.

In [None]:
#--- Fill your code here ---#

## Question 7: Reflections on the overall process

1. In your own words, describe the steps we used to build and evaluate models for employee review. What additional steps would you recommend for a company thinking of deploying such a model?

2. How did your re-labeling change the performance? What does this indicate about the "quality" of the initial labels?

3. If a company were to deploy such a model for employment review purposes, what are some pros and cons of using your custom categorization of the data compared to the original labels?

4. What is the key idea behind using a language model for "zero-shot classification"? Did certain prompts work better than others, and why do you think that was the case?

---Fill your answers here---

### For graduate students

5. How did few-shot prompting change the performance? What strategies did you find useful for choosing the few-shot examples?  

---Fill your answer here---