<a href="https://colab.research.google.com/github/dasmiq/cs6120-assignment2/blob/main/shakespeare.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Interpreting Classifier Weights

In this experiment, you will train models to distringuish examples of two different genres of Shakespeare's plays: comedies and tragedies. (We'll ignore the histories, sonnets, etc.) Since he died four hundred years ago, Shakespeare has not written any more plays—although scraps of various other works have come to light. We are not, therefore, interested in building models simply to help categorize an unbounded stream of future documents, as we might be in other applications of text classification; rather, we are interested in what a classifier might have to tell us about what we mean by the terms “comedy” and “tragedy”.

You will start by copying and running your `createBasicFeatures` function from the experiment with movie reviews. Do the features the classifier focuses on tell you much about comedy and tragedy in general?

You will then implement another featurization function `createInterestingFeatures`, which will focus on only those features you think are informative for distinguishing between comedy and tragedy. Accuracy on leave-one-out cross-validation may go up, but it more important to look at the features given the highest weight by the classifier. Interpretability in machine learning, of course, may be harder to define than accuracy—although accuracy at some tasks is hard enoough.

In [7]:
import json
import requests
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_validate,LeaveOneOut
import numpy as np

In [10]:
#read in the shakespeare corpus
def readShakespeare():
  raw = requests.get("https://raw.githubusercontent.com/dasmiq/cs6120-assignment2/refs/heads/main/shakespeare_plays.json").text.strip()
  corpus = [json.loads(line) for line in raw.split("\n")]

  #remove histories from the data, as we're only working with tragedies and comedies
  corpus = [entry for entry in corpus if entry["genre"] != "history"]
  return corpus

This is where you will implement two functions to featurize the data:

In [17]:
# TODO: Implement createBasicFeatures
# NB: The current contents are for testing only
# This function should return:
#  -a sparse numpy matrix of document features
#  -a list of the correct genre for each document
#  -a list of the vocabulary used by the features, such that the ith term of the
#    list is the word whose counts appear in the ith column of the matrix.

# This function should create a feature representation using all tokens that
# contain an alphabetic character.
import re
from collections import Counter
from scipy.sparse import csr_matrix

def createBasicFeatures(corpus):
    # Extract genre labels
    genres = [doc['genre'] for doc in corpus]
    
    # Extract tokens and build vocabulary
    vocab_counter = Counter()
    all_tokens = []
    
    for doc in corpus:
        tokens = [token.lower() for token in re.findall(r'\S+', doc['text']) 
                  if re.search(r'[a-zA-Z]', token)]
        all_tokens.append(tokens)
        vocab_counter.update(tokens)
    
    # Create vocabulary
    vocab = sorted(vocab_counter.keys())
    vocab_to_idx = {word: idx for idx, word in enumerate(vocab)}
    
    # Build sparse matrix
    rows, cols, data = [], [], []

    for doc_idx, tokens in enumerate(all_tokens):
        token_counts = Counter(tokens)
        for token, count in token_counts.items():
            rows.append(doc_idx)
            cols.append(vocab_to_idx[token])
            data.append(count)
    
    texts = csr_matrix((data, (rows, cols)), 
                       shape=(len(corpus), len(vocab)))
    
    return texts, genres, vocab


In [24]:
# TODO: Implement createInterestingFeatures. Describe your features and what
# they might tell you about the difference between comedy and tragedy.
# This function can add other features you want that help classification
# accuracy, such as bigrams, word prefixes and suffixes, etc.
def createBasicFeatures(corpus):
    # Extract genre labels
    genres = [doc['genre'] for doc in corpus]
    
    # Extract tokens and build vocabulary
    vocab_counter = Counter()
    all_tokens = []
    
    for doc in corpus:
        tokens = [token.lower() for token in re.findall(r'\S+', doc['text']) 
                  if re.search(r'[a-zA-Z]', token)]
        all_tokens.append(tokens)
        vocab_counter.update(tokens)
    
    # Create vocabulary
    vocab = sorted(vocab_counter.keys())
    vocab_to_idx = {word: idx for idx, word in enumerate(vocab)}
    
    # Build sparse matrix
    rows, cols, data = [], [], []
    for doc_idx, tokens in enumerate(all_tokens):
        token_counts = Counter(tokens)
        for token, count in token_counts.items():
            rows.append(doc_idx)
            cols.append(vocab_to_idx[token])
            data.append(count)
    
    texts = csr_matrix((data, (rows, cols)), shape=(len(corpus), len(vocab)))
    return texts, genres, vocab


def createInterestingFeatures(corpus):
    """
    Features focused on literary themes rather than character names.
    - Theme words: death, love, marriage, blood, honor, revenge
    - Emotional words: joy, sorrow, fear, laugh
    - Bigrams with theme words
    """
    # Extract genre labels
    genres = [doc['genre'] for doc in corpus]
    
    # Define theme words for comedy and tragedy
    theme_words = {
        'death', 'die', 'blood', 'kill', 'murder', 'revenge', 'honour', 'honor',
        'love', 'marry', 'marriage', 'wedding', 'joy', 'laugh', 'merry', 'happy',
        'sorrow', 'weep', 'cry', 'tears', 'fear', 'hate', 'war', 'peace'
    }
    
    # Extract tokens and build features
    vocab_counter = Counter()
    all_features = []
    
    for doc in corpus:
        tokens = [token.lower() for token in re.findall(r'\S+', doc['text']) 
                  if re.search(r'[a-zA-Z]', token)]
        
        features = []
        
        # Add only theme-related unigrams
        for token in tokens:
            if token in theme_words:
                features.append(token)
        
        # Add bigrams containing theme words
        for i in range(len(tokens) - 1):
            if tokens[i] in theme_words or tokens[i+1] in theme_words:
                features.append(f"{tokens[i]}_{tokens[i+1]}")
        
        # Add general common words (not character names)
        common_words = {'i', 'you', 'my', 'thy', 'me', 'him', 'her', 'his', 'our', 
                       'what', 'how', 'why', 'not', 'no', 'yes', 'good', 'bad'}
        for token in tokens:
            if token in common_words:
                features.append(token)
        
        all_features.append(features)
        vocab_counter.update(features)
    
    # Create vocabulary
    vocab = sorted(vocab_counter.keys())
    vocab_to_idx = {word: idx for idx, word in enumerate(vocab)}
    
    # Build sparse matrix
    rows, cols, data = [], [], []
    for doc_idx, features in enumerate(all_features):
        feature_counts = Counter(features)
        for feature, count in feature_counts.items():
            rows.append(doc_idx)
            cols.append(vocab_to_idx[feature])
            data.append(count)
    
    texts = csr_matrix((data, (rows, cols)), shape=(len(corpus), len(vocab)))
    return texts, genres, vocab

In [25]:
#given a numpy matrix representation of the features for the training set, the
# vector of true classes for each example, and the vocabulary as described
# above, this computes the accuracy of the model using leave one out cross
# validation and reports the most indicative features for each class
def evaluateModel(X,y,vocab,penalty="l1"):
  #create and fit the model
  model = LogisticRegression(penalty=penalty,solver="liblinear")
  results = cross_validate(model,X,y,cv=LeaveOneOut())

  #determine the average accuracy
  scores = results["test_score"]
  avg_score = sum(scores)/len(scores)

  #determine the most informative features
  # this requires us to fit the model to everything, because we need a
  # single model to draw coefficients from, rather than 26
  model.fit(X,y)
  neg_class_prob_sorted = model.coef_[0, :].argsort()
  pos_class_prob_sorted = (-model.coef_[0, :]).argsort()

  termsToTake = 20
  pos_indicators = [vocab[i] for i in neg_class_prob_sorted[:termsToTake]]
  neg_indicators = [vocab[i] for i in pos_class_prob_sorted[:termsToTake]]

  return avg_score,pos_indicators,neg_indicators

def runEvaluation(X,y,vocab):
  print("----------L1 Norm-----------")
  avg_score,pos_indicators,neg_indicators = evaluateModel(X,y,vocab,"l1")
  print("The model's average accuracy is %f"%avg_score)
  print("The most informative terms for pos are: %s"%pos_indicators)
  print("The most informative terms for neg are: %s"%neg_indicators)
  #this call will fit a model with L2 normalization
  print("----------L2 Norm-----------")
  avg_score,pos_indicators,neg_indicators = evaluateModel(X,y,vocab,"l2")
  print("The model's average accuracy is %f"%avg_score)
  print("The most informative terms for pos are: %s"%pos_indicators)
  print("The most informative terms for neg are: %s"%neg_indicators)


In [26]:
corpus = readShakespeare()

Run the following to train and evaluate two models with basic features:

In [27]:
X,y,vocab = createBasicFeatures(corpus)
runEvaluation(X, y, vocab)

----------L1 Norm-----------
The model's average accuracy is 0.615385
The most informative terms for pos are: ['you', 'duke', 'helena', 'i', 'prospero', 'sir', 'leontes', 'a', 'private', 'preserving', 'preservers', 'preserver', 'preserved', 'preserve', 'preservative', 'president', 'preservation', 'presents', 'presentment', 'presently']
The most informative terms for neg are: ['him', 's', 'iago', 'imogen', 'brutus', 'lear', 'o', 'and', 'rom', 'ham', 'the', 'preserving', 'preservers', 'preserver', 'pretense', 'preserved', 'preserve', 'pretend', 'president', 'pretext']
----------L2 Norm-----------
The model's average accuracy is 0.769231
The most informative terms for pos are: ['i', 'you', 'duke', 'prospero', 'a', 'helena', 'your', 'antonio', 'sir', 'leontes', 'hermia', 'for', 'lysander', 'ariel', 'sebastian', 'demetrius', 'camillo', 'stephano', 'me', 'parolles']
The most informative terms for neg are: ['iago', 'othello', 's', 'him', 'imogen', 'what', 'lear', 'brutus', 'his', 'cassio', 'o

Run the following to train and evaluate two models with features that are interesting for distinguishing comedy and tragedy:

In [28]:
X,y,vocab = createInterestingFeatures(corpus)
runEvaluation(X, y, vocab)

----------L1 Norm-----------
The model's average accuracy is 0.846154
The most informative terms for pos are: ['honour', 'you', 'i', 'me', 'no', 'thy', 'my', 'neck_happy', 'nearer_death', 'near_death', 'nay_weep', 'nature_honour', 'native_blood', 'name_revenge', 'n_marriage', 'my_wedding', 'my_tears', 'my_sorrow', 'native_honour', 'nectar_death']
The most informative terms for neg are: ['what', 'him', 'not', 'our', 'her', 'nectar_death', 'neck_happy', 'nearer_death', 'near_death', 'a_cry', 'nature_honour', 'native_honour', 'native_blood', 'name_revenge', 'n_marriage', 'my_wedding', 'nay_weep', 'ned_death', 'needs_marry', 'my_tears']
----------L2 Norm-----------
The model's average accuracy is 0.846154
The most informative terms for pos are: ['honour', 'you', 'i', 'your_honour', 'me', 'no', 'love', 'marry', 'mine_honour', 'die', 'marriage', 'good', 'merry', 'die_to', 'must_die', 'to_death', 'you_love', 'love_is', 'and_blood', 'honour_s']
The most informative terms for neg are: ['him', '

**TODO**: Based on the most informative features in the output of the classifier evaluation, what do these classifiers tell you about the differences between comedy and tragedy?

The basic features model achieves 61.5% accuracy for L1 and 76.9% for L2, but it relies mostly on character names. The top features are names like Helena, Prospero for comedies and Iago, Othello, Lear for tragedies. This means the model just memorizes which characters belong to which plays instead of learning real differences between comedy and tragedy. It works somewhat but does not tell us anything meaningful about the genres.

The interesting features model performs much better at 84.6% accuracy for both L1 and L2. It identifies features that show real literary patterns. Comedies use words about marriage like "marry" and "wedding", which makes sense since comedies often end with weddings. They also use more personal words like "I", "you", "me" and positive words like "merry", "good", and "love". This shows comedies focus on personal relationships and happy themes.

Tragedies show very different patterns with words about violence like "kill", "blood", "revenge", and "war". They include sad emotions like "weep", "cry", and "tears". Tragedies also use question words like "what" and "why" with negative words like "not", showing more conflict and questioning. These features reveal that Shakespeare's tragedies deal with death and suffering while comedies focus on love and happy endings.