# Identifying Deception in Diplomacy

In the workshop, you will use scikit-learn to identify deception in the board game Diplomacy.

First, form a pair to do these tasks. You should have just one computer and one person typing at a time.

Next, please reset this lesson to the scaffold, to ensure you have the latest version of the data.

**You are welcome to download and run this notebook somewhere else** (e.g., on your local computer). If you do, you will need to have scikit-learn installed.

We will be using the data from this repository: https://github.com/DenisPeskoff/2020_acl_diplomacy. If you are curious about where this comes from, see the research paper: https://aclanthology.org/2020.acl-main.353/

The data contains JSON lines like this:

```python
{
    "messages": [
        "Germany!\n\nJust the person I want to speak with. I have a somewhat crazy idea that I\u2019ve always wanted to try with I/G, but I\u2019ve never actually convinced the other guy to try it. And, what\u2019s worse, it might make you suspicious of me. \n\nSo...do I suggest it?\n\nI\u2019m thinking that this is a low stakes game, not a tournament or anything, and an interesting and unusual move set might make it more fun? That\u2019s my hope anyway.\n\nWhat is your appetite like for unusual and crazy?",
        "You've whet my appetite, Italy. What's the suggestion?",
        "\ud83d\udc4d",
        "It seems like there are a lot of ways that could go wrong...I don't see why France would see you approaching/taking Munich--while I do nothing about it--and not immediately feel skittish",
        "Yeah, I can\u2019t say I\u2019ve tried it and it works, cause I\u2019ve never tried it or seen it. But how I think it would work is (a) my Spring move looks like an attack on Austria, so it would not be surprising if you did not cover Munich. Then (b) you build two armies, which looks like we\u2019re really at war and you\u2019re going to eject me. Then we launch the attack in Spring. So there is really no part of this that would raise alarm bells with France.\n\nAll that said, I\u2019ve literally never done it before, and it does involve risk for you, so I\u2019m not offended or concerned if it\u2019s just not for you. I\u2019m happy to play more conventionally too. Up to you.",
        "I am just sensing that you don\u2019t like this idea, so shall we talk about something else? That was just a crazy idea I\u2019ve always wanted to try. I\u2019m happy to play more conservatively.",
        "Any thoughts?",
        "Sorry Italy I've been away doing, um, German things. Brewing Lagers?",
        "I don't think I'm ready to go for that idea, however I'd be down for some good ol'-fashioned Austria-kicking?",
        "I am pretty conflicted about whether to guess that you were telling the truth or lying about the \u201cbrewing lagers\u201d thing. I am going to take it literally and say \ud83d\udc4e even though I don\u2019t think you meant it deceptively. \ud83d\ude09"
    ],
    "sender_labels": [ true, true, true, true, true, true, true, true, true, true ],
    
    # The rest of these fields are not needed in the first task
    "receiver_labels": [ true, true, true, true, "NOANNOTATION", "NOANNOTATION", "NOANNOTATION", true, false, true ],
    "speakers": [ "italy", "germany", "italy", "germany", "italy", "italy", "italy", "germany", "germany", "italy" ],
    "receivers": [ "germany", "italy", "germany", "italy", "germany", "germany", "germany", "italy", "italy", "germany" ],
    "absolute_message_index": [ 74, 76, 86, 87, 89, 92, 97, 117, 119, 121, ],
    "relative_message_index": [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ],
    "seasons": [ "Spring", "Spring", "Spring", "Spring", "Spring", "Spring", "Spring", "Spring", "Spring", "Spring", ],
    "years": [ "1901", "1901", "1901", "1901", "1901", "1901", "1901", "1901", "1901", "1901" ],
    "game_score": [ "3", "3", "3", "3", "3", "3", "3", "3", "3", "3" ],
    "game_score_delta": [ "0", "0", "0", "0", "0", "0", "0", "0", "0", "0" ],
    "players": [ "italy", "germany" ],
    "game_id": 1
}
```

For us, the crucial thing to know about is:

- `messages`, the text being sent back and forth between players
- `sender_labels`, this indicates whether the message is truthful (`true`) or not (`false`)

## Task 1: Basic Classifier
Implement a Logistic Regression classifier that predicts if a message is truthful. The reason we are using logistic regression is that it is widely used and a great baseline for many NLP classification tasks. You should use the following configuration:

- Solver: `saga`
- Penalty: `None`
- All other parameters as defaults

The steps you will need to implement are:

1. **Data Loading**: Start by loading your training, development, and test datasets to prepare the features (messages) and labels (sender_labels).
2. **Data Preprocessing**: Convert the text messages into a suitable format for modelling, using the same approach as in the tutorial above.
3. **Model Definition**: Use scikit-learn's LogisticRegression class to create the model, with the solver set to 'saga' and penalty set to 'None'.
4. **Model Training**: Train your model using the training data.
5. **Model Evaluation**: Assess the model's performance on the development dataset.

A few notes that may be helpful:

- In the notebook above, `twenty_train.data` is a list of strings. You can create something comparable yourself for the diplomacy data
- In the notebook above, `twenty_train.target` is a Numpy array. You can create a regular Python list and then convert it into an array with `numpy.array(my_list)`

You should get results that look something like this (where `0` indicates a lie and `1` indicates the truth):

```python
              precision    recall  f1-score   support

           0       0.07      0.04      0.05       124
           1       0.96      0.97      0.96      2658

    accuracy                           0.93      2782
   macro avg       0.51      0.51      0.51      2782
weighted avg       0.92      0.93      0.92      2782
```

The number we care about here is the `f1-score` for `0`, ie., how well are we doing at identifying lies? This simple model scores `0.05`, which is not very good - it's hard!

Note also that `accuracy` will look high (`0.93` above). That's because it's very easy to get a lot of the test cases correct by predicting they are not lies.

In [None]:
import json
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn import metrics


def readfile(filename):
    examples_msg = []
    examples_label = []
    # 只有jsonl可以按行读取
    for line in open(filename):
        # 去掉行首尾空白字符（如换行符），然后使用 json.loads() 将这一行解析成 Python 字典。
        data = json.loads(line.strip())
        for msg, tlabel in zip(data['messages'], data['sender_labels']):
            examples_msg.append(msg)
            examples_label.append(1 if tlabel else 0)
    return examples_msg, np.array(examples_label)


train = readfile("mod-train.jsonl")
valid = readfile("mod-validation.jsonl")

text_clf = Pipeline([                                           
    ('vect', CountVectorizer()),
    ('tfidf', TfidfTransformer()), # 把上面的词频矩阵，转换成 TF-IDF 权重矩阵。强调更“有信息量”的词（减少高频但无意义的停用词影响）。
    ('clf', LogisticRegression(penalty=None, solver='saga', max_iter=100, random_state=42)),
])
text_clf.fit(train[0], train[1])

predicted = text_clf.predict(valid[0])

# predicted == valid[1] 是一个逐元素的比较操作，返回一个布尔数组，表示每个位置上的预测结果是否与实际标签相等。
# 比如，如果 predicted 和 valid[1] 在第一个位置上的值相等，则结果是 True，否则是 False。接着，np.mean(predicted == valid[1]) 
# 计算这个布尔数组的平均值。由于 True 被视为 1，False 被视为 0，np.mean 会返回一个介于 0 和 1 之间的值
print(np.mean(predicted == valid[1]))

print(metrics.classification_report(valid[1], predicted))
print(metrics.confusion_matrix(valid[1], predicted))

0.9317038102084831
              precision    recall  f1-score   support

           0       0.07      0.04      0.05       124
           1       0.96      0.97      0.96      2658

    accuracy                           0.93      2782
   macro avg       0.51      0.51      0.51      2782
weighted avg       0.92      0.93      0.92      2782

[[   5  119]
 [  71 2587]]




## Task 2: Analysis
Below, we will explore improving the classifier, but first, investigate a few properties of the dataset:

1. How accurate are people? The receiver_labels field contains whether the receiver of the message thought it was the truth (`true`) or not (`false`). For cases where they did not provide a label (`NOANNOTATION`), treat it as `true`, i.e., as if they thought the message was truthful. Calculate the precision and recall of the receivers.

2. Above we just used the text, but there is more information available. For each of the following fields, calculate the percentage of messages that are lies for each possible value: `speakers`, `receivers`, pairs of (`speakers`, `receivers`), `seasons`, `years`.

3. Read through 20 of the training set messages that are lies and 20 that are the truth. Do you notice anything different about them?

### Item 1

In [None]:
# 回答问题1 用来计算接收者（receiver）标签的准确性，也就是 接收者是否认为信息是正确的。
# 这里我们通过计算 接收者标签的精度（precision）和召回率（recall） 来评估接收者判断的准确性。
def readfile_v2(filename):
    examples_label_sender = []
    examples_label_receiver = []
    for line in open(filename):
        data = json.loads(line.strip())
        for tlabel, rlabel in zip(data['sender_labels'], data['receiver_labels']):
            examples_label_sender.append(1 if tlabel else 0)
            examples_label_receiver.append(1 if rlabel != False else 0)
    return np.array(examples_label_sender), np.array(examples_label_receiver)


train_v2 = readfile_v2("mod-train.jsonl")
valid_v2 = readfile_v2("mod-validation.jsonl")

In [None]:
print(np.mean(valid_v2[0] == valid_v2[1]))
print(metrics.classification_report(valid_v2[1], valid_v2[0]))
print(metrics.confusion_matrix(valid_v2[1], valid_v2[0]))

0.9284687275341481
              precision    recall  f1-score   support

           0       0.04      0.06      0.05        85
           1       0.97      0.96      0.96      2697

    accuracy                           0.93      2782
   macro avg       0.51      0.51      0.51      2782
weighted avg       0.94      0.93      0.93      2782

[[   5   80]
 [ 119 2578]]


In [None]:
print(np.mean(train_v2[0] == train_v2[1]))
print(metrics.classification_report(train_v2[1], train_v2[0]))
print(metrics.confusion_matrix(train_v2[1], train_v2[0]))

0.921553629100799
              precision    recall  f1-score   support

           0       0.12      0.12      0.12       528
           1       0.96      0.96      0.96     11238

    accuracy                           0.92     11766
   macro avg       0.54      0.54      0.54     11766
weighted avg       0.92      0.92      0.92     11766

[[   64   464]
 [  459 10779]]


### Item 2

In [None]:
# 为了计算每个字段（或其每个可能值）对应的谎言的比例，

def readfile_full(filename):
    examples_label_sender = []
    speakers_ = []
    receivers_ = []
    speakers_and_receivers_ = []
    seasons_ = []
    years_ = []
    for line in open(filename):
        data = json.loads(line.strip())
        for tlabel, speakers, receivers, seasons, years in zip(data['sender_labels'], data['speakers'], data['receivers'], data['seasons'], data['years']):
            examples_label_sender.append(1 if tlabel else 0)
            speakers_.append(speakers)
            receivers_.append(receivers)
            speakers_and_receivers_.append((speakers, receivers))
            seasons_.append(seasons)
            years_.append(years)
    return np.array(examples_label_sender), speakers_, receivers_, speakers_and_receivers_, seasons_, years_


train_full = readfile_full("mod-train.jsonl")
valid_full = readfile_full("mod-validation.jsonl")

In [None]:
import numpy as np
import json
from collections import defaultdict

def compute_lie_percentages(labels, values):
    """
    Compute the percentage of messages that are lies (0) for each unique value in a given field.

    :param labels: List of binary labels (1 = truth, 0 = lie)
    :param values: List of categorical values (e.g., speakers, receivers)
    :return: Dictionary with percentage of lies for each unique value
    """
    total_counts = defaultdict(int)
    lie_counts = defaultdict(int)

    for label, value in zip(labels, values):
        total_counts[value] += 1
        if label == 0:
            lie_counts[value] += 1

    percentages = {key: (lie_counts[key] / total_counts[key]) * 100 for key in total_counts}
    return percentages

def compute_lie_percentages_for_all(train_data):
    """
    Computes lie percentages for speakers, receivers, (speakers, receivers) pairs, seasons, and years.
    """
    labels, speakers, receivers, speakers_receivers, seasons, years = train_data

    # Compute lie percentages
    speaker_lie_percent = compute_lie_percentages(labels, speakers)
    receiver_lie_percent = compute_lie_percentages(labels, receivers)
    speaker_receiver_lie_percent = compute_lie_percentages(labels, speakers_receivers)
    season_lie_percent = compute_lie_percentages(labels, seasons)
    year_lie_percent = compute_lie_percentages(labels, years)

    return {
        "Speaker Lie Percentage": speaker_lie_percent,
        "Receiver Lie Percentage": receiver_lie_percent,
        "Speaker-Receiver Pair Lie Percentage": speaker_receiver_lie_percent,
        "Season Lie Percentage": season_lie_percent,
        "Year Lie Percentage": year_lie_percent,
    }


In [None]:
# Compute lie percentages
lie_percentages_train = compute_lie_percentages_for_all(train_full)
lie_percentages_valid = compute_lie_percentages_for_all(valid_full)

# Display results
import pandas as pd
for key, value in lie_percentages_train.items():
    if key == "Speaker Lie Percentage":
        df = pd.DataFrame(value.items(), columns=["Speaker", "Lie Percentage (%)"])

In [None]:
df

Unnamed: 0,Speaker,Lie Percentage (%)
0,italy,8.547718
1,germany,2.122547
2,austria,2.928071
3,russia,2.838221
4,england,3.31384
5,turkey,3.000883
6,france,8.221797


### Item 3

In [None]:
# Extract messages and labels from training data
messages, labels = train

# Store first 20 truth messages (label = 1) and first 20 lies (label = 0)
truth_messages = []
lie_messages = []

# Iterate through training data and collect messages
for msg, label in zip(messages, labels):
    if label == 1 and len(truth_messages) < 20:
        truth_messages.append(msg)
    elif label == 0 and len(lie_messages) < 20:
        lie_messages.append(msg)
    # Stop when we have 20 of each
    if len(truth_messages) == 20 and len(lie_messages) == 20:
        break

# Print results
print("First 20 Truth Messages:")
for i, msg in enumerate(truth_messages, 1):
    print(f"{i}. {msg}")

print("\nFirst 20 Lie Messages:")
for i, msg in enumerate(lie_messages, 1):
    print(f"{i}. {msg}")


First 20 Truth Messages:
1. Germany!

Just the person I want to speak with. I have a somewhat crazy idea that I’ve always wanted to try with I/G, but I’ve never actually convinced the other guy to try it. And, what’s worse, it might make you suspicious of me. 

So...do I suggest it?

I’m thinking that this is a low stakes game, not a tournament or anything, and an interesting and unusual move set might make it more fun? That’s my hope anyway.

What is your appetite like for unusual and crazy?
2. You've whet my appetite, Italy. What's the suggestion?
3. 👍
4. It seems like there are a lot of ways that could go wrong...I don't see why France would see you approaching/taking Munich--while I do nothing about it--and not immediately feel skittish
5. Yeah, I can’t say I’ve tried it and it works, cause I’ve never tried it or seen it. But how I think it would work is (a) my Spring move looks like an attack on Austria, so it would not be surprising if you did not cover Munich. Then (b) you build