# Assignment <span style="color:red">option Four</span> - News Categorization using PyTorch

Download the dataset from https://www.kaggle.com/uciml/news-aggregator-dataset and develop a news classification or categorization model. The dataset contain only titles of a news item and some metadata. The categories of the news items include one of: –<span  style="color:red"> b</span> : business – <span  style="color:red">t</span> : science and technology – <span  style="color:red">e</span> : entertainment and –<span  style="color:red">m</span> : health.

1. Prepare training and test dataset: Split the data into training and test set (80% train and 20% test). Make sure they are balanced, otherwise if all <span  style="color:red">b</span> files are on training, your model fails to predict <span  style="color:red">t</span> files in test.
2. Binary classification: produce training data for each two categories, such as <span  style="color:red">b </span> and <span  style="color:red"> t</span>, <span  style="color:red">b</span> and <span  style="color:red"> m</span>, <span  style="color:red">e</span> and <span  style="color:red">t</span> and so on. Evaluate the performance and report which categories are easier for the models.
3. Adapt the Text Categorization PyTorch code (see above) and evaluate the performance of the system for these task
4. Use a pre-trained embeddings and compare your result. When you use pre-trained embeddings, you have to average the word embeddings of each tokens in ach document to get the unique representation of the document. DOC_EMBEDDING = (TOKEN1_EMBEDDING + ... + TOKENn_EMBEDDING). You can also use some of the <span  style="color:red">spacy/FLAIR </span>document embedding methods
5. Report the recall, precision, and F1 scores for both binary and multi-class classification.


# Task 1

1. Prepare training and test dataset: Split the data into training and test set (80% train and 20% test). Make sure they are balanced, otherwise if all <span  style="color:red">b</span> files are on training, your model fails to predict <span  style="color:red">t</span> files in test.


In [2]:
import pandas as pd
from sklearn.model_selection import train_test_split

# read data
data = pd.read_csv("data/uci-news-aggregator.csv")
# remove unnecessary columns
frame = data[["TITLE", "CATEGORY"]]

TEST_SIZE = 0.2

# Division into training and test data. The stratify parameter causes the "Category" feature to be split equally
training_data, testing_data = train_test_split(
    frame, test_size=TEST_SIZE, random_state=0, stratify=data["CATEGORY"]
)

# print size of train and test set
print("Trainingsdaten: ", len(training_data))
print("Testdaten: ", len(testing_data))

Trainingsdaten:  337935
Testdaten:  84484


# Task 2

Binary classification: produce training data for each two categories, such as b and t, b
and m, e and t and so on. Evaluate the performance and report which categories are
easier for the models.


In [13]:
from itertools import combinations

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import (
    accuracy_score,
    classification_report,
    f1_score,
    precision_score,
    recall_score,
)
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
import pickle


# Define the categories
categories = ["b", "t", "e", "m"]

# get all possible combinations
combinations_categories = list(combinations(categories, 2))

# print combinations
# for combination in possible_combinations:
#    print(combination)

# loop through each category combination
for category_pair in combinations_categories:
    category_1, category_2 = category_pair

    # only keep data of category pair
    filtered_training_data = training_data[
        (training_data["CATEGORY"] == category_1)
        | (training_data["CATEGORY"] == category_2)
    ]
    filtered_test_data = testing_data[
        (testing_data["CATEGORY"] == category_1)
        | (testing_data["CATEGORY"] == category_2)
    ]

    # Create a binary dataset for the current category pair
    cat_mapping = {category_1: 1, category_2: 0}
    filtered_training_data["CATEGORY_IN_BINARY"] = filtered_training_data[
        "CATEGORY"
    ].map(cat_mapping)
    filtered_test_data["CATEGORY_IN_BINARY"] = filtered_test_data["CATEGORY"].map(
        cat_mapping
    )

    # print(filtered_training_data)

    # split the binary dataset into features (X) und labels (y)
    X_train = filtered_training_data["TITLE"]
    y_train = filtered_training_data["CATEGORY_IN_BINARY"]
    X_test = filtered_test_data["TITLE"]
    y_test = filtered_test_data["CATEGORY_IN_BINARY"]

    # vectorize the titles using TF-IDF
    vectorizer = TfidfVectorizer()
    X_train_tfidf = vectorizer.fit_transform(X_train)
    X_test_tfidf = vectorizer.transform(X_test)

    # save vectorizer
    with open(f'vectorizer/tasktwo_{category_1}_{category_2}.pkl', 'wb') as vectorizer_file:
        pickle.dump(vectorizer, vectorizer_file)

    # train a Naive Bayes classifier
    classifier = MultinomialNB()
    classifier.fit(X_train_tfidf, y_train)



    # Save the trained model
    with open(f'models/tasktwo_{category_1}_{category_2}.pkl', 'wb') as model_file:
        pickle.dump(classifier, model_file)

    # make predictions on the test set
    predictions = classifier.predict(X_test_tfidf)

    # evaluate performance
    accuracy = accuracy_score(y_test, predictions)
    precision = precision_score(y_test, predictions)
    recall = recall_score(y_test, predictions)
    f1 = f1_score(y_test, predictions)

    # this report gives further information
    report = classification_report(y_test, predictions)

    # print results
    print("----------------------------------------------------------")
    print(
        f"Category Pair: {category_1} ({cat_mapping[category_1]}) vs {category_2} ({cat_mapping[category_2]})"
    )
    print("------------------PERFORMANCE-----------------------------")
    print(f"Accuracy: {accuracy:.2f}")
    print(f"Precision: {precision:.2f}")
    print(f"Recall: {recall:.2f}")
    print(f"F1_score: {f1:.2f}")
    print("--------------------REPORT--------------------------------")
    print("Classification Report:\n", report)
    print("----------------------------------------------------------")
    print("\n\n")

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_training_data["CATEGORY_IN_BINARY"] = filtered_training_data[
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_test_data["CATEGORY_IN_BINARY"] = filtered_test_data["CATEGORY"].map(


----------------------------------------------------------
Category Pair: b (1) vs t (0)
------------------PERFORMANCE-----------------------------
Accuracy: 0.93
Precision: 0.93
Recall: 0.93
F1_score: 0.93
--------------------REPORT--------------------------------
Classification Report:
               precision    recall  f1-score   support

           0       0.92      0.93      0.92     21669
           1       0.93      0.93      0.93     23193

    accuracy                           0.93     44862
   macro avg       0.93      0.93      0.93     44862
weighted avg       0.93      0.93      0.93     44862

----------------------------------------------------------





A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_training_data["CATEGORY_IN_BINARY"] = filtered_training_data[
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_test_data["CATEGORY_IN_BINARY"] = filtered_test_data["CATEGORY"].map(


----------------------------------------------------------
Category Pair: b (1) vs e (0)
------------------PERFORMANCE-----------------------------
Accuracy: 0.98
Precision: 0.98
Recall: 0.97
F1_score: 0.97
--------------------REPORT--------------------------------
Classification Report:
               precision    recall  f1-score   support

           0       0.98      0.98      0.98     30494
           1       0.98      0.97      0.97     23193

    accuracy                           0.98     53687
   macro avg       0.98      0.98      0.98     53687
weighted avg       0.98      0.98      0.98     53687

----------------------------------------------------------





A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_training_data["CATEGORY_IN_BINARY"] = filtered_training_data[
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_test_data["CATEGORY_IN_BINARY"] = filtered_test_data["CATEGORY"].map(


----------------------------------------------------------
Category Pair: b (1) vs m (0)
------------------PERFORMANCE-----------------------------
Accuracy: 0.97
Precision: 0.97
Recall: 0.99
F1_score: 0.98
--------------------REPORT--------------------------------
Classification Report:
               precision    recall  f1-score   support

           0       0.97      0.93      0.95      9128
           1       0.97      0.99      0.98     23193

    accuracy                           0.97     32321
   macro avg       0.97      0.96      0.97     32321
weighted avg       0.97      0.97      0.97     32321

----------------------------------------------------------





A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_training_data["CATEGORY_IN_BINARY"] = filtered_training_data[
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_test_data["CATEGORY_IN_BINARY"] = filtered_test_data["CATEGORY"].map(


----------------------------------------------------------
Category Pair: t (1) vs e (0)
------------------PERFORMANCE-----------------------------
Accuracy: 0.98
Precision: 0.97
Recall: 0.97
F1_score: 0.97
--------------------REPORT--------------------------------
Classification Report:
               precision    recall  f1-score   support

           0       0.98      0.98      0.98     30494
           1       0.97      0.97      0.97     21669

    accuracy                           0.98     52163
   macro avg       0.98      0.98      0.98     52163
weighted avg       0.98      0.98      0.98     52163

----------------------------------------------------------





A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_training_data["CATEGORY_IN_BINARY"] = filtered_training_data[
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_test_data["CATEGORY_IN_BINARY"] = filtered_test_data["CATEGORY"].map(


----------------------------------------------------------
Category Pair: t (1) vs m (0)
------------------PERFORMANCE-----------------------------
Accuracy: 0.98
Precision: 0.97
Recall: 0.99
F1_score: 0.98
--------------------REPORT--------------------------------
Classification Report:
               precision    recall  f1-score   support

           0       0.98      0.94      0.96      9128
           1       0.97      0.99      0.98     21669

    accuracy                           0.98     30797
   macro avg       0.98      0.97      0.97     30797
weighted avg       0.98      0.98      0.98     30797

----------------------------------------------------------





A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_training_data["CATEGORY_IN_BINARY"] = filtered_training_data[
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_test_data["CATEGORY_IN_BINARY"] = filtered_test_data["CATEGORY"].map(


----------------------------------------------------------
Category Pair: e (1) vs m (0)
------------------PERFORMANCE-----------------------------
Accuracy: 0.98
Precision: 0.98
Recall: 0.99
F1_score: 0.99
--------------------REPORT--------------------------------
Classification Report:
               precision    recall  f1-score   support

           0       0.98      0.93      0.95      9128
           1       0.98      0.99      0.99     30494

    accuracy                           0.98     39622
   macro avg       0.98      0.96      0.97     39622
weighted avg       0.98      0.98      0.98     39622

----------------------------------------------------------





# Task 3

Adapt the Text Categorization PyTorch code (see above) and evaluate the performance
of the system for these task


In [17]:
from collections import Counter
import numpy as np
import torch
import torch.nn as nn

# used code from lecture notebook but exchanged the data
vocab = Counter()
for text in training_data["TITLE"]:
    for word in text.split(" "):
        vocab[word.lower()] += 1

for text in testing_data["TITLE"]:
    for word in text.split(" "):
        vocab[word.lower()] += 1

total_words = len(vocab)


def get_word_2_index(vocab):
    word2index = {}
    for i, word in enumerate(vocab):
        word2index[word.lower()] = i
    return word2index


word2index = get_word_2_index(vocab)


def get_batch(df, i, batch_size):
    batches = []
    results = []

    # used iloc from pandas package because working with dataframe not array
    # extracting batch of data from dataframe
    texts = df["TITLE"].iloc[i * batch_size : i * batch_size + batch_size]
    categories = df["CATEGORY"].iloc[i * batch_size : i * batch_size + batch_size]

    for text in texts:
        layer = np.zeros(total_words, dtype=float)
        for word in text.split(" "):
            layer[word2index[word.lower()]] += 1
        batches.append(layer)

    # convert categories to numbers
    for category in categories:
        index_y = -1
        if category == "b":
            index_y = 0
        elif category == "t":
            index_y = 1
        elif category == "e":
            index_y = 2
        elif category == "m":
            index_y = 3
        results.append(index_y)

    return np.array(batches), np.array(results)


# Parameters
learning_rate = 0.05
num_epochs = (
    1  # changed epoch size so training is faster, you can increase it if you want
)
batch_size = 150
display_step = 1

# Network Parameters
hidden_size = 100  # 1st layer and 2nd layer number of feature
input_size = total_words  # Words in vocab
print(input_size)
print("--------------------")
num_classes = 4

# select gpu (cuda) as method for faster training
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Verwendetes Gerät:", device)

if device.type == "cuda":
    torch.cuda.empty_cache()  # empty cache -> otherwise there were sometimes errors


class NewsNN(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(NewsNN, self).__init__()
        self.layer_1 = nn.Linear(input_size, hidden_size, bias=True)
        self.relu = nn.ReLU()
        self.layer_2 = nn.Linear(hidden_size, hidden_size, bias=True)
        self.output_layer = nn.Linear(hidden_size, num_classes, bias=True)

    def forward(self, x):
        out = self.layer_1(x)
        out = self.relu(out)
        out = self.layer_2(out)
        out = self.relu(out)
        out = self.output_layer(out)
        return out


# with "to()" you can easily switch between CPU and GPU without changing the rest of your code
# had some problems with it so we added it
news_net = NewsNN(input_size, hidden_size, num_classes).to(device)
# Loss and Optimizer
criterion = nn.CrossEntropyLoss()  # This includes the Softmax loss function
optimizer = torch.optim.Adam(news_net.parameters(), lr=learning_rate)

# Train the Model
for epoch in range(num_epochs):
    # determine the number of min-batches based on the batch size and size of training data - exchanged the data
    total_batch = int(len(training_data) / batch_size)
    # Loop over all batches
    for i in range(total_batch):
        batch_x, batch_y = get_batch(training_data, i, batch_size)
        articles = torch.FloatTensor(batch_x).to(device)
        labels = torch.LongTensor(batch_y).to(device)
        # print("articles",articles)
        # print(batch_x, labels)
        # print("size labels",labels.size())

        # Forward + Backward + Optimize
        optimizer.zero_grad()  # zero the gradient buffer
        outputs = news_net(articles)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        if (i + 1) % 4 == 0:
            print(
                "Epoch [%d/%d], Step [%d/%d], Loss: %.4f"
                % (
                    epoch + 1,
                    num_epochs,
                    i + 1,
                    len(training_data) / batch_size,
                    loss.data,
                )
            )

# Save the model
torch.save(news_net.state_dict(), 'models/taskthree.pth')

# show the different trained parameters
for name, param in news_net.named_parameters():
    if param.requires_grad:
        print("Name--->", name, "\nValues--->", param.data)

# set model to evaluation mode
news_net.eval()
total_test_batches = int(len(testing_data) / batch_size)

with torch.no_grad():
    # create empty result arrays
    all_predicted = []
    all_labels = []
    # iterate through each of the batches
    for i in range(total_test_batches):
        # get data of corresponding batch
        test_batch_x, test_batch_y = get_batch(testing_data, i, batch_size)
        test_articles = torch.FloatTensor(test_batch_x).to(device)
        test_labels = torch.LongTensor(test_batch_y).to(device)
        # get data into NN and get predicted labels
        test_outputs = news_net(test_articles)
        _, predicted = torch.max(test_outputs.data, 1)
        
        # we need .cpu() because we did .to(device) which was mainly gpu
        all_predicted.extend(predicted.cpu().numpy())
        all_labels.extend(test_labels.cpu().numpy())

# print / create classification report for the predicted data
print("-------------------------------")
print(
    classification_report(all_labels, all_predicted, target_names=["b", "t", "e", "m"])
)

135402
--------------------
Verwendetes Gerät: cpu
Epoch [1/1], Step [4/2252], Loss: 1.1010
Epoch [1/1], Step [8/2252], Loss: 0.8581
Epoch [1/1], Step [12/2252], Loss: 0.7082
Epoch [1/1], Step [16/2252], Loss: 0.8497


KeyboardInterrupt: 

: 

# Task 4

Use a pre-trained embeddings and compare your result. When you use pre-trained
embeddings, you have to average the word embeddings of each tokens in ach
document to get the unique representation of the document. DOC_EMBEDDING =
(TOKEN1_EMBEDDING + ... + TOKENn_EMBEDDING). You can also use some of the
spacy/FLAIR document embedding methods


In [None]:
import spacy
import torch.optim as optim


# Network Parameters
hidden_size = 100  # 1st layer and 2nd layer number of feature
input_size = total_words  # Words in vocab

# Set hyperparameters
embedding_dim = 300  # SpaCy provides 300-dimensional word vectors
num_classes = 4
hidden_dim = 100
num_epochs = 1
batch_size = 150
learning_rate = 0.01

# Tokenization and embeddings using spacy with the larger English model
nlp = spacy.load("en_core_web_sm", disable=["tagger", "parser", "ner"])


def calculate_average_embedding(text):
    doc = nlp(text)
    # Use the vector attribute to get the word vectors
    embeddings = [token.vector for token in doc]
    if embeddings:
        return np.mean(embeddings, axis=0)
    else:
        return np.zeros(embedding_dim)


# Apply tokenization and embeddings to the dataset
training_data["SPACY_EMBEDDING"] = training_data["TITLE"].apply(
    calculate_average_embedding
)
testing_data["SPACY_EMBEDDING"] = testing_data["TITLE"].apply(
    calculate_average_embedding
)

# Convert embeddings to torch tensors
train_embeddings = torch.tensor(np.vstack(training_data["SPACY_EMBEDDING"].to_numpy()))
test_embeddings = torch.tensor(np.vstack(testing_data["SPACY_EMBEDDING"].to_numpy()))

label_mapping = {"b": 0, "t": 1, "e": 2, "m": 3}

# Convert labels to torch tensors
train_labels = torch.tensor(training_data["CATEGORY"].map(label_mapping).to_numpy())
test_labels = torch.tensor(testing_data["CATEGORY"].map(label_mapping).to_numpy())

# Instantiate the model
model = NewsNN(embedding_dim, hidden_dim, num_classes)

# Loss and optimizer
loss_function = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)


for epoch in range(num_epochs):
    for i in range(0, len(train_embeddings), batch_size):
        inputs = train_embeddings[i : i + batch_size]
        labels = train_labels[i : i + batch_size]

        optimizer.zero_grad()
        outputs = model(inputs.float())
        loss = loss_function(outputs, labels)
        loss.backward()
        optimizer.step()
        if ((i // batch_size) + 1) % 2 == 0:
            print(
                "Epoch [%d/%d], Step [%d/%d], Loss: %.4f"
                % (
                    epoch + 1,
                    num_epochs,
                    i // batch_size,
                    len(train_embeddings) // batch_size,
                    loss.data,
                )
            )

# Evaluate on the test set
with torch.no_grad():
    test_outputs = model(test_embeddings.float())
    _, test_predictions = torch.max(test_outputs, 1)

test_predictions = test_predictions.numpy()
test_labels = test_labels.numpy()

# Print evaluation metrics
print("Classification Report:")
print(classification_report(test_labels, test_predictions, target_names=label_mapping))

In [9]:
from torchtext.vocab import GloVe

input_size = 300  # Assuming 300-dimensional GloVe embeddings
output_size = 4
hidden_size = 100  # 1st layer and 2nd layer number of features
num_epochs = 1
batch_size = 3000
learning_rate = 0.02

label_mapping = {"b": 0, "t": 1, "e": 2, "m": 3}

# Load GloVe embeddings
glove = GloVe(name="6B", dim=300)

# Tokenization and embeddings using spacy
nlp = spacy.load("en_core_web_sm", disable=["tagger", "parser", "ner"])


def get_average_embedding(text):
    tokens = nlp(text)
    embeddings = [
        glove[token.text].numpy() for token in tokens if token.text in glove.stoi
    ]
    if embeddings:
        return np.mean(embeddings, axis=0)
    else:
        return np.zeros(input_size)  # Return zeros if no embeddings are found


# Apply tokenization and embeddings to the dataset
training_data["EMBEDDING_GLOVE"] = training_data["TITLE"].apply(get_average_embedding)
testing_data["EMBEDDING_GLOVE"] = testing_data["TITLE"].apply(get_average_embedding)

# Convert embeddings to torch tensors
train_embeddings = torch.tensor(np.vstack(training_data["EMBEDDING_GLOVE"].to_numpy()))
test_embeddings = torch.tensor(np.vstack(testing_data["EMBEDDING_GLOVE"].to_numpy()))

# Convert labels to torch tensors
train_labels = torch.tensor(training_data["CATEGORY"].map(label_mapping).to_numpy())
test_labels = torch.tensor(testing_data["CATEGORY"].map(label_mapping).to_numpy())

# Instantiate the model
model = NewsNN(input_size, hidden_size, output_size)

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)


for epoch in range(num_epochs):
    for i in range(0, len(train_embeddings), batch_size):
        inputs = train_embeddings[i : i + batch_size]
        labels = train_labels[i : i + batch_size]

        optimizer.zero_grad()
        outputs = model(inputs.float())
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        if ((i / batch_size) + 1) % 2 == 0:
            print(
                "Epoch [%d/%d], Step [%d/%d], Loss: %.4f"
                % (
                    epoch + 1,
                    num_epochs,
                    i / batch_size,
                    len(train_embeddings) / batch_size,
                    loss.data,
                )
            )


# Evaluate on the test set
with torch.no_grad():
    test_outputs = model(test_embeddings.float())
    _, test_predictions = torch.max(test_outputs, 1)

test_predictions = test_predictions.numpy()
test_labels = test_labels.numpy()

# Print evaluation metrics
print("Classification Report:")
print(classification_report(test_labels, test_predictions, target_names=label_mapping))

.vector_cache/glove.6B.zip: 862MB [02:46, 5.19MB/s]                           
100%|█████████▉| 399999/400000 [01:02<00:00, 6414.16it/s]


Epoch [1/1], Step [1/112], Loss: 1.2954
Epoch [1/1], Step [3/112], Loss: 1.1915
Epoch [1/1], Step [5/112], Loss: 1.1502
Epoch [1/1], Step [7/112], Loss: 1.0810
Epoch [1/1], Step [9/112], Loss: 1.0278
Epoch [1/1], Step [11/112], Loss: 1.0270
Epoch [1/1], Step [13/112], Loss: 0.9577
Epoch [1/1], Step [15/112], Loss: 0.9733
Epoch [1/1], Step [17/112], Loss: 0.9294
Epoch [1/1], Step [19/112], Loss: 0.9462
Epoch [1/1], Step [21/112], Loss: 0.9344
Epoch [1/1], Step [23/112], Loss: 0.9276
Epoch [1/1], Step [25/112], Loss: 0.9029
Epoch [1/1], Step [27/112], Loss: 0.9286
Epoch [1/1], Step [29/112], Loss: 0.8933
Epoch [1/1], Step [31/112], Loss: 0.8681
Epoch [1/1], Step [33/112], Loss: 0.8798
Epoch [1/1], Step [35/112], Loss: 0.8652
Epoch [1/1], Step [37/112], Loss: 0.8738
Epoch [1/1], Step [39/112], Loss: 0.8865
Epoch [1/1], Step [41/112], Loss: 0.8775
Epoch [1/1], Step [43/112], Loss: 0.8639
Epoch [1/1], Step [45/112], Loss: 0.8455
Epoch [1/1], Step [47/112], Loss: 0.8664
Epoch [1/1], Step [49