# Text Mining - Assignment 3 (35 points total)
This **Home Assignment** is to be submitted and you will be given points for each of the tasks. 
Assume the code is run under python 3.8.
You can use libraries like pytorch, torchvision, torchtext, numpy, nltk, sklearn, and all python standard libraries.

## Formalities
**Submit in a group of 3-4 people until 18.01.2022 23:59CET. The deadline is strict!**

Deadline for early submission (prelim feedback): 09.01.2022 23:59.


## Evaluation and Grading
General advice for programming excercises at *CSSH*:
Evaluation of your submission is done semi-automatically. Think of it as this notebook being 
executed once. Afterwards, some test functions are appended to this file and executed respectively.

Therefore:
* Submit valid _Python3_ code only!
* Use external libraries only when specified by task.
* Ensure your definitions (functions, classes, methods, variables) follow the specification if
  given. The concrete signature of e.g. a function usually can be inferred from task description, 
  code skeletons and test cases.
* Ensure the notebook does not rely on current notebook or system state!
  * Use `Kernel --> Restart & Run All` to see if you are using any definitions, variables etc. that 
    are not in scope anymore.
* Keep your code idempotent! Running it or parts of it multiple times must not yield different
  results. Minimize usage of global variables.
* Ensure your code / notebook terminates in reasonable time.

**There's a story behind each of these points! Don't expect us to fix your stuff!**

Regarding the scores, you will get no points for a task if:
- your function throws an unexpected error (e.g. takes the wrong number of arguments)
- gets stuck in an infinite loop
- takes much much longer than expected (e.g. >1s to compute the mean of two numbers)
- does not produce the desired output (e.g. returns an descendingly sorted list even though we asked for ascending, returns the mean and the std even though we asked for only the mean, prints an output instead of returning it!)

In [1]:
# credentials of all team members
team_members = [
    {
        'first_name': 'Leonardo',
        'last_name': 'Gomes da Matta e Silva',
        'student_id': 384657
    },
    {
        'first_name': 'Florisa',
        'last_name': 'Zanier',
        'student_id': 317700
    },
    {
        'first_name': 'Felix',
        'last_name': 'Paulig',
        'student_id': 394924
    },
    {
        'first_name': 'Lisa',
        'last_name': 'Pühl',
        'student_id': 394649
    }
]

## The Task
We will have a "kaggle"-like competition within the class. You will be given a task and you will have to find a solution that maximizes the accuracy on some heldout test set.
This home assignment will allow you more freedoms then those that you have previously encountered.

The goal is to perform text classification. You are given a set of small text snippets, each with a label (0 to 5, both included) that you shall use for training.
You model will be evaluated on a heldout dataset. You provide us (among other things) with this notebook which is used to load your pretrained model (which is then scored according to accuracy on the heldout data) and to run some other simple tests.

You will have to repeat these steps for both one RNN and one CNN architecture. 
- Write a dataset class in order to load the data.
- Write a model class that is called `RNN`/`CNN` and has an init function which works when called without arguments.
- Write a function `get_<rnn/cnn>_dataset` that takes a path to a dataset file. This file will be in JSONL format and each line will contain a string stored at key `"text"` and an integer label at key `"label"`
- Write a function `train_<rnn/cnn>` which performs training of your RNN/CNN architecture.

Please experiment with different RNN/CNN architectures and find one which yields good accuracy.
Also, provide a report (PDF format) which contains a visualization of your architecture and corresponding links/citations if you took inspiration from other code/literature for this assignment.


This home assignment will be worth 35 points distributed as follows:

- 4 + 4 points for report (4+4 means 4 for RNN, 4 for CNN)
 - visualization of final architecture is included (see e.g. https://github.com/szagoruyko/pytorchviz, not optimal but a starting point)
- 8 + 8 points for a working architecture
 - model can be loaded
 - model can be initialized (by calling `RNN()`/`CNN()`)
 - model can be trained
 - you provide a dataset class
- 11 (up to) points for accuracy of your model

|$<55$|$\geq 55$|$\geq 60$|$\geq65$|$\geq70$|$\geq75$|$\geq80$|$\geq85$|$\geq90$|
|---|---|---|---|---|---|---|---|---|
|$0$|$4$  |$5$ |$6$  |$7$ |$8$  |$9$ |$10$  |$11$ |


It is very important that your model can be loaded!

Please submit the following files:
- Your jupyter notebook (make sure you have provided **credentials**)
- A report (PDF) containing visualizations for both your models
- `rnn.pt` (trained model parameters for `RNN`, see code below)
- `cnn.pt` (trained model parameters for `CNN`, see code below)
- `words2index_rnn.csv`, `words2index_cnn.csv` files that, if necessary, are used to map words to indices. In principle you can put any kind of data here (you are the only one responsible for working with this data), but please stick to these filenames including their type indications even though you saved e.g. a pickle'd object.

If your files `rnn.pt` and `cnn.pt` exceed 250MB then please provide us with a link to the files by putting the link into `rnn.txt` and/or `cnn.txt`. We will not go to length to get the files, though. We will simply run: `wget -i rnn.txt -O rnn.pt`. Therefore, make sure that one can get the files through this method before submission if this case applies to your solutions!

You are allowed to use a GPU for training, but make sure your model is loaded into CPU/trained on CPU by default. You are also allowed to take inspiration from literature, you are not allowed to straight up copy and paste code though. If you take inspiration, specify references here in the code and in your report. You are allowed to use tricks of the trade for training. If you want to use pretrained wordvectors you are only allowed to use those provided by torchtext (e.g. https://pytorch.org/text/stable/vocab.html#glove).

In [2]:
import time
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset

import json
import nltk
from torchtext.vocab import GloVe
import csv

### Architectures

In [3]:
def load_embeddings(word2index_path='words2index_cnn.csv', dim=300):
    """
    Load GloVe embeddings as a tensor. If a words2index file is given, it determines the index of the embedding of each word in the resulting
    tensor.
    """

    # Load GloVe embeddings
    glove = GloVe(name='6B', dim=dim)

    # Load word2index dictionary
    word2index = {}
    with open(word2index_path, 'r',  encoding="utf-8") as file:
        csv_reader = csv.reader(file)
        word2index = {row[0]:int(row[1]) for row in csv_reader}

    # Initialize embeddings (words not in glove vocabulary are randomly initialized, padding is zero vector, unknown token is mean embedding)
    embeddings = np.random.uniform(-0.25,0.25,(len(word2index),dim))
    for word in word2index.keys():
        if word in glove.stoi:
            embeddings[word2index[word]] = glove[word]
    embeddings[word2index['<pad>']] = np.zeros(dim)
    embeddings[word2index['<unk>']] = np.mean(embeddings, axis=0)

    return torch.FloatTensor(embeddings)


class CNN(nn.Module):
    """
    Architecture inspired by the original paper on convolutional networks for sentence classification (Kim, 2014).
    Architecture and embedding implementation details inspired by Tran in https://chriskhanhtran.github.io/posts/cnn-sentence-classification/.
    """
    def __init__(self, embeddings_path='words2index_cnn.csv'):
        super().__init__()
        self.embedding_dim = 300
        self.embeddings = load_embeddings(embeddings_path,self.embedding_dim) # Pretrained word embeddings
        self.vocab_size, self.embeddings_dim = self.embeddings.shape # Dimension of embeddings and size of vocabulary
        self.filter_sizes = [2,3,4] # Size of filter kernels
        self.n_filters = [100,100,100] # Number of filters for each kernel size
        self.n_classes = 6 # Number of classes for output
        self.dropout_p = 0.5 # Dropout probability
        self.freeze_embedding = True # Fix embeddings or not

        # Initialize embedding layer
        self.embedding_layer = nn.Embedding.from_pretrained(self.embeddings, freeze=self.freeze_embedding)

        # Initialize convolution layer
        self.convolution_layer = nn.ModuleList([nn.Conv1d(in_channels=self.embeddings_dim,out_channels=self.n_filters[i],kernel_size=self.filter_sizes[i]) for i in range(len(self.filter_sizes))])
        
        # Fully connected layer
        self.fully_connected_layer = nn.Linear(np.sum(self.n_filters),self.n_classes)

        # Dropout
        self.dropout_layer = nn.Dropout(p=self.dropout_p)


    def forward(self, input):
        # Get embeddings of input indexes and reshape tensor to match convolution requirement
        x_embedding = self.embedding_layer(input).permute(0,2,1)
        
        # Apply convolution layer
        x_convolutions = [conv(x_embedding) for conv in self.convolution_layer]

        # Apply ReLU
        x_relu = [F.relu(conv) for conv in x_convolutions]

        # Max-over-time-pooling
        x_pooling = [F.max_pool1d(conv, kernel_size=conv.shape[2]) for conv in x_relu]

        # Concatenate pooling results
        x_concatenated = torch.cat([pool.squeeze(dim=2) for pool in x_pooling], dim=1)

        # Feed concatenated results into fully connected layer
        x_fully_connected = self.fully_connected_layer(self.dropout_layer(x_concatenated))

        return x_fully_connected

In [4]:
class RNN(nn.Module):
    """
    One-directed LSTM for sentence classification.
    Implementation details inspired by Cheng in https://towardsdatascience.com/lstm-text-classification-using-pytorch-2c6c657f8fc0.
    """
    def __init__(self, embeddings_path='words2index_rnn.csv'):
        super().__init__()
        self.embedding_dim = 300
        self.embeddings = load_embeddings(embeddings_path,self.embedding_dim) # Pretrained word embeddings
        self.vocab_size, self.embeddings_dim = self.embeddings.shape # Dimension of embeddings and size of vocabulary
        self.n_classes = 6 # Number of classes for output
        self.dropout_p = 0.2 # Dropout probability
        self.hidden_dim = 128 # Dimension of hidden states

        # Initialize embedding layer
        self.embedding_layer = nn.Embedding.from_pretrained(self.embeddings)

        # Initialize bidirectional LSTM
        self.lstm = nn.LSTM(input_size=self.embedding_dim,hidden_size=self.hidden_dim,bidirectional=False,batch_first=True)

        # Fully connected layer
        self.fully_connected_layer = nn.Linear(self.hidden_dim,self.n_classes)

        # Dropout
        self.dropout_layer = nn.Dropout(p=self.dropout_p)


    def forward(self, input):
        # Separate input
        x = input[0]
        sent_lens = input[1]

        # Get embeddings of input indexes and reshape tensor to match convolution requirement
        x_embedding = self.embedding_layer(x)

        # Pack input sentences
        x_packed = nn.utils.rnn.pack_padded_sequence(x_embedding, sent_lens, batch_first=True, enforce_sorted=False)

        # Apply LSTM
        x_lstm, _ = self.lstm(x_packed)

        # Repad output
        x_padded, sent_lens = nn.utils.rnn.pad_packed_sequence(x_lstm, batch_first=True)

        # Extract hidden states from forward and backwards pass
        x_forward = x_padded[torch.arange(len(x_padded)),sent_lens - 1,:]
        #x_backwards = x_padded[:,0,self.hidden_dim:]
        #x_concatenated = torch.cat((x_forward,x_backwards),1)
        

        # Feed concatenated results into fully connected layer (with dropout)
        x_fully_connected = self.fully_connected_layer(self.dropout_layer(x_forward))

        return x_fully_connected

### Loading data

For reading the `JSONL` file consider using the `json` library from python (in particular `.loads`) and a loop over the lines of the file.

In [5]:
from torch.utils.data import DataLoader, Dataset
import re
import csv
import os

class cnn_dataset_class(Dataset):
    """
    Dataset class used for training and evaluating our CNN.
    """
    
    # Initialize data
    def __init__(self, data_points, class_labels):
        super(Dataset, self).__init__()

        # Define data points and labels
        self.X = data_points
        self.y = class_labels

        # Define number of samples
        self.n_samples = len(data_points)

    # Indexing
    def __getitem__(self, index):
        return torch.LongTensor(self.X[index]), torch.tensor(self.y[index])

    # Returns length of dataset
    def __len__(self):
        return self.n_samples

def get_cnn_dataset(path : str, optional_file : str = 'word2index_cnn.csv'):
    """
    Data preprocessing function. Sentences are tokenized and transformed into sequences of indices.
    If a word2index file is provided, it is used to determine the indices of the words.
    Returns a Dataset object that can be used for training and evaluating our CNN.
    """

    sentences = []
    class_labels = []
    with open(path, 'r', encoding="utf-8") as file:
        json_list = list(file)
        for line in json_list:
            json_obj = json.loads(line)
            sentences.append(json_obj['text'])
            class_labels.append(json_obj['label'])

    # Tokenize sentences
    tokenized_sentences = []
    for sentence in sentences:
        sentence = re.sub(r'<.*?>','',sentence)
        sent_tokens = nltk.word_tokenize(sentence)#[:300]
        if len(sent_tokens) != 0:
            tokenized_sentences.append(sent_tokens)

    # Create vocabulary if no word2index file is available
    word2index = {}
    if optional_file != None and os.path.exists(optional_file):
        with open(optional_file, 'r',  encoding="utf-8") as file:
            csv_reader = csv.reader(file)
            word2index = {row[0]:int(row[1]) for row in csv_reader}
    else:
        word2index['<pad>'] = 0
        word2index['<unk>'] = 1
        index_count = 2
        for sentence in tokenized_sentences:
            for token in sentence:
                if token not in word2index:
                    word2index[token] = index_count
                    index_count += 1
        
        # Save vocabulary as csv file
        with open('words2index_cnn.csv', 'w',  encoding="utf-8", newline='') as file:
            csv_writer = csv.writer(file)
            for key in word2index.keys():
                csv_writer.writerow([key, word2index[key]])

    # Encode sentences using vocabulary
    data_points = []
    for sentence in tokenized_sentences:
        data_points.append([word2index.get(token, word2index['<unk>']) for token in sentence])

    return cnn_dataset_class(data_points, class_labels)

In [6]:
from torch.nn.utils.rnn import pad_sequence

def cnn_collate(batch):
    """
    Collate function that pads all sequences in the batch.
    """
    data, labels = [], []
    for datapoint in batch:
        data.append(datapoint[0])
        labels.append(datapoint[1])

    data = pad_sequence(data, batch_first=True, padding_value=0)

    return torch.LongTensor(data), torch.tensor(labels)

cnn_collate_fn = cnn_collate

In [7]:
class rnn_dataset_class(Dataset):
    """
    Dataset class used for training and evaluating our RNN.
    """
    
    # Initialize data
    def __init__(self, data_points, class_labels):
        super(Dataset, self).__init__()

        # Define data points and labels
        self.X = data_points
        self.y = class_labels

        # Define number of samples
        self.n_samples = len(data_points)

    # Indexing
    def __getitem__(self, index):
        return torch.LongTensor(self.X[index]), torch.tensor(self.y[index])

    # Returns length of dataset
    def __len__(self):
        return self.n_samples

def get_rnn_dataset(path : str, optional_file : str = 'word2index_rnn.csv'):
    """
    Data preprocessing function. Sentences are tokenized and transformed into sequences of indices.
    If a word2index file is provided, it is used to determine the indices of the words.
    Returns a Dataset object that can be used for training and evaluating our RNN.
    """

    sentences = []
    class_labels = []
    with open(path, 'r', encoding="utf-8") as file:
        json_list = list(file)
        for line in json_list:
            json_obj = json.loads(line)
            sentences.append(json_obj['text'])
            class_labels.append(json_obj['label'])

    # Tokenize sentences
    tokenized_sentences = []
    for sentence in sentences:
        sentence = re.sub(r'<.*?>','',sentence)
        sent_tokens = nltk.word_tokenize(sentence)
        if len(sent_tokens) != 0:
            tokenized_sentences.append(sent_tokens)

    # Create vocabulary if no word2index file is available
    word2index = {}
    if optional_file != None and os.path.exists(optional_file):
        with open(optional_file, 'r',  encoding="utf-8") as file:
            csv_reader = csv.reader(file)
            word2index = {row[0]:int(row[1]) for row in csv_reader}
    else:
        word2index['<pad>'] = 0
        word2index['<unk>'] = 1
        index_count = 2
        for sentence in tokenized_sentences:
            for token in sentence:
                if token not in word2index:
                    word2index[token] = index_count
                    index_count += 1
        
        # Save vocabulary as csv file
        with open('words2index_rnn.csv', 'w',  encoding="utf-8", newline='') as file:
            csv_writer = csv.writer(file)
            for key in word2index.keys():
                csv_writer.writerow([key, word2index[key]])
    
    # Encode sentences using vocabulary
    data_points = []
    for sentence in tokenized_sentences:
        data_points.append([word2index.get(token, word2index['<unk>']) for token in sentence])
        
    return rnn_dataset_class(data_points, class_labels)

In [8]:
from torch.nn.utils.rnn import pad_sequence

def rnn_collate(batch):
    """
    Collate function that pads all sequences in the batch and also returns the sentence lengths.
    """
    data, lengths, labels = [], [], []
    for datapoint in batch:
        data.append(datapoint[0])
        lengths.append(len(datapoint[0]))
        labels.append(datapoint[1])

    data = pad_sequence(data, batch_first=True, padding_value=0)

    return [torch.LongTensor(data), lengths], torch.tensor(labels)

rnn_collate_fn = rnn_collate

### Training

In [9]:
# Define device for GPU usage
if torch.cuda.is_available():       
    device = torch.device("cuda")
    

def train_cnn(cnn_instance : CNN, dataset, max_train_time : float, use_gpu=False):
    """Performs training of the CNN for at most max_train_time
    You do not have to go to strict about max_train_time but try to stick to that limit
    e.g. if max_train_time=1s and after 10min its still going we would consider this to be a fail
    """
    end_time = time.time() + max_train_time # keep
    
    # Set training parameters
    batch_size = 32
    epochs = 25
    learning_rate = 0.001
    weight_decay = 0.0001
    
    # Define optimizer
    optimizer = torch.optim.Adam(cnn_instance.parameters(), lr=learning_rate, weight_decay=weight_decay)

    # Define loss function
    loss_function = nn.CrossEntropyLoss()

    # Set model to training mode
    cnn_instance.train()

    # If GPU is to be used, send model to device
    if use_gpu:
        cnn_instance.to(device)

    for i in range(epochs): # modify to create real training loop
        dataloader = DataLoader(dataset,batch_size=batch_size,shuffle=True,collate_fn=cnn_collate_fn)
        total_loss = 0
        for data, labels in dataloader:
            # Break out of training loop if the maximum training time has been reached
            if time.time() > end_time:
                break
            
            # If GPU is to be used, send tensors to device
            if use_gpu:
                data = data.to(device)
                labels = labels.to(device)

            # Clear out previous gradients
            cnn_instance.zero_grad()

            # Perform forward pass
            output = cnn_instance(data)

            # Compute loss
            loss = loss_function(output, labels)
            total_loss += loss * batch_size

            # Perform backward pass
            loss.backward()

            # Perform optimization step
            optimizer.step()

        # Break out of training loop if the maximum training time has been reached
        if time.time() > end_time:
            break

        print(f'Total loss for epoch {i+1}: {total_loss.item()}')


        

In [10]:
# you may add additional parameters with default values

def train_rnn(rnn_instance : RNN, dataset, max_train_time : float, use_gpu=False):
    """Performs training of the CNN for at most max_train_time
    You do not have to go to strict about max_train_time but try to stick to that limit
    e.g. if max_train_time=1s and after 10min its still going we would consider this to be a fail
    """
    end_time = time.time() + max_train_time # keep
    
    # Set training parameters
    batch_size = 32
    epochs = 30
    learning_rate = 0.003
    weight_decay = 0.0001
    
    # Define optimizer
    optimizer = torch.optim.Adam(rnn_instance.parameters(), lr=learning_rate,weight_decay=weight_decay)

    # Define loss function
    loss_function = nn.CrossEntropyLoss()

    # Set model to training mode
    rnn_instance.train()

    # If GPU is to be used, send model to device
    if use_gpu:
        rnn_instance.to(device)

    for i in range(epochs): # modify to create real training loop
        dataloader = DataLoader(dataset,batch_size=batch_size,shuffle=True,collate_fn=rnn_collate)
        total_loss = 0
        for data, labels in dataloader:
            # Break out of training loop if the maximum training time has been reached
            if time.time() > end_time:
                break

            # If GPU is to be used, send tensors to device
            if use_gpu:
                data[0] = data[0].to(device)
                labels = labels.to(device)

            # Clear out previous gradients
            rnn_instance.zero_grad()

            # Perform forward pass
            output = rnn_instance(data)

            # Compute loss
            loss = loss_function(output, labels)
            total_loss += loss * batch_size

            # Perform backward pass
            loss.backward()

            # Perform optimization step
            optimizer.step()

        # Break out of training loop if the maximum training time has been reached
        if time.time() > end_time:
            break

        print(f'Total loss for epoch {i+1}: {total_loss.item()}')

In [11]:
#import random
#random_seed = 2

#torch.cuda.manual_seed_all(random_seed)
#torch.manual_seed(random_seed)
#random.seed(random_seed)
#np.random.seed(random_seed)


#cnn_dataset = get_cnn_dataset('trainfile.jsonl','words2index_cnn.csv')
#train_size = int(cnn_dataset.__len__() * 0.9)
#test_size = cnn_dataset.__len__() - train_size
#cnn_train, cnn_test = torch.utils.data.random_split(cnn_dataset, [train_size, test_size], generator=torch.Generator().manual_seed(random_seed))

#cnn_dataset_reduced = get_cnn_dataset('trainfile_reduced.jsonl','words2index_cnn.csv')
#train_size_reduced = int(cnn_dataset_reduced.__len__() * 0.9)
#test_size_reduced = cnn_dataset_reduced.__len__() - train_size_reduced
#cnn_train_reduced, cnn_test_reduced = torch.utils.data.random_split(cnn_dataset_reduced, [train_size_reduced, test_size_reduced], generator=torch.Generator().manual_seed(random_seed))

#### Training

In [16]:
cnn_inst = CNN() # make sure this works
#cnn_dataset = get_cnn_dataset("trainfile.jsonl") # make sure this works (potentially also different foldername)
#cnn_dataset = get_cnn_dataset("trainfile.jsonl", 'word2index_cnn.csv') # make sure this works (potentially also different foldername)
train_cnn(cnn_inst, cnn_dataset, 30, use_gpu=False) # make sure this works
torch.save(cnn_inst.state_dict(), "cnn.pt")# save model after training 

In [None]:
#rnn_dataset = get_rnn_dataset('trainfile.jsonl','words2index_rnn.csv')
#train_size = int(rnn_dataset.__len__() * 0.9)
#test_size = rnn_dataset.__len__() - train_size
#rnn_train, rnn_test = torch.utils.data.random_split(rnn_dataset, [train_size, test_size], generator=torch.Generator().manual_seed(random_seed))

#rnn_dataset_reduced = get_rnn_dataset('trainfile_reduced.jsonl','words2index_rnn.csv')
#train_size_reduced = int(rnn_dataset_reduced.__len__() * 0.9)
#test_size_reduced = rnn_dataset_reduced.__len__() - train_size_reduced
#rnn_train_reduced, rnn_test_reduced = torch.utils.data.random_split(rnn_dataset_reduced, [train_size_reduced, test_size_reduced], generator=torch.Generator().manual_seed(random_seed))

In [14]:
rnn_inst = RNN() # make sure this works
#rnn_dataset = get_rnn_dataset("trainfile.jsonl") # make sure this works (potentially also different foldername)
#rnn_dataset = get_rnn_dataset("trainfile.jsonl", 'word2index_rnn.csv') # make sure this works (potentially also different foldername)
train_rnn(rnn_inst, rnn_dataset, 30, use_gpu=False) # make sure this works
torch.save(rnn_inst.state_dict(), "rnn.pt")# save model after training

#### Saving

In [None]:
# code here for demonstration, may remove
torch.save(cnn_inst.state_dict(), "cnn.pt")# save model after training 
torch.save(rnn_inst.state_dict(), "rnn.pt")# save model after training

#### Loading

#### Evaluation

In [None]:
# make sure your trained NN can be evaluated using this function here
# DO NOT MODIFY!

from sklearn.metrics import accuracy_score
from torch.utils.data import DataLoader
def evaluate(clf, test_data, batch_size=100, collate_fn=None):
    
    true_labels = []
    inf_labels = []
    clf.eval()
    with torch.no_grad():
        for data, labels in DataLoader(test_data, batch_size=batch_size, collate_fn=collate_fn):
            out = clf(data)
            cls = torch.argmax(F.softmax(out, dim=1), dim=1)
            inf_labels.extend(cls.detach().numpy().tolist())
            true_labels.extend(labels.numpy().tolist())
    clf.train()
    return accuracy_score(true_labels, inf_labels)

In [None]:
# code here for demonstration, may remove
# This demonstrates how we plan on evaluating the model
cnn_loaded = CNN()                               # make sure you can init model without parameters
cnn_loaded.load_state_dict(torch.load("./cnn.pt"))  # and you can load it afterwards
cnn_loaded.eval() # make sure this works
#cnn_dataset = get_cnn_dataset("trainfile.jsonl", 'words2index_cnn.csv') # make sure this works (potentially also different filenames)
print(evaluate(cnn_loaded, cnn_dataset, collate_fn=cnn_collate_fn)) # should work
#print(evaluate(cnn_loaded, cnn_train, collate_fn=cnn_collate_fn)) # should work

In [None]:
# code here for demonstration, may remove
# This demonstrates how we plan on evaluating the model
rnn_loaded = RNN() 
rnn_loaded.load_state_dict(torch.load("./rnn.pt")) # make sure this works (potentially also different filename)
rnn_loaded.eval() # make sure this works
#rnn_dataset = get_rnn_dataset("train_file.jsonl", '/some/path/optional_rnn.csv') # make sure this works (potentially also different foldername)
print(evaluate(rnn_loaded, rnn_dataset, collate_fn=rnn_collate_fn)) # should workd
#print(evaluate(rnn_loaded, rnn_train, collate_fn=rnn_collate_fn)) # should work