## Project NLP and Deep Learning

### 1. Project proposal presentation

In the presentation, you have 5 minutes to present your research proposal. During the presentation, you should explain:

* What is the topic of your project, what is the current state of this topic/task/setup
* What is the new part of your project
* What is the research question of your project

We have proposed a number of topics in the slides which can be found on LearnIt, you can either pick one of these or come up with your own. If you pick your own, we suggest to get a pre-approval with Rob van der Goot.

**Deadline for uploading slides: day before the presentation (23:59)**  (pdf only, they will be put into one long pdf for a smooth presentation)

### 2. Baseline
To get your project started, you start with implementing a baseline model. Ideally, this is going to be the main baseline that you are going to compare to in your paper. Note that this baseline should be more advanced than just predicting the majority class (O).

We will use EWT portion of the [Universal NER project](http://www.universalner.org/), which we provide with this notebook for convenience. You can use the train data (`en_ewt-ud-train.iob2`) and dev data(`en_ewt-ud-dev.iob2`) to build your baseline, then upload your prediction on the test data (`en_ewt-ud-test.iob2`).

It is important to upload your predictions in same format as the training and dev files, so that the `span_f1.py` script can be used.

Note that you do not have to implement your baseline from scratch, you can use for example the code from the RNN or BERT assignments as a starting point.

**Deadline: 20-03 on LearnIt (14:00)**

In [1]:
!uname --nodename

desktop2.hpc.itu.dk


In [None]:
!pip install --upgrade pip
!pip install --upgrade torch

In [8]:
!pip install jsonlines

Defaulting to user installation because normal site-packages is not writeable
Collecting jsonlines
  Downloading jsonlines-4.0.0-py3-none-any.whl (8.7 kB)
Installing collected packages: jsonlines
Successfully installed jsonlines-4.0.0

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.1.2[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


#### Helper functions

In [3]:
!free -h

              total        used        free      shared  buff/cache   available
Mem:           30Gi       841Mi       989Mi       101Mi        29Gi        29Gi
Swap:          15Gi       7.0Mi        15Gi


In [1]:
from nlp_cyber_ner.dataset import read_cyner, read_aptner, read_attackner, read_dnrti
from nlp_cyber_ner.dataset import unify_labels_aptner
from nlp_cyber_ner.dataset import clean_aptner, clean_dnrti
from nlp_cyber_ner.dataset import transform_dataset
from nlp_cyber_ner.config import PROCESSED_DATA_DIR, RAW_DATA_DIR, INTERIM_DATA_DIR
from nlp_cyber_ner.dataset import Preprocess

[32m2025-04-03 18:50:09.466[0m | [1mINFO    [0m | [36mnlp_cyber_ner.config[0m:[36m<module>[0m:[36m11[0m - [1mPROJ_ROOT path is: /home/macja/repositories/NLP-Cyber-NER[0m


### Cyner Loaders

In [2]:
cyner_path = RAW_DATA_DIR / "cyner"
cyner_train_path = cyner_path / "train.txt"
cyner_dev_path = cyner_path / "valid.txt"
cyner_test_path = cyner_path / "test.txt"

In [None]:
from nlp_cyber_ner.dataset import unify_labels_cyner
unify_labels_cyner(path=cyner_train_path)
unify_labels_cyner(path=cyner_dev_path)
unify_labels_cyner(path=cyner_test_path)

In [4]:
from nlp_cyber_ner.dataset import read_iob2_file


cyner_path = PROCESSED_DATA_DIR / "cyner"
cyner_train_path = cyner_path / "train.unified"
cyner_dev_path = cyner_path / "valid.unified"
cyner_test_path = cyner_path / "test.unified"

cyner_train_data = read_iob2_file(cyner_train_path)
cyner_dev_data = read_iob2_file(cyner_dev_path)
cyner_test_data = read_iob2_file(cyner_test_path)

In [5]:
cyner_pack = transform_dataset(
    cyner_train_data, cyner_dev_data, cyner_test_data,
    True
)
cyner_train_X, cyner_train_y, cyner_dev_X, cyner_dev_y, cyner_test_X, cyner_test_y, cyner_idx2word, cyner_idx2label, cyner_max_len = cyner_pack

# cyner_max_len = 120

In [24]:
cyner_idx2label

['<PAD>',
 'B-Malware',
 'I-Malware',
 'O',
 'B-System',
 'I-System',
 'B-Organization',
 'B-O',
 'I-Organization',
 'I-O',
 'B-Vulnerability',
 'I-Vulnerability']

### Aptner Loaders

In [16]:
aptner_path = RAW_DATA_DIR / "APTNer"
aptner_train_path= aptner_path / "APTNERtrain.txt"
aptner_dev_path= aptner_path / "APTNERdev.txt"
aptner_test_path= aptner_path / "APTNERtest.txt"
clean_aptner(aptner_train_path)
clean_aptner(aptner_dev_path)
clean_aptner(aptner_test_path)
# move to interim folder


OSError: [Errno 22] Invalid argument: 'C:\\Users\\Maciej\\coding-projects\\NLP-Cyber-NER\\data\\raw\\APTNer\\APTNERtest.txt'

In [None]:

aptner_path = INTERIM_DATA_DIR / "APTNer"
aptner_train_path= aptner_path / "APTNERtrain.cleaned"
aptner_dev_path= aptner_path / "APTNERdev.cleaned"
aptner_test_path= aptner_path / "APTNERtest.cleaned"
unify_labels_aptner(aptner_train_path)
unify_labels_aptner(aptner_dev_path)
unify_labels_aptner(aptner_test_path)
# move to processed folder


In [6]:

aptner_path = PROCESSED_DATA_DIR / "APTNer"
aptner_train_path= aptner_path / "APTNERtrain.unified"
aptner_dev_path= aptner_path / "APTNERdev.unified"
aptner_test_path= aptner_path / "APTNERtest.unified"
aptner_train_data = read_aptner(aptner_train_path)
aptner_dev_data = read_aptner(aptner_dev_path)
aptner_test_data = read_aptner(aptner_test_path)
end_labels = {'B-Organization', 'O', 'I-Malware', 'B-System', 'I-Vulnerability', 'I-Organization', 'I-System', 'B-Vulnerability', 'B-Malware'}
A = set(tag for _, tags in aptner_train_data for tag in tags)
B = set(tag for _, tags in aptner_dev_data for tag in tags)
C = set(tag for _, tags in aptner_test_data for tag in tags)
assert A == B == C == end_labels, "The labels in the train, dev and test sets are not the same."

In [19]:
aptner_pack = transform_dataset(
    aptner_train_data, aptner_dev_data, aptner_test_data, True
)
aptner_train_X, aptner_train_y, aptner_dev_X, aptner_dev_y, aptner_test_X, aptner_test_y, aptner_idx2word, aptner_idx2label, aptner_max_len = aptner_pack

### AttackNer loacders

In [None]:
from nlp_cyber_ner.dataset import unify_labels_attackner, read_iob2_file

attackner_path = RAW_DATA_DIR / "attackner"
attackner_train_path  = attackner_path / "train.json"
attackner_dev_path= attackner_path / "dev.json"
attackner_test_path= attackner_path / "test.json"


unify_labels_attackner(attackner_test_path)
unify_labels_attackner(attackner_train_path)
unify_labels_attackner(attackner_dev_path)
# actually, a little cleaning has been done. For now, unification and cleaning are in the same step
# move to processed folder

In [25]:

attackner_path = PROCESSED_DATA_DIR / "attackner"
attackner_train_path  = attackner_path / "train.unified"
attackner_dev_path= attackner_path / "dev.unified"
attackner_test_path= attackner_path / "test.unified"
attackner_train_data = read_iob2_file(attackner_train_path, word_index=0, tag_index=1)
attackner_dev_data = read_iob2_file(attackner_dev_path, word_index=0, tag_index=1)
attackner_test_data = read_iob2_file(attackner_test_path, word_index=0, tag_index=1)
end_labels = {'B-Organization', 'O', 'I-Malware', 'B-System', 'I-Vulnerability', 'I-Organization', 'I-System', 'B-Vulnerability', 'B-Malware'}
A = set(tag for _, tags in attackner_train_data for tag in tags)
B = set(tag for _, tags in attackner_dev_data for tag in tags)
C = set(tag for _, tags in attackner_test_data for tag in tags)
assert A == B == C == end_labels, "The labels in the train_data, dev_data and test_data sets are not the same."


In [26]:
attackner_pack = transform_dataset(
    attackner_train_data, attackner_dev_data, attackner_test_data, True
)
attackner_train_X, attackner_train_y, attackner_dev_X, attackner_dev_y, attackner_test_X, attackner_test_y, attackner_idx2word, attackner_idx2label, attackner_max_len = attackner_pack

attackner_idx2label

['<PAD>',
 'O',
 'B-Organization',
 'B-System',
 'I-System',
 'B-Vulnerability',
 'I-Vulnerability',
 'B-Malware',
 'I-Organization',
 'I-Malware']

### DN-RTI

In [3]:
from nlp_cyber_ner.dataset import read_iob2_file, clean_dnrti, unify_labels_dnrti
dnrti_path = RAW_DATA_DIR / "DNRTI"
dnrti_train_path = dnrti_path / "train.txt"
dnrti_dev_path = dnrti_path / "valid.txt"
dnrti_test_path = dnrti_path / "test.txt"
clean_dnrti(dnrti_train_path)
clean_dnrti(dnrti_dev_path)
clean_dnrti(dnrti_test_path)
# move to interim folder

[32m2025-03-31 23:27:08.158[0m | [1mINFO    [0m | [36mnlp_cyber_ner.config[0m:[36m<module>[0m:[36m11[0m - [1mPROJ_ROOT path is: D:\Projekty Programistyczne\NLP-Cyber-NER[0m


KeyboardInterrupt: 

In [None]:

dnrti_path = INTERIM_DATA_DIR / "DNRTI"
dnrti_train_path = dnrti_path / "train.cleaned"
dnrti_dev_path = dnrti_path / "valid.cleaned"
dnrti_test_path = dnrti_path / "test.cleaned"
unify_labels_dnrti(dnrti_train_path)
unify_labels_dnrti(dnrti_dev_path)
unify_labels_dnrti(dnrti_test_path)
# move to processed folder

In [27]:
dnrti_path = PROCESSED_DATA_DIR / "dnrti"
dnrti_train_path = dnrti_path / "train.unified"
dnrti_dev_path = dnrti_path / "valid.unified"
dnrti_test_path = dnrti_path / "test.unified"

dnrti_train_data = read_iob2_file(dnrti_train_path, word_index=0, tag_index=1)
dnrti_dev_data = read_iob2_file(dnrti_dev_path, word_index=0, tag_index=1)
dnrti_test_data = read_iob2_file(dnrti_test_path, word_index=0, tag_index=1)
end_labels = {'B-Organization', 'O', 'I-Malware', 'B-System', 'I-Vulnerability', 'I-Organization', 'I-System', 'B-Vulnerability', 'B-Malware'}
A = set(tag for _, tags in dnrti_train_data for tag in tags)
B = set(tag for _, tags in dnrti_dev_data for tag in tags)
C = set(tag for _, tags in dnrti_test_data for tag in tags)
assert A == B == C == end_labels, "The labels in the train, dev and test sets are not the same."


In [28]:
dnrti_pack = transform_dataset(
    dnrti_train_data, dnrti_dev_data, dnrti_test_data, True
)
dnrti_train_X, dnrti_train_y, dnrti_dev_X, dnrti_dev_y, dnrti_test_X, dnrti_test_y, dnrti_idx2word, dnrti_idx2label, dnrti_max_len = dnrti_pack

In [36]:
dnrti_test_y

tensor([[2, 1, 1,  ..., 0, 0, 0],
        [1, 1, 2,  ..., 0, 0, 0],
        [2, 1, 1,  ..., 0, 0, 0],
        ...,
        [1, 6, 1,  ..., 0, 0, 0],
        [2, 1, 1,  ..., 0, 0, 0],
        [2, 1, 1,  ..., 0, 0, 0]])

### Cuda, MLflow

In [15]:
import torch
print(torch.cuda.is_available())
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(torch.__version__)

True
2.5.1


In [16]:

import mlflow
from dotenv import load_dotenv
load_dotenv()
mlflow.set_tracking_uri("https://dagshub.com/PLtier/NLP-Cyber-NER.mlflow")

## Training

### helpers

In [17]:
from nlp_cyber_ner.dataset import get_labels, preds_to_tags
from nlp_cyber_ner.span_f1 import span_f1
import gc
from torch import nn
import torch
from torch.utils.data import DataLoader, TensorDataset

def train_and_eval(
    train_X: torch.Tensor,
    train_y: torch.Tensor,
    dev_X: torch.Tensor,
    dev_labels: list[tuple[str]],
    idx2word: list[str],
    idx2label: list[str],
    max_len: int,
) -> None:


    ### Transforms
    # !nvidia-smi
    # put already to gpu if having space:
    train_X, train_y = train_X.to(device), train_y.to(device)
    dev_X = dev_X.to(device)
    ### Batching

    # TODO: Maybe dtype would need to be changed!
    BATCH_SIZE = 32
    train_dataset = TensorDataset(train_X, train_y)
    train_loader = DataLoader(train_dataset, BATCH_SIZE)  # drop_last=True
    n_batches = len(train_loader)
    ### Training

    torch.manual_seed(0)
    DIM_EMBEDDING = 100
    LSTM_HIDDEN = 100
    BATCH_SIZE = 32
    LEARNING_RATE = 0.01
    EPOCHS = 15


    class TaggerModel(torch.nn.Module):
        def __init__(self, nwords, ntags):
            super().__init__()
            # TODO Do Bidirectional LSTM
            self.embed = nn.Embedding(nwords, DIM_EMBEDDING)
            self.drop1 = nn.Dropout(p=0.2)
            self.rnn = nn.LSTM(
                DIM_EMBEDDING, LSTM_HIDDEN, batch_first=True, bidirectional=True
            )
            self.drop2 = nn.Dropout(p=0.3)
            self.fc = nn.Linear(LSTM_HIDDEN * 2, ntags)

        def forward(self, input_data):
            word_vectors = self.embed(input_data)
            regular1 = self.drop1(word_vectors)
            output, hidden = self.rnn(regular1)
            regular2 = self.drop2(output)

            predictions = self.fc(regular2)
            return predictions


    model = TaggerModel(len(idx2word), len(idx2label))
    model = model.to(device)  # run on cuda if possible
    optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)
    loss_function = torch.nn.CrossEntropyLoss(ignore_index=0, reduction="sum")

    # creating the batches

    for epoch in range(EPOCHS):
        model.train()
        # reset the gradient
        print(f"Epoch {epoch + 1}\n-------------------------------")
        loss_sum = 0

        # loop over batches
        # types for convenience
        batch_X: torch.Tensor
        batch_y: torch.Tensor
        for batch_X, batch_y in train_loader:
            # TODO: if having memory issues comment .to(device)
            # from one of the previous cells, and uncomment that:
            # batch_X, batch_y = batch_X.to(device), batch_y.to(device)

            optimizer.zero_grad()

            predicted_values = model.forward(batch_X)

            # Cross entropy request (predictions, classes) shape for predictions, and (classes) for batch_y

            # calculate loss
            loss = loss_function(
                predicted_values.view(batch_X.shape[0] * max_len, -1), batch_y.flatten()
            )  # TODO: Last batch has 31 entries instead of 32 - we don't adjust much for that.
            loss_sum += loss.item()  # avg later

            # update
            loss.backward()
            optimizer.step()

        print(f"Average loss after epoch {epoch + 1}: {loss_sum / n_batches}")

    # set to evaluation mode
    model.eval()

    # eval using Span_F1
    predictions_dev = model.forward(dev_X)
    print(predictions_dev.shape)
    # gives probabilities for each tag (dim=18) for each word/feature (dim=159) for each sentence(dim=2000)
    # we want to classify each word for the part-of-speech with highest probability
    labels_dev = torch.argmax(predictions_dev, 2)
    print(labels_dev.shape)

    labels_dev = preds_to_tags(idx2label, dev_labels,labels_dev )

    metrics = span_f1(dev_labels, labels_dev)

    for k, v in metrics.items():
        mlflow.log_metric(k, v) 
    


    del predictions_dev
    del labels_dev
    gc.collect()
    torch.cuda.empty_cache()


### training

In [30]:
# SO the idea is: we have four 'packs' required for training.
# I want to train the model on each pack's train set, and evaluate on the dev set of all four packs.
# So, we have 4 train sets and 4 dev sets, everything logged to mlflow.

from numpy.random import f


train_packs = [
    ("cyner", cyner_train_data),
    ("aptner", aptner_train_data),
    ("attackner", attackner_train_data),
    ("dnrti", dnrti_train_data),
]

dev_packs = [
("cyner", cyner_dev_data),
    ("aptner", aptner_dev_data),
    ("attackner", attackner_dev_data),
    ("dnrti", dnrti_dev_data),
]

for train_pack_name, train_pack in train_packs:
    for dev_pack_name, dev_pack in train_packs:
        name: str = f"train-{train_pack_name}-eval-{dev_pack_name}"
        data_pack = transform_dataset(
            train_pack, dev_pack, train_pack, True #the third one is not used, but needed for the function to work
        )

        # train_X, train_y, dev_X, dev_y, test_X, test_y, idx2word, idx2label, max_len = train_pack
        train_X, train_y, dev_X, dev_y, _, _, train_idx2word, train_idx2label, train_max_len = data_pack
        mlflow.set_experiment(name)
        with mlflow.start_run(run_name=name):
            assert type(dev_y) == list, "Dev y is not a list!"
            train_and_eval(
                train_X,
                train_y,
                dev_X,
                dev_y,
                train_idx2word,
                train_idx2label,
                train_max_len,
            )

🏃 View run train-cyner-eval-cyner at: https://dagshub.com/PLtier/NLP-Cyber-NER.mlflow/#/experiments/8/runs/be597c27b43a4839bab415bbf18c5f80
🧪 View experiment at: https://dagshub.com/PLtier/NLP-Cyber-NER.mlflow/#/experiments/8


RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.


In [None]:
# loop:
# so I want to obtain 4x4 results: i.e. for each pair of train/valid (no test) datasets from my four
# datasets, I want to train a model and evaluate it on the other two datasets. I will then have 4x4=16 results.
# let's go:


#mlflow experiments - a simple on dnrti data
mlflow.set_experiment("dnrti")
with mlflow.start_run(run_name="dnrti") as run:
    train_and_eval(
        dnrti_train_X,
        dnrti_train_y,
        dnrti_dev_X,
        dnrti_dev,
        dnrti_idx2word,
        dnrti_idx2label,
        dnrti_max_len,
    )
    

In [67]:
# loop:
# so I want to obtain 4x4 results: i.e. for each pair of train/valid (no test) datasets from my four
# datasets, I want to train a model and evaluate it on the other two datasets. I will then have 4x4=16 results.
# let's go:


#mlflow experiments - a simple on dnrti data
mlflow.set_experiment("train-cyner-eval-cyner")
with mlflow.start_run(run_name="train-cyner-eval-cyner") as run:
    assert type(cyner_dev_y) == list, "cyner_dev_y is not a list of tuples, make sure to pass True to transform_dataset"
    train_and_eval(
        cyner_train_X,
        cyner_train_y,
        cyner_dev_X,
        cyner_dev_y,
        cyner_idx2word,
        cyner_idx2label,
        cyner_max_len,
    )

2025/04/03 18:35:02 INFO mlflow.tracking.fluent: Experiment with name 'train-cyner-eval-cyner' does not exist. Creating a new experiment.


Epoch 1
-------------------------------
Average loss after epoch 1: 286.313142971559
Epoch 2
-------------------------------
Average loss after epoch 2: 133.6077549511736
Epoch 3
-------------------------------
🏃 View run train-cyner-eval-cyner at: https://dagshub.com/PLtier/NLP-Cyber-NER.mlflow/#/experiments/8/runs/ad13fea49d0c477fabf251ee4ee4ceaa
🧪 View experiment at: https://dagshub.com/PLtier/NLP-Cyber-NER.mlflow/#/experiments/8


KeyboardInterrupt: 

### Evaluate on dev

### Save test for submission

In [None]:
import gc

from nlp_cyber_ner.dataset import prepare_output_file

# Evaluating on dev data we will predict using trained TaggerModel
predictions_test = model.forward(test_X)
print(predictions_test.shape)
# gives probabilities for each tag (dim=18) for each word/feature (dim=159) for each sentence(dim=2000)
# we want to classify each word for the part-of-speech with highest probability
labels_test = torch.argmax(predictions_test, 2)
print(labels_test.shape)
### save labels
prepare_output_file(
    transformer, test_data, labels_test, "./en_ewt-ud-test-masked.iob2", "./test.iob2"
)

del predictions_test
del labels_test
gc.collect()
torch.cuda.empty_cache()

### 3. Project proposal

The written proposal should consist of maximum one page in [ACL-format](https://github.com/acl-org/acl-style-files) (The bibliography does not count for the word limit). In here, you should explain the last three points from the list above and place your project in a larger context (previous work).

Make sure your proposal is:
* Novel to some extent
* Doable within the time-frame

*hint* The [ACL Anthology](https://aclanthology.org/) contains almost all peer-reviewed NLP papers.

**Deadline: 03-04 on LearnIt (14:00)**

### 4. Final project
The final project has a maximum size of 5 pages (excluding bibliography and appendix), using the [ACL style files](https://github.com/acl-org/acl-style-files)

Besides the main paper (discussed in class), you have to include:
* Group contributions. State who was responsible for which part of the project. Here you may state if there
were any serious unequal workloads among group members. This should be put in the appendix.
* A report on usage of chatbots. We follow: https://2023.aclweb.org/blog/ACL-2023-policy/
   * Add a section in appendix if you made use of a chatbot (since we do not use a Responsible NLP Checklist)
   * Include each stage on the ACL policy, and indicate to what extent you used a chatbot
   * Use with care!, you are responsible for the project and plagiarism, correctness etc.

You can also put additional results and details in the appendix. However, the paper itself should be standalone, and understandable without consulting the appendix.

Furthermore, the code should be available on www.github.itu.dk (with a link in a footnote at the end of the abstract) , it should include a README with instructions on how to reproduce your results.

**Deadline: 23-05 on LearnIt** Please check the checklist below before uploading!

Optionally, you can upload a draft a week before **16-05 (before 09:00)** for an extra round of feedback

## Analysis

Analysis is essential for the interpretation of your results. In this section we will shortly describe some different types of analysis. We strongly suggest to use at least one of these:

* **Ablation study**: Leave out a certain part of the model, to study its effects. For example, disable the tokenizer, remove a certain (group of) feature(s), or disable the stop-word removal. If the performance drops a lot, it means that this part of the model contributes heavily to the models final performance. This is commonly done in 1 table, while disabling different parts of the model. Note that you can also do this the other way around, i.e. use only one feature (group) at a time, and test performance
* **Learning curve**: Evaluate how much data your model needs to reach a certain performance. Especially for the data augmentation projects this is essential.
* **Quantitative analysis**: Automated means of analyzing in which cases your model performs worse. This can for example be done with a confusion matrix.
* **Qualitative analysis**: Manually inspect a certain number of errors, and try to categorize them/find trends. Can be combined with the quantitative analysis, i.e., inspect 100 cases of positive reviews predicted to be negative and 100 cases of negative reviews predicted to be positive
* **Feature importance**: In traditional machine learning methods, one can often extract and inspect the weights of the features. In sklearn these can be found in: `trained_model.coef_`
* **Other metrics**: per class scores, partial matches, or count how often the span-borders were correct, but the label wrong.
* **Input words importance**: To gain insight into which words have a impact on prediction performance (positive, negative), we can analyze per-word impact: given a trained model, replace a given word with
the unknown word token and observe the change in prediction score (probability for a class). This is
shown in Figure 4 of [Rethmeier et al (2018)](https://aclweb.org/anthology/W18-6246) (a paper on controversy detection), also shown below: red-colored
tokens were important for controversy detection, blue-colored token decreased prediction scores.

<img width=400px src=example.png>

Note that this is a non-exhaustive list, and you are encouraged to also explore additional analyses.

### Checklist final project
Please check all these items before handing in your final report. You only have to upload a pdf file on learnit, and make sure a link to the code is included in the report and the code is accesible. 

* Are all group members and their email addresses specified?
* Does the group report include a representative project title?
* Does the group report contain an abstract?
* Does the introduction clearly specify the research intention and research question?
* Does the group report adequately refer to the relevant literature?
* Does the group report properly use figure, tables and examples?
* Does the group report provide and discuss the empirical results?
* Is the group report proofread?
* Does the pdf contain the link to the project’s github repo?
* Is the github repo accessible to the public (within ITU)?
* Is the group report maximum 5 pages long, excluding references and appendix?
* Are the group contributions added in the appendix?
* Does the repository contain all scripts and code to reproduce the results in the group report? Are instructions
 provided on how to run the code?
