Skip to content

The aim of this project is to implement and explore architectures based on Siamese Transformer networks to solve Natural Language Processing tasks, in particular the one proposed by IBM’s shared task 2021

Notifications You must be signed in to change notification settings

AlbertoMarinelli/Siamese-Transformer-Networks-for-Key-Point-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Siamese Transformer Networks for Key-Point Analysis

Project for the Human Languages Technologies course @ University of Pisa

Unipi logo

Authors: Alberto Marinelli, Valerio Mariani

Abstract

Pretrained language models are nowadays becoming standard approaches to tackle various Natural Language Processing tasks. This is the reason why we decided to experiment with using Transformers in a Siamese network to solve these problems and understand how they work. Specifically, due to the extensive pre-training of the available language models and the small dataset, a pre-trained model of BERT and its variant RoBERTa inside a Siamese network were chosen. In particular, this architecture was used to tackle task 1 in the IBM's shared task 2021. In this task, we are given a set of debatable topics, a set of key points, and a set of arguments, supporting or contesting the topic. We are then asked to match each argument to the topic it is supporting or contesting.

Running the Project

All the code for experimentation is reported in apposite notebook that can be runned on Google Colab, to speed up the computation required for the training and inference phase it is suggested to change the runtime type to GPU.

Main results

Despite the relatively small size of the Dataset we were given - expecially when compared to the corpus these models were trained on - both SBERT and Siamese-RoBERTa obtained some nice generalization of the ability to select the Key-Point of a sentence in a set of possible ones from the training set to the development (validation) set we were given. In fact, using the given metrics on the validation set, we obtained ~75% Mean Average Precision with SBERT and ~80% Mean Average Precision with Siamese-RoBERTA.

Languages and Tools

python tensorflow

About

The aim of this project is to implement and explore architectures based on Siamese Transformer networks to solve Natural Language Processing tasks, in particular the one proposed by IBM’s shared task 2021

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published