Skip to content

soham-013/ParaBERT-Paraphrase-Identification-using-Siamese-BERT-and-Hand-Crafted-Features

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CSCI 544: Applied Natural Language Processing

ParaBERT: Paraphrase Identification using Siamese BERT and Hand Crafted Features

Group Number: 42

Team MembersKriti AsijaParth RohillaNishthavan Dahiya Darsh PatelSoham Khade

Professor: Mohammad Rostami


Abstract

     Paraphrase identification in today’s world is increasingly valuable, finding diverse applications across various fields, from enhancing academic integrity to refining legal document analysis and boosting content originality in digital publishing. While there are many existing models for paraphrase detection, they typically focus on word-level context, potentially missing sentence-level subtleties. This project presents ”ParaBERT,” a novel approach for paraphrase identification that combines the strengths of a Siamese BERT network with handcrafted features to get a more nuanced understanding of semantics. To evaluate the efficacy of ”ParaBERT”, it is compared against a baseline of various classical models. The final results demonstrate the model’s robust performance, achieving high accuracy and F1- scores on the datasets used.


Dataset

drawing


Proposed Model

drawing


Results

drawing