Skip to content

Language Modelling (text generation, spell correction) and Sentiment Analysis / POS Tagging with MLP, RNN, CNN and BERT models and LLM prompting

Notifications You must be signed in to change notification settings

VassilisDrouzas/Natural-Language-Processing

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Natural Language Processing

Language Modeling

We create bigram, trigram and linear interpolation language models which are used for language generation and spell correction.

Source code Report

Sentiment Classification and POS Tagging tasks

We create deep learning models using the Transformers\Datasets, Pytorch and Tensorflow libraries. We also use the keras_tuner / transformers_trainer frameworks to optimize hyperparameters and model architecture.

We briefly mention additional tasks carried out:

  • Sentiment Analysis: Dataset selection, exploratory analysis, custom stopwords, data augmentation.
  • POS Taggging: Dataset selection, exploratory analysis, custom parsing, custom baseline ("smart dummy") model, local caching of heavy computations, automated results generation (python -> LaTeX).

Each task features two IPython notebooks containing the executed code, python source files for repeated custom tasks and a unified report.

The reports discuss in detail the design decisions for each classifier and include graphs and aggregated results comparing the current model to the previous models.

Simple MLP model

Sentiment classification POS Tagging Report

RNN Model

Sentiment classification POS Tagging Report

CNN Model

Sentiment classification POS Tagging Report

BERT Model

Sentiment classification POS Tagging Report

About

Language Modelling (text generation, spell correction) and Sentiment Analysis / POS Tagging with MLP, RNN, CNN and BERT models and LLM prompting

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 86.1%
  • TeX 8.5%
  • Python 5.4%