Skip to content

kwulffert/NLP_Projects

Repository files navigation

NLP_Projects

This is my corner to experiment with NLP algorithms.

In this notebook the NLP transfer learning method "Universal Language Model Fine-tuning" (ULMFiT) is applied for sentiment classification of travellers' tweets about their experiences flying with six US airlines.

The work is structured as follows:

  • Import libraries.
  • Data load.
  • Data exploration.
  • Text pre-processing.
  • Universal Language Model Fine-tuning (ULMFiT) application.
  • Getting the data ready for modeling.
  • Tweets generator Language Model.
  • Sentiment classifier of US airlines tweets.
  • Analysis of results.
  • Conclusions.

As Alice's adventures in wonderland has taken a special place in our home, I found it a great fit to train a word2vec model, use bigrams and trigrams to include in the embeddings context around the words and to find out if similar words can be clustered to characters.

The notebook is divided in the following steps:

  • Data load and pre-processing.
  • Text pre-processing.
  • Tokenization and lemmatization.
  • Create bigrams and trigrams.
  • Create and train a word2vec model.
  • Explore similarity of words resulting from the model.
  • Visualisation of similarity: are cluster of words related to characters from the book?.

About

This is my corner to experiment with NLP algorithms.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors