Skip to content

susantabiswas/Word-Prediction-Ngram

Repository files navigation

HitCount

Word-Prediction-Ngram

Next Word Prediction using n-gram Probabilistic Model.

Various jupyter notebooks are there using different Language Models for next word Prediction.

Input :

The users Enters a text sentence

Output :

Predicts a word which can follow the input sentence

Various Smoothing Techniques have been used in different Language Models along with combination of Interpolation and Backoff in these different Language Models.

Smoothing Techniques Used:

  1. Add 1 
  2. Good Turing 
  3. Simple Knesser Ney
  4. Interpolated Knesser Ney

How the code works:

  1. Cleaning of training corpus ( Removing Punctuations etc)
  2. Creation of Language Model:
    i) Formation of n-grams (Unigram, Bigram, Trigram, Quadgram)
    ii) Probability Dictionary Creation with provision of various Smoothing Mechanism