Language modelling using LSTM and classic english literature
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
character_mappings
character_models
preprocessed
word_models
Alice_PAP_SH.txt
JuliusCaesar.txt
README.md
SH.txt
all_test_text.txt
config.py
evaluate_character_models.py
evaluate_word_models.py
generate_output.py
lstm_character_based.py
lstm_word_based.py
preprocess.py
runCodeRun.sh
setup.sh
test.txt
test_output.txt

README.md

Abstract

Language is “...a systematic means of communicating ideas or feelings by the use of conventionalized signs, sounds, gestures, or marks having understood meanings”. In natural language processing, we try to come up with algorithms that can parse information, facts and sentiments quickly, efficiently and accurately from written text or speech. The fundamental problem in solving any of these tasks involve understanding the language itself, specifically its semantics and grammar. Language modeling addresses this key issue - it is a statistical measure of whether a sequence of words semantically makes sense in a particular language. In this report we will take a look at how we can model the English language using a deep learning architecture called Recurrence Neural Network (RNN) with Long-Short Term Memory (LSTM) units. Two modeling approaches are explored - one at a word level and another at a character level. The report compares and contrasts the two approaches through exhaustive experiments and identify the tradeoff and limitations of both the approaches.

Setup

  • Use setup.sh to install required packages.
    ./setup.sh
    
  • Change hyper parameters in config.py
  • For training word based model
    python lstm_word_based.py
    
  • For character based model
    ./runCoderun.sh
    
  • Models are saved in character_models OR word_models folder

Read more about the project