Skip to content
Kshitij Karthick edited this page Sep 15, 2015 · 3 revisions

Trsl

  $ git clone https://github.com/iisc-sa-open/trsl
  $ cd trsl

Dependencies for Trsl

  • Python 2.7
  • Pip
  • Other packages required by trsl are given in requirements.txt
  $ sudo pip install -r requirements.txt

Generating the Tree model

Generates a decision tree model, using one of the following options

  • A text Corpus and a config file. The contents of the config file are as follows:

       [Trsl]
       ngram_window_size   =  < number of predictor variables >
       samples             =  < number of samples at each leaf node >
       reduction_threshold =  < reduction threshold at which tree growth should be stopped >
       set_filename        =    None
    
       [Set]
       no_of_clusters      =  < number of clusters to be formed from the word vectors >
       no_of_words         =  < number of words to be utilized for word vectorization >
       word2vec_model_path =  < word2vec google binary news vector>
    
    Example
       $ cd ..                 # change directory outside the parent directory of trsl
       $ python -m trsl.main --config ./trsl/examples/example.config --text ./trsl/data/corpus/last_question
    
  • A Text Corpus and pre-computed sets

      $ main.py [-h] [-v] [-s] [-m MODEL] [-c CONFIG] [-t CORPUS] [-g GROUP]
    
      Script used to generate models
    
      optional arguments:
      -h,        --help           show this help message and exit
      -v,        --verbose        increase output verbosity
      -s,        --silent         silence all logging
      -m MODEL,  --model MODEL    pre-computed model file path
      -c CONFIG, --config CONFIG  config file for the model generation
      -t CORPUS, --text CORPUS    text corpus for model generation
      -g GROUP,  --group GROUP    groups of words, pre-clustered words based on vectors [ sets ]
    
    Example
       $ cd ..                 # change directory outside the parent directory of trsl
       $ python -m trsl.main --group ./trsl/data/word-sets/Inaugural-speeches/Inaugural-speeches-Kmeans-8851words-50clusters.json --text ./trsl/data/corpus/last_question
    

Predict the Next word

  • The user provides a sequence of words (number of words being that of the ngram window size), and
  • the program predicts the next probable word (a.k.a. target-word)
$ predict.py [-h] [-v] [-s] -m MODEL

Example script for target word prediction based on the input predictor variables of ngram window size

optional arguments:
 -h, --help               show this help message and exit
 -v, --verbose            increase output verbosity
 -s, --silent             silence all logging
 -m MODEL, --model MODEL  pre-computed model file path
Example
  $ cd ..                 # change directory outside the parent directory of trsl
  $ python -m trsl.examples.predict --model ./trsl/data/model/last_question/model

Random Walk

  • The user provides a sequence of words (number of words being that of the ngram window size) and also the amount of text to be generated
  • The program then generates the required amount of text, using the initial sequence of words as a seed.
$ tree_walk.py [-h] [-v] [-s] -m MODEL

Example script for random tree walk based on the input predictor variables 
of ngram window size and the number of words to be generated

optional arguments:
 -h, --help               show this help message and exit
 -v, --verbose            increase output verbosity
 -s, --silent             silence all logging
 -m MODEL, --model MODEL  pre-computed model file path
Example
  $ cd ..                 # change directory outside the parent directory of trsl
  $ python -m trsl.examples.predict --model ./trsl/data/model/last_question/model