Skip to content

model3.py

Yuzhou Mao edited this page Dec 13, 2018 · 7 revisions
def main()
    description: you can tune all the parameters here.
    parameters tuneable: filename, if_pretrained, n_steps, reverse, n_neurons, batch_size, cpu_or_gpu, cell_ty, n_layers, if_bidirect, if_attention, choose_n, mode_

def get_rnn_cell(att, typ, platform, **kwargs)
    This function returns different types of RNN cells.
    input parameters:
        1. att: boolean, whether to use attention, only applicable to LSTM
        2. typ: string, type of RNN cell
        3. platform: gpu or cpu
        4. **kwargs: other necessary parameters
    returns:
        An RNN cell.

def prepare_input_for_nn(model, sentences, n_steps, stars, reverse=False, training=True)
    description:
    This function prepares the inputs for neural network and outputs the ground truth word vectors.
    input parameters:
	1. model: the trained word2Vec model for converting words to embedded vectors.
	2. sentences: a list of list of tokens, where each list consists of tokens from one review.
	3. n_steps: number of timestamps inside a sequence 
	4. stars: a list of integer from 1 to 5, representing the user rating for each review.
	5. reverse: a boolean parameter indicating the direction of the decrease of weight for calculating weight sum. If true, the furthest word has the largest weight; if false, the closest word has the largest weight.
	6. training: whether the output is used for training or not
    outputs:
	1. A dataset of inputs(reviews), true_words(labels), sequence length, stars list

def build_nn(n_layers, xpu, cell_type, training, stars, input_ph, n_steps, n_inputs, n_neurons, seq_length_ph, out_size=100, keep_prob=0.5, bidirection=False, attention=False, mode="")
    description: This function builds the neural network architecture.
    input parameters:
	1.n_layers: number of RNN layers
	2. xpu: cpu or gpu
	3. cell_type: RNN cell type
	4. training: placeholder for whether training or not
	5. stars: placeholder for stars
	6. input_ph: input placeholder for neural network
	7. n_steps: number of timestamps
	8. n_inputs: input size
	9. n_neurons: hidden neurons numbers
	10. seq_length_ph: placeholder for sequence length
	11. out_size: the output dimension size of neural network.
	12. keep_prob: dropout keeping rate
	13. whether to use bidirectional RNN
	14. whether attention mechanism is used
	15. big_fc or not. big_fc is for the largest fully connected model.
     outputs:
	1. output: the output vectors from neural network.

def get_loss(pred_word, true_word)
     description: This function returns the loss between true word vectors and predicted word vectors.
     input parameters:
	1. pred_word: a list of embedded vectors from our neural network.
	2. true_word: a list of ground true word vectors.
     output:
	1. loss: the loss

def get_optimizer(loss, lr=0.005)
     description: This function returns the optimizer for our neural network.
     input parameters:
	1. loss: the loss returned from get_loss function.
	2. lr: the learning rate.
     output:
	1. optimizer: the adam optimizer.

def train_nn(seq_length_ph,n_steps, n_inputs, training, model, sess, saver,stars_ph, input_ph, word_ph, loss, train_op, dataset, batch_size, num_epoch)
     description: This function is for training our neural network.
     input parameters:
	1. seq_length_ph: placeholder for sequence length
	2. n_steps: number of timestamps
	3. n_inputs: input size
	4. training: placeholder for training or not
	5. model: the trained word2Vec model for converting words to embedded vectors.
	6. sess: the tf session.
	7. saver: the tf saver for saving the neural network model.
	8. stars_ph: placeholder for stars
	9. input_ph: the placeholder for input for our neural network.
	10. word_ph: the predicted word vector placeholder.
	11. loss: the loss.
	12. train_op: the optimizer operation returned from get_optimizer() function.
	13. dataset: dataset used for training
	14. batch_size: batch size
	15. num_epoch: the number of epoch to train.
    outputs:
        void

def get_prediction(seq_length_ph, training, model, nn_model, test_sentences, stars, stars_ph, input_ph, word_ph, n_steps,reverse=True)
    description: This function generates the predicted embedded vector outputted from our neural network.
    input parameters:
	1. seq_length_ph: placeholder for sequence length
	2. training: placeholder for training or not
	3. model: word2Vec model
	5. nn_model: our neural network model
	6. test_sentences: a list of list of tokens as the input for our neural network.
	7. stars: a list of integer from 1 to 5 representing the user rating for each review.
	8. stars_ph: placeholder for stars
	9. input_ph: the input placeholder
	10. word_ph: the output placeholder for our neural network
	11. n_steps: number of timestamps
	12. reverse: whether we reverse the sequence or not
    output:
	1. test_true_words: the ground true embedded vectors.
	2. test_pred_words: the predicted embedded vectors.

def get_accuracy(model, true_words, pred_words, topn=10)
    description: This function calculates the accuracy.
    input parameters:
	1. model: word2Vec model.
	2. true_words: he ground true embedded vectors.
	3. pred_words: the predicted embedded vectors.
	4. topn: top n most similar embedded vectors.
    outputs:
	1. accuracy 
Clone this wiki locally