add a script to compute the perplexity of test data #56

ajaech · 2016-10-27T00:33:43Z

The eval.py script can be used to compute perplexity of test data.

Adding eval.py and updates to util.py and models.py to allow for calculating the perplexity of test files. I also modified the vocabulary to have start, end and unknown character tokens.

martiansideofthemoon · 2016-11-19T06:59:20Z

utils.py

        count_pairs = sorted(counter.items(), key=lambda x: -x[1])
        self.chars, _ = zip(*count_pairs)
        self.vocab_size = len(self.chars)
        self.vocab = dict(zip(self.chars, range(len(self.chars))))
        with open(vocab_file, 'wb') as f:
            cPickle.dump(self.chars, f)
-        self.tensor = np.array(list(map(self.vocab.get, data)))
+        self.tensor = np.array(list(map(self.vocab.get, ['<S>'] + list(data) + ['</S>'])))


Do you think it would be a better idea to write this after line 59, self.tensor = self.tensor[:self.num_batches * self.batch_size * self.seq_length], since it's unlikely that you will get the </S> character

martiansideofthemoon · 2016-11-19T08:09:56Z

model.py

@@ -58,6 +58,29 @@ def loop(prev, _):
        optimizer = tf.train.AdamOptimizer(self.lr)
        self.train_op = optimizer.apply_gradients(zip(grads, tvars))

+    def eval(self, sess, chars, vocab, text):
+        batch_size = 200


seq_length you mean?

martiansideofthemoon · 2016-12-18T19:46:44Z

model.py

@@ -58,6 +58,29 @@ def loop(prev, _):
        optimizer = tf.train.AdamOptimizer(self.lr)
        self.train_op = optimizer.apply_gradients(zip(grads, tvars))

+    def eval(self, sess, chars, vocab, text):


It's probably better to move this to eval.py

hugovk · 2017-02-16T07:03:57Z

@ajaech This PR has merge conflicts.

ajaech added 2 commits October 26, 2016 17:30

allow perplexity calculation on test data

7efb5c1

Adding eval.py and updates to util.py and models.py to allow for calculating the perplexity of test files. I also modified the vocabulary to have start, end and unknown character tokens.

remove dead code

5373bcd

martiansideofthemoon reviewed Nov 19, 2016

View reviewed changes

martiansideofthemoon mentioned this pull request Nov 26, 2016

BRNN + perplexity evaluation #65

Open

martiansideofthemoon reviewed Dec 18, 2016

View reviewed changes

normanheckscher mentioned this pull request Jan 9, 2017

Perplexity & Performance hunkim/word-rnn-tensorflow#45

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add a script to compute the perplexity of test data #56

add a script to compute the perplexity of test data #56

ajaech commented Oct 27, 2016

martiansideofthemoon Nov 19, 2016 •

edited

martiansideofthemoon Nov 19, 2016

martiansideofthemoon Dec 18, 2016

hugovk commented Feb 16, 2017

add a script to compute the perplexity of test data #56

Are you sure you want to change the base?

add a script to compute the perplexity of test data #56

Conversation

ajaech commented Oct 27, 2016

martiansideofthemoon Nov 19, 2016 • edited

Choose a reason for hiding this comment

martiansideofthemoon Nov 19, 2016

Choose a reason for hiding this comment

martiansideofthemoon Dec 18, 2016

Choose a reason for hiding this comment

hugovk commented Feb 16, 2017

martiansideofthemoon Nov 19, 2016 •

edited