Process killed by the system #12

lenhhoxung86 · 2016-03-30T15:14:01Z

Hello, I've tested your code on my own data of 20,000 examples, and the result is quite good. I have another data of 300,000 examples. Each example is a short sentence of approximate 20 words. When tested on this new dataset, the first 100 steps are fine. However, it stops on the evaluation step. The dev set has more than 60,000 examples, and the message when it stops is "Killed".
I guess the reason is that the number of examples of the dev set is very large, so it consumes a lot of memory. Is that true? And how can I fix that?
Thank you very much.

dennybritz · 2016-03-30T21:21:34Z

Yeah, the problem is most likely that you are feeding the whole dev set as one batch, which is theoretically possible, but takes a lot of memory. You can break up the dev set into multiple batches of fixed size and then combine the results.

lenhhoxung86 · 2016-03-31T22:17:11Z

I'm able to make it work now. Thank you very much.

junchaozheng · 2017-07-18T18:04:05Z

@lenhhoxung86 After feeding multiple batches of dev data, how did you deal with the summary step? Since the summary is for each batch.
step, summaries, loss, accuracy = sess.run([global_step, dev_summary_op, cnn.loss, cnn.accuracy], feed_dict)

lenhhoxung86 · 2017-07-18T23:22:15Z

@junchaozheng Hi, you can do like this:

dev_batches = data_helpers.batch_iter_dev(
                                              zip(x_batch,y_batch),
                                              FLAGS.batch_size)
                total_losses = np.array([],dtype="float32")
                total_predictions = np.array([],dtype=int)
                counter = 0
                for dev_batch in dev_batches:
                    counter += 1
                    dev_x_batch, dev_y_batch = zip(*dev_batch)
                    feed_dict = {
                        cnn.input_x: dev_x_batch,
                        cnn.input_y: dev_y_batch,
                        cnn.dropout_keep_prob: 1.0
                    }
                    step, loss_vals, dev_predictions = sess.run(
                        [global_step, cnn.losses, cnn.predictions],feed_dict)
                    time_str = datetime.datetime.now().isoformat()
                    print("{}: step {} batch {}".format(time_str, step, counter))
                    print "dev_predictions: ",dev_predictions
                    total_losses = np.concatenate((total_losses,loss_vals),axis=0)
                    total_predictions = np.concatenate((total_predictions, dev_predictions),axis=0)
                loss = tf.reduce_mean(total_losses)
                correct_predictions = tf.equal(total_predictions, tf.argmax(y_batch, 1))
                accuracy = tf.reduce_mean(tf.cast(correct_predictions, "float"), name="accuracy")
                print "loss:",sess.run(loss)," -- accuracy: ",sess.run(accuracy)
                # now write the summaries
                total_loss_summary = tf.scalar_summary("loss", loss)
                total_acc_summary = tf.scalar_summary("accuracy", accuracy)
                total_dev_summary_op = tf.merge_summary([total_loss_summary, total_acc_summary])
                summaries = sess.run(total_dev_summary_op)
                if writer:
                    writer.add_summary(summaries, step)

But in general, I have to say that word2vec and CNN are not really a good choice for text classification. From my experience, tf-idf is always the best feature to deal with text.

pmutyala · 2017-12-05T18:16:35Z

Hello @lenhhoxung86 @junchaozheng we are seeing the same issue wondering what does your batch_iter_dev() function definition looks like in data_helpers package, can you please give any pointers? any help is appreciated.

lenhhoxung86 closed this as completed Mar 31, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process killed by the system #12

Process killed by the system #12

lenhhoxung86 commented Mar 30, 2016

dennybritz commented Mar 30, 2016

lenhhoxung86 commented Mar 31, 2016

junchaozheng commented Jul 18, 2017

lenhhoxung86 commented Jul 18, 2017 •

edited

pmutyala commented Dec 5, 2017 •

edited

Process killed by the system #12

Process killed by the system #12

Comments

lenhhoxung86 commented Mar 30, 2016

dennybritz commented Mar 30, 2016

lenhhoxung86 commented Mar 31, 2016

junchaozheng commented Jul 18, 2017

lenhhoxung86 commented Jul 18, 2017 • edited

pmutyala commented Dec 5, 2017 • edited

lenhhoxung86 commented Jul 18, 2017 •

edited

pmutyala commented Dec 5, 2017 •

edited