New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty prediction on CNN/DM with beam > 1 #457

Closed
pltrdy opened this Issue Dec 20, 2017 · 7 comments

Comments

Projects
None yet
4 participants
@pltrdy
Contributor

pltrdy commented Dec 20, 2017

I trained a summarization model using @srush 's setup (described here), during translate, using a beam size higher than one result in an empty prediction. Using beam = 1 make redundancy.

Preprocessing

The problem may come from the dataset which I processed again. I'm basically running:

  1. A. See's preprocessing
  2. @mataney 's script to get text file from TF bins (gist here)
  3. I'm then replacing </s> tokens from target files to </t>
  4. OpenNMT preprocessing as follows:
  python preprocess.py \
      -train_src $data/train.src.txt \
      -train_tgt $data/train.tgt.repeos.txt \
      -valid_src $data/valid.src.txt \
      -valid_tgt $data/valid.tgt.repeos.txt \
      -save_data $root/data \
      -src_seq_length 10000 \
      -tgt_seq_length 10000 \
      -src_seq_length_trunc 400 \
      -tgt_seq_length_trunc 100 \
      -dynamic_dict \
      -share_vocab \
      -save_data $root/data 

(*.repeos.txt files are those with </t> replacing </s>)

Training

  python train.py -data $root/data \
        -save_model $root/model \
        -copy_attn \
        -global_attention mlp \
        -word_vec_size 128 \
        -rnn_size 256 \
        -layers 1 \
        -encoder_type "brnn" \
        -epochs 16 \
        -seed 777 \
        -batch_size 32 \
        -max_grad_norm 2 \
        -share_decoder_embeddings \
        -gpuid 0

## Translation

  python translate.py -model "$best_model" \
                      -src $data/test.src.txt \
                      -gpu "$gpu" \
                      -batch_size 1 \ 
                      -verbose \
                      -beam_size 5

Results

with beam_size = 1: redundant, but not empty, e.g.

PRED 1:  <s> new : the crash of germanwings flight 9525 flight 9525 into the french alps . </t> <s> the crash of germ
anwings flight 9525 flight 9525 into the french alps . </t> <s> the crash of a cell phone video was recovered from a
 phone at the wreckage site . </t> <s> the crash of the flight 9525 flight 9525 's possible motive . </t>

PRED 2: <s> the icc 's founding rome statute is based at the hague , in the netherlands . </t> <s> the icc opened a 
preliminary examination into the situation in january . </t> <s> the icc opened a preliminary examination into the s
ituation in the netherlands . </t> <s> the icc opened a preliminary examination into the situation in january . </t>

with beam_size > 1: empty, each beam produce and eos (token_id = 3)

with n_best > 1: interestingly, I find good sentences in n_best, that are not THE best, e.g.

<s> french prosecutor : `` so far no videos were used in the crash investigation '' </t> <s> robin 's com
ments were `` completely wrong '' and `` unwarranted '' cell phones . </t> <s> `` it is a very disturbing scene , ''
 he says . </t>

but also some that contains redundancy:

 <s> the formal accession was marked with a ceremony at the hague in the netherlands . </t> <s> the formal
 accession was marked with a ceremony at the hague in the netherlands . </t> <s> the formal accession was marked wit
h a ceremony at the hague . </t>

This is not a trivial problem, I really don't know how this happens.
Any clues are welcome!

@srush

This comment has been minimized.

Show comment
Hide comment
@srush

srush Dec 20, 2017

Contributor

Can you print your logs with -verbose as well?

Contributor

srush commented Dec 20, 2017

Can you print your logs with -verbose as well?

@sebastianGehrmann

This comment has been minimized.

Show comment
Hide comment
@sebastianGehrmann

sebastianGehrmann Dec 21, 2017

Collaborator

Hm, I have run into a related problem in the past where one in every ~20 predictions was empty, even with beam size 1. Looking at the other top predictions, everything seems normal. I have not been able to replicate this error consistently yet, but a simple fix is to set the probability of EOS to -1e7 or so for the very first step. Let me know if you make progress figuring this bug out!

Collaborator

sebastianGehrmann commented Dec 21, 2017

Hm, I have run into a related problem in the past where one in every ~20 predictions was empty, even with beam size 1. Looking at the other top predictions, everything seems normal. I have not been able to replicate this error consistently yet, but a simple fix is to set the probability of EOS to -1e7 or so for the very first step. Let me know if you make progress figuring this bug out!

@srush

This comment has been minimized.

Show comment
Hide comment
@srush

srush Dec 21, 2017

Contributor

Oh okay. Let me try adding a min_length option.

Contributor

srush commented Dec 21, 2017

Oh okay. Let me try adding a min_length option.

@pltrdy

This comment has been minimized.

Show comment
Hide comment
@pltrdy
Contributor

pltrdy commented Dec 21, 2017

@pltrdy

This comment has been minimized.

Show comment
Hide comment
@pltrdy

pltrdy Dec 21, 2017

Contributor

I just checked Abisee's work, she is indeed using an min_length option, and discarding beams that are too short.

Adding an option would make the implementation in line with her work then.

Contributor

pltrdy commented Dec 21, 2017

I just checked Abisee's work, she is indeed using an min_length option, and discarding beams that are too short.

Adding an option would make the implementation in line with her work then.

@mataney

This comment has been minimized.

Show comment
Hide comment
@mataney

mataney Dec 21, 2017

Contributor

@pltrdy Brilliant!
Have you manage to check ROUGE on this!?

Contributor

mataney commented Dec 21, 2017

@pltrdy Brilliant!
Have you manage to check ROUGE on this!?

@pltrdy

This comment has been minimized.

Show comment
Hide comment
@pltrdy

pltrdy Jan 2, 2018

Contributor

@mataney sry was in vacations.
Since the top prediction is mostly empty I haven't ran the ROUGE scoring, it would be really bad. This has to be fixed before scoring can take place.

Contributor

pltrdy commented Jan 2, 2018

@mataney sry was in vacations.
Since the top prediction is mostly empty I haven't ran the ROUGE scoring, it would be really bad. This has to be fixed before scoring can take place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment