Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

Generating text from arbitrary input string #303

@ghego

Description

@ghego

I have a language model trained on a very large corpus.
I can input any input string in the t2t_decoder interactive mode and it completely ignores it, it's not used by the model to generate text. Any suggestions?

Here are the detailed instructions on how to reproduce this bug:

  1. train a language model. I trained mine on wikipedia data using:
PROBLEM=languagemodel_wiki_full32k
MODEL=attention_lm
HPARAMS=attention_lm_base

DATA_DIR=/mnt/data/t2t_data/
TMP_DIR=/mnt/data/t2t_datagen/
TRAIN_DIR=/mnt/data/t2t_train/$PROBLEM/$MODEL-$HPARAMS

WORKER_GPU=16

t2t-trainer \
  --data_dir=$DATA_DIR \
  --problems=$PROBLEM \
  --model=$MODEL \
  --hparams_set=$HPARAMS \
  --hparams='batch_size=4096' \
  --output_dir=$TRAIN_DIR \
  --local_eval_frequency=0 \
  --worker_gpu=$WORKER_GPU \
  1. launch the interactive t2t_decoder like this:
PROBLEM=languagemodel_wiki_full32k
MODEL=attention_lm
HPARAMS=attention_lm_base

DATA_DIR=/mnt/data/t2t_data/
TMP_DIR=/mnt/data/t2t_datagen/
TRAIN_DIR=/mnt/data/t2t_train/$PROBLEM/$MODEL-$HPARAMS

WORKER_GPU=0

BEAM_SIZE=1
ALPHA=0.6

t2t-decoder \
  --data_dir=$DATA_DIR \
  --problems=$PROBLEM \
  --model=$MODEL \
  --hparams_set=$HPARAMS \
  --hparams='sampling_method=random' \
  --output_dir=$TRAIN_DIR \
  --decode_beam_size=$BEAM_SIZE \
  --decode_alpha=$ALPHA \
  --worker_gpu=$WORKER_GPU \
  --local_eval_frequency=0 \
  --decode_interactive \

it should display this:

INTERACTIVE MODE  num_samples=1  decode_length=100
  it=<input_type>     ('text' or 'image' or 'label', default: text)
  pr=<problem_num>    (set the problem number, default: 0)
  in=<input_problem>  (set the input problem number)
  ou=<output_problem> (set the output problem number)
  ns=<num_samples>    (changes number of samples, default: 1)
  dl=<decode_length>  (changes decode length, default: 100)
  <source_string>                (decode)
  q                   (quit)

at this point try changing the decode_lenght or giving an arbitrary source_string, the output is always the same.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions