This repository was archived by the owner on Jul 7, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
This repository was archived by the owner on Jul 7, 2023. It is now read-only.
Generating text from arbitrary input string #303
Copy link
Copy link
Open
Labels
Description
I have a language model trained on a very large corpus.
I can input any input string in the t2t_decoder interactive mode and it completely ignores it, it's not used by the model to generate text. Any suggestions?
Here are the detailed instructions on how to reproduce this bug:
- train a language model. I trained mine on wikipedia data using:
PROBLEM=languagemodel_wiki_full32k
MODEL=attention_lm
HPARAMS=attention_lm_base
DATA_DIR=/mnt/data/t2t_data/
TMP_DIR=/mnt/data/t2t_datagen/
TRAIN_DIR=/mnt/data/t2t_train/$PROBLEM/$MODEL-$HPARAMS
WORKER_GPU=16
t2t-trainer \
--data_dir=$DATA_DIR \
--problems=$PROBLEM \
--model=$MODEL \
--hparams_set=$HPARAMS \
--hparams='batch_size=4096' \
--output_dir=$TRAIN_DIR \
--local_eval_frequency=0 \
--worker_gpu=$WORKER_GPU \
- launch the interactive
t2t_decoderlike this:
PROBLEM=languagemodel_wiki_full32k
MODEL=attention_lm
HPARAMS=attention_lm_base
DATA_DIR=/mnt/data/t2t_data/
TMP_DIR=/mnt/data/t2t_datagen/
TRAIN_DIR=/mnt/data/t2t_train/$PROBLEM/$MODEL-$HPARAMS
WORKER_GPU=0
BEAM_SIZE=1
ALPHA=0.6
t2t-decoder \
--data_dir=$DATA_DIR \
--problems=$PROBLEM \
--model=$MODEL \
--hparams_set=$HPARAMS \
--hparams='sampling_method=random' \
--output_dir=$TRAIN_DIR \
--decode_beam_size=$BEAM_SIZE \
--decode_alpha=$ALPHA \
--worker_gpu=$WORKER_GPU \
--local_eval_frequency=0 \
--decode_interactive \
it should display this:
INTERACTIVE MODE num_samples=1 decode_length=100
it=<input_type> ('text' or 'image' or 'label', default: text)
pr=<problem_num> (set the problem number, default: 0)
in=<input_problem> (set the input problem number)
ou=<output_problem> (set the output problem number)
ns=<num_samples> (changes number of samples, default: 1)
dl=<decode_length> (changes decode length, default: 100)
<source_string> (decode)
q (quit)
at this point try changing the decode_lenght or giving an arbitrary source_string, the output is always the same.