Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repetitive generation for simple prompt #4

Closed
strin opened this issue Sep 12, 2019 · 8 comments
Closed

Repetitive generation for simple prompt #4

strin opened this issue Sep 12, 2019 · 8 comments

Comments

@strin
Copy link

strin commented Sep 12, 2019

Followed the exact steps documented in README. The model with sequence length 256 running:

ENTER PROMPT: hello this is GPT. how are you?

image

Is this error reproducible by others?

@strin
Copy link
Author

strin commented Sep 12, 2019

Similar error for the unicorn example:
image

@kostyan0005
Copy link

I have the same the problem.

Additional info:

@keskarnitish
Copy link
Contributor

keskarnitish commented Sep 12, 2019

This is usually symptomatic of not loading a model.
Are you sure that --model_dir points to the right location and that there entire checkpoint is available there?
Also, which model version are you using?

@pradeepthiyyagura
Copy link

pradeepthiyyagura commented Sep 13, 2019

I have the same problem and I am using

python 2.7.16
tensorflow 1.14.0 gpu_py27h39f1c70_0
model seqlen256_v1.ckpt

Command $python generation.py --model_dir /home/tpradeep/ctrl/seqlen256_v1.ckpt

Quite a few warning messages and then the following

2019-09-13 08:03:27.284843: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
WARNING:tensorflow:From /home/tpradeep/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py:1066: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
Loading vocabulary from vocab ...
Read 6086453827 words (246531 unique) from vocabulary file.
Loading codes from codes ...
Read 200000 codes from the codes file.
ENTER PROMPT: Wikipedia Salesforce Inc. is
Wikipedia Salesforce Inc. is a

Wikipedia Salesforce Inc. is a software

Wikipedia Salesforce Inc. is a software company

Wikipedia Salesforce Inc. is a software company that

Wikipedia Salesforce Inc. is a software company that provides

Wikipedia Salesforce Inc. is a software company that provides cloud-based

^C^CContinuing
ENTER PROMPT: ^CTraceback (most recent call last):
File "generation.py", line 162, in
prompt = raw_input('ENTER PROMPT: ')
KeyboardInterrupt
^C

@keskarnitish
Copy link
Contributor

@pradeepthiyyagura, your generation script seems to be fine? Is your concern about the warnings?

@pradeepthiyyagura
Copy link

pradeepthiyyagura commented Sep 13, 2019

Thank you for the response. My primary concern is not about the warnings but is the out put in the expected format? I thought it would generate a full sentence or a paragraph of a certain length as in the examples instead of displaying every line with a new word prediction.

@keskarnitish
Copy link
Contributor

Thank you for the response. My primary concern is not about the warnings but is the out put in the expected format? I thought it would generate a full sentence or a paragraph of a certain length as in the examples instead of displaying every line with a new word prediction.

Aah. I see.

You can generate just once by indenting the print statements on https://github.com/salesforce/ctrl/blob/master/generation.py#L272 and https://github.com/salesforce/ctrl/blob/master/generation.py#L273 to be outside the generation for loop

@keskarnitish
Copy link
Contributor

Seems like this has been fixed by specifying the right --model_dir. Closing for now, reopen as necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants