Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interactive_conditional_samples.py crashes if there is more than one context token #306

Open
Nicholas-Markley opened this issue Jul 26, 2022 · 1 comment

Comments

@Nicholas-Markley
Copy link

I can run the generate_unconditional_samples.py script on my GPU without issue, however, when I run the interactive_conditional_samples.py script, it crashes if there is more than one context token.

The interactive_conditional_samples.py script works fine as long as the model prompt only produces one context token, for instance using the prompt "please" produces the list of tokens [29688] and correctly generates text. However, it crashes if the model prompt produces two or more context tokens, for instance using the prompt "pig" produces the list of tokens [79, 328] and crashes immediately.

When it crashes I'm getting the error:
failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED

And a little further down I see:

Blas xGEMMBatched launch failed : a.shape=[25,2,64], b.shape=[25,2,64], m=2, n=2, k=64, batch_size=25
         [[{{node sample_sequence/model/h0/attn/MatMul}}]]
         [[sample_sequence/while/Exit_3/_1375]]

If anyone has any insight on what might be going wrong, and how I can fix it, I'd really appreciate the help.

@huangh12
Copy link

huangh12 commented May 23, 2023

Update
The problem occurs with tf1.12/1.15 or tf2.0, but disappear with tf2.3.0


I also meet this problem. After hours of debuging, I find it seems like a bug of tf. Suppose you input three token, the model.py will calculate below like logics in w = tf.matmul(q, k, transpose_b=True), which is OK during network initialization but will crash when execute session run.

a = tf.random.uniform([1,12,3,64])
b = tf.random.uniform([1,12,3,64])
c = tf.matmul(a, b, transpose_b=True)
with tf.Session() as sess:
    print(sess.run(c).shape)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants