-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recursion error in transformer module with NeMo Stable Diffusion #461
Comments
FYI just figured that Adding |
Thanks @athitten ! That's really helpful. Tagging triage review. Triage team, beyond the obvious "add support for control flow", I'm curious what our options are here. |
Staring down the traceback (rather than running it myself) it does not look like the modules itself (litgpt also uses a for loop over ModuleList), but as if we do have a trace that fails to print itself because of some reference cycle (which might be caused by the interpreter erroneously inserting that into the trace). |
Hi Team 鈿★笍, currently I am working on this issue and would like to share how I reproduced the same error (just as a reference to anyone else who is working on it). It is quite similar to the code shown above, but just smaller :)
self.embeddings = thunder.jit(self.embeddings)
self.encoder = thunder.jit(self.encoder)
self.final_layer_norm = thunder.jit(self.final_layer_norm)
from transformers import CLIPTokenizer, CLIPTextModel
model = CLIPTextModel.from_pretrained("openai/clip-vit-base-patch32")
tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-base-patch32")
inputs = tokenizer(["a photo of a cat", "a photo of a dog"], padding=True, return_tensors="pt")
outputs = model(**inputs)
last_hidden_state = outputs.last_hidden_state
pooled_output = outputs.pooler_output # pooled (EOS token) states (cc. @t-vi ) |
@k223kim debugged this more and the infinite recursion is from
|
More minimal repro to create a test in a fix:
|
馃悰 Bug
NeMo's Stable Diffusion uses CLIPTextModel from HuggingFace transformers. Using thunder.jit with the CLIPTextModel is causing a RecursionError.
To Reproduce
Steps to reproduce the behavior:
Partial stack trace below:
CC: @tfogal
cc @apaz-cli @tfogal
The text was updated successfully, but these errors were encountered: