Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Conversation

@Gasoonjia
Copy link
Contributor

@Gasoonjia Gasoonjia commented Sep 26, 2024

This PR enables llama3.2-11b model with text-only input.
Note that this PR is only for CLI pipeline. Will have a separate PR for openai-api update

(torchchat-test) [gasoonjia@server ~/torchchat-32mm (main|REBASE-i|main)]$ python torchchat.py generate llama3.2-11B --prompt "How are you these days"
Using device=cuda NVIDIA PG509-210
Loading model...
Time to load model: 8.88 seconds
-----------------------------------------------------------
How are you these daysI'm just a computer program, so I don't have feelings or emotions like humans do, but thanks for asking! I'm functioning properly and ready to assist with any questions or tasks you may have. How about you? How's your day going?
========================================


      Average tokens/sec (total): 11.66                 
Average tokens/sec (first token): 1.82                 
Average tokens/sec (next tokens): 13.04 

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 26, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1216

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Cancelled Job

As of commit c0c8033 with merge base ec7b510 (image):

NEW FAILURE - The following job has failed:

  • pull / runner-et (macos-14-xlarge) (gh)
    RuntimeError: module was compiled against NumPy C-API version 0x10 (NumPy 1.23) but the running NumPy has C-API version 0xe. Check the section C-API incompatibility at the Troubleshooting ImportError section at https://numpy.org/devdocs/user/troubleshooting-importerror.html#c-api-incompatibility for indications on how to solve this problem.

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 26, 2024
is_multimodal = False

seq_len = tokens.size(1)
seq_len = x.size(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between tokens and x?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No difference, all equal to text input
Previously we pull the text input out of batch inside prefill and call it tokens
Now we pull out text input first then forward it into prefill function and call it x
The reason we change the name is other single modality models usually use x to represent the text token inputs.

@joecummings
Copy link
Member

You can also check for multimodal content like this so you don't have to check the model type explicitly:

https://github.com/pytorch/torchtune/blob/7da96d18dbddbfcf77e045fe149b1efb6866681d/recipes/dev/generate_v2.py#L123

@Gasoonjia
Copy link
Contributor Author

@joecummings thanks for your suggestions!
Just curious how do you guys handle the mismatch between input and model requirements? Like when user forward an image input into a single-modality model?

@Jack-Khuu
Copy link
Contributor

The failing test is not relevant. Pushing through

@Jack-Khuu Jack-Khuu merged commit e4b36f9 into main Sep 27, 2024
49 of 51 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants