-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support codegen2 #209
Add support codegen2 #209
Conversation
Oh this is wonderful! I didn't realize CodeGen2 was still a standard GPT-J model! I'll try to test this out as soon as possible and get it merged :) |
Can it be merged? :) |
Great job. Has anyone experienced the Codegen-2? Is it worth upgrading? |
Any update ? Would really like to try this out |
Also interested if anyone experienced the Codegen-2. Is it worth upgrading? |
Is it possible to test this PR by pulling the branch, following the build steps, and editing the setup.sh to add the codegen2.5 model as an option? |
CodeGen 2.5 is based on LLama architecture, no longer on CodeGen architecture. |
@michaelfeil Hi, I read the blog of CodeGen 2.5 and Salesforce indeed serve it and evaluate the latency on NVIDIA triton server. Do you know how to serve CodeGen 2.5 with triton? I feel like there’s other ways to make the CodeGen based model supported instead of converting to GPT-J. |
I would back for tutorials on how to run llama-2-7b on triton, and start from there. |
Closed in favor of #230 |
1. General Description
This PR adds support to convert models from Codegen-2 to GPT-J.
It does not modify the functionality of the existing converter. The PR seems quite small, but took me hours of debugging to figure out that the architecture of codegen2 is actually fully compatible for the large (7, 16B versions).
For the smaller models, a different permutation order is required, because the smaller models (1, 3_7B versions)were trained on another TPU setting.
2. Changes proposed in this PR:
Resolves: #202
3. How to evaluate:
Describe how to evaluate such that it may be reproduced by the reviewer (s).
1.
Self assessment:
docker-compose build
: