-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Neuronx TGI #36
Add Neuronx TGI #36
Conversation
@@ -0,0 +1,140 @@ | |||
# Fetch and extract the TGI sources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of including the sources for the tgi server in this repo, I would rather use those from the optimum-neuron repo that are likely to be up-to-date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can cherry-pick the tip of my branch in my fork for instance:
https://github.com/dacorvo/llm-hosting-container/tree/neuronx-tgi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for GPU we needed the dockerfile to be separate due to legal reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at my branch I keep the Dockerfile, but I added a step to fetch the optimum-neuron sources and install from them directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah nice can you open a PR to my branch then we can keep this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I updated my branch to also remove the server files, so the commit i referenced before is invalid.
You can still git cherry-pick https://github.com/dacorvo/llm-hosting-container/tree/neuronx-tgi
to get the correct commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can also close this PR and open a new one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I sent a pr to your branch as requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated 👍🏻
fix(tgi): install optimum-neuron and tgi server from original repo
Discarding in favor of #47. |
What does this PR?
This PR adds the Dockerfile and utiltilies to run TGI on AWS infernetia2. It also adds instruction on how to build the container. It includes more steps than for GPU since we need to build the optimum library first.