Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Neuronx TGI #36

Closed
wants to merge 4 commits into from
Closed

Add Neuronx TGI #36

wants to merge 4 commits into from

Conversation

philschmid
Copy link
Contributor

What does this PR?

This PR adds the Dockerfile and utiltilies to run TGI on AWS infernetia2. It also adds instruction on how to build the container. It includes more steps than for GPU since we need to build the optimum library first.

@philschmid philschmid marked this pull request as ready for review November 13, 2023 17:29
@@ -0,0 +1,140 @@
# Fetch and extract the TGI sources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of including the sources for the tgi server in this repo, I would rather use those from the optimum-neuron repo that are likely to be up-to-date.

Copy link
Contributor

@dacorvo dacorvo Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can cherry-pick the tip of my branch in my fork for instance:
https://github.com/dacorvo/llm-hosting-container/tree/neuronx-tgi

Copy link
Contributor Author

@philschmid philschmid Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for GPU we needed the dockerfile to be separate due to legal reviews.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at my branch I keep the Dockerfile, but I added a step to fetch the optimum-neuron sources and install from them directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah nice can you open a PR to my branch then we can keep this one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I updated my branch to also remove the server files, so the commit i referenced before is invalid.
You can still git cherry-pick https://github.com/dacorvo/llm-hosting-container/tree/neuronx-tgi to get the correct commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can also close this PR and open a new one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sent a pr to your branch as requested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated 👍🏻

@amzn-choeric
Copy link
Contributor

Discarding in favor of #47.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants