Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADD TGI v1.0.2 #22

Merged
merged 8 commits into from
Aug 28, 2023
Merged

ADD TGI v1.0.2 #22

merged 8 commits into from
Aug 28, 2023

Conversation

philschmid
Copy link
Contributor

@philschmid philschmid commented Aug 14, 2023

What does this PR do

This PR adds TGI 1.0.1, which includes several improvements and bug fixes compared to 0.9.3.

See:

Features to highlight are:

  • GPQ int4 inference for llama
  • Exllama kernels for faster latency
  • BNB int4 inference
  • RoPE scaling for deployment, e.g. Amazon/FalconLite

@philschmid
Copy link
Contributor Author

Cannot add reviewer pinging here: @xyang16 @frankfliu @ashivadi

@xyang16
Copy link
Contributor

xyang16 commented Aug 14, 2023

@amzn-choeric

@weiZhenkun
Copy link

@amzn-choeric can you review this PR?

@philschmid philschmid changed the title ADD TGI v1.0.1 ADD TGI v1.0.2 Aug 23, 2023
@philschmid
Copy link
Contributor Author

Updated to 1.0.2 with the fix for tokenizers

@philschmid
Copy link
Contributor Author

Thank you for adding the clean and license @amzn-choeric

Copy link
Contributor

@amzn-choeric amzn-choeric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that I am looking into slightly changing that one line for THIRD-PARTY-LICENSES, but that can be done as a follow-up PR.

@amzn-choeric amzn-choeric merged commit 42b6fb8 into awslabs:main Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants