Skip to content

Files

Latest commit

 

History

History

hf-tgi-bloom7b1

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Hosting bloom-7b1 on Amazon SageMaker using HuggingFace Text Generation Inference (TGI)

TGI Architecture

Text Generation Inference (TGI) is a Rust, Python and gRPC server for text generation inference.

This notebook shows how to deploy bigscience/bloom-7b1, an open-access Multilingual language model, to an Amazon SageMaker real-time endpoint with TGI backend.

For a list of optimized architectures for hosting with TGI can be found here

References

  1. https://github.com/huggingface/text-generation-inference
  2. https://huggingface.co/bigscience/bloom-7b1
  3. https://github.com/huggingface/text-generation-inference#optimized-architectures