This code will help you get started using NVIDIA's Triton Inference Server and Outerbounds (built on open-source Metaflow).
There are subdirectories each containing different getting started toolkits for using Triton with Outerbounds. You can find detailed instructions in the README files in each subdirectory.
- The trees repository provides a template for how to use Metaflow to orchestrate training and tuning of Scikit-learn, XGBoost, or LightGBM models, pushing the resulting model to cloud storage so it is ready to be used on a Triton Inference Server.
- The llm repository provides a template for how to use Metaflow to orchestrate fine-tuning for transformer models, pushing the resulting model and tokenizer state to cloud storage so it is ready to be used on a Triton Inference Server.
- Set up an inference server where you want to host models.
- Best to have a GPU for
/llm
- Best to have a GPU for
- Access to a Metaflow deployment