Files

sagemaker-core
end_to_end_ml_lifecycle
prepare_data
build_and_train_models
deploy_and_monitor
generative_ai
ml_ops
responsible_ai
archived
- AutoML_-_Train_multiple_models_in_parallel
- Image_Classification_VIT
- JPMML_Models_SageMaker
- RestRServe_Example
- Text_Classification_BERT
- albert-base-v2
- amazon_comprehend_sagemaker_pipeline
- amazon_demo_product
- athena_ml_workflow_end_to_end
- auto-scaling
- autogluon-sagemaker-pipeline
- autogluon-tabular-containers
- autogluon-tabular
- autogluon_tabular_marketplace
- automate_model_retraining_workflow
- automating_auto_insurance_claim_processing
- autopilot-serverless-inference
- bandits_recsys_movielens_testbed
- bandits_statlog_vw_customEnv
- basic-training-container
- bedrock-examples
- bert_attention_head_view
- bert_trition_backend
- byoc-nginx-python
- causal-inference
- churn_prediction_multimodality_of_text_and_tabular
- clarify-explainability-inference-pipelines
- computer-vision-examples
- creative-writing-using-gpt-2-text-generation
- credit_card_fraud_detector
- curating_aws_marketplace_listing_and_sample_notebook
- custom-feature-selection
- custom_tensorflow_inference_script_csv_and_tfrecord
- customer_churn
- customizing_build_train_deploy_project
- data_parallel_bert
- deep_demand_forecasting
- deploy_all_options_xgb
- deploy_huggingface_model_on_Inf1_instance
- deploy_pytorch_model_on_Inf1_instance
- distributed_tensorflow_mask_rcnn
- dw_flow
- end_to_end_music_recommendation
- evaluating_aws_marketplace_models_for_person_counting_use_case
- fairness_and_explainability
- fairness_and_explainability_json
- fairness_and_explainability_json_format
- fairness_and_explainability_jsonlines
- fairness_and_explainability_spark
- fairseq_sagemaker_translate_en2fr
- falcon
- fil_ensemble
  - images
  - model_repository
  - .gitignore
  - 1_prep_rapids_train_xgb.ipynb
  - 2_triton_xgb_fil_ensemble.ipynb
  - README.md
  - create_prep_env.sh
  - on_start.sh
- framework-container
- frameworks-tensorflow
- fraud_detection
- fraud_detection_using_graph_neural_networks
- fully_sharded_data_parallel-falcon
- geospatial
- getting_started
- gluon_recommender_system
- gluoncv_yolo_neo
- hf-tgi-bloom7b1
- huggingface-inference-recommender
- huggingface-large-model-inference-santacoder
- huggingface_deploy_instructpix2pix
- huggingface_multiclass_text_classification_20_newsgroups
- huggingface_sentiment_classification
- huggingface_text_classification
- identify_key_insights_from_textual_document
- implicit_bpr
- improving_industrial_workplace_safety
- inference-benchmarking
- inference-recommender-with-python-sdk
- inference_pipeline_custom_containers
- ingest-with-aws-services
- jit_trace
- keras_bring_your_own
- keras_script_mode_pipe_mode_horovod
- language-modeling
- llm_monitor_byoc
- lmi-aitemplate-stablediff
- local_experiment_tracking
- machine_learning_workflow_abalone
- managed_spot_training_tensorflow_estimator
- ml-lifecycle
- mme-on-gpu
- mnist
- model_monitor_tensorflow
- monitoring_data_quality_of_models
- mpi_on_sagemaker
- multi_modal_parallel_sagemaker_labeling_workflows_with_step_functions
- multi_model_catboost
- multi_model_linear_learner_home_value
- multi_model_pytorch
- multicategory_sec
- multimodal_tabtext
- mxnet_distributed_mnist_neo_inf1
- mxnet_onnx_ei
- nas_for_llm_with_amt
- nlp_mlops_company_sentiment
- nlp_score_dashboard_sec
- notebook-job-step
- object_detection_with_tensorflow_and_tfrecords
- onnx-roberta-backend
- parameterize-spark-config-pysparkprocessor-pipeline
- pipe_bring_your_own
- prep_data
- preprocessing-audio-data-using-a-machine-learning-model
- product_ratings_with_pipelines
- pyspark_mnist
- python-sdk
- pytorch-ic-model
- pytorch-sagemaker-huggingface
- pytorch
- pytorch_cnn_cifar10
- pytorch_deploy_pretrained_bert_model
- pytorch_extend_container_train_deploy_bertopic
- pytorch_horovod_mnist
- pytorch_multiple_gpu_single_node
- pytorch_smdataparallel_mnist_demo
- pytorch_torchvision
- pytorch_triton_inference_recommender
- pytorch_yolov5_training_and_hpo
- r_byo_r_algo_hpo
- r_serving_with_fastapi
- r_serving_with_plumber
- rapids_bring_your_own
- resnet50
- resnet_onnx_backend_SME_triton_v2
- resnet_onnx_pytorch_tensorRT-backend
- resnet_pytorch_python-backend
- retail_recommend
- rl_gamerserver_ray
- rl_hvac_coach_energyplus
- rl_mountain_car_coach_gymEnv
- rl_stock_trading_coach_customEnv
- rl_traveling_salesman_vehicle_routing_coach
- rl_unity_ray
- roberta-base
- roberta_traced_triton
- sagemaker-autopilot-pipelines
- sagemaker-debugger
- sagemaker-featurestore
- sagemaker-huggingface-tgi-hosting-examples
- sagemaker-lineage
- sagemaker-pipeline-compare-model-versions
- sagemaker-pipeline-multi-model
- sagemaker-pipeline-parameterization
- sagemaker-script-mode
- sagemaker_clarify_integration
- sagemaker_job_tracking
- sagemaker_pytorch_model_zoo
- scientific_details_of_algorithms
- scikit_learn_bring_your_own_model
- scikit_learn_data_processing_and_model_evaluation
- scikit_learn_iris
- script-mode-container-2
- script-mode-container
- sentiment_parallel_batch
- seq2seq_translation_en-de
- shadow-console
- single_gpu_single_node
- sklearn-inference-recommender
- sklearn
- sm-train_a_pytorch_model
- smddp_deepspeed_example
- sme_resnet_pytorch_python-backend
- smp-gpt-sharded-data-parallel
- smp-train-gpt-neox-sharded-data-parallel
- smp-train-gptj-sharded-data-parallel-tp
- smp-train-t5-sharded-data-parallel
- stable_diffusion
- streamlit_demo
- studio-scheduling
- t5_pytorch_python-backend
- tensorboard_keras
- tensorflow-cloudwatch
- tensorflow
- tensorflow2-california-housing-sagemaker-pipelines-deploy-endpoint
- tensorflow2_mnist
- tensorflow_action_on_rule
- tensorflow_moving_from_framework_mode_to_script_mode
- tensorflow_open-images_jpg
- tensorflow_profiling
- tensorflow_script_mode_quickstart
- tensorflow_script_mode_training_and_serving
- tensorflow_serving_using_elastic_inference_with_your_own_model
- tensorflow_single_gpu_single_node
- tensort-rt
- text-to-image-fine-tuning
- text_explainability_sagemaker_algorithm
- tf-dali-ensemble-cv
- time_series_deepar
- time_series_forecasting
- timeseries-quantile-selection-dataflow
- training_pipeline_pytorch_mnist
- triton-cv-mme-tensorflow-backend
- using_step_decorator_with_selective_execution
- vision-transformer
- visual_object_detection
- visualization
- workshops
- xgboost_abalone
- xgboost_bring_your_own
- xgboost_ensemble_python-fil-backend
- xgboost_parquet_input_training
- 2_object_detection_train_eval.ipynb
- 3D-point-cloud-input-data-processing.ipynb
- Amazon_JumpStart_Image_Classification.ipynb
- Amazon_JumpStart_Image_Classification_Benchmarking.ipynb
- Amazon_JumpStart_Image_Embedding.ipynb
- Amazon_JumpStart_Inpainting.ipynb
- Amazon_JumpStart_Instance_Segmentation.ipynb
- Amazon_JumpStart_Machine_Translation.ipynb
- Amazon_JumpStart_NLP_Regression_Free_Training.ipynb
- Amazon_JumpStart_Named_Entity_Recognition.ipynb
- Amazon_JumpStart_Object_Detection.ipynb
- Amazon_JumpStart_Question_Answering.ipynb
- Amazon_JumpStart_Regression_Free_Training.ipynb
- Amazon_JumpStart_Semantic_Segmentation.ipynb
- Amazon_JumpStart_Semantic_Segmentation_Extract_Image.ipynb
- Amazon_JumpStart_Sentence_Pair_Classification.ipynb
- Amazon_JumpStart_Text_Classification.ipynb
- Amazon_JumpStart_Text_Generation.ipynb
- Amazon_JumpStart_Text_Summarization.ipynb
- Amazon_JumpStart_Upscaling.ipynb
- Amazon_JumpStart_Zero_Shot_Text_Classification.ipynb
- Amazon_Jumpstart_AlexaTM_20B.ipynb
- Amazon_Tabular_Classification_AutoGluon.ipynb
- Amazon_Tabular_Regression_LightGBM_CatBoost.ipynb
- Amazon_Tabular_Regression_TabTransformer.ipynb
- Amazon_TensorFlow_Image_Classification.ipynb
- Amazon_Tensorflow_Object_Detection.ipynb
- Batch Transform - breast cancer prediction with high level SDK.ipynb
- Batch Transform - breast cancer prediction with lowel level SDK.ipynb
- DeployStableCascade.ipynb
- Dynamic Pricing with Causal Machine Learning and Optimization on Amazon SageMaker.ipynb
- EnsembleLearnerCensusIncome.ipynb
- GPT-J-6B-model-parallel-inference-DJL.ipynb
- GPT-J-6B_DJLServing_with_PySDK.ipynb
- GT_semantic_segmentation_to_COCO.ipynb
- HPO_Analyze_TuningJob_Results.ipynb
- HuggingFace-Async-Inference-Walkthrough.ipynb
- JumpStart_Stable_Diffusion_Inference_Only.ipynb
- Linear_Learner_Regression_csv_format.ipynb
- R_binary_classification_algorithms_comparison.ipynb
- SEC_Retrieval_Summarizer_Scoring.ipynb
- SageMaker-Monitoring-Bias-Drift-for-Batch-Transform-JSON-Lines.ipynb
- SageMaker-Monitoring-Bias-Drift-for-Batch-Transform.ipynb
- SageMaker_Keyspaces_ml_example.ipynb
- Sklearn_on_SageMaker_end2end.ipynb
- Transcription_on_SM_endpoint.ipynb
- algorithms.ipynb
- ap-batch-transform.ipynb
- automatic-speech-recognition.ipynb
- autopilot_customer_churn_high_level_with_evaluation.ipynb
- autopilot_ts_data_merge.ipynb
- bias_detection_with_predicted_label_and_facet_datasets.ipynb
- bloom-z-176b-few-shot-and-zero-shot-learning.ipynb
- boto3_scikit_retrain_model_and_deploy_to_existing_endpoint.ipynb
- bring_your_own_container.ipynb
- build_gan_with_pytorch.ipynb
- churn-prediction-lightgbm-catboost-tabtransformer-autogluon.ipynb
- custom_dog_image_generator.ipynb
- data_analysis_of_ground_truth_image_classification_output.ipynb
- deepar_chicago_traffic_violations.ipynb
- distilgpt2-tgi.ipynb
- djl_deepspeed_deploy_opt30b.ipynb
- djl_deepspeed_deploy_opt30b_no_custom_inference_code.ipynb
- download_weights.ipynb
- endpoints.ipynb
- explainability_with_pdp.ipynb
- fairness_and_explainability_jsonlines_format.ipynb
- fairness_and_explainability_outputs.ipynb
- falcon-7b-instruction-domain-adaptation-finetuning.ipynb
- feature_store_securely_store_images.ipynb
- financial_payment_classification.ipynb
- forecast_example.ipynb
- frameworks.ipynb
- from_unlabeled_data_to_deployed_machine_learning_model_ground_truth_demo_image_classification.ipynb
- get_started_mnist_train_outputs.ipynb
- gpt2-large-tgi.ipynb
- gpt2-tgi.ipynb
- gpt2-xl-tgi.ipynb
- granite-code-instruct.ipynb
- ground_truth_annotation_dense_point_cloud_tutorial.ipynb
- hello_world_workflow.ipynb
- hf-tgi-flan-t5-xl.ipynb
- image-classification-with-shutterstock-datasets.ipynb
- image-generation-stable-diffusion.ipynb
- ingest_image_data.ipynb
- ingest_tabular_data.ipynb
- ingest_text_data.ipynb
- instant-recommendations.ipynb
- instruction-fine-tuning-flan-t5.ipynb
- kmeans_bring_your_own_model.ipynb
- kmeans_mnist.ipynb
- linear_learner_mnist.ipynb
- open-assistant-chatbot.ipynb
- pyspark-etl-training.ipynb
- pyspark_mnist_custom_estimator.ipynb
- pyspark_mnist_pca_mllib_kmeans.ipynb
- pytorch_mnist_elastic_inference.ipynb
- question_answering_jumpstart_knn.ipynb
- question_answering_pinecone_llama-2_jumpstart.ipynb
- question_answering_text_embedding_llama-2_jumpstart.ipynb
- r_sagemaker_hello_world.ipynb
- r_xgboost_batch_transform.ipynb
- r_xgboost_hpo_batch_transform.ipynb
- risk_bucketing.ipynb
- sagemaker-countycensusclustering.ipynb
- sagemaker-lightgbm-distributed-training-dask.ipynb
- sagemaker-lineage-multihop-queries_outputs.ipynb
- sagemaker-neo-tf-unet.ipynb
- sagemaker_autopilot_abalone_parquet_input.ipynb
- sagemaker_autopilot_direct_marketing.ipynb
- sagemaker_autopilot_neo4j_portfolio_churn.ipynb
- scikit_learn_model_registry_batch_transform.ipynb
- serverless-model-registry.ipynb
- sklearn_multi_model_endpoint_home_value.ipynb
- smp-finetuning-gpt-neox-fsdp-tp.ipynb
- smp-train-gpt-neox-fsdp-tp.ipynb
- sparkml_serving_emr_mleap_abalone.ipynb
- step_functions_mlworkflow_scikit_learn_data_processing_and_model_evaluation.ipynb
- tensorflow_BYOM_iris.ipynb
- text-embedding-sentence-similarity.ipynb
- text-generation-chatbot.ipynb
- text-generation-falcon.ipynb
- text-generation-few-shot-learning.ipynb
- text-generation-open-llama.ipynb
- text2text-generation-Batch-Transform.ipynb
- text2text-generation-bloomz.ipynb
- text2text-generation-flan-t5-ul2.ipynb
- text2text-generation-flan-t5.ipynb
- tf-resnet-profiling-multi-gpu-multi-node.ipynb
- tgi-bloom-560m.ipynb
- tgi-gpt-neox-20b.ipynb
- upgrade_to_v2.ipynb
- using-dataset-product-from-aws-data-exchange-with-ml-model-from-aws-marketplace.ipynb
- vilt-b32-finetuned-vqa.ipynb
- xgboost-census-debugger-rules.ipynb
- xgboost-inference-recommender.ipynb
- xgboost_customer_churn_outputs.ipynb
- xgboost_mnist.ipynb
- xgboost_multi_model_endpoint_home_value.ipynb
.github
_static
_templates
.gitignore
.readthedocs.yml
CODEOWNERS
CONTRIBUTING.md
LICENSE.txt
Makefile
NOTICE
README.md
conf.py
config.json
environment.yml
index.rst
intro.rst
make.bat
new_file_structure_updated_notebook_names_and_folders.xlsx
tox.ini

fil_ensemble

Name		Name	Last commit message	Last commit date
parent directory ..
images		images
model_repository		model_repository
.gitignore		.gitignore
1_prep_rapids_train_xgb.ipynb		1_prep_rapids_train_xgb.ipynb
2_triton_xgb_fil_ensemble.ipynb		2_triton_xgb_fil_ensemble.ipynb
README.md		README.md
create_prep_env.sh		create_prep_env.sh
on_start.sh		on_start.sh

README.md

XGBoost model inference pipeline with NVIDIA Triton Inference Server on Amazon SageMaker

In this example we show an end-to-end GPU-accelerated fraud detection example making use of tree-based models like XGBoost. In the first notebook Data Preprocessing using RAPIDS and Training XGBoost for Fraud Detection we demonstrate GPU-accelerated tabular data preprocessing using RAPIDS and training of XGBoost model for fraud detection on the GPU in SageMaker. Then in the second notebook Pre-processing and XGBoost model inference pipeline with NVIDIA Triton Inference Server on Amazon SageMaker we walk through the process of deploying data preprocessing and XGBoost model inference pipeline for high throughput, low-latency inference on Triton in SageMaker.

Steps to run the notebooks

Launch SageMaker notebook instance with g4dn.xlarge instance. This example can also be run on a SageMaker studio notebook instace but the steps that follow will focus on the notebook instance.
- In Additional Configuration select Create a new lifecycle configuration. Specify rapids-2106 as the name in Configuration Setting and copy paste the on_start.sh script as the lifecycle configuration start notebook script. This will create the RAPIDS kernel for us to use inside SageMaker notebook.
  - For those using AWS on Windows machine, because of the incompatibility between Windows and Unix formatted text, especially in end of line characters you will run into this error if you copy paste on_start.sh script. To prevent that use Notepad++ (or other text editor) to change end of line characters (CRLF to LF) in the on_start.sh script.
    1. Click on Search > Replace (or Ctrl + H)
    2. Find what: \r\n.
    3. Replace with: \n.
    4. Search Mode: select Extended.
    5. Replace All. And then copy paste this into the AWS Lifecycle Configuration Start Notebook UI
- IMPORTANT: In Additional Configuration for Volume Size in GB specify at least 50 GB.
- For git repositories select the option Clone a public git repository to this notebook instance only and specify the Git repository URL https://github.com/aws/amazon-sagemaker-examples/tree/main/sagemaker-triton/fil_ensemble
Once JupyterLab is ready, launch the 1_prep_rapids_train_xgb.ipynb notebook with rapids-2106 conda kernel and run through this notebook to do GPU-accelerated data preprocessing and XGBoost training on credit card transactions dataset for fraud detection use-case. Make sure to use the rapids-2106 kernel for this notebook.
Launch the 2_triton_xgb_fil_ensemble.ipynb notebook using conda_python3 kernel (we don't use RAPIDS in this notebook). Make sure to use the conda_python3 kernel for this notebook. Please note that this notebook requires that the first notebook be run to create the required dependencies. Run through this notebook to learn how to deploy the ensemble data preprocessing + XGBoost model inference pipeline using the Triton's Python and FIL Backends on Triton SageMaker g4dn.xlarge endpoint.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

fil_ensemble

fil_ensemble

README.md

XGBoost model inference pipeline with NVIDIA Triton Inference Server on Amazon SageMaker

Steps to run the notebooks

Files

fil_ensemble

Directory actions

More options

Directory actions

More options

Latest commit

History

fil_ensemble

Folders and files

parent directory

README.md

XGBoost model inference pipeline with NVIDIA Triton Inference Server on Amazon SageMaker

Steps to run the notebooks