Files

sagemaker-core
end_to_end_ml_lifecycle
prepare_data
build_and_train_models
deploy_and_monitor
generative_ai
ml_ops
responsible_ai
archived
- AutoML_-_Train_multiple_models_in_parallel
- Image_Classification_VIT
- JPMML_Models_SageMaker
- RestRServe_Example
- Text_Classification_BERT
- albert-base-v2
- amazon_comprehend_sagemaker_pipeline
- amazon_demo_product
- athena_ml_workflow_end_to_end
- auto-scaling
- autogluon-sagemaker-pipeline
- autogluon-tabular-containers
- autogluon-tabular
- autogluon_tabular_marketplace
- automate_model_retraining_workflow
- automating_auto_insurance_claim_processing
- autopilot-serverless-inference
- bandits_recsys_movielens_testbed
- bandits_statlog_vw_customEnv
- basic-training-container
- bedrock-examples
- bert_attention_head_view
- bert_trition_backend
- byoc-nginx-python
- causal-inference
- churn_prediction_multimodality_of_text_and_tabular
- clarify-explainability-inference-pipelines
- computer-vision-examples
- creative-writing-using-gpt-2-text-generation
- credit_card_fraud_detector
- curating_aws_marketplace_listing_and_sample_notebook
- custom-feature-selection
- custom_tensorflow_inference_script_csv_and_tfrecord
- customer_churn
- customizing_build_train_deploy_project
- data_parallel_bert
- deep_demand_forecasting
- deploy_all_options_xgb
- deploy_huggingface_model_on_Inf1_instance
- deploy_pytorch_model_on_Inf1_instance
- distributed_tensorflow_mask_rcnn
- dw_flow
- end_to_end_music_recommendation
- evaluating_aws_marketplace_models_for_person_counting_use_case
- fairness_and_explainability
- fairness_and_explainability_json
- fairness_and_explainability_json_format
- fairness_and_explainability_jsonlines
- fairness_and_explainability_spark
- fairseq_sagemaker_translate_en2fr
- falcon
- fil_ensemble
- framework-container
- frameworks-tensorflow
- fraud_detection
- fraud_detection_using_graph_neural_networks
- fully_sharded_data_parallel-falcon
- geospatial
- getting_started
- gluon_recommender_system
- gluoncv_yolo_neo
- hf-tgi-bloom7b1
- huggingface-inference-recommender
- huggingface-large-model-inference-santacoder
- huggingface_deploy_instructpix2pix
- huggingface_multiclass_text_classification_20_newsgroups
- huggingface_sentiment_classification
- huggingface_text_classification
- identify_key_insights_from_textual_document
- implicit_bpr
- improving_industrial_workplace_safety
- inference-benchmarking
- inference-recommender-with-python-sdk
- inference_pipeline_custom_containers
- ingest-with-aws-services
- jit_trace
- keras_bring_your_own
- keras_script_mode_pipe_mode_horovod
- language-modeling
- llm_monitor_byoc
- lmi-aitemplate-stablediff
- local_experiment_tracking
- machine_learning_workflow_abalone
- managed_spot_training_tensorflow_estimator
- ml-lifecycle
- mme-on-gpu
- mnist
- model_monitor_tensorflow
- monitoring_data_quality_of_models
- mpi_on_sagemaker
- multi_modal_parallel_sagemaker_labeling_workflows_with_step_functions
- multi_model_catboost
- multi_model_linear_learner_home_value
- multi_model_pytorch
- multicategory_sec
- multimodal_tabtext
- mxnet_distributed_mnist_neo_inf1
- mxnet_onnx_ei
- nas_for_llm_with_amt
- nlp_mlops_company_sentiment
- nlp_score_dashboard_sec
- notebook-job-step
- object_detection_with_tensorflow_and_tfrecords
- onnx-roberta-backend
- parameterize-spark-config-pysparkprocessor-pipeline
- pipe_bring_your_own
- prep_data
- preprocessing-audio-data-using-a-machine-learning-model
- product_ratings_with_pipelines
- pyspark_mnist
- python-sdk
- pytorch-ic-model
- pytorch-sagemaker-huggingface
- pytorch
- pytorch_cnn_cifar10
- pytorch_deploy_pretrained_bert_model
- pytorch_extend_container_train_deploy_bertopic
- pytorch_horovod_mnist
- pytorch_multiple_gpu_single_node
- pytorch_smdataparallel_mnist_demo
- pytorch_torchvision
- pytorch_triton_inference_recommender
- pytorch_yolov5_training_and_hpo
- r_byo_r_algo_hpo
- r_serving_with_fastapi
- r_serving_with_plumber
- rapids_bring_your_own
- resnet50
- resnet_onnx_backend_SME_triton_v2
  - images
  - workspace
  - README.md
  - resnet_onnx_backend_SME_triton_v2.ipynb
- resnet_onnx_pytorch_tensorRT-backend
- resnet_pytorch_python-backend
- retail_recommend
- rl_gamerserver_ray
- rl_hvac_coach_energyplus
- rl_mountain_car_coach_gymEnv
- rl_stock_trading_coach_customEnv
- rl_traveling_salesman_vehicle_routing_coach
- rl_unity_ray
- roberta-base
- roberta_traced_triton
- sagemaker-autopilot-pipelines
- sagemaker-debugger
- sagemaker-featurestore
- sagemaker-huggingface-tgi-hosting-examples
- sagemaker-lineage
- sagemaker-pipeline-compare-model-versions
- sagemaker-pipeline-multi-model
- sagemaker-pipeline-parameterization
- sagemaker-script-mode
- sagemaker_clarify_integration
- sagemaker_job_tracking
- sagemaker_pytorch_model_zoo
- scientific_details_of_algorithms
- scikit_learn_bring_your_own_model
- scikit_learn_data_processing_and_model_evaluation
- scikit_learn_iris
- script-mode-container-2
- script-mode-container
- sentiment_parallel_batch
- seq2seq_translation_en-de
- shadow-console
- single_gpu_single_node
- sklearn-inference-recommender
- sklearn
- sm-train_a_pytorch_model
- smddp_deepspeed_example
- sme_resnet_pytorch_python-backend
- smp-gpt-sharded-data-parallel
- smp-train-gpt-neox-sharded-data-parallel
- smp-train-gptj-sharded-data-parallel-tp
- smp-train-t5-sharded-data-parallel
- stable_diffusion
- streamlit_demo
- studio-scheduling
- t5_pytorch_python-backend
- tensorboard_keras
- tensorflow-cloudwatch
- tensorflow
- tensorflow2-california-housing-sagemaker-pipelines-deploy-endpoint
- tensorflow2_mnist
- tensorflow_action_on_rule
- tensorflow_moving_from_framework_mode_to_script_mode
- tensorflow_open-images_jpg
- tensorflow_profiling
- tensorflow_script_mode_quickstart
- tensorflow_script_mode_training_and_serving
- tensorflow_serving_using_elastic_inference_with_your_own_model
- tensorflow_single_gpu_single_node
- tensort-rt
- text-to-image-fine-tuning
- text_explainability_sagemaker_algorithm
- tf-dali-ensemble-cv
- time_series_deepar
- time_series_forecasting
- timeseries-quantile-selection-dataflow
- training_pipeline_pytorch_mnist
- triton-cv-mme-tensorflow-backend
- using_step_decorator_with_selective_execution
- vision-transformer
- visual_object_detection
- visualization
- workshops
- xgboost_abalone
- xgboost_bring_your_own
- xgboost_ensemble_python-fil-backend
- xgboost_parquet_input_training
- 2_object_detection_train_eval.ipynb
- 3D-point-cloud-input-data-processing.ipynb
- Amazon_JumpStart_Image_Classification.ipynb
- Amazon_JumpStart_Image_Classification_Benchmarking.ipynb
- Amazon_JumpStart_Image_Embedding.ipynb
- Amazon_JumpStart_Inpainting.ipynb
- Amazon_JumpStart_Instance_Segmentation.ipynb
- Amazon_JumpStart_Machine_Translation.ipynb
- Amazon_JumpStart_NLP_Regression_Free_Training.ipynb
- Amazon_JumpStart_Named_Entity_Recognition.ipynb
- Amazon_JumpStart_Object_Detection.ipynb
- Amazon_JumpStart_Question_Answering.ipynb
- Amazon_JumpStart_Regression_Free_Training.ipynb
- Amazon_JumpStart_Semantic_Segmentation.ipynb
- Amazon_JumpStart_Semantic_Segmentation_Extract_Image.ipynb
- Amazon_JumpStart_Sentence_Pair_Classification.ipynb
- Amazon_JumpStart_Text_Classification.ipynb
- Amazon_JumpStart_Text_Generation.ipynb
- Amazon_JumpStart_Text_Summarization.ipynb
- Amazon_JumpStart_Upscaling.ipynb
- Amazon_JumpStart_Zero_Shot_Text_Classification.ipynb
- Amazon_Jumpstart_AlexaTM_20B.ipynb
- Amazon_Tabular_Classification_AutoGluon.ipynb
- Amazon_Tabular_Regression_LightGBM_CatBoost.ipynb
- Amazon_Tabular_Regression_TabTransformer.ipynb
- Amazon_TensorFlow_Image_Classification.ipynb
- Amazon_Tensorflow_Object_Detection.ipynb
- Batch Transform - breast cancer prediction with high level SDK.ipynb
- Batch Transform - breast cancer prediction with lowel level SDK.ipynb
- DeployStableCascade.ipynb
- Dynamic Pricing with Causal Machine Learning and Optimization on Amazon SageMaker.ipynb
- EnsembleLearnerCensusIncome.ipynb
- GPT-J-6B-model-parallel-inference-DJL.ipynb
- GPT-J-6B_DJLServing_with_PySDK.ipynb
- GT_semantic_segmentation_to_COCO.ipynb
- HPO_Analyze_TuningJob_Results.ipynb
- HuggingFace-Async-Inference-Walkthrough.ipynb
- JumpStart_Stable_Diffusion_Inference_Only.ipynb
- Linear_Learner_Regression_csv_format.ipynb
- R_binary_classification_algorithms_comparison.ipynb
- SEC_Retrieval_Summarizer_Scoring.ipynb
- SageMaker-Monitoring-Bias-Drift-for-Batch-Transform-JSON-Lines.ipynb
- SageMaker-Monitoring-Bias-Drift-for-Batch-Transform.ipynb
- SageMaker_Keyspaces_ml_example.ipynb
- Sklearn_on_SageMaker_end2end.ipynb
- Transcription_on_SM_endpoint.ipynb
- algorithms.ipynb
- ap-batch-transform.ipynb
- automatic-speech-recognition.ipynb
- autopilot_customer_churn_high_level_with_evaluation.ipynb
- autopilot_ts_data_merge.ipynb
- bias_detection_with_predicted_label_and_facet_datasets.ipynb
- bloom-z-176b-few-shot-and-zero-shot-learning.ipynb
- boto3_scikit_retrain_model_and_deploy_to_existing_endpoint.ipynb
- bring_your_own_container.ipynb
- build_gan_with_pytorch.ipynb
- churn-prediction-lightgbm-catboost-tabtransformer-autogluon.ipynb
- custom_dog_image_generator.ipynb
- data_analysis_of_ground_truth_image_classification_output.ipynb
- deepar_chicago_traffic_violations.ipynb
- distilgpt2-tgi.ipynb
- djl_deepspeed_deploy_opt30b.ipynb
- djl_deepspeed_deploy_opt30b_no_custom_inference_code.ipynb
- download_weights.ipynb
- endpoints.ipynb
- explainability_with_pdp.ipynb
- fairness_and_explainability_jsonlines_format.ipynb
- fairness_and_explainability_outputs.ipynb
- falcon-7b-instruction-domain-adaptation-finetuning.ipynb
- feature_store_securely_store_images.ipynb
- financial_payment_classification.ipynb
- forecast_example.ipynb
- frameworks.ipynb
- from_unlabeled_data_to_deployed_machine_learning_model_ground_truth_demo_image_classification.ipynb
- get_started_mnist_train_outputs.ipynb
- gpt2-large-tgi.ipynb
- gpt2-tgi.ipynb
- gpt2-xl-tgi.ipynb
- granite-code-instruct.ipynb
- ground_truth_annotation_dense_point_cloud_tutorial.ipynb
- hello_world_workflow.ipynb
- hf-tgi-flan-t5-xl.ipynb
- image-classification-with-shutterstock-datasets.ipynb
- image-generation-stable-diffusion.ipynb
- ingest_image_data.ipynb
- ingest_tabular_data.ipynb
- ingest_text_data.ipynb
- instant-recommendations.ipynb
- instruction-fine-tuning-flan-t5.ipynb
- kmeans_bring_your_own_model.ipynb
- kmeans_mnist.ipynb
- linear_learner_mnist.ipynb
- open-assistant-chatbot.ipynb
- pyspark-etl-training.ipynb
- pyspark_mnist_custom_estimator.ipynb
- pyspark_mnist_pca_mllib_kmeans.ipynb
- pytorch_mnist_elastic_inference.ipynb
- question_answering_jumpstart_knn.ipynb
- question_answering_pinecone_llama-2_jumpstart.ipynb
- question_answering_text_embedding_llama-2_jumpstart.ipynb
- r_sagemaker_hello_world.ipynb
- r_xgboost_batch_transform.ipynb
- r_xgboost_hpo_batch_transform.ipynb
- risk_bucketing.ipynb
- sagemaker-countycensusclustering.ipynb
- sagemaker-lightgbm-distributed-training-dask.ipynb
- sagemaker-lineage-multihop-queries_outputs.ipynb
- sagemaker-neo-tf-unet.ipynb
- sagemaker_autopilot_abalone_parquet_input.ipynb
- sagemaker_autopilot_direct_marketing.ipynb
- sagemaker_autopilot_neo4j_portfolio_churn.ipynb
- scikit_learn_model_registry_batch_transform.ipynb
- serverless-model-registry.ipynb
- sklearn_multi_model_endpoint_home_value.ipynb
- smp-finetuning-gpt-neox-fsdp-tp.ipynb
- smp-train-gpt-neox-fsdp-tp.ipynb
- sparkml_serving_emr_mleap_abalone.ipynb
- step_functions_mlworkflow_scikit_learn_data_processing_and_model_evaluation.ipynb
- tensorflow_BYOM_iris.ipynb
- text-embedding-sentence-similarity.ipynb
- text-generation-chatbot.ipynb
- text-generation-falcon.ipynb
- text-generation-few-shot-learning.ipynb
- text-generation-open-llama.ipynb
- text2text-generation-Batch-Transform.ipynb
- text2text-generation-bloomz.ipynb
- text2text-generation-flan-t5-ul2.ipynb
- text2text-generation-flan-t5.ipynb
- tf-resnet-profiling-multi-gpu-multi-node.ipynb
- tgi-bloom-560m.ipynb
- tgi-gpt-neox-20b.ipynb
- upgrade_to_v2.ipynb
- using-dataset-product-from-aws-data-exchange-with-ml-model-from-aws-marketplace.ipynb
- vilt-b32-finetuned-vqa.ipynb
- xgboost-census-debugger-rules.ipynb
- xgboost-inference-recommender.ipynb
- xgboost_customer_churn_outputs.ipynb
- xgboost_mnist.ipynb
- xgboost_multi_model_endpoint_home_value.ipynb
.github
_static
_templates
.gitignore
.readthedocs.yml
CODEOWNERS
CONTRIBUTING.md
LICENSE.txt
Makefile
NOTICE
README.md
conf.py
config.json
environment.yml
index.rst
intro.rst
make.bat
new_file_structure_updated_notebook_names_and_folders.xlsx
tox.ini

resnet_onnx_backend_SME_triton_v2

Name		Name	Last commit message	Last commit date
parent directory ..
images		images
workspace		workspace
README.md		README.md
resnet_onnx_backend_SME_triton_v2.ipynb		resnet_onnx_backend_SME_triton_v2.ipynb

README.md

Serve an ResNet-50 ONNX model on GPU with Amazon SageMaker endpoint (SME) with Triton

In this example, we will walk you through how to use NVIDIA Triton Inference Server on Amazon SageMaker SME with GPU to deploy Resnet-50 ONNX for Image Classification.

Steps to run the notebook

Launch SageMaker notebook instance with g4dn.xlarge instance. This example can also be run on a SageMaker studio notebook instance but the steps that follow will focus on the notebook instance.
- For git repositories select the option Clone a public git repository to this notebook instance only and specify the Git repository URL
Once JupyterLab is ready, launch the resnet_onnx_backend_SME_triton_v2.ipynb notebook with conda_python3 conda kernel and run through this notebook to learn how to host multiple CV models on g4dn.xlarge GPU behind SME endpoint.

Note This notebook was tested with the conda_pytorch_p39 kernel on an Amazon SageMaker notebook instance of type g4dn.xlarge. It is a modified version of the original version of this sample notebook Here by Vikram Elango.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

resnet_onnx_backend_SME_triton_v2

resnet_onnx_backend_SME_triton_v2

README.md

Serve an ResNet-50 ONNX model on GPU with Amazon SageMaker endpoint (SME) with Triton

Steps to run the notebook

Files

resnet_onnx_backend_SME_triton_v2

Directory actions

More options

Directory actions

More options

Latest commit

History

resnet_onnx_backend_SME_triton_v2

Folders and files

parent directory

README.md

Serve an ResNet-50 ONNX model on GPU with Amazon SageMaker endpoint (SME) with Triton

Steps to run the notebook