Batch serving

Batch inference is about using data distributed processing infrastructure to carry out inference asynchronously on a large number of instances at once.

What to optimize: throughput, not latency-sensitive

End user: usually no direct interactions with a model. User interacts with the predictions stored in a data storage as a result of the batch jobs.

Validation: offline

Where to start

Learn MLOps general concepts:

MLOps

Next learn how to build and run pipelines for batch serving on Azure cloud:

or overall:

Next step: Advanced workshop: Azure Batch Serving Pipelines

This workshop is WIP

It will cover a real-life use case of building, publishing, scheduling and troubleshooting Batch Serving pipelines on Azure with Python runtime.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Batch serving

Where to start

Next step: Advanced workshop: Azure Batch Serving Pipelines

About

License

EzheZhezhe/ML-Batch-Serving

Folders and files

Latest commit

History

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Batch serving

Where to start

Next step: Advanced workshop: Azure Batch Serving Pipelines

About

Topics

Resources

License

Stars

Watchers

Forks