Streamlining Serverless ML Inference with Candle in Rust

Introduction

This repository is a companion to the Medium post: "Streamlining Serverless ML Inference: Unleashing Candle Framework's Power in Rust". It provides a practical guide to implementing a vector embedding and search REST service using the Candle framework and Axum in Rust.

The post addresses challenges in scaling machine learning models for high-throughput, low-latency environments. It explores using the Candle framework, a Rust-based minimalist ML framework focused on performance and ease of use, ideal for cloud native serverless environments.

Feel free to examine other branches which also add a few LLMs such OpenChat and Phi to the toolset here

Post (and Repository) Contents

Section 2: Design and components of the vector embedding and search service.
Section 3: Detailed implementation using the Candle framework with a Bert model.
Section 4: Wrapping the model inference in a REST web service using Axum.
Section 5: Creation of embedding artifacts and service setup.
Section 6: Conclusion and further insights.

Acknowledgements

Special thanks to Hugging Face for the Candle framework and examples

For a detailed walkthrough and insights into the challenges and solutions of serverless ML inference at scale, read the full post on Medium.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.idea		.idea
embedding_generator		embedding_generator
inf_server		inf_server
models_hf		models_hf
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.MD		README.MD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

embedding_generator

embedding_generator

inf_server

inf_server

models_hf

models_hf

.gitignore

.gitignore

Cargo.lock

Cargo.lock

Cargo.toml

Cargo.toml

LICENSE

LICENSE

README.MD

README.MD

Repository files navigation

Streamlining Serverless ML Inference with Candle in Rust

Introduction

Post (and Repository) Contents

Acknowledgements

About

Releases

Packages

Languages

License

a-agmon/candle_demo_1-1

Folders and files

Latest commit

History

Repository files navigation

Streamlining Serverless ML Inference with Candle in Rust

Introduction

Post (and Repository) Contents

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Languages