An end-to-end production-ready LLM & RAG system.

LLM-Twin: Production-Ready AI Replica

An end-to-end production-ready LLM & RAG system.

From data gathering to productionizing LLMs using LLMOps good practices.

🤖 What is LLM-Twin

LLM-Twin is AI replica that writes social media posts or technical articles (like this one) using your own voice.

🪈 The architecture of the LLM Twin is split into 4 Python microservices

The data collection pipeline

Crawl your digital data from various social media platforms, such as Medium, Substack and GitHub.
Clean, normalize and load the data to a Mongo NoSQL DB through a series of ETL pipelines.
Send database changes to a RabbitMQ queue using the CDC pattern.
Learn to package the crawlers as AWS Lambda functions.

The feature pipeline

Consume messages in real-time from a queue through a Bytewax streaming pipeline.
Every message will be cleaned, chunked, embedded and loaded into a Qdrant vector DB.
In the bonus series, we refactor the cleaning, chunking, and embedding logic using Superlinked, a specialized vector compute engine. We will also load and index the vectors to a Redis vector DB.

The training pipeline

Create a custom instruction dataset based on your custom digital data to do SFT.
Fine-tune an LLM using LoRA or QLoRA.
Use Comet ML's experiment tracker to monitor the experiments.
Evaluate the LLM using Opik
Save and version the best model to the Hugging Face model registry.
Run and automate the training pipeline using AWS SageMaker.

The inference pipeline

Load the fine-tuned LLM from the Hugging Face model registry.
Deploy the LLM as a scalable REST API using AWS SageMaker inference endpoints.
Enhance the prompts using advanced RAG techniques.
Monitor the prompts and LLM generated results using Opik
In the bonus series, we refactor the advanced RAG layer to write more optimal queries using Superlinked.
Wrap up everything with a Gradio UI (as seen below) where you can start playing around with the LLM Twin to generate content that follows your writing style.

Along the 4 microservices, you will learn to integrate 4 serverless tools:

Comet ML as your experiment tracker and data registry;
Qdrant as your vector DB;
AWS SageMaker as your ML infrastructure;
Opik as your prompt evaluation and monitoring tool.

🎓 Prerequisites

Category	Requirements
Skills	Basic understanding of Python and Machine Learning
Hardware	Any modern laptop/workstation will do the job, as the LLM fine-tuning and inference will be done on AWS SageMaker.
Level	Intermediate

🏗️ Project Structure

At Decoding ML we teach how to build production ML systems, thus the course follows the structure of a real-world Python project:

llm-twin-course/
├── src/                     # Source code for all the ML pipelines and services
│ ├── data_crawling/         # Data collection pipeline code
│ ├── data_cdc/              # Change Data Capture (CDC) pipeline code
│ ├── feature_pipeline/      # Feature engineering pipeline code
│ ├── training_pipeline/     # Training pipeline code
│ ├── inference_pipeline/    # Inference service code
│ └── bonus_superlinked_rag/ # Bonus RAG optimization code
├── .env.example             # Example environment variables template
├── Makefile                 # Commands to build and run the project
├── pyproject.toml           # Project dependencies

🚀 Install & Usage

To understand how to install and run the LLM Twin code end-to-end, go to the INSTALL_AND_USAGE dedicated document.

🥂 Acknowledgments

This project is based on the LLM Twin: Building Your Production-Ready AI Replica course by Decoding ML . Special thanks to the original authors and contributors for their invaluable course.

License

This course is an open-source project released under the MIT license. Thus, as long you distribute our LICENSE and acknowledge our work, you can safely clone or fork this project and use it as a source of inspiration for whatever you want (e.g., university projects, college degree projects, personal projects, etc.).# llm-twin

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.docker		.docker
.github		.github
data		data
media		media
src		src
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
INSTALL_AND_USAGE.md		INSTALL_AND_USAGE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose-superlinked.yml		docker-compose-superlinked.yml
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM-Twin: Production-Ready AI Replica

An end-to-end production-ready LLM & RAG system.

From data gathering to productionizing LLMs using LLMOps good practices.

🤖 What is LLM-Twin

🪈 The architecture of the LLM Twin is split into 4 Python microservices

The data collection pipeline

The feature pipeline

The training pipeline

The inference pipeline

🎓 Prerequisites

🏗️ Project Structure

🚀 Install & Usage

🥂 Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

yordanoswuletaw/llm-twin

Folders and files

Latest commit

History

Repository files navigation

LLM-Twin: Production-Ready AI Replica

An end-to-end production-ready LLM & RAG system.

From data gathering to productionizing LLMs using LLMOps good practices.

🤖 What is LLM-Twin

🪈 The architecture of the LLM Twin is split into 4 Python microservices

The data collection pipeline

The feature pipeline

The training pipeline

The inference pipeline

🎓 Prerequisites

🏗️ Project Structure

🚀 Install & Usage

🥂 Acknowledgments

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages