Skip to content

An end-to-end production ready AI replica that writes social media posts or technical articles.

License

Notifications You must be signed in to change notification settings

yordanoswuletaw/llm-twin

Repository files navigation

LLM-Twin: Production-Ready AI Replica

An end-to-end production-ready LLM & RAG system.

From data gathering to productionizing LLMs using LLMOps good practices.


Your image description

πŸ€– What is LLM-Twin

LLM-Twin is AI replica that writes social media posts or technical articles (like this one) using your own voice.

πŸͺˆ The architecture of the LLM Twin is split into 4 Python microservices

LLM Twin Architecture

The data collection pipeline

  • Crawl your digital data from various social media platforms, such as Medium, Substack and GitHub.
  • Clean, normalize and load the data to a Mongo NoSQL DB through a series of ETL pipelines.
  • Send database changes to a RabbitMQ queue using the CDC pattern.
  • Learn to package the crawlers as AWS Lambda functions.

The feature pipeline

  • Consume messages in real-time from a queue through a Bytewax streaming pipeline.
  • Every message will be cleaned, chunked, embedded and loaded into a Qdrant vector DB.
  • In the bonus series, we refactor the cleaning, chunking, and embedding logic using Superlinked, a specialized vector compute engine. We will also load and index the vectors to a Redis vector DB.

The training pipeline

  • Create a custom instruction dataset based on your custom digital data to do SFT.
  • Fine-tune an LLM using LoRA or QLoRA.
  • Use Comet ML's experiment tracker to monitor the experiments.
  • Evaluate the LLM using Opik
  • Save and version the best model to the Hugging Face model registry.
  • Run and automate the training pipeline using AWS SageMaker.

The inference pipeline

  • Load the fine-tuned LLM from the Hugging Face model registry.
  • Deploy the LLM as a scalable REST API using AWS SageMaker inference endpoints.
  • Enhance the prompts using advanced RAG techniques.
  • Monitor the prompts and LLM generated results using Opik
  • In the bonus series, we refactor the advanced RAG layer to write more optimal queries using Superlinked.
  • Wrap up everything with a Gradio UI (as seen below) where you can start playing around with the LLM Twin to generate content that follows your writing style.

Gradio UI

Along the 4 microservices, you will learn to integrate 4 serverless tools:

  • Comet ML as your experiment tracker and data registry;
  • Qdrant as your vector DB;
  • AWS SageMaker as your ML infrastructure;
  • Opik as your prompt evaluation and monitoring tool.

πŸŽ“ Prerequisites

Category Requirements
Skills Basic understanding of Python and Machine Learning
Hardware Any modern laptop/workstation will do the job, as the LLM fine-tuning and inference will be done on AWS SageMaker.
Level Intermediate

πŸ—οΈ Project Structure

At Decoding ML we teach how to build production ML systems, thus the course follows the structure of a real-world Python project:

llm-twin-course/
β”œβ”€β”€ src/                     # Source code for all the ML pipelines and services
β”‚ β”œβ”€β”€ data_crawling/         # Data collection pipeline code
β”‚ β”œβ”€β”€ data_cdc/              # Change Data Capture (CDC) pipeline code
β”‚ β”œβ”€β”€ feature_pipeline/      # Feature engineering pipeline code
β”‚ β”œβ”€β”€ training_pipeline/     # Training pipeline code
β”‚ β”œβ”€β”€ inference_pipeline/    # Inference service code
β”‚ └── bonus_superlinked_rag/ # Bonus RAG optimization code
β”œβ”€β”€ .env.example             # Example environment variables template
β”œβ”€β”€ Makefile                 # Commands to build and run the project
β”œβ”€β”€ pyproject.toml           # Project dependencies

πŸš€ Install & Usage

To understand how to install and run the LLM Twin code end-to-end, go to the INSTALL_AND_USAGE dedicated document.

πŸ₯‚ Acknowledgments

This project is based on the LLM Twin: Building Your Production-Ready AI Replica course by Decoding ML . Special thanks to the original authors and contributors for their invaluable course.

License

This course is an open-source project released under the MIT license. Thus, as long you distribute our LICENSE and acknowledge our work, you can safely clone or fork this project and use it as a source of inspiration for whatever you want (e.g., university projects, college degree projects, personal projects, etc.).# llm-twin

About

An end-to-end production ready AI replica that writes social media posts or technical articles.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages