LLM-Twin is AI replica that writes social media posts or technical articles (like this one) using your own voice.
- Crawl your digital data from various social media platforms, such as Medium, Substack and GitHub.
- Clean, normalize and load the data to a Mongo NoSQL DB through a series of ETL pipelines.
- Send database changes to a RabbitMQ queue using the CDC pattern.
- Learn to package the crawlers as AWS Lambda functions.
- Consume messages in real-time from a queue through a Bytewax streaming pipeline.
- Every message will be cleaned, chunked, embedded and loaded into a Qdrant vector DB.
- In the bonus series, we refactor the cleaning, chunking, and embedding logic using Superlinked, a specialized vector compute engine. We will also load and index the vectors to a Redis vector DB.
- Create a custom instruction dataset based on your custom digital data to do SFT.
- Fine-tune an LLM using LoRA or QLoRA.
- Use Comet ML's experiment tracker to monitor the experiments.
- Evaluate the LLM using Opik
- Save and version the best model to the Hugging Face model registry.
- Run and automate the training pipeline using AWS SageMaker.
- Load the fine-tuned LLM from the Hugging Face model registry.
- Deploy the LLM as a scalable REST API using AWS SageMaker inference endpoints.
- Enhance the prompts using advanced RAG techniques.
- Monitor the prompts and LLM generated results using Opik
- In the bonus series, we refactor the advanced RAG layer to write more optimal queries using Superlinked.
- Wrap up everything with a Gradio UI (as seen below) where you can start playing around with the LLM Twin to generate content that follows your writing style.
Along the 4 microservices, you will learn to integrate 4 serverless tools:
- Comet ML as your experiment tracker and data registry;
- Qdrant as your vector DB;
- AWS SageMaker as your ML infrastructure;
- Opik as your prompt evaluation and monitoring tool.
Category | Requirements |
---|---|
Skills | Basic understanding of Python and Machine Learning |
Hardware | Any modern laptop/workstation will do the job, as the LLM fine-tuning and inference will be done on AWS SageMaker. |
Level | Intermediate |
At Decoding ML we teach how to build production ML systems, thus the course follows the structure of a real-world Python project:
llm-twin-course/
βββ src/ # Source code for all the ML pipelines and services
β βββ data_crawling/ # Data collection pipeline code
β βββ data_cdc/ # Change Data Capture (CDC) pipeline code
β βββ feature_pipeline/ # Feature engineering pipeline code
β βββ training_pipeline/ # Training pipeline code
β βββ inference_pipeline/ # Inference service code
β βββ bonus_superlinked_rag/ # Bonus RAG optimization code
βββ .env.example # Example environment variables template
βββ Makefile # Commands to build and run the project
βββ pyproject.toml # Project dependencies
To understand how to install and run the LLM Twin code end-to-end, go to the INSTALL_AND_USAGE dedicated document.
This project is based on the LLM Twin: Building Your Production-Ready AI Replica course by Decoding ML . Special thanks to the original authors and contributors for their invaluable course.
This course is an open-source project released under the MIT license. Thus, as long you distribute our LICENSE and acknowledge our work, you can safely clone or fork this project and use it as a source of inspiration for whatever you want (e.g., university projects, college degree projects, personal projects, etc.).# llm-twin