diff --git a/content/learning-paths/servers-and-cloud-computing/rag/_index.md b/content/learning-paths/servers-and-cloud-computing/rag/_index.md index fc7c9a1377..ebfe968750 100644 --- a/content/learning-paths/servers-and-cloud-computing/rag/_index.md +++ b/content/learning-paths/servers-and-cloud-computing/rag/_index.md @@ -3,11 +3,11 @@ title: Deploy a RAG-based Chatbot with llama-cpp-python using KleidiAI on Arm Se minutes_to_complete: 45 -who_is_this_for: This Learning Path is for software developers, ML engineers, and those looking to deploy production-ready LLM chatbots with RAG capabilities, knowledge base integration, and performance optimization for Arm Architecture. +who_is_this_for: This Learning Path is for software developers, ML engineers, and those looking to deploy production-ready LLM chatbots with Retrieval Augmented Generation (RAG) capabilities, knowledge base integration, and performance optimization for Arm Architecture. learning_objectives: - Set up llama-cpp-python optimized for Arm servers. - - Implement RAG architecture using the FAISS vector database. + - Implement RAG architecture using the Facebook AI Similarity Search (FAISS) vector database. - Optimize model performance through 4-bit quantization. - Build a web interface for document upload and chat. - Monitor and analyze inference performance metrics. diff --git a/content/learning-paths/servers-and-cloud-computing/rag/chatbot.md b/content/learning-paths/servers-and-cloud-computing/rag/chatbot.md index 8e659b8a41..fbd872adf5 100644 --- a/content/learning-paths/servers-and-cloud-computing/rag/chatbot.md +++ b/content/learning-paths/servers-and-cloud-computing/rag/chatbot.md @@ -13,9 +13,14 @@ Open the web application in your browser using either the local URL or the exter http://localhost:8501 or http://75.101.253.177:8501 ``` +{{% notice Note %}} + +To access the links you may need to allow inbound TCP traffic in your instance's security rules. Always review these permissions with caution as they may introduce security vulnerabilities. + +{{% /notice %}} ## Upload a PDF File and Create a New Index -Now you can upload a PDF file in the web browser by selecting the **Create New Store** option. +Now you can upload a PDF file in the web browser by selecting the **Create New Store** option. Follow these steps to create a new index: diff --git a/content/learning-paths/servers-and-cloud-computing/rag/rag_llm.md b/content/learning-paths/servers-and-cloud-computing/rag/rag_llm.md index 9551af2cb1..7725d7658e 100644 --- a/content/learning-paths/servers-and-cloud-computing/rag/rag_llm.md +++ b/content/learning-paths/servers-and-cloud-computing/rag/rag_llm.md @@ -14,7 +14,7 @@ This learning path demonstrates how to build and deploy a Retrieval Augmented Ge ## Overview -In this Learning Path, you learn how to build a Retrieval Augmented Generation (RAG) chatbot using llama-cpp-python, a Python binding for llama.cpp that enables efficient LLM inference on Arm CPUs. +In this Learning Path, you learn how to build a RAG chatbot using llama-cpp-python, a Python binding for llama.cpp that enables efficient LLM inference on Arm CPUs. The tutorial demonstrates how to integrate the FAISS vector database with the Llama-3.1-8B model for document retrieval, while leveraging llama-cpp-python's optimized C++ backend for high-performance inference.