Skip to content

Daethyra/Build-RAGAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build-RAGAI

Description

This project seeks to teach you how to build Python applications with generative AI functionality by using the LangChain and Transformers libraries.

While there is a section for OpenAI, most of the code that previously existed there has been repurposed and integrated with either the LangChain or Transformers libraries. This project includes code snippets, packages examples, and jupyter notebooks that you can augment, copy, or learn from respectively.

If you're new to building AI-powered applications, I suggest you start by playing with and executing the code in the LangChain notebooks. Seeing the code in action, editing it yourself, and creatively brainstorming new ideas is the best way to learn.

Table of Contents

Below you'll find links to, and descriptions of, sections of this project for easy navigation.

This README:

LangChain:

  • Code Snippets: Here you'll find pluggable Python components.

  • Notebooks: Here you'll find Jupyter notebooks that guide you through the use of many different LangChain classes.

    • MergedDataLoader: Learn how to embed and query multiple data sources via MergedDataLoader. In this notebook, we learn how to clone GitHub repositories and scrape web documentation before embedding them into a vectorstore which we then use as a retriever. By the end of it, you should be comfortable using whatever sources as context in your own RAG projects.
    • Custom Tools: Learn how to create and use custom tools in LangChain agents.
    • Image Generation and Captioning + Video Generation: Learn to create an agent that chooses which generative tool to use based on your prompt. This example begins with the agent generating an image after refining the user's prompt.
    • LangSmith Walkthrough: Learn how to use LangSmith tracing and pull prompts fromt he LangSmith Hub.
    • Retrieval Augmented Generation: Get started with Retrieval Augmented Generation to enhance the performance of your LLM.
    • MongoDB RAG: Perform similarity searching, metadata filtering, and question-answering with MongoDB.
    • Pinecone and ChromaDB: A more basic but thorough walkthrough of performing retrieval augmented generation with two different vectorstores.
    • FAISS and the HuggingFaceHub: Learn how to use FAISS indexes for similarity search with HuggingFaceHub embeddings. This example is a privacy friendly option, as everything runs locally. No GPU required!
    • Runnables and Chains (LangChain Expression Language): Learn the difference of and how to use Runnables and Chains in LangChain. Here you'll dive deep into their specifics.
  • End to End Examples: Here you'll find scripts made to work out of the box.

OpenAI:

  • Code Snippets: Here you'll find code snippets using the OpenAI Python library.

  • Notebooks: Here you'll find Jupyter notebooks that show you how to use the OpenAI Python library.

Transformers:


Getting Started

Installation

Local Code Execution and Testing

This project is developed using PDM. You can install PDM using pip:

Start by navigating to the root directory of this project, then run:

pip install -U pdm

Then you'll need to install the dependencies using PDM:

pdm install

This command will create a virtual environment in .venv and install the dependencies in that environment. If you're on macOS or Linux, you can run source .venv/bin/activate to activate the environment. Otherwise, you can run the command .venv/Scripts/activate or .venv/Scripts/activate.ps1 to activate the environment.

By using a virtual environment we avoid cross contaminating our global Python environment.

Once our virtual environment is set up we need to select it as our kernel for the Jupyter Notebook. If you're in VSCode, you can do this at the top right of the notebook. If you're using a different IDE, you'll need to look for setup help online.

When selecting the kernel, ensure you choose the one that's located inside of the .venv directory, and not the global Python environment.


Test Your First Notebook

If you're totally new to building AI powered applications with access to external data, specifically retrieval augmented generation, check out the RAG Basics notebook. It's the most straightforward notebook, and its concepts are built upon in every other 'RAG' notebook.

Google Colab

Click the badge below to open the RAG Basics notebook in Colab.

Open 'rag_basics.ipynb' In Colab