# RAG DEVELOPMENT
## Create Virtual Environment 

conda config --add channels conda-forge

conda config --set channel_priority strict

conda create --name lil python=3.11.4

conda activate lil

## Add relevant libraries 

conda install ipykernel jupyter jupyterlab

# Choosing an LLM and Embeddings Provider

Several options are available to you for choosing an LLM and embeddings provider.

You can choose from companies that build and serve their own LLMs, like:

- [OpenAI](https://platform.openai.com/docs/models)

- [Anthropic](https://docs.anthropic.com/claude/docs/models-overview)

- [Cohere](https://docs.cohere.com/docs/the-cohere-platform)

- [Mistral](https://docs.mistral.ai/platform/pricing/)

- [Google Gemini](https://ai.google.dev/)

Or you can choose from companies that host and serve open-source models via an API, like:

- [Fireworks AI](https://fireworks.ai/models)

- [Together AI](https://www.together.ai/pricing)

- [Predibase](https://docs.predibase.com/user-guide/inference/models)

- [Hugging Face](https://huggingface.co/docs/text-generation-inference/en/supported_models)

- [Basten](https://www.baseten.co/library/)

- [Replicate](https://replicate.com/collections/language-models)

- [Lepton AI](https://www.lepton.ai/docs)

- [Clarifai](https://clarifai.com/explore/models)

With countless other providers continuously entering the market and trying to capture a share.

LlamaIndex has integrations with dozens of LLM and Embeddings providers. You can see them all [here](https://github.com/run-llama/llama_index/tree/main/llama-index-integrations/llms).

Whatever you end up choosing, the installation process will go something like this:

# Choosing a Vector Database

Without a doubt, vector databases are an essential part of any RAG system. And, as the image below shows...there are a lot to choose from.

So, how should you go about choosing a vector database for Retrieval-Augmented Generation (RAG) applications?

<img src="https://miro.medium.com/v2/resize:fit:1400/format:webp/0*iAX7Y3NfVOtn0dOr.png">
[Image Source](https://blog.det.life/why-you-shouldnt-invest-in-vector-databases-c0cd3f59d23c)

Well, there are several key factors to consider...

## 🚀 Similarity search performance
 
RAG relies heavily on efficient similarity search to retrieve relevant documents or passages. The vector database should provide fast and accurate similarity search capabilities, such as cosine similarity or Euclidean distance, to quickly retrieve relevant information.

## 📊 Scalability

As the amount of data grows, the vector database should be able to scale horizontally and handle large-scale indexing and querying. It should efficiently store and manage high-dimensional vectors and support distributed search across multiple nodes if necessary.

## 🔗 Integration with LLM frameworks

The vector database should integrate well with popular LLM orchestration frameworks like LlamaInde, LangChain, or Instructor. This integration allows seamless interaction between the RAG model and the vector database, enabling efficient retrieval and generation.

## 🌐 Support for various data types

RAG applications may deal with different data types, such as text, images, or audio. The vector database should support storing and indexing vectors derived from various data types, allowing flexibility in the data types that can be retrieved.

## 🛠️ Indexing and updating capabilities

The vector database should provide efficient indexing mechanisms to quickly build and update the index as new data is added or modified. It should handle incremental updates and support real-time indexing if required by the application.

## 🔍 Retrieval flexibility

The vector database should offer flexibility in retrieval options, such as specifying the number of nearest neighbors to retrieve, setting similarity thresholds, or applying filters based on metadata. This flexibility allows fine-tuning the retrieval process based on the specific requirements of the RAG application.

## 💾 Data persistence and reliability

The vector database should ensure data persistence and provide data backup and recovery mechanisms. It should be reliable and able to handle potential failures or data loss scenarios.

## 🫂 Community support and documentation

Consider the level of community support and documentation available for the vector database. An active community and comprehensive documentation can greatly assist in troubleshooting, optimizing performance, and staying updated with the latest features and best practices.

## 🎮 Ease of use and deployment

The vector database should be easy to set up, configure, and deploy. It should provide clear APIs or client libraries for integration with the RAG application and have straightforward deployment options, whether on-premises or in the cloud.

## 💰 Cost and licensing

Consider the vector database's cost and licensing model. Consider pricing, scalability costs, and any limitations or restrictions imposed by the licensing terms.

# We're using Qdrant in this course

For all the reasons mentioned above.

Not only that, but...

- 🦙 Qdrant is [one of the most popular vector databases based on downloads on LlamaHub](https://llamahub.ai/?tab=vector_stores).

- 📖 Their documentation is a breath of fresh air – clear, to the point, and without any fluff. It makes getting up and running a breeze if you want to go deeper than the LlamaIndex abstractions.

- 🌍 The development is open, and the team behind Qdrant is technically savvy. It's reassuring to see the level of transparency and expertise.

- 🔐 They've recently added built-in authentication to the dev version. It's a game changer if you're looking for that extra layer of security.

- 🆓 They offer an extremely generous free tier via their hosted cloud, making it easy to test drive Qdrant and see if it fits your needs without any commitment.

- 🤖 OpenAI is rumoured to use Qdrant as an embedding vector database, according to a [Reddit post](https://www.reddit.com/r/ChatGPT/comments/17plmqj/openai_is_using_qdrant_as_a_vector_database/).

- 🤖 X AI also uses Qdrant, as [evidenced by the fork in their GitHub](https://github.com/xai-org).

# LlamaIndex Overview,

   LlamaIndex is a broad toolkit containing hundreds of integrations:,

    - 150+ data loaders,
    - 35+ agent tools,
    - 50+ LlamaPack templates,
    - 50+ LLMs,
    - 25+ embeddings,
    - 40+ vector stores,
   
   <img src="https://miro.medium.com/v2/resize:fit:1100/format:webp/1*m3YA6oLsYsVVBWG67RIIbA.png">
   


   - `llama-index-core` :  A slimmed-down package that contains the core LlamaIndex abstractions and components, without any integrations. Integrations are available as a separate package.,
   
   - `llama-index-integrations`: This folder contains third-party integrations for 19 LlamaIndex abstractions. This includes data loaders, LLMs, embedding models, vector stores, and more.,
   
   - `llama-index-packs` : This contains 50+ LlamaPacks, which are templates designed to kickstart a user’s application. You can view all of the packs [here](https://pretty-sodium-5e0.notion.site/ce81b247649a44e4b6b35dfb24af28a6?v=53b3c2ced7bb4c9996b81b83c9f01139), Each integration and pack will need to be installed before use, and I'll be sure to include the a line at the top of each notebook which will install the required package.
