In [4]:
import os
from embedchain import App
from embedchain.models.data_type import DataType

### Load the envtt file
from dotenv import load_dotenv
load_dotenv()

True

### EmbedChain Library

Documentation: https://docs.embedchain.ai/get-started/quickstart

EmbedChain is an open-source framework that makes it easy to build and deploy retrieval-augmented generation (RAG) applications powered by large language models (LLMs). Its “Conventional but Configurable” approach caters to both software and machine learning engineers.

Key advantages of EmbedChain include:
* Simplifies RAG Development: Building robust RAG pipelines involves complexities like data integration, chunking, indexing, vector storage, and more. EmbedChain streamlines this process.
* Flexible Architecture: Choose components like LLMs, vector databases, data loaders, chunkers, and retrieval strategies to tailor the pipeline to your needs.
* Efficient Data Handling: EmbedChain automatically loads data, generates embeddings for relevant chunks, and stores them in your chosen vector database.
* User-Friendly APIs: Beginners can build LLM apps in just 4 lines of code, while advanced users can deeply customize the RAG pipeline.

The core workflow is straightforward:
* Add Data: Automatically load, chunk, embed, and index your data sources.
* Query: Turn user questions into embeddings to retrieve relevant documents.

### Config
Set up your config below.
You can define your vectordb, embedding, and llm

In [2]:
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") ## Put your OpenAI API key here

In [3]:
config = {
  'vectordb': {
    'provider': 'chroma',
    'config': {
    'collection_name': 'rag-collection',
    'dir': 'db',
    'allow_reset': True 
    }
  },
  'embedder': {
    'provider': 'openai',
    'config': {
      'model': 'text-embedding-3-small'
    }
  },
  'llm': {
        'provider': 'openai',
        'config': {
            'model': 'gpt-3.5-turbo-0125',
            'temperature': 0.5,
            'top_p': 1,
            'stream': False,
            'prompt': (
                "Use the following pieces of context to answer the query at the end.\n"
                "If you don't know the answer, just say that you don't know, don't try to make up an answer.\n"
                "$context\n\nQuery: $query\n\nHelpful Answer:"
            ),
            'system_prompt': (
                "You are an expert at looking at the provided context and answering user's query."
            ),
        }
  }
}

### Embed your documents

* Supported Data Sources : https://docs.embedchain.ai/components/data-sources/overview
* Supported LLM Models: https://docs.embedchain.ai/components/llms

In [4]:
app = App.from_config(config=config)

In [5]:
### Sources about the recently released Llama3 model
youtube_sources = ['https://www.youtube.com/watch?v=cEHFzvU-pzk', 'https://www.youtube.com/watch?v=8Ul_0jddTU4']
web_sources = ['https://www.theverge.com/2024/4/18/24134103/llama-3-benchmark-testing-ai-gemma-gemini-mistral']

In [6]:
## Add your sources to the app
for video in youtube_sources:
    app.add(video, data_type=DataType.YOUTUBE_VIDEO)

for pdf in web_sources:
    app.add(pdf, data_type=DataType.WEB_PAGE)

Inserting batches in chromadb: 100%|██████████| 1/1 [00:00<00:00,  2.22it/s]
Inserting batches in chromadb: 100%|██████████| 1/1 [00:00<00:00,  2.13it/s]
Inserting batches in chromadb: 100%|██████████| 1/1 [00:00<00:00,  4.18it/s]


In [7]:
app.query("What different sizes is the Llama3 model avaialble in?")

'The Llama3 model is available in different sizes, including an 8 billion parameter model, a 70 billion parameter model, and a 405 billion parameter model.'

In [8]:
app.query("How does Llama3-8B compare to Mistral 7B model?")

'According to the provided context, Meta claims that Llama3-8B outperformed Mistral 7B in certain benchmarking tests. In the MMLU benchmark, Llama3-8B performed significantly better than Gemma 7B and Mistral 7B. Therefore, based on this information, Llama3-8B is considered to be superior to the Mistral 7B model in specific benchmarking tests.'

### Integrating an open source model

Use Together AI to access open source models

Available inference models: https://docs.together.ai/docs/inference-models

In [6]:
os.environ["TOGETHER_API_KEY"] = os.getenv("TOGETHER_API_KEY") ## Put your Together API key here

In [7]:
### Change the LLM in the config

config = {
  'vectordb': {
    'provider': 'chroma',
    'config': {
    'collection_name': 'rag-collection-opensource',
    'dir': 'db',
    'allow_reset': True 
    }
  },
  'embedder': {
    'provider': 'openai',
    'config': {
      'model': 'text-embedding-3-small'
    }
  },
  'llm': {
        'provider': 'together',
        'config': {
            'model': 'mistralai/Mistral-7B-Instruct-v0.2',
            'temperature': 0.5,
            'top_p': 1,
            'prompt': (
                "Use the following pieces of context to answer the query at the end.\n"
                "If you don't know the answer, just say that you don't know, don't try to make up an answer.\n"
                "$context\n\nQuery: $query\n\nHelpful Answer:"
            )
        }
  }
}

In [8]:
app_opensource = App.from_config(config=config)

In [9]:
## Add your sources to the app
for video in youtube_sources:
    app_opensource.add(video, data_type=DataType.YOUTUBE_VIDEO)

for pdf in web_sources:
    app_opensource.add(pdf, data_type=DataType.WEB_PAGE)

Inserting batches in chromadb: 100%|██████████| 1/1 [00:00<00:00,  2.18it/s]


In [10]:
app_opensource.query("How does Llama3-8B compare to Mistral 7B model?")

  warn_deprecated(


" According to Meta's benchmarking tests, Llama3-8B outperforms Mistral 7B in the MMLU benchmark, which measures general knowledge. Llama3-8B showed more diversity in answering prompts, had fewer false refusals, and could reason better than Mistral 7B in this test. However, it's important to note that benchmark testing AI models is imperfect, and the datasets used to benchmark models can have limitations."

In [11]:
app_opensource.query("How does Llama3 architecture differ from Llama2?")

' Llama3 models have a larger parameter size compared to Llama2 models. The smallest Llama3 model, which is 8B, is already outperforming the largest Llama2 model. Meta claims that both sizes of Llama3 show more diversity in answering prompts, have fewer false refusals, and can reason better than Llama2. Additionally, Llama3 is expected to offer multimodal responses like generating images or transcribing audio files in future larger versions. The larger versions of Llama3, which are over 400B parameters, are currently training and are expected to ideally learn more complex patterns than the smaller versions. Meta did not release a preview of these larger models or compare them to other big models like GPT-4.'

## Exercise

- Create your own RAG collection on a different topic. It can be anything like your favorite movie or a book
- Integrate data from a few different sources like PDFs, Webpages, Videos. If there is code involved you can integrate Github too
- Set an open source model as an LLM

Test how your system does. Change configs for embeddings/retriever/different LLM and observe the difference 
