<a href="https://colab.research.google.com/github/frank-morales2020/MLxDL/blob/main/Embedchain_Demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Embedchain

Embedchain is an Open Source RAG Framework that makes it easy to create and deploy AI apps. At its core, Embedchain follows the design principle of being "Conventional but Configurable" to serve both software engineers and machine learning engineers.


Here is a very simple demo about how it work!

Check us out: https://github.com/embedchain/embedchain

First of all we install the dependencies:

In [None]:
!pip install colab-env --upgrade
!pip install openai

import colab_env
import os
import openai

# OpenAI API Key
openai.api_key = os.getenv("OPENAI_API_KEY")
#api_key = "YOUR_OPENAI_API_KEY"
api_key = os.getenv("OPENAI_API_KEY")
HUGGINGFACE_ACCESS_TOKEN = os.environ["HUGGINGFACE_ACCESS_TOKEN"]


In [None]:
!pip install --upgrade embedchain

Now we import the dependencies:

In [3]:
from embedchain import App

We instantiate the embechain bot. Remember to change the API key with you OpenAI api key.

In [4]:
# OpenAI API Key
openai.api_key = os.getenv("OPENAI_API_KEY")
#api_key = "YOUR_OPENAI_API_KEY"
api_key = os.getenv("OPENAI_API_KEY")

app = App()

Now, add different data sources using embedchain's `.add()` method:

In [5]:
app.add("https://en.wikipedia.org/wiki/Elon_Musk")
app.add("https://www.forbes.com/profile/elon-musk")


Inserting batches in chromadb: 100%|██████████| 1/1 [00:01<00:00,  1.53s/it]


Successfully saved https://en.wikipedia.org/wiki/Elon_Musk (DataType.WEB_PAGE). New chunks count: 99


Inserting batches in chromadb: 100%|██████████| 1/1 [00:00<00:00,  3.53it/s]

Successfully saved https://www.forbes.com/profile/elon-musk (DataType.WEB_PAGE). New chunks count: 4





'8cf46026cabf9b05394a2658bd1fe890'

Your bot is ready now. Ask your bot any questions using `.query()` method:

In [6]:
#Added by Frank Morales January 24th, 2023

#https://docs.embedchain.ai/components/data-sources/overview

from embedchain import App
app = App()
app.add('https://arxiv.org/pdf/1706.03762.pdf', data_type='pdf_file')
print()
app.query("What is the paper 'attention is all you need' about?", citations=False)


Inserting batches in chromadb: 100%|██████████| 1/1 [00:00<00:00,  1.84it/s]


Successfully saved https://arxiv.org/pdf/1706.03762.pdf (DataType.PDF_FILE). New chunks count: 47



'The paper "Attention Is All You Need" proposes a new network architecture called the Transformer, which is based solely on attention mechanisms. It suggests that complex recurrent or convolutional neural networks can be replaced with a simpler architecture that connects the encoder and decoder through attention. The paper discusses how this approach can improve sequence transduction models, such as neural machine translation.'

In [7]:
print(app.query("How many companies does Elon Musk run? Name those"))
print()
print(app.query("How many companies does Bill Gates run? Name those"))

Elon Musk runs multiple companies. Some of the companies he is associated with include Tesla, SpaceX, Neuralink, and The Boring Company.

I don't have enough information to answer the query.


In [54]:
app.add("https://en.wikipedia.org/wiki/Bill_Gates")
print()
print(app.query("How many companies does Bill Gates run? Name those"))

Inserting batches in chromadb: 100%|██████████| 1/1 [00:10<00:00, 10.35s/it]


Successfully saved https://en.wikipedia.org/wiki/Bill_Gates (DataType.WEB_PAGE). New chunks count: 66


  Bill Gates is the founder and the chairman of the board at Microsoft, a leading multinational technology company.

  He is also the co-founder and co-chairman of TerraPower, a company that is focused on developing a new generation of nuclear reactors.

  Gates is also the founder and chairman of Cascade Investment, a private investment company.

  In addition, he is the co-founder and co-chairman of the Breakthrough Energy Ventures fund, which invests in clean energy technologies.

  He is also the co-founder and co-chairman of the Gates Foundation, which is focused on improving healthcare and education around the world.

  In addition to his work with these companies, Gates also has a number of other investments and ventures.


In [None]:
!pip install sentence-transformers

In [48]:
#!/usr/bin/env python

#Added by Frank Morales January 24th, 2023  using Mistral LLM and 1536 dim embeddings
#model: 'sentence-transformers/all-mpnet-base-v2' 768
#model: 'sangmini/msmarco-cotmae-MiniLM-L12_en-ko-ja' 1536
import yaml

# https://www.linkedin.com/pulse/science-control-how-temperature-topp-topk-shape-large-puente-viejo-u88yf
# https://medium.com/@jansiml/is-there-an-optimal-temperature-and-top-p-for-code-generation-with-paid-llm-apis-46bfef0e7a36

#{'llm': {'provider': 'huggingface', 'config': {'model': 'mistralai/Mistral-7B-v0.1', 'top_p': 0.95,
#'temperature': 0.8}}, 'embedder': {'provider': 'huggingface', 'config': {'model': 'sangmini/msmarco-cotmae-MiniLM-L12_en-ko-ja'}}}

with open("/content/gdrive/MyDrive/datasets/mistral.yaml", "r") as stream:
    try:
        print(yaml.safe_load(stream))
    except yaml.YAMLError as exc:
        print(exc)

{'llm': {'provider': 'huggingface', 'config': {'model': 'mistralai/Mistral-7B-v0.1', 'top_p': 0.95, 'temperature': 0.8}}, 'embedder': {'provider': 'huggingface', 'config': {'model': 'sangmini/msmarco-cotmae-MiniLM-L12_en-ko-ja'}}}


In [53]:
#Added by Frank Morales January 24th, 2023  using Mistral LLM and 1536 dim embeddings
import os

import warnings
warnings.filterwarnings('ignore')

#!pip install colab-env --upgrade
#import colab_env

#HUGGINGFACE_ACCESS_TOKEN = os.environ["HUGGINGFACE_ACCESS_TOKEN"]

from embedchain import App
app = App.from_config("/content/gdrive/MyDrive/datasets/mistral.yaml")
app.add("https://www.forbes.com/profile/elon-musk")
app.add("https://en.wikipedia.org/wiki/Elon_Musk")

print()
query="What is the net worth of Elon Musk today?"
response = app.query(query)
print('Query : %s'%query)
print('Answer : %s'%response)

Inserting batches in chromadb: 100%|██████████| 1/1 [00:00<00:00,  1.18it/s]


Successfully saved https://www.forbes.com/profile/elon-musk (DataType.WEB_PAGE). New chunks count: 4


Inserting batches in chromadb: 100%|██████████| 1/1 [00:14<00:00, 14.23s/it]


Successfully saved https://en.wikipedia.org/wiki/Elon_Musk (DataType.WEB_PAGE). New chunks count: 99

Query : What is the net worth of Elon Musk today?
Answer : 
  Elon Musk is a billionaire businessman, engineer, and inventor. His net worth is currently estimated at $236 billion.
