<a href="https://colab.research.google.com/github/brianMutea/LlamaIndex-Precision-and-Simplicity-in-Information-Retrieval/blob/main/LlamaIndex_Introduction_Precision_and_Simplicity_in_Information_Retrieval.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install -q llama-index==0.9.14.post3 openai==1.3.8 cohere==4.37 deeplake

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m943.5/943.5 kB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m221.5/221.5 kB[0m [31m11.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m48.9/48.9 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m588.2/588.2 kB[0m [31m14.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.9/75.9 kB[0m [31m9.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m23.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.1/3.1 MB[0m [31m35.6 MB/s[0

In [22]:
import os
os.environ["OPENAI_API_KEY"] = "your_openai_api_key"
os.environ["ACTIVELOOP_TOKEN"] = "your_activeloop_token"

In [3]:
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [4]:
from llama_index import download_loader

WikipediaReader = download_loader("WikipediaReader")

loader = WikipediaReader()

docs = loader.load_data(pages=["Natural Language Processing", "Artificial Intelligence"])

print(len(docs))

2


## Creating Nodes

In LlamaIndex, once data is ingested as documents, it passes through a processing structure that transforms these documents into `Node` objects. Nodes are smaller, more granular data units created from the original documents. Besides their primary content, these nodes also contain metadata and contextual information.

LlamaIndex features a `NodeParser` class designed to convert the content of documents into structured nodes automatically. The `SimpleNodeParser` converts a list of document objects into nodes.

In [5]:
from llama_index.node_parser import SimpleNodeParser

# initialize parser
parser = SimpleNodeParser.from_defaults(chunk_size=512, chunk_overlap=20)

# parse documents into nodes
nodes = parser.get_nodes_from_documents(docs)
print(len(nodes))

[nltk_data] Downloading package punkt to /tmp/llama_index...
[nltk_data]   Unzipping tokenizers/punkt.zip.


50


## Indices

At the heart of LlamaIndex is the capability to index and search various data formats like documents, PDFs, and database queries. Indexing is an initial step for storing information in a database; it essentially transforms the unstructured data into embeddings that capture semantic meaning and optimize the data format so it can be easily accessed and queried.

LlamaIndex has a variety of index types, each fulfills a specific role. We have highlighted some of the popular index types in the following subsections.

* Summary Index - extracts a summary from each document and stores it with all the nodes in that document

* Vector Store Index - generates the embeddings during index construction to identify top-k similar nodes in the response query.

The crawled Wikipedia documents can be stored in a Deep Lake vector store, and an index object can be created based on its data. We can create the dataset in Activeloop and append documents to it by employing the `DeepLakeVectorStore` class.

To connect to the platform, use the DeepLakeVectorStore class and provide the dataset path as an argument.

In [6]:
from llama_index.vector_stores import DeepLakeVectorStore

my_activeloop_id = "brianmuteak"
my_activeloop_dataset_name = "LlamaIndex_intro"
dataset_path = f"hub://{my_activeloop_id}/{my_activeloop_dataset_name}"

# create a vecotr store
vector_store = DeepLakeVectorStore(dataset_path=dataset_path, overwrite=False)



Deep Lake Dataset in hub://brianmuteak/LlamaIndex_intro already exists, loading from the storage


Now, we need to create a storage context using the `StorageContext` class and the Deep Lake dataset as the source. Pass this storage to a `VectorStoreIndex` class to create the index (generate embeddings) and store the results on the defined dataset.

In [7]:
from llama_index.storage.storage_context import StorageContext
from llama_index import VectorStoreIndex

storage_context = StorageContext.from_defaults(vector_store = vector_store)
index = VectorStoreIndex.from_documents(
    docs, storage_context = storage_context
)

Uploading data to deeplake dataset.


100%|██████████| 25/25 [00:05<00:00,  4.66it/s]
/

Dataset(path='hub://brianmuteak/LlamaIndex_intro', tensors=['embedding', 'id', 'metadata', 'text'])

  tensor      htype      shape      dtype  compression
  -------    -------    -------    -------  ------- 
 embedding  embedding  (75, 1536)  float32   None   
    id        text      (75, 1)      str     None   
 metadata     json      (75, 1)      str     None   
   text       text      (75, 1)      str     None   


 

### Query Engines

The next step is to leverage the generated indexes to query through the information. The Query Engine is a wrapper that combines a Retriever and a Response Synthesizer into a pipeline. The pipeline uses the query string to fetch nodes and then sends them to the LLM to generate a response. A query engine can be created by calling the `as_query_engine()` method on an already-created index.

In [8]:
from llama_index import GPTVectorStoreIndex

# index = GPTVectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query("What does NLP stand for?")
print(response.response)

NLP stands for Natural Language Processing.


## Saving and Loading Indexes Locally


There are scenarios where saving the data on a disk might be necessary for rapid testing. The concept of storing refers to saving the index data, which includes the nodes and their associated embeddings, to disk. This is done using the `persist()` method from the `storage_context` object related to the index.

In [9]:
# store index as vector embeddings on the disk
index.storage_context.persist()

# This saves the data in the 'storage' by default
# to minimize repetitive processing

If the index already exists in storage, you can load it directly instead of recreating it. We simply need to determine whether the index already exists on disk and proceed accordingly:

In [20]:
# Index Storage Checks

import os.path
from llama_index import(
    VectorStoreIndex,
    StorageContext,
    load_index_from_storage
)

from llama_index import download_loader


# check if the index already exist

if not os.path.exists("./storage"):
  # If not, load the Wikipedia data and create a new index
  WikipediaReader = download_loader("WikipediaReader")
  loader = WikipediaReader()

  docs = loader.load_data(pages=["Natural Language Processing", "Artificial Intelligence"])
  index = VectorStoreIndex.from_documents(docs)
  # store index
  index.storage_context.persist()


else:
  storage_context = StorageContext.from_defaults(persist_dir="./storage", vector_store = vector_store)
  index = load_index_from_storage(storage_context=storage_context)

In [21]:
# test
query_engine = index.as_query_engine()
response = query_engine.query("What Artificial Intelligence?")
print(response.response)

Artificial intelligence (AI) refers to the intelligence exhibited by machines or software, as opposed to the intelligence of humans or other living beings. It is a field of study in computer science that focuses on developing and studying intelligent machines. AI technology is widely used in various industries, government sectors, and scientific research. It encompasses a range of applications such as advanced web search engines, recommendation systems, speech recognition, self-driving cars, generative and creative tools, and superhuman play in strategy games. The goals of AI research include reasoning, knowledge representation, planning, learning, natural language processing, perception, and support for robotics. The ultimate long-term goal of AI is to achieve general intelligence, which refers to the ability to perform any task that a human can do. AI researchers employ various problem-solving techniques, including search algorithms, mathematical optimization, formal logic, artificia

## LangChain vs. LlamaIndex
LangChain and LlamaIndex are designed to improve LLMs' capabilities, each with their unique strengths.

**LlamaIndex**: LlamaIndex specializes in processing, structuring, and accessing private or domain-specific data, with a focus on specific LLM interactions. It works for tasks that demand high precision and quality when dealing with specialized, domain-specific data. Its main strength lies in linking Large Language Models (LLMs) to any data source.

**LangChain** is dynamic, suited for context-rich interactions, and effective for applications like chatbots and virtual assistants. These features render it highly appropriate for quick prototyping and application development.

While generally used independently, it is worth noting that it can be possible to combine functions from both LangChain and LlamaIndex where they have different strengths. Both can be complementary tools. We also designed a little table below to help you understand the differences better. The attached video in the course also aims to help you decide which tool you should use for your application: LlamaIndex, LangChain, OpenAI Assistants, or doing it all from scratch (yourself).

Here’s a clear comparison of each to help you quickly grasp the essentials on a few relevant topics you may consider when choosing: