# What is LlamaIndex?

LLMs offer a natural language interface between humans and data. Widely
available models come pre-trained on huge amounts of publicly available data like Wikipedia, mailing lists, textbooks, source code and more.

However, while LLMs are trained on a great deal of data, they are not trained on your data, which may be private or specific to the problem you’re trying to solve. It’s behind APIs, in SQL databases, or trapped in PDFs and slide decks.

LlamaIndex solves this problem by connecting to these data sources and adding your data to the data LLMs already have. This is often called Retrieval-Augmented Generation (RAG). RAG enables you to use LLMs to query your data, transform it, and generate new insights. You can ask questions about your data, create chatbots, build semi-autonomous agents, and more. To learn more, check out our Use Cases on the left.

# What is diffenece between LlamaIndex and LangChain?

Langchain is a more general-purpose framework that can be used to build a wide variety of applications. It provides tools for loading, processing, and indexing data, as well as for interacting with LLMs. Langchain is also more flexible than LlamaIndex, allowing users to customize the behavior of their applications.

LlamaIndex is specifically designed for building search and retrieval applications. It provides a simple interface for querying LLMs and retrieving relevant documents. LlamaIndex is also more efficient than Langchain, making it a better choice for applications that need to process large amounts of data.

<b><h4> So which one should you choose? </h4></b>


LangChain is ideal if you are looking for a broader framework to bring multiple tools together. LangChain is also suitable for building intelligent agents capable of performing multiple tasks simultaneously.

On the other hand, if your main goal is smart search and retrieval, LlamaIndex is a great choice. It excels in indexing and retrieval for LLMs, making it a powerful tool for deep exploration of data.


# How can LlamaIndex help?

LlamaIndex provides the following tools:


*   Data connectors:  ingest your existing data from their native source and format. These could be APIs, PDFs, SQL, and (much) more.
*   Data indexes: structure your data in intermediate representations that are easy and performant for LLMs to consume.
*   Engines: provide natural language access to your data. For example: - Query engines are powerful retrieval interfaces for knowledge-augmented output. - Chat engines are conversational interfaces for multi-message, “back and forth” interactions with your data.
*   Data agents: are LLM-powered knowledge workers augmented by tools, from simple helper functions to API integrations and more.
*   Application integrations: tie LlamaIndex back into the rest of your ecosystem. This could be LangChain, Flask, Docker, ChatGPT, or… anything else!

# Retrieval Augmented Generation (RAG)

LLMs are trained on enormous bodies of data but they aren’t trained on your data. Retrieval-Augmented Generation (RAG) solves this problem by adding your data to the data LLMs already have access to.

### Stages in RAG


1.   Loading Stage:

      *   Nodes and Documents:
      
            Document objects represent entire files, while Nodes are smaller pieces of that original document, that are suitable for an LLM and Q&A.

      *   Connectors

            Plugins that allow us to take in data from a source (such as PDF files) and then use the loaded data in our LLM application.

      We can use different types of data loaders. https://llamahub.ai/
           
2.   Indexing and Storing Stage

      *   Indexes and Embeddings

      Once you’ve ingested your data, LlamaIndex will help you index the data into a structure that’s easy to retrieve. In this stage involes generating vector embeddings which are stored in vector store. We can also store metadata.
   
3.   Querying Stage

      *   Retrievers

          It defines how to efficiently retrieve relevant context from a knowledge base (i.e. index) when given a query.

      *   Node Postprocessors

          A node postprocessor takes in a set of retrieved nodes and applies transformations, filtering, or re-ranking logic to them.

      *   Response Synthesizers

          A response synthesizer generates a response from an LLM, using a user query and a given set of retrieved text chunks.

4.   Evaluation
      
     *    Evaluation provides objective measures of how accurate, faithful and fast your responses to queries are.




In [1]:
import openai
from sentence_transformers import SentenceTransformer
from llama_index import VectorStoreIndex, SimpleDirectoryReader, StorageContext, load_index_from_storage, ServiceContext
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from llama_index.embeddings import LangchainEmbedding
from transformers import AutoTokenizer, AutoModelForCausalLM
from huggingface_hub import login
from llama_index.query_engine import CustomQueryEngine
from llama_index.retrievers import BaseRetriever
from llama_index.prompts import PromptTemplate
from llama_index.response_synthesizers import (get_response_synthesizer,BaseSynthesizer,)

openai.api_key = "sk-xxx"


  from .autonotebook import tqdm as notebook_tqdm


Token will not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to /Users/ravirajsavaliya/.cache/huggingface/token
Login successful


In [2]:
embedding_model = HuggingFaceEmbeddings(model_name ='sentence-transformers/all-mpnet-base-v2')

In [42]:
try: 
    storage_content = StorageContext.from_defaults(persist_dir="./storage/cache/andrew/sleep/")
    index = load_index_from_storage(storage_context)
    print("Loading from disk")
except:
    service_context = ServiceContext.from_defaults(embed_model=embedding_model)
    documents = SimpleDirectoryReader('./AndrewHuberman/sleep').load_data()
    index = VectorStoreIndex.from_documents(documents, service_context=service_context)
    # to store vector data in local storage
    index.storage_context.persist(persist_dir='./storage/cache/andrew/sleep/')


Loading from disk


In [None]:
#Custom RAG

from llama_index.prompts import PromptTemplate

qa_prompt = PromptTemplate(
    "Context information is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Given the context information and not prior knowledge, "
    "answer the query.\n"
    "Query: {query_str}\n"
    "Answer: "
)

class RAGStringQueryEngine(CustomQueryEngine):
    """RAG String Query Engine."""

    retriever: BaseRetriever
    response_synthesizer: BaseSynthesizer
    llm: model
    qa_prompt: PromptTemplate

    def custom_query(self, query_str: str):
        nodes = self.retriever.retrieve(query_str)

        context_str = "\n\n".join([n.node.get_content() for n in nodes])
        response = self.llm.complete(
            qa_prompt.format(context_str=context_str, query_str=query_str)
        )
        return str(response)

In [4]:
query_engine = index.as_query_engine()
response = query_engine.query("How does sun light affect sleep?")
print(response)

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Sunlight affects sleep by playing a crucial role in regulating the circadian rhythm, which is the internal 24-hour clock that controls sleep-wake cycles. Exposure to bright sunlight, especially in the morning, helps to synchronize the circadian rhythm and promote alertness and wakefulness during the day. Sunlight exposure early in the day can make it easier to fall asleep at night and improve overall sleep quality. The intensity and duration of sunlight exposure are important factors, with brighter and longer exposure being more beneficial for regulating the circadian rhythm. Even on cloudy days, outdoor light is still brighter than indoor light and can have a positive impact on sleep. Conversely, lack of sunlight, such as staying indoors or using artificial light late at night, can disrupt the circadian rhythm and lead to sleep problems, including delayed sleep-wake cycles and difficulty falling asleep at night.


### Chapter 2

In [39]:
from llama_index.response.pprint_utils import pprint_response
from llama_index.retrievers import VectorIndexRetriever
from llama_index.query_engine import RetrieverQueryEngine
from llama_index.indices.postprocessor import SimilarityPostprocessor , KeywordNodePostprocessor
from llama_index.response_synthesizers import get_response_synthesizer

In [15]:
documents = SimpleDirectoryReader('./AndrewHuberman/sleep').load_data()
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()
response = query_engine.query("How to sleep better?")
pprint_response(response,show_source=True)

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Final Response: To sleep better, there are several things you can try.
First, you can lower your core body temperature by taking a hot shower
or bath and then cooling off afterwards. This can help you relax and
prepare for sleep. Additionally, it's important to establish a
consistent sleep-wake cycle and avoid caffeine in the hours leading up
to bedtime. Creating a calming bedtime routine, such as dimming the
lights and avoiding stimulating activities, can also promote better
sleep. It's worth noting that alcohol and substances like THC may help
some people fall asleep, but they can disrupt the quality of sleep.
It's generally recommended to prioritize behavioral tools, nutrition,
and supplementation before considering prescription drugs. Other tools
that may improve sleep include using earplugs or an eye mask if they
are helpful for you, elevating your feet or the head side of your bed
to enhance sleep depth, and practicing nose breathing during sleep to
alleviate sleep apnea and prom

In [16]:
documents = SimpleDirectoryReader('./AndrewHuberman/sleep').load_data()
index = VectorStoreIndex.from_documents(documents)

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


In [35]:
s_processor = SimilarityPostprocessor(similarity_cutoff = 0.83)
k_processor = KeywordNodePostprocessor(exclude_keywords=["temperature"])
retrivever = VectorIndexRetriever(index = index, similarity_top_k= 3)
# query_engine = RetrieverQueryEngine(retriever= retrivever , node_postprocessors=[s_processor]) 

query_engine = RetrieverQueryEngine(retriever= retrivever , node_postprocessors=[k_processor]) 


## Filter the results based on 0.83

In [38]:
response = query_engine.query("What can i do for better sleep?")
pprint_response(response,show_source=True)
print(response)

Final Response: To improve your sleep, there are several things you
can try. Some people find using earplugs helpful, while others do not.
It's a personal preference, so you can see if they work for you. If
you sleep in a bright environment or need to sleep on a plane, using
an eye mask can be beneficial. Elevating your feet with a pillow or by
raising the end of your bed by about three to five degrees can
increase the depth of sleep. However, if you suffer from acid reflux,
it's better to elevate the head side of your bed instead. Sleep apnea,
which is bouts of suffocation or lack of oxygenation during sleep, can
be very detrimental to your health. If it's not too serious, training
yourself to be a nose breather while you sleep can help relieve sleep
apnea. You can tape your mouth shut before going to sleep to encourage
nose breathing and prevent snoring. It's also a good idea to be a nose
breather during exercise, as it can translate to being a nose breather
during sleep.
___________

In [40]:
response_synthesizer = get_response_synthesizer(response_mode = "no_text")

query_engine = RetrieverQueryEngine(retriever= retrivever, node_postprocessors= [k_processor],response_synthesizer= response_synthesizer)
response = query_engine.query("what can i do for better sleep?")
pprint_response(response,show_source=True)
print(response)

Final Response:
______________________________________________________________________
Source Node 1/1
Node ID: dca2bbbf-f39e-4bd7-834a-e8b564dba23e
Similarity: 0.8449979850780629
Text: I'm one such person. Although, I have family members that like
using earplugs when they sleep. So it's really up to you. You have to
see whether or not those earplugs help or disrupt your sleep. For me,
they're no good. For some people, they really enjoy them. I don't use
an eye mask unless I'm sleeping in a really bright environment or I
need t...
None


In [41]:
response = query_engine.query("what can i do for better sleep?")
print(response.source_nodes)

[NodeWithScore(node=TextNode(id_='dca2bbbf-f39e-4bd7-834a-e8b564dba23e', embedding=None, metadata={'file_path': 'AndrewHuberman/sleep/84_Sleep_Toolkit_Tools_for_Optimizing_Sleep_&_SleepWake_Timing_Huberman_Lab_Podcast_84.txt', 'file_name': '84_Sleep_Toolkit_Tools_for_Optimizing_Sleep_&_SleepWake_Timing_Huberman_Lab_Podcast_84.txt', 'file_type': 'text/plain', 'file_size': 113543, 'creation_date': '2023-11-22', 'last_modified_date': '2023-10-04', 'last_accessed_date': '2023-11-22'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='c24e6fad-804e-4501-9eca-a490ffe8e5e1', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'file_path': 'AndrewHuberman/sleep/84_Sleep_Toolkit_Tools_for_Optimizing_Sleep_&_SleepWake_Timi

## How to count tokens used when creating and querying LlamaIndex

#### Byte Pair Encoding


In [47]:
import tiktoken
from llama_index import ServiceContext
from llama_index.callbacks import CallbackManager, TokenCountingHandler

token_counter = TokenCountingHandler( 
    tokenizer= tiktoken.encoding_for_model("text-embedding-ada-002").encode,
    verbose= True,
)
callback_manager = CallbackManager([token_counter])
service_context = ServiceContext.from_defaults(callback_manager=callback_manager)

In [57]:
from llama_index import VectorStoreIndex, SimpleDirectoryReader, StorageContext, load_index_from_storage
try:
    storage_context = StorageContext.from_defaults(persist_dir='./storage/cache/andrew/sleep')
    index = load_index_from_storage(storage_context)
    print('loading from disk')
except:
    documents = SimpleDirectoryReader('./AndrewHuberman/sleep').load_data()
    index = VectorStoreIndex.from_documents(documents)
    index.storage_context.persist(persist_dir='./storage/cache/andrew/sleep/')
    print('persisting to disk')

loading from disk


In [58]:
response = index.as_query_engine().query("How does sleep enhance learning memory?")

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


In [59]:
print(response)

Sleep enhances learning and memory by facilitating the consolidation of information. During sleep, different stages of sleep play different roles in the learning process. Slow wave sleep, which occurs primarily in the early part of the night, is important for motor learning and the learning of specific details about specific events. It is during this stage that the brain experiences big amplitude activity and the release of neuromodulators like norepinephrine and serotonin. On the other hand, rapid eye movement (REM) sleep, which occurs throughout the night with a larger percentage towards morning, is involved in refreshing the memory and strengthening synapses. During REM sleep, the locus ceruleus is turned off, allowing for the weakening of synapses, which is an important part of lifelong learning. Additionally, the release of neurotransmitters like dopamine, norepinephrine, and galanin during REM sleep helps in rapid learning and strengthening synapses. Overall, sleep, particularly 

### Prompting

In [60]:
from llama_index.prompts import PromptTemplate

text_qa_template_str = (
    "Context information is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Using both the context information and also using your own knowledge, "
    "answer the question: {query_str}\n"
    "If the context isn't helpful, you can also answer the question on your own.\n"
)

text_qa_template = PromptTemplate(text_qa_template_str)

In [61]:
text_qa_template_str = (
    "You are an Andrew huberman assistant that can read Andrew Huberman podcast notes.\n"
    "Always answer the query only using the provided context information, "
    "and not prior knowledge.\n"
    "Some rules to follow:\n"
    "1. Never directly reference the given context in your answer.\n"
    "2. Avoid statements like 'Based on the context, ...' or "
    "'The context information ...' or anything along "
    "those lines."
    "Context information is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Answer the question: {query_str}\n"
)

text_qa_template = PromptTemplate(text_qa_template_str)

In [62]:
response = index.as_query_engine(
    text_qa_template = text_qa_template 
).query("How does sleep enhance learning memory?")

In [63]:
print(response)

Sleep enhances learning and memory by facilitating two different types of memory consolidation processes during different stages of sleep. Slow wave sleep, which primarily occurs in the early part of the night, is important for motor learning and the learning of specific details about specific events. During slow wave sleep, there is a release of neuromodulators such as norepinephrine and serotonin, which contribute to the consolidation of motor skills and detailed learning. On the other hand, REM sleep, which occurs throughout the night with a larger percentage towards morning, plays a role in refreshing and strengthening long-term memories. During REM sleep, the locus ceruleus is turned off, allowing for the weakening of synapses and the integration of new information into long-term memory structures. The release of neurotransmitters like dopamine, norepinephrine, and galanin during REM sleep further supports the consolidation of memories and rapid learning. Overall, sleep, particula

In [64]:
from llama_index.prompts import PromptTemplate

text_qa_template_str = (
    "You are an Andrew huberman assistant that can read Andrew Huberman podcast notes.\n"
    "Always answer the query only using the provided context information, "
    "and not prior knowledge.\n"
    "Some rules to follow:\n"
    "1. Never directly reference the given context in your answer.\n"
    "2. Avoid statements like 'Based on the context, ...' or "
    "'The context information ...' or anything along "
    "those lines."
    "Context information is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Answer the question: {query_str}\n"
)

text_qa_template = PromptTemplate(text_qa_template_str)

In [65]:
response = index.as_query_engine(
    text_qa_template = text_qa_template 
).query("How does sleep enhance learning memory?")

In [69]:
display(response.source_nodes)

[NodeWithScore(node=TextNode(id_='6403748f-e029-4405-87c3-2f6ca792e8e7', embedding=None, metadata={'file_path': 'AndrewHuberman/sleep/05_Understanding_and_Using_Dreams_to_Learn_and_to_Forget_Huberman_Lab_Podcast_5.txt', 'file_name': '05_Understanding_and_Using_Dreams_to_Learn_and_to_Forget_Huberman_Lab_Podcast_5.txt', 'file_type': 'text/plain', 'file_size': 75844, 'creation_date': '2023-11-22', 'last_modified_date': '2023-10-04', 'last_accessed_date': '2023-11-22'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='39ab4dbe-f58f-4904-b565-dc03fc312fbd', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'file_path': 'AndrewHuberman/sleep/05_Understanding_and_Using_Dreams_to_Learn_and_to_Forget_Huberman_Lab_Podca

### Metadata

In [72]:
from llama_index import SimpleDirectoryReader
andrew_gina_docs = SimpleDirectoryReader(input_files=["./AndrewHuberman/sleep/115_Dr_Gina_Poe_Use_Sleep_to_Enhance_Learning_Memory_&_Emotional_State_Huberman_Lab_Podcast.txt"]).load_data()

Object ` llama_index.node_parser` not found.


In [79]:
from llama_index.extractors import (
    SummaryExtractor,
    QuestionsAnsweredExtractor,
    TitleExtractor,
    KeywordExtractor,
    EntityExtractor,
    BaseExtractor,
)
from llama_index.node_parser import SimpleNodeParser
from llama_index import .
# from llama_index.schema import MetadataExtractor
from llama_index.text_splitter import TokenTextSplitter
from llama_index.node_parser import SimpleNodeParser

text_splitter = TokenTextSplitter(separator=" ", chunk_size=512, chunk_overlap=20)

metadata_extractor = MetadataExtractor(
    extractors=[
        TitleExtractor(nodes=5),
        QuestionsAnsweredExtractor(questions=3),
    ],
)

node_parser = SimpleNodeParser(
    text_splitter = text_splitter,
    metadata_extractor=metadata_extractor
)

SyntaxError: invalid syntax (403154468.py, line 10)

In [84]:
pip install llama-index

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


In [85]:
from llama_index.node_parser.extractors import MetadataExtractor


ModuleNotFoundError: No module named 'llama_index.node_parser.extractors'