LlamaIndex, previously known as GPT Index, is a data framework used by LLM applications to handle private or specialized data, making it easier to organize and access

1.   Creating and Quering Index
2.   Saving and Loading Index
3.   Customize LLM
4.   Customize Prompt
5.   Customize Embedding


In [14]:
!pip install llama-index pypdf sentence_transformers langchain -q

In [2]:
import os
import openai
openai.api_key = ""
os.environ["OPENAI_API_KEY"] = ""

# Create Index

create folder call it book and put inside what pdf file you want to talk to

In [4]:
from llama_index import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader('book').load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

[nltk_data] Downloading package punkt to /tmp/llama_index...
[nltk_data]   Unzipping tokenizers/punkt.zip.


In [5]:
response = query_engine.query("What is this text about?")
print(response)

This text appears to be the table of contents for a book titled "Bigger Leaner Stronger: The Simple Science of Building the Ultimate Male Body" by Mike Matthews. It provides an overview of the chapters and sections included in the book, as well as information about the author and other books written by him.


In [7]:
response = query_engine.query("when was this book published")
print(response)

I'm sorry, but I cannot determine the publication date of the book based on the given context information.


In [8]:
response = query_engine.query("list 5 important points from this book")
print(response)

This book discusses various topics related to health and fitness. Some important points from this book could include:
1. The book debunks common myths and mistakes related to muscle building and fat loss.
2. It emphasizes that building bigger and stronger muscles is easier than commonly believed.
3. The book highlights the importance of setting fitness goals that will motivate and drive long-term dedication.
4. It emphasizes the significance of having a good training partner for effective workouts.
5. The book emphasizes the importance of measuring progress and understanding it in order to achieve desired results.


In [9]:
response = query_engine.query("what the book say aboutcommon myths and mistakes related to muscle building and fat loss")
print(response)

The book addresses common myths and mistakes related to muscle building and fat loss. It mentions that there are five common myths and mistakes of getting ripped, and it aims to dispel them. Additionally, it states that building bigger, stronger muscles is much easier than most people believe, and that getting shredded is impossible if certain traps are fallen into. The book also discusses the real science of muscle growth and the real science of healthy fat loss, providing three simple rules for effective fat loss methods.


#Saving and Loading Index

In [10]:
index.storage_context.persist("m_index")

In [11]:
from llama_index import StorageContext, load_index_from_storage

# rebuild storage context
storage_context = StorageContext.from_defaults(persist_dir="m_index")
# load index
new_index = load_index_from_storage(storage_context)

In [12]:
new_query_engine = new_index.as_query_engine()
response = new_query_engine.query("who is this text about?")
print(response)

This text is about Mike Matthews.


#Customizing LLM's

In [17]:
from llama_index import LLMPredictor, ServiceContext

from langchain.chat_models import ChatOpenAI

llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo",openai_api_key=""))


service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)


custom_llm_index = VectorStoreIndex.from_documents(
    documents, service_context=service_context
)

In [18]:
custom_llm_query_engine = custom_llm_index.as_query_engine()
response = custom_llm_query_engine.query("who is this text about?")
print(response)

The text is about Mike Matthews.


#Custom Prompt

In [19]:
from llama_index import Prompt

template = (
    "We have provided context information below. \n"
    "---------------------\n"
    "{context_str}"
    "\n---------------------\n"
    "Given this information, \
    please answer the question and each answer\
     should start with code word AI Demos: {query_str}\n"
)
qa_template = Prompt(template)

In [20]:
query_engine = custom_llm_index.as_query_engine(text_qa_template=qa_template)
response = query_engine.query("who is this text about?")
print(response)

AI Demos: This text is about Michael Matthews.


#Custom Embedding

In [21]:
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from llama_index import LangchainEmbedding, ServiceContext

# load in HF embedding model from langchain
embed_model = LangchainEmbedding(HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2'))
service_context = ServiceContext.from_defaults(embed_model=embed_model)

new_index = VectorStoreIndex.from_documents(
    documents,
    service_context=service_context,
)

Downloading (…)e9125/.gitattributes:   0%|          | 0.00/1.18k [00:00<?, ?B/s]

Downloading (…)_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Downloading (…)7e55de9125/README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

Downloading (…)55de9125/config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

Downloading (…)ce_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

Downloading (…)125/data_config.json:   0%|          | 0.00/39.3k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

Downloading (…)nce_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

Downloading (…)e9125/tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

Downloading (…)9125/train_script.py:   0%|          | 0.00/13.2k [00:00<?, ?B/s]

Downloading (…)7e55de9125/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)5de9125/modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

In [22]:
query_engine = new_index.as_query_engine()
response = query_engine.query("list 5 important points from this book")
print(response)

The book "Bigger Leaner Stronger: The Simple Science of Building the Ultimate Male Body" by Michael Matthews contains important points related to fitness and bodybuilding. Some of the key points from the book could include:

1. The author promises that by following the information provided in the book, readers will be able to achieve a better physique and overall well-being.
2. The book emphasizes the importance of following a structured workout and nutrition plan to achieve desired results.
3. The author encourages readers to provide feedback and leave reviews on platforms like Amazon, indicating that he values the opinions and experiences of his readers.
4. The book offers additional resources and ways to connect with the author, such as through social media platforms like Facebook, Twitter, and Google Plus, as well as through his website and email address.
5. The author mentions that he answers questions from readers every day and is willing to provide assistance and support to thos

In [23]:
query_engine = new_index.as_query_engine()
response = query_engine.query("what the book say about the importance of following a structured workout ")
print(response)

The book emphasizes the importance of following a structured workout. It suggests keeping a training journal to track progress and make notes about strength, weaknesses, and any issues that may arise during workouts. By following a structured workout and keeping track of progress, individuals can ensure continuous improvement and avoid getting stuck or regressing in their fitness journey.


#Do you want to learn more?


**If you're hungry for more insights into  such topics. check my youtube channel**
https://www.youtube.com/@codewello

**Docs LlamaIndex**
https://gpt-index.readthedocs.io/en/stable/index.html