4. this can be seperate
    - composability
    - different indicies

# Advanced Indices

Indices are really thin data structures that provide structure and metadata to the documents to help retrieve the correct information that is needed by the LLM. Today we are going to be discussing the advantages and disadvantages of the most popular ones and seeing them in action. We'll also learn about how you can put different types of indices together to get the best out of each of them.

In [2]:
# make sure you have a valid OpenAI API key and it set as an 
# environment variable "OPENAI_API_KEY".
import os
import logging
import sys

from IPython.display import Markdown, display

from llama_index import (
    GPTVectorStoreIndex, 
    GPTSimpleKeywordTableIndex, 
    SimpleDirectoryReader,
    LLMPredictor,
    ServiceContext
)
from langchain.llms.openai import OpenAIChat

if os.environ.get("OPENAI_API_KEY") is None:
    logging.error("OPENAI_API_KEY not Found! Please add it as an env var")
    
# ennable DEBUG mode to more logs and understand what is happening.
#logging.basicConfig(stream=sys.stdout, level=logging.INFO)
#logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

logger = logging.getLogger()
logger.disabled = True

In [42]:
from os import listdir
from llama_index import SimpleDirectoryReader

pg_docs = {}
essays_list = listdir('./essays/')

for file_name in essays_list:
    if '.md' not in file_name:
        continue 
    key = file_name[:-3]
    key = ' '.join(key.split('_')[1:])
    pg_docs[key] = SimpleDirectoryReader(
        input_files=[f"./essays/{file_name}"]
    ).load_data()
          

startup_blogs = [
    "how to get startup ideas",
    "startup  growth", "schlep blindness", 
    "organic startup ideas",
    "the 18 mistakes that kill startups", 
    "why to not not start a startup",
    "what ive learned from users", "how to make wealth"
]

for blog in startup_blogs:
    assert blog in pg_docs, f"{blog} not found"

In [17]:

# Config LLM
chatgpt = LLMPredictor(llm=OpenAIChat(temperature=0, model_name="gpt-3.5-turbo"))
service_context = ServiceContext.from_defaults(
    llm_predictor=chatgpt, chunk_size_limit=1024
)

# gpt4 = LLMPredictor(llm=OpenAIChat(temperature=0, model_name="gpt-4"))
# service_context = ServiceContext.from_defaults(
#     llm_predictor=gpt4, chunk_size_limit=1024
# )



In [18]:
tree_index.docstore.docs.__len__()

NameError: name 'tree_index' is not defined

In [46]:
from llama_index import GPTTreeIndex, GPTListIndex, GPTKeywordTableIndex, GPTEmptyIndex, GPTVectorStoreIndex
from llama_index.indices.document_summary import GPTDocumentSummaryIndex

In [36]:
from llama_index import GPTTreeIndex

tree_index = GPTTreeIndex.from_documents(
    pg_docs['how to make wealth'],
    service_context=service_context
)
tree_qe = tree_index.as_query_engine()
resp = tree_qe.query("what are the underlying principles of startups and entrepreneurship")
print(resp)

The essay discusses the economic proposition of startups and entrepreneurship, including the idea that startups involve technology and take on hard technical problems. The author also suggests that startups allow individuals to compress their working life into a few years and be more productive than in a corporate job. The essay does not provide a comprehensive list of underlying principles of startups and entrepreneurship.


In [59]:
from llama_index import GPTListIndex

list_index = GPTListIndex.from_documents(
    pg_docs['how to make wealth'],
    service_context=service_context
)
list_qe = list_index.as_query_engine()
resp = list_qe.query(
    "what are the key take aways? explain as if you are paul graham")
print(resp)

INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 23323 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens


The key takeaways from Paul Graham's essay are that creating wealth is a legitimate and straightforward way to get rich, and starting or joining a startup in the technology industry is a reliable path to wealth creation. Graham emphasizes that wealth is not the same thing as money and that businesses create wealth by doing something people want. He also notes that there is not a fixed amount of wealth in the world and that wealth can be created and destroyed.

Graham highlights the importance of hard work, endurance, and a willingness to take on hard technical problems in creating a successful startup. He notes that programmers are among the few remaining craftsmen who can create wealth by sitting down in front of a computer and writing software. Graham also points out that there are huge variations in the rate at which wealth is created and that the best programmers can be 36 times more productive than in a random corporate job, earning up to $3 million a year.

Furthermore, Graham ar

In [20]:
# summary index

from llama_index.indices.document_summary import GPTDocumentSummaryIndex

summary_index = GPTDocumentSummaryIndex.from_documents(
    documents=pg_docs['how to make wealth']
)

None
current doc id: 4b8cfb61-b171-4964-97b4-c3e072bce91c


In [16]:
qe = summary_index.as_query_engine()
response = qe.query("how to make wealth? explain in bullet points")
print(response)



• Start or join a small group working on a hard problem
• Work hard and fast, with an emphasis on productivity
• Compress your working life into a few years
• Do something people want
• Understand that money is not wealth
• Disprove the Pie Fallacy
• Create wealth by making things
• Craftsmen can create wealth by making things
• Understand that a job means doing something people want
• Working harder can create more wealth
• Look for jobs with measurement and leverage
• Small groups working on hard problems can create wealth
• Understand that smallness equals measurement
• Recognize that upside must be balanced by downside
• Be aware that if there is no danger, there is likely no leverage
• Use difficulty as a guide not just in selecting the overall aim of your company, but also at decision points along the way
• Develop technology that's too hard for competitors to duplicate
• Pick a hard problem and take the harder choice at every decision point
• Be aware that success is often an 

## Composition

In [22]:
tree_index = GPTTreeIndex.from_documents(
    pg_docs['how to make wealth'],
    service_context=service_context
)

list_index = GPTListIndex.from_documents(
    pg_docs['how to make wealth'],
    service_context=service_context
)

summary_index = GPTDocumentSummaryIndex.from_documents(
    documents=pg_docs['how to make wealth']
)
empty_index = GPTEmptyIndex()

None
current doc id: 4b8cfb61-b171-4964-97b4-c3e072bce91c


In [44]:
empty_index = GPTEmptyIndex(service_context=service_context)


In [45]:
qe = empty_index.as_query_engine()
r = qe.query("what is the essay about?")
print(r)

None


In [47]:
startup_ideas = GPTVectorStoreIndex.from_documents(
    pg_docs["how to get startup ideas"],
    service_context=service_context
)
wealth = GPTVectorStoreIndex.from_documents(
    pg_docs['how to make wealth'],
    service_context=service_context
)
why_startup = GPTVectorStoreIndex.from_documents(
    pg_docs["why to not not start a startup"],
    service_context=service_context
)

In [48]:
from llama_index.indices.composability import ComposableGraph

graph = ComposableGraph.from_indices(
    GPTTreeIndex,
    [startup_ideas, wealth, why_startup],
    index_summaries=[
        "explains how to get startup ideas",
        "explains how to make wealth and why startups are a good choice",
        "explains the reasons why you should start a startup",
    ]
)

In [51]:
qe = graph.as_query_engine()
r = qe.query("how do you get wealthy?")
print(r)

Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: That model is currently overloaded with other requests. You can retry your request, or contact us through our help center at help.openai.com if the error persists. (Please include the request ID a5029e54ff876300f910ff81ad66edb3 in your message.).



You can get wealthy by creating wealth and getting paid for it, as well as by taking advantage of chance, speculation, marriage, inheritance, theft, extortion, fraud, monopoly, graft, lobbying, counterfeiting, and prospecting.


In [52]:
print(r.get_formatted_sources())

> Source (Doc id: 7d1c8d8c-af94-45b8-90a3-2288f7db8b17): According to the essay, one way to get wealthy is by creating wealth and getting paid for it. Thi...

> Source (Doc id: 2a801093-5f24-4372-af51-f15d64f24d02): wealth creation. They just represent a point at the far end of the curve. There is a conservation...

> Source (Doc id: 8f72f854-4353-45af-9233-d54615c903c9): some imaginary Daddy.   
  
 It's also obvious to programmers that there are huge variations in t...


In [53]:
qe = graph.as_query_engine()
r = qe.query("why should I start a startup? how can I get ideas")
print(r)


Starting a startup can be a great way to make a difference in the world and to create something that can have a lasting impact. It can also be a great way to make money and to gain experience in the business world. To get ideas for a startup, one can look for something that is missing in their own life and try to supply that need, even if it seems specific to them. Additionally, it is possible to look at current trends and problems in the world and try to come up with a solution. Finally, it is important to find a cofounder to share the workload and to brainstorm ideas.


In [54]:
print(r.get_formatted_sources())

> Source (Doc id: 94b6b511-f47d-4efd-b38f-b8ae72035ec2): According to the context information, starting a startup can be a good idea because there is no l...

> Source (Doc id: accd98b6-b582-459d-9f20-a40e172e05eb): In the average Y Combinator startup, I'd guess 70% of the idea is new at the end of the first thr...

> Source (Doc id: 57acf0a4-2380-4487-81fc-2186a1546435): holds for them too: if users love you, you can always make money from that somehow, and if they d...


In [55]:
qe = graph.as_query_engine()
r = qe.query("which startup ideas will make me weathly?")
print(r)


The exact startup ideas that will make you wealthy will depend on your individual skills, interests, and the current market conditions. However, the essay suggests that the key to generating wealth is to understand what people want and make customers happy. Therefore, you should focus on creating a startup that solves a problem that people have and that they are willing to pay for. Additionally, you should focus on creating a product or service that is of high quality and that customers are satisfied with.


In [56]:
print(r.get_formatted_sources())

> Source (Doc id: 37c76376-755c-4b1d-8f2b-00f72e3151e1): The essay does not provide specific startup ideas that will make someone wealthy. Instead, it dis...

> Source (Doc id: 92793f54-8e2a-433d-8b19-9beb1e3fae5d): 

028 How to Make Wealth


  
 
  
 **Want to start a startup?** Get funded by Y Combinator.   
 ...

> Source (Doc id: 4063b5c2-31bc-45d6-a955-6fa03d9e935b): where your program is slow, and what would make it faster, you almost always guess wrong.   
  
 ...


In [60]:
full_index = GPTVectorStoreIndex.from_documents(
    documents=[
        pg_docs["how to get startup ideas"][0],
        pg_docs['how to make wealth'][0],
        pg_docs["why to not not start a startup"][0],
    ],
    service_context=service_context
)

In [62]:
qe = full_index.as_query_engine()
r = qe.query("which startup ideas will make me weathly?")
print(r)

The essay does not provide specific startup ideas that will make someone wealthy. Instead, it discusses the economic proposition of startups and the potential for individuals to earn a high income by working hard and solving hard technical problems. It also emphasizes the importance of understanding what people want and making customers happy in order to generate wealth.


In [63]:
r = qe.query("how do you get wealthy?")
print(r)

According to the essay, one way to get wealthy is by creating wealth and getting paid for it. This involves doing something that people want and creating value. However, the essay also notes that there are many other ways to get money, including chance, speculation, marriage, inheritance, theft, extortion, fraud, monopoly, graft, lobbying, counterfeiting, and prospecting.


In [64]:
r = qe.query("why should I start a startup? how can I get ideas")
print(r)

According to the article, the best way to get startup ideas is to look for problems, preferably problems you have yourself. The article also emphasizes the importance of working on a problem that really exists and that there are users who really need what you're making. The article suggests that good startup ideas tend to have three things in common: they're something the founders themselves want, that they themselves can build, and that few others realize are worth doing. The article does not explicitly state why someone should start a startup, but it does provide guidance on how to come up with good startup ideas.
