## GPT Keyword Table Index Comparisons

Comparing GPTSimpleKeywordTableIndex, GPTRAKEKeywordTableIndex, GPTKeywordTableIndex.

- GPTSimpleKeywordTableIndex - uses simple regex to extract keywords.
- GPTRAKEKeywordTableIndex - uses RAKE to extract keywords.
- GPTKeywordTableIndex - uses GPT to extract keywords.

#### GPTSimpleKeywordTableIndex

In [1]:
from llama_index import GPTSimpleKeywordTableIndex, SimpleDirectoryReader
from IPython.display import Markdown, display

[nltk_data] Downloading package stopwords to /home/jerry/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [None]:
# build keyword index
documents = SimpleDirectoryReader('data').load_data()
index = GPTSimpleKeywordTableIndex(documents)
query_engine = index.as_query_engine()

In [None]:
response = query_engine.query("What did the author do after his time at YC?")

In [5]:
display(Markdown(f"<b>{response}</b>"))

<b>

The author went on to write essays and work on other projects, including a new version of the Arc programming language and Hacker News. He also started painting, but stopped after a few months. In 2015, he started working on a new Lisp programming language, which he finished in 2019. The author then moved to England in 2016 with his family and continued writing essays. In 2019, he finished Bel and wrote a bunch of essays on various topics.

The author also worked on building online stores in 1995 after finishing ANSI Common Lisp. He ran the software on servers and let users control it by clicking on links, which was a new concept at the time. In 1996, he co-founded Viaweb with Robert Morris, which was later acquired by Yahoo in 1998. After leaving Yahoo, the author moved back to New York and started painting again. In 2000, he had the idea for a web application that would let people edit code on a server and host the resulting applications, which later became known as "Reddit".</b>

#### GPTRAKEKeywordTableIndex

In [1]:
from llama_index import GPTRAKEKeywordTableIndex, SimpleDirectoryReader
from IPython.display import Markdown, display

[nltk_data] Downloading package stopwords to /home/jerry/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [None]:
# build keyword index
documents = SimpleDirectoryReader('data').load_data()
index = GPTRAKEKeywordTableIndex(documents)
query_engine = index.as_query_engine()

In [10]:
response = query_engine.query("What did the author do after his time at YC?")

> Starting query: What did the author do after his time at YC?
Extracted keywords: []


In [11]:
display(Markdown(f"<b>{response}</b>"))

<b>Empty response</b>

#### GPTKeywordTableIndex

In [7]:
from llama_index import GPTKeywordTableIndex, SimpleDirectoryReader
from IPython.display import Markdown, display

In [None]:
# build keyword index
documents = SimpleDirectoryReader('data').load_data()
index = GPTKeywordTableIndex.from_documents(documents)
query_engine = index.as_query_engine()

In [None]:
response = query_engine.query("What did the author do after his time at Y Combinator?")

In [10]:
display(Markdown(f"<b>{response}</b>"))

<b>

After a few years, the author decided to step away from Y Combinator to focus on other projects, such as painting and writing essays. In 2013, he handed over control of Y Combinator to Sam Altman. The author's mother passed away in 2014, and after taking some time to grieve, he returned to writing essays and working on Lisp. He continued working on Lisp until 2019, when he finally completed the project.

In 2015, the author decided to move to England with his family. They originally intended to only stay for a year, but ended up liking it so much that they remained there. The author wrote Bel while living in England. In 2019, he finally finished the project. After completing Bel, the author wrote a number of essays on various topics. He continued writing essays through 2020, but also started thinking about other things he could work on.</b>

## GPT Keyword Table Query Comparisons
Compare retriever_mode={"default", "simple", "rake"}

In [None]:
# build table with default GPTKeywordTableIndex
from llama_index import GPTKeywordTableIndex, SimpleDirectoryReader
from IPython.display import Markdown, display

documents = SimpleDirectoryReader('data').load_data()
index = GPTKeywordTableIndex.from_documents(documents)

In [3]:
# default
query_engine = index.as_query_engine(
    retriever_mode="default"
)
response = query_engine.query("What did the author do after his time at Y Combinator?")
display(Markdown(f"<b>{response}</b>"))

> Starting query: What did the author do after his time at Y Combinator?
Extracted keywords: ['y combinator', 'combinator']
> Querying with idx: 235042210695008001: of excluding them, because there were so many s...
> Querying with idx: 7029274505691774319: it was like living in another country, and sinc...
> Querying with idx: 1773317813360405038: browser, and then host the resulting applicatio...
> Querying with idx: 3866067077574405334: person, and from those we picked 8 to fund. The...


<b>

The author went on to write a book about his experiences at Y Combinator, and then moved to England. He started writing essays again and also began working on a new Lisp programming language. He also wrote an essay about how he chooses what to work on.</b>

In [4]:
# simple
query_engine = index.as_query_engine(
    retriever_mode="simple"
)
response = query_engine.query("What did the author do after his time at Y Combinator?")
display(Markdown(f"<b>{response}</b>"))

> Starting query: What did the author do after his time at Y Combinator?
Extracted keywords: ['combinator']
> Querying with idx: 235042210695008001: of excluding them, because there were so many s...
> Querying with idx: 7029274505691774319: it was like living in another country, and sinc...
> Querying with idx: 1773317813360405038: browser, and then host the resulting applicatio...
> Querying with idx: 3866067077574405334: person, and from those we picked 8 to fund. The...


<b>

The author went on to write a book about his experiences at Y Combinator, and then moved to England. He started writing essays again and also began working on a new Lisp programming language. He also wrote an essay about how he chooses what to work on.</b>

In [5]:
# rake
query_engine = index.as_query_engine(
    retriever_mode="rake"
)
response = query_engine.query("What did the author do after his time at Y Combinator?")
display(Markdown(f"<b>{response}</b>"))

> Starting query: What did the author do after his time at Y Combinator?
Extracted keywords: ['combinator']
> Querying with idx: 235042210695008001: of excluding them, because there were so many s...


[nltk_data] Downloading package punkt to /home/jerry/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


> Querying with idx: 7029274505691774319: it was like living in another country, and sinc...
> Querying with idx: 1773317813360405038: browser, and then host the resulting applicatio...
> Querying with idx: 3866067077574405334: person, and from those we picked 8 to fund. The...


<b>

The author went on to write a book about his experiences at Y Combinator, and then moved to England. He started writing essays again and also began working on a new Lisp programming language. He also wrote an essay about how he chooses what to work on.</b>