# Comparing Methods for Structured Retrieval (Auto-Retrieval vs. Recursive Retrieval)


In a naive RAG system, the set of input documents are then chunked, embedded, and dumped to a vector database collection. Retrieval would just fetch the top-k documents by embedding similarity.

This can fail if the set of documents is large - it can be hard to disambiguate raw chunks, and you're not guaranteed to filter for the set of documents that contain relevant context.

In this guide we explore **structured retrieval** - more advanced query algorithms that take advantage of structure within your documents for higher-precision retrieval. We compare the following two methods:

1. **Metadata Filters + Auto-Retrieval**: Tag each document with the right set of metadata. During query-time, use auto-retrieval to infer metadata filters along with passing through the query string for semantic search.
2. **Store Document Hierarchies (summaries -> raw chunks) + Recursive Retrieval**: Embed document summaries and map that to the set of raw chunks for each document. During query-time, do recursive retrieval to first fetch summaries before fetching documents.

In [16]:
from dotenv import load_dotenv
load_dotenv()
import os

In [1]:
%pip install llama-index-llms-openai
%pip install llama-index-vector-stores-weaviate

Note: you may need to restart the kernel to use updated packages.
Collecting llama-index-vector-stores-weaviate
  Downloading llama_index_vector_stores_weaviate-1.1.3-py3-none-any.whl.metadata (717 bytes)
Collecting weaviate-client<5.0.0,>=4.5.7 (from llama-index-vector-stores-weaviate)
  Downloading weaviate_client-4.9.0-py3-none-any.whl.metadata (3.6 kB)
Collecting httpx (from llama-index-core<0.12.0,>=0.11.0->llama-index-vector-stores-weaviate)
  Using cached httpx-0.27.0-py3-none-any.whl.metadata (7.2 kB)
Collecting validators==0.34.0 (from weaviate-client<5.0.0,>=4.5.7->llama-index-vector-stores-weaviate)
  Downloading validators-0.34.0-py3-none-any.whl.metadata (3.8 kB)
Collecting authlib<1.3.2,>=1.2.1 (from weaviate-client<5.0.0,>=4.5.7->llama-index-vector-stores-weaviate)
  Downloading Authlib-1.3.1-py2.py3-none-any.whl.metadata (3.8 kB)
Collecting grpcio-tools<2.0.0,>=1.57.0 (from weaviate-client<5.0.0,>=4.5.7->llama-index-vector-stores-weaviate)
  Downloading grpcio_tools-1.6

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
opentelemetry-proto 1.27.0 requires protobuf<5.0,>=3.19, but you have protobuf 5.28.3 which is incompatible.


In [3]:
import nest_asyncio

nest_asyncio.apply()

In [4]:
import logging
import sys
from llama_index.core import SimpleDirectoryReader
from llama_index.core import SummaryIndex

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [5]:
wiki_titles = ["Michael Jordan", "Elon Musk", "Richard Branson", "Rihanna"]
wiki_metadatas = {
    "Michael Jordan": {
        "category": "Sports",
        "country": "United States",
    },
    "Elon Musk": {
        "category": "Business",
        "country": "United States",
    },
    "Richard Branson": {
        "category": "Business",
        "country": "UK",
    },
    "Rihanna": {
        "category": "Music",
        "country": "Barbados",
    },
}

In [7]:
from pathlib import Path

import requests

for title in wiki_titles:
    response = requests.get(
        "https://en.wikipedia.org/w/api.php",
        params={
            "action": "query",
            "format": "json",
            "titles": title,
            "prop": "extracts",
            # 'exintro': True,
            "explaintext": True,
        },
    ).json()
    page = next(iter(response["query"]["pages"].values()))
    wiki_text = page["extract"]

    data_path = Path("data_people")
    if not data_path.exists():
        Path.mkdir(data_path)

    with open(data_path / f"{title}.txt", "w") as fp:
        fp.write(wiki_text)

In [10]:
# Load all wiki documents
docs_dict = {}
for wiki_title in wiki_titles:
    doc = SimpleDirectoryReader(
        input_files=[f"data_people/{wiki_title}.txt"]
    ).load_data()[0]

    doc.metadata.update(wiki_metadatas[wiki_title])
    docs_dict[wiki_title] = doc

In [13]:
docs_dict

{'Michael Jordan': Document(id_='2bf86899-2637-4d7f-a3a8-2ba31440e598', embedding=None, metadata={'file_path': 'data_people\\Michael Jordan.txt', 'file_name': 'Michael Jordan.txt', 'file_type': 'text/plain', 'file_size': 66995, 'creation_date': '2024-10-31', 'last_modified_date': '2024-10-31', 'category': 'Sports', 'country': 'United States'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, text='Michael Jeffrey Jordan (born February 17, 1963), also known by his initials MJ, is an American businessman and former professional basketball player. He played 15 seasons in the National Basketball Association (NBA) between 1984 and 2003, winning six NBA championships with the Chicago Bulls. He was integral in popularizing basketball and the NBA around the world in th

In [14]:
from llama_index.llms.openai import OpenAI
from llama_index.core.callbacks import LlamaDebugHandler, CallbackManager
from llama_index.core.node_parser import SentenceSplitter


llm = OpenAI("gpt-4o-mini")
callback_manager = CallbackManager([LlamaDebugHandler()])
splitter = SentenceSplitter(chunk_size=256)

# Metadata Filters + Auto-Retrieval
In this approach, we tag each Document with metadata (category, country), and store in a Weaviate vector db.

During retrieval-time, we then perform "auto-retrieval" to infer the relevant set of metadata filters.

In [None]:
weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]

In [33]:
## Setup Weaviate
import weaviate
from weaviate.classes.init import Auth

# # cloud
# auth_config = weaviate.AuthApiKey(api_key=weaviate_api_key)
# client = weaviate.Client(
#     weaviate_url,
#     auth_client_secret=auth_config,
# )

client = weaviate.connect_to_weaviate_cloud(
    cluster_url=weaviate_url,
    auth_credentials=Auth.api_key(weaviate_api_key),
)

INFO:httpx:HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/meta "HTTP/1.1 200 OK"
HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/meta "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://pypi.org/pypi/weaviate-client/json "HTTP/1.1 200 OK"
HTTP Request: GET https://pypi.org/pypi/weaviate-client/json "HTTP/1.1 200 OK"


In [25]:
print(client.is_ready())


True


In [34]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.vector_stores.weaviate import WeaviateVectorStore
from IPython.display import Markdown, display

In [48]:
# drop items from collection first
# client.schema.delete_class("LlamaIndex")


client.collections.delete("LlamaIndex")

INFO:httpx:HTTP Request: DELETE https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/schema/LlamaIndex "HTTP/1.1 200 OK"
HTTP Request: DELETE https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/schema/LlamaIndex "HTTP/1.1 200 OK"


In [49]:
from llama_index.core import StorageContext


In [50]:

# If you want to load the index later, be sure to give it a name!
vector_store = WeaviateVectorStore(
    weaviate_client=client, index_name="LlamaIndex"
)


INFO:httpx:HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/schema/LlamaIndex "HTTP/1.1 404 Not Found"
HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/schema/LlamaIndex "HTTP/1.1 404 Not Found"
INFO:httpx:HTTP Request: POST https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/schema "HTTP/1.1 200 OK"
HTTP Request: POST https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/schema "HTTP/1.1 200 OK"


In [51]:
storage_context = StorageContext.from_defaults(vector_store=vector_store)

In [43]:

# NOTE: you may also choose to define a index_name manually.
# index_name = "test_prefix"
# vector_store = WeaviateVectorStore(weaviate_client=client, index_name=index_name)

In [52]:
# validate that the schema was created
# class_schema = client.schema.get("LlamaIndex")




class_schema=client.collections.get("LlamaIndex")

display(class_schema)

<weaviate.collections.collection.sync.Collection at 0x23290473400>

In [53]:
index = VectorStoreIndex(
    [],
    storage_context=storage_context,
    transformations=[splitter],
    callback_manager=callback_manager,
)

# add documents to index
for wiki_title in wiki_titles:
    index.insert(docs_dict[wiki_title])

**********
Trace: index_construction
**********
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/schema "HTTP/1.1 200 OK"
HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/schema "HTTP/1.1 200 OK"
**********
Trace: insert
**********
INFO:httpx:HTTP Request: POST https://api

INFO:httpx:HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/nodes "HTTP/1.1 200 OK"
HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/nodes "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/nodes "HTTP/1.1 200 OK"
HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/nodes "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/nodes "HTTP/1.1 200 OK"
HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/nodes "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/nodes "HTTP/1.1 200 OK"
HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/nodes "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://evnfh4ysmyg6qq2

In [54]:
from llama_index.core.retrievers import VectorIndexAutoRetriever
from llama_index.core.vector_stores.types import MetadataInfo, VectorStoreInfo


vector_store_info = VectorStoreInfo(
    content_info="brief biography of celebrities",
    metadata_info=[
        MetadataInfo(
            name="category",
            type="str",
            description=(
                "Category of the celebrity, one of [Sports, Entertainment,"
                " Business, Music]"
            ),
        ),
        MetadataInfo(
            name="country",
            type="str",
            description=(
                "Country of the celebrity, one of [United States, Barbados,"
                " Portugal]"
            ),
        ),
    ],
)
retriever = VectorIndexAutoRetriever(
    index,
    vector_store_info=vector_store_info,
    llm=llm,
    callback_manager=callback_manager,
    max_top_k=10000,
)

In [55]:
# NOTE: the "set top-k to 10000" is a hack to return all data.
# Right now auto-retrieval will always return a fixed top-k, there's a TODO to allow it to be None
# to fetch all data.
# So it's theoretically possible to have the LLM infer a None top-k value.
nodes = retriever.retrieve(
    "Tell me about a celebrity from the United States, set top k to 10000"
)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:llama_index.core.indices.vector_store.retrievers.auto_retriever.auto_retriever:Using query str: Celebrity biography
Using query str: Celebrity biography
INFO:llama_index.core.indices.vector_store.retrievers.auto_retriever.auto_retriever:Using filters: [('country', '==', 'United States')]
Using filters: [('country', '==', 'United States')]
INFO:llama_index.core.indices.vector_store.retrievers.auto_retriever.auto_retriever:Using top_k: 2
Using top_k: 2
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/schema/LlamaIndex "HTTP/1.1 200 OK"
HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.wea

In [56]:
print(f"Number of nodes: {len(nodes)}")
for node in nodes[:10]:
    print(node.node.get_content())

Number of nodes: 2
Princeton University Press. ISBN 978-0-691-13751-3.
Porter, David L. (2007). Michael Jordan: A Biography. Greenwood Publishing Group. ISBN 978-0-313-33767-3.
The Sporting News Official NBA Register 1994–95 (1994). The Sporting News. ISBN 978-0-89204-501-3.


== Further reading ==
Dyson, M. E. (1993). Be like Mike?: Michael Jordan and the pedagogy of desire. Cultural Studies, 7(1), 64–72.
Leahy, Michael (2004). When Nothing Else Matters: Michael Jordan's Last Comeback. Simon & Schuster. ISBN 978-0-7432-7648-1.
Mathur, Lynette Knowles, et al. "The wealth effects associated with a celebrity endorser: The Michael Jordan phenomenon." Journal of Advertising Research, vol. 37, no. 3, May–June 1997, pp.
Jordan granted rapper Travis Scott permission to film a music video for his single "Franchise" at his home in Highland Park, Illinois. Jordan appeared in the 2022 miniseries The Captain, which follows the life and career of Derek Jeter.


=== Books ===
Jordan has authored sev

In [57]:
nodes = retriever.retrieve(
    "Tell me about the childhood of a popular sports celebrity in the United"
    " States"
)
for node in nodes:
    print(node.node.get_content())

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:llama_index.core.indices.vector_store.retrievers.auto_retriever.auto_retriever:Using query str: childhood of a popular sports celebrity
Using query str: childhood of a popular sports celebrity
INFO:llama_index.core.indices.vector_store.retrievers.auto_retriever.auto_retriever:Using filters: [('category', '==', 'Sports'), ('country', '==', 'United States')]
Using filters: [('category', '==', 'Sports'), ('country', '==', 'United States')]
INFO:llama_index.core.indices.vector_store.retrievers.auto_retriever.auto_retriever:Using top_k: 2
Using top_k: 2
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/schema/Lla

In [58]:
nodes = retriever.retrieve(
    "Tell me about the college life of a billionaire who started at company at"
    " the age of 16"
)
for node in nodes:
    print(node.node.get_content())

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:llama_index.core.indices.vector_store.retrievers.auto_retriever.auto_retriever:Using query str: college life of a billionaire who started a company at the age of 16
Using query str: college life of a billionaire who started a company at the age of 16
INFO:llama_index.core.indices.vector_store.retrievers.auto_retriever.auto_retriever:Using filters: []
Using filters: []
INFO:llama_index.core.indices.vector_store.retrievers.auto_retriever.auto_retriever:Using top_k: 2
Using top_k: 2
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/schema/LlamaIndex "HTTP/1.1 200 OK"
HTTP Request: GET https://evnfh4ysmyg6qq2jgu

In [59]:
nodes = retriever.retrieve("Tell me about the childhood of a UK billionaire")
for node in nodes:
    print(node.node.get_content())

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:llama_index.core.indices.vector_store.retrievers.auto_retriever.auto_retriever:Using query str: childhood of a UK billionaire
Using query str: childhood of a UK billionaire
INFO:llama_index.core.indices.vector_store.retrievers.auto_retriever.auto_retriever:Using filters: []
Using filters: []
INFO:llama_index.core.indices.vector_store.retrievers.auto_retriever.auto_retriever:Using top_k: 2
Using top_k: 2
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/schema/LlamaIndex "HTTP/1.1 200 OK"
HTTP Request: GET https://evnfh4ysmyg6qq2jgurfa.c0.asia-southeast1.gcp.weaviate.cloud/v1/schema/LlamaIndex "HTTP/1.1 200 O

#  Build Recursive Retriever over Document Summaries


In [60]:
from llama_index.core.schema import IndexNode

In [61]:
# define top-level nodes and vector retrievers
nodes = []
vector_query_engines = {}
vector_retrievers = {}

for wiki_title in wiki_titles:
    # build vector index
    vector_index = VectorStoreIndex.from_documents(
        [docs_dict[wiki_title]],
        transformations=[splitter],
        callback_manager=callback_manager,
    )
    # define query engines
    vector_query_engine = vector_index.as_query_engine(llm=llm)
    vector_query_engines[wiki_title] = vector_query_engine
    vector_retrievers[wiki_title] = vector_index.as_retriever()

    # save summaries
    out_path = Path("summaries") / f"{wiki_title}.txt"
    if not out_path.exists():
        # use LLM-generated summary
        summary_index = SummaryIndex.from_documents(
            [docs_dict[wiki_title]], callback_manager=callback_manager
        )

        summarizer = summary_index.as_query_engine(
            response_mode="tree_summarize", llm=llm
        )
        response = await summarizer.aquery(
            f"Give me a summary of {wiki_title}"
        )

        wiki_summary = response.response
        Path("summaries").mkdir(exist_ok=True)
        with open(out_path, "w") as fp:
            fp.write(wiki_summary)
    else:
        with open(out_path, "r") as fp:
            wiki_summary = fp.read()

    print(f"**Summary for {wiki_title}: {wiki_summary}")
    node = IndexNode(text=wiki_summary, index_id=wiki_title)
    nodes.append(node)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
**********
Trace: index_construction
**********
**********
Trace: index_construction
**********
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
**Summary for Michael Jordan: Michael Jordan, born on February 17, 1963, is a renowned American b

In [62]:
# define top-level retriever
top_vector_index = VectorStoreIndex(
    nodes, transformations=[splitter], callback_manager=callback_manager
)
top_vector_retriever = top_vector_index.as_retriever(similarity_top_k=1)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
**********
Trace: index_construction
**********


In [63]:
# define recursive retriever
from llama_index.core.retrievers import RecursiveRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core import get_response_synthesizer

In [64]:
# note: can pass `agents` dict as `query_engine_dict` since every agent can be used as a query engine
recursive_retriever = RecursiveRetriever(
    "vector",
    retriever_dict={"vector": top_vector_retriever, **vector_retrievers},
    # query_engine_dict=vector_query_engines,
    verbose=True,
)

In [65]:
# run recursive retriever
nodes = recursive_retriever.retrieve(
    "Tell me about a celebrity from the United States"
)
for node in nodes:
    print(node.node.get_content())

[1;3;34mRetrieving with query id None: Tell me about a celebrity from the United States
[0mINFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
[1;3;38;5;200mRetrieved node with id, entering: Rihanna
[0m[1;3;34mRetrieving with query id Rihanna: Tell me about a celebrity from the United States
[0m[1;3;38;5;200mRetrieving text node: During Rihanna's third annual "Diamond Ball", former U.S. president Barack Obama, praised Rihanna's work and stated: "[She's] become a powerful force in the fight to give people dignity." On September 20, 2018, Rihanna was appointed by the government of Barbados to be an Ambassador Extraordinary and Plenipotentiary, with special duties of promoting "education, tourism and investment for the island." 
At the 2020 NAACP Image Awards, hosted by BET, Rihanna accepted the President's Award from Derrick Johnson. Johnson stated that "Rihanna has not only 

In [66]:
nodes = recursive_retriever.retrieve(
    "Tell me about the childhood of a billionaire who started at company at"
    " the age of 16"
)
for node in nodes:
    print(node.node.get_content())

[1;3;34mRetrieving with query id None: Tell me about the childhood of a billionaire who started at company at the age of 16
[0mINFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
[1;3;38;5;200mRetrieved node with id, entering: Elon Musk
[0m[1;3;34mRetrieving with query id Elon Musk: Tell me about the childhood of a billionaire who started at company at the age of 16
[0m[1;3;38;5;200mRetrieving text node: Elon had a tendency to call people stupid. How could I possibly blame that child?" After the incident, Elon was enrolled in private school.
Elon was an enthusiastic reader of books, later attributing his success in part to having read The Lord of the Rings, the Foundation series, and The Hitchhiker's Guide to the Galaxy. At age ten, he developed an interest in computing and video games, teaching himself how to program from the VIC-20 user manual. At age twelve, Elon sold hi