# Small-to-big Retrieval Pack

This LlamaPack provides an example of our small-to-big retrieval (with recursive retrieval).

In [1]:
import nest_asyncio

nest_asyncio.apply()

### Setup Data

In [None]:
!wget "https://www.dropbox.com/s/f6bmb19xdg0xedm/paul_graham_essay.txt?dl=1" -O paul_graham_essay.txt

In [2]:
from llama_index import SimpleDirectoryReader
from llama_index.node_parser import SimpleNodeParser

# load in some sample data
reader = SimpleDirectoryReader(input_files=["paul_graham_essay.txt"])
documents = reader.load_data()

### Download and Initialize Pack

Note that this pack directly takes in the html file, no need to load it beforehand.

In [19]:
from llama_index.llama_pack import download_llama_pack

RecursiveRetrieverSmallToBigPack = download_llama_pack(
    "RecursiveRetrieverSmallToBigPack",
    "./recursive_retriever_stb_pack",
    # leave the below commented out (was for testing purposes)
    # llama_hub_url="https://raw.githubusercontent.com/run-llama/llama-hub/jerry/add_llama_packs/llama_hub",
)

In [20]:
recursive_retriever_stb_pack = RecursiveRetrieverSmallToBigPack(
    documents,
)

### Run Pack

In [15]:
# this will run the full pack
response = recursive_retriever_stb_pack.run("What did the author do growing up?")

[1;3;34mRetrieving with query id None: What did the author do growing up?
[0m[1;3;38;5;200mRetrieved node with id, entering: node-0
[0m[1;3;34mRetrieving with query id node-0: What did the author do growing up?
[0m

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


In [16]:
print(str(response))

The author wrote short stories and also worked on programming, specifically on an IBM 1401 computer in 9th grade. They used an early version of Fortran and had to type programs on punch cards. They also mentioned getting a microcomputer, a TRS-80, in about 1980 and started programming on it.


In [17]:
len(response.source_nodes)

1

### Inspect Modules

In [18]:
modules = recursive_retriever_stb_pack.get_modules()
display(modules)

{'query_engine': <llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine at 0x31d055900>,
 'recursive_retriever': <llama_index.retrievers.recursive_retriever.RecursiveRetriever at 0x31d056a70>,
 'llm': OpenAI(callback_manager=<llama_index.callbacks.base.CallbackManager object at 0x30fd7c0d0>, model='gpt-3.5-turbo', temperature=0.1, max_tokens=None, additional_kwargs={}, max_retries=3, timeout=60.0, default_headers=None, api_key='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', api_base='https://api.openai.com/v1', api_version=''),
 'embed_model': HuggingFaceEmbedding(model_name='BAAI/bge-small-en', embed_batch_size=10, callback_manager=<llama_index.callbacks.base.CallbackManager object at 0x30fd7c0d0>, tokenizer_name='BAAI/bge-small-en', max_length=512, pooling=<Pooling.CLS: 'cls'>, normalize=True, query_instruction=None, text_instruction=None, cache_folder=None),
 'service_context': ServiceContext(llm_predictor=LLMPredictor(system_prompt=None, query_wrapper_prompt=N