# RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

In [1]:
# NOTE: An OpenAI API key must be set here for application initialization, even if not in use.
# If you're not utilizing OpenAI models, assign a placeholder string (e.g., "not_used").
import os
os.environ["GROQ_API_KEY"] = " "


In [6]:
# Cinderella story defined in sample.txt
with open('demo/sample.txt', 'r') as file:
    text = file.read()

print(text[:100])

The wife of a rich man fell sick, and as she felt that her end
was drawing near, she called her only


1) **Building**: RAPTOR recursively embeds, clusters, and summarizes chunks of text to construct a tree with varying levels of summarization from the bottom up. You can create a tree from the text in 'sample.txt' using `RA.add_documents(text)`.

2) **Querying**: At inference time, the RAPTOR model retrieves information from this tree, integrating data across lengthy documents at different abstraction levels. You can perform queries on the tree with `RA.answer_question`.

### Building the tree

In [2]:
from raptor import RetrievalAugmentation 
import umap

2025-01-25 22:54:17,637 - Loading faiss with AVX512 support.
2025-01-25 22:54:17,638 - Could not load library with AVX512 support due to:
ModuleNotFoundError("No module named 'faiss.swigfaiss_avx512'")
2025-01-25 22:54:17,639 - Loading faiss with AVX2 support.
2025-01-25 22:54:17,738 - Successfully loaded faiss with AVX2 support.


In [3]:
RA = RetrievalAugmentation()

# construct the tree
RA.add_documents(text)

NameError: name 'RetrievalAugmentation' is not defined

### Querying from the tree

```python
question = # any question
RA.answer_question(question)
```

In [None]:
question = "How did Cinderella reach her happy ending ?"

answer = RA.answer_question(question=question)

print("Answer: ", answer)

In [None]:
# Save the tree by calling RA.save("path/to/save")
SAVE_PATH = "demo/cinderella"
RA.save(SAVE_PATH)

In [None]:
# load back the tree by passing it into RetrievalAugmentation

RA = RetrievalAugmentation(tree=SAVE_PATH)

answer = RA.answer_question(question=question)
print("Answer: ", answer)

## Using other Open Source Models for Summarization/QA/Embeddings

If you want to use other models such as Llama or Mistral, you can very easily define your own models and use them with RAPTOR. 

In [3]:
import torch
from raptor import BaseSummarizationModel, BaseQAModel, BaseEmbeddingModel, RetrievalAugmentationConfig
from transformers import AutoTokenizer, pipeline

In [None]:
# if you want to use the Gemma, you will need to authenticate with HuggingFace, Skip this step, if you have the model already downloaded
from huggingface_hub import login
login()

In [4]:
# ...existing code...
import os
from groq import Groq

class GEMMASummarizationModel(BaseSummarizationModel):
    def __init__(self, model_name="llama-3.3-70b-versatile"):
        self.model_name = model_name
        self.client = Groq(api_key=os.environ.get("GROQ_API_KEY"))

    def summarize(self, context, max_tokens=150):
        messages = [
            {
                "role": "user",
                "content": f"Write a summary of the following, including as many key details as possible: {context}",
            }
        ]
        response = self.client.chat.completions.create(
            model=self.model_name,
            messages=messages,
            max_tokens=max_tokens,
            temperature=0.7,
            top_p=0.95,
        )
        summary = response.choices[0].message.content.strip()
        return summary
# ...existing code...

In [5]:
# ...existing code...
import os
from groq import Groq

class GEMMAQAModel(BaseQAModel):
    def __init__(self, model_name="llama-3.3-70b-versatile"):
        self.model_name = model_name
        self.client = Groq(api_key=os.environ.get("GROQ_API_KEY"))

    def answer_question(self, context, question):
        messages = [
            {
                "role": "user",
                "content": f"Given Context: {context} Give the best full answer to the question {question}",
            }
        ]
        # Remove unsupported arguments (e.g., top_k, top_p)
        response = self.client.chat.completions.create(
            model=self.model_name,
            messages=messages,
            max_tokens=256,
            temperature=0.7,
        )
        return response.choices[0].message.content.strip()
# ...existing code...

In [6]:
from sentence_transformers import SentenceTransformer
class SBertEmbeddingModel(BaseEmbeddingModel):
    def __init__(self, model_name="sentence-transformers/multi-qa-mpnet-base-cos-v1"):
        self.model = SentenceTransformer(model_name)

    def create_embedding(self, text):
        return self.model.encode(text)


In [7]:
RAC = RetrievalAugmentationConfig(summarization_model=GEMMASummarizationModel(), qa_model=GEMMAQAModel(), embedding_model=SBertEmbeddingModel())

2025-01-25 22:54:36,123 - Load pretrained SentenceTransformer: sentence-transformers/multi-qa-mpnet-base-cos-v1
2025-01-25 22:54:38,328 - Use pytorch device: cpu


In [8]:
RA = RetrievalAugmentation(config=RAC)

2025-01-25 22:54:40,512 - Successfully initialized TreeBuilder with Config 
        TreeBuilderConfig:
            Tokenizer: <Encoding 'cl100k_base'>
            Max Tokens: 100
            Num Layers: 5
            Threshold: 0.5
            Top K: 5
            Selection Mode: top_k
            Summarization Length: 100
            Summarization Model: <__main__.GEMMASummarizationModel object at 0x000001F1EF728DA0>
            Embedding Models: {'EMB': <__main__.SBertEmbeddingModel object at 0x000001F1F1296120>}
            Cluster Embedding Model: EMB
        
        Reduction Dimension: 10
        Clustering Algorithm: RAPTOR_Clustering
        Clustering Parameters: {}
        
2025-01-25 22:54:40,513 - Successfully initialized ClusterTreeBuilder with Config 
        TreeBuilderConfig:
            Tokenizer: <Encoding 'cl100k_base'>
            Max Tokens: 100
            Num Layers: 5
            Threshold: 0.5
            Top K: 5
            Selection Mode: top_k
            

In [9]:
# ...existing code...
with open('Processing DPDF_CONNECTMM-0211-02 -.txt', 'r', encoding='utf-8', errors='replace') as file:
    text = file.read()

RA.add_documents(text)
# ...existing code...

2025-01-25 22:54:44,385 - Creating Leaf Nodes


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2025-01-25 22:55:05,514 - Created 27 Leaf Embeddings
2025-01-25 22:55:05,515 - Building All Nodes
2025-01-25 22:55:05,518 - Using Cluster TreeBuilder
2025-01-25 22:55:05,519 - Constructing Layer 0
2025-01-25 22:55:21,183 - Summarization Length: 100
2025-01-25 22:55:22,114 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2025-01-25 22:55:22,123 - Node Texts Length: 325, Summarized Text Length: 100


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2025-01-25 22:55:23,399 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2025-01-25 22:55:23,404 - Node Texts Length: 551, Summarized Text Length: 100


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2025-01-25 22:55:24,661 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2025-01-25 22:55:24,664 - Node Texts Length: 340, Summarized Text Length: 101


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2025-01-25 22:55:25,784 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2025-01-25 22:55:25,789 - Node Texts Length: 366, Summarized Text Length: 100


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2025-01-25 22:55:26,964 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2025-01-25 22:55:26,968 - Node Texts Length: 336, Summarized Text Length: 100


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2025-01-25 22:55:28,021 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2025-01-25 22:55:28,026 - Node Texts Length: 191, Summarized Text Length: 100


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2025-01-25 22:55:28,501 - Constructing Layer 1
2025-01-25 22:55:28,503 - Stopping Layer construction: Cannot Create More Layers. Total Layers in tree: 1
2025-01-25 22:55:28,504 - Successfully initialized TreeRetriever with Config 
        TreeRetrieverConfig:
            Tokenizer: <Encoding 'cl100k_base'>
            Threshold: 0.5
            Top K: 5
            Selection Mode: top_k
            Context Embedding Model: EMB
            Embedding Model: <__main__.SBertEmbeddingModel object at 0x000001F1F1296120>
            Num Layers: None
            Start Layer: None
        


In [11]:
question = "see the whole doucment  and tell the steps  for maintain material master defaults"

answer = RA.answer_question(question=question)

print("Answer: ", answer)

2025-01-25 22:58:23,914 - Using collapsed_tree


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2025-01-25 22:58:26,285 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"


Answer:  Based on the provided document, the steps to maintain Material Master Defaults using transaction code ZDML are as follows:

1. **Maintain Material Master Defaults (MM)**: Open ZDML and enter the plant code. If defaults are to be set on the profit center level, enter the profit center; otherwise, keep it blank. Hit Enter to continue.

2. **Maintain Sales Organization 1**: Keep all attributes blank, as these values are material-specific.

3. **Maintain Sales Organization 2**: 
   - Material statistics group = 1
   - Account assignment group = (as agreed upon with the accounting team)

4. **Maintain Purchasing**: 
   - Enter an initial value for the purchasing group, agreed upon with the purchasing team.
   - Set the GR proc time to 1 if needed.
   - Enter the storage location for purchase orders.

5. **Maintain MRP1 (Code 70)**: 
   - MRP group: Enter the initial MRP group agreed upon with planning.
   - ABC indicator: Set to "C".
   - Procurement: Depending on the plant activit