#### Quick Start: How to Use the RAG Framework in Your Application

This example demonstrates the basic usage of the framework via a YAML configuration file and a Dependency Injection (DI) container.

---

This framework provides the infrastructure for building Retrieval-Augmented Generation (RAG) systems in Python.  
It is designed according to the principles of Clean Architecture and uses YAML-based configuration and a DI container  
to connect components such as retrievers, LLMs, data loaders, and more.


#### Framework Architecture

The framework is designed in accordance with the principles of Clean Architecture:

- **Ports** ‚Äî interfaces that define what a component must implement (e.g., `LLMPort`, `RetrieverPort`).
- **Adapters** ‚Äî concrete implementations of those interfaces, such as `HuggingFaceInferenceAdapter`.
- **DIContainer** ‚Äî assembles adapters based on the configuration without violating isolation or scalability.

This allows you to easily swap out adapters (e.g., replace HuggingFace with OpenAI or your own implementation) without touching the business logic.

#### Glossary of Key Framework Components

- **IngestionService** ‚Äî responsible for data preparation: loading, chunking, and indexing in the retriever.
- **AnswerService** ‚Äî handles the main answering pipeline: extracting relevant documents and generating a response using the LLM.
- **Retriever** ‚Äî component responsible for searching documents. Its interface is defined in `RetrieverPort`.
- **LLM** ‚Äî large language model. The interface for response generation is defined in `LLMPort`.
- **DIContainer** ‚Äî a dependency injection container that builds and manages component lifecycles.
- **ADAPTER_REGISTRY** ‚Äî a global registry of interface implementations (adapters), available for configuration-based injection.

The easiest way to get started with the framework is via a YAML configuration file and the dependency injection container.

All you need to do is:

- load a configured YAML file,
- create an instance of the `DIContainer` and call `build_app_service()` to initialize the components,
- use the `generate_answer` method to generate a response based on retrieved relevant documents.


–ü—Ä–æ—â–µ –≤—Å–µ–≥–æ –Ω–∞—á–∞—Ç—å –∏—Å–ø–æ–ª—å–∑–æ–≤–∞–Ω–∏–µ —Ñ—Ä–µ–π–º–≤–æ—Ä–∫–∞ —Å YAML-–∫–æ–Ω—Ñ–∏–≥—É—Ä–∞—Ü–∏–∏ –∏ –∫–æ–Ω—Ç–µ–π–Ω–µ—Ä–∞ –∑–∞–≤–∏—Å–∏–º–æ—Å—Ç–µ–π.

–í—Å–µ, —á—Ç–æ –æ—Ç –≤–∞—Å —Ç—Ä–µ–±—É–µ—Ç—Å—è:
- –∑–∞–≥—Ä—É–∑–∏—Ç—å –Ω–∞—Å—Ç—Ä–æ–µ–Ω–Ω—ã–π —Ñ–∞–π–ª –∫–æ–Ω—Ñ–∏–≥—É—Ä–∞—Ü–∏–∏,
- —Å–æ–∑–¥–∞—Ç—å —ç–∫–∑–µ–º–ø–ª—è—Ä –∫–æ–Ω—Å—Ç—Ä—É–∫—Ç–æ—Ä–∞ –∑–∞–≤–∏—Å–∏–º–æ—Å—Ç–µ–π `DIContainer` –∏ –≤—ã–∑–≤–∞—Ç—å –º–µ—Ç–æ–¥ –ø–æ—Å—Ç—Ä–æ–µ–Ω–∏—è —ç—Ç–∏—Ö –∑–∞–≤–∏—Å–∏–º–æ—Å—Ç–µ–π –Ω–∞ –æ—Å–Ω–æ–≤–µ —Ñ–∞–π–ª–∞ –∫–æ–Ω—Ñ–∏–≥—É—Ä–∞—Ü–∏–∏ `build_app_service`,
- –≤–æ—Å–ø–æ–ª—å–∑–æ–≤–∞—Ç—å—Å—è –º–µ—Ç–æ–¥–æ–º –≥–µ–Ω–µ—Ä–∞—Ü–∏–∏ –æ—Ç–≤–µ—Ç–∞ –Ω–∞ –æ—Å–Ω–æ–≤–µ –Ω–∞–π–¥–µ–Ω–Ω—ã—Ö —Ä–µ–ª–µ–≤–∞–Ω—Ç–Ω—ã—Ö –¥–æ–∫—É–º–µ–Ω—Ç–æ–≤ `generate_answer`.

In [2]:
from ragbee_fw import load_config

In [3]:
app_config_path = "/workspace/src/ragbee_fw/config/app_config.yml"
app_config = load_config(app_config_path)
app_config.llm.token = "your_token"

In [4]:
from ragbee_fw import DIContainer


container = DIContainer(app_config)
app = container.build_app_service()

#### Result: Working with `AnswerService`

As a result, we obtain an instance of the `AnswerService` class, which is designed for the main pipeline:

- retrieving text fragments relevant to the user query,
- generating a response using the LLM.


In [14]:
type(app)

ragbee_fw.core.services.answer_service.AnswerService

You can now simply pass a question to the `generate_answer` method of the created `AnswerService` instance and get a response from the LLM.

In [None]:
response = app.generate_answer(query="–∫–∞–∫–∏–µ –±—ã–ª–∏ –ò—Å–ø–∞–Ω—Å–∫–∏–µ –∑–∞–≤–æ–µ–≤–∞–Ω–∏—è –≤ –ê–º–µ—Ä–∏–∫–µ?")

In [11]:
response

'–ò—Å–ø–∞–Ω—Å–∫–∏–µ –∑–∞–≤–æ–µ–≤–∞–Ω–∏—è –≤ –ê–º–µ—Ä–∏–∫–µ –≤–∫–ª—é—á–∞–ª–∏ –≤ —Å–µ–±—è –Ω–µ—Å–∫–æ–ª—å–∫–æ –≤–∞–∂–Ω—ã—Ö —Å–æ–±—ã—Ç–∏–π –∏ –∫–∞–º–ø–∞–Ω–∏–π, –∫–æ—Ç–æ—Ä—ã–µ –ø—Ä–∏–≤–µ–ª–∏ –∫ –∫–æ–ª–æ–Ω–∏–∑–∞—Ü–∏–∏ –∑–Ω–∞—á–∏—Ç–µ–ª—å–Ω–æ–π —á–∞—Å—Ç–∏ –∞–º–µ—Ä–∏–∫–∞–Ω—Å–∫–æ–≥–æ –∫–æ–Ω—Ç–∏–Ω–µ–Ω—Ç–∞. –í–æ—Ç –∫–ª—é—á–µ–≤—ã–µ –º–æ–º–µ–Ω—Ç—ã:\n\n1. **–≠–∫—Å–ø–µ–¥–∏—Ü–∏—è –ö–æ—Ä—Ç–µ—Å–∞ –≤ –ú–µ–∫—Å–∏–∫—É (1518 –≥.)**: –ò—Å–ø–∞–Ω—Å–∫–∏–π –≥—É–±–µ—Ä–Ω–∞—Ç–æ—Ä –ö—É–±—ã –æ—Ç–ø—Ä–∞–≤–∏–ª —ç–∫—Å–ø–µ–¥–∏—Ü–∏—é –ø–æ–¥ —Ä—É–∫–æ–≤–æ–¥—Å—Ç–≤–æ–º –§–µ—Ä–Ω–∞–Ω–¥–æ –ö–æ—Ä—Ç–µ—Å–∞ –≤ –ú–µ–∫—Å–∏–∫—É. –ö–æ—Ä—Ç–µ—Å –≤—ã—Å–∞–¥–∏–ª—Å—è —Å 600 —á–µ–ª–æ–≤–µ–∫ –∏, —á—Ç–æ–±—ã –∏—Å–∫–ª—é—á–∏—Ç—å –≤–æ–∑–º–æ–∂–Ω–æ—Å—Ç—å –æ—Ç—Å—Ç—É–ø–ª–µ–Ω–∏—è, —Å–∂–µ–≥ –≤—Å–µ –∫–æ—Ä–∞–±–ª–∏.\n\n2. **–ó–∞–≤–æ–µ–≤–∞–Ω–∏–µ –≥–æ—Å—É–¥–∞—Ä—Å—Ç–≤–∞ –∞—Ü—Ç–µ–∫–æ–≤ (1521 –≥.)**: –ö–æ—Ä—Ç–µ—Å –¥–≤–∏–Ω—É–ª—Å—è –≤–≥–ª—É–±—å —Å—Ç—Ä–∞–Ω—ã –∏ –¥–æ—Å—Ç–∏–≥ —Å—Ç–æ–ª–∏—Ü—ã –∞—Ü—Ç–µ–∫–æ–≤, –ú–µ—Ö–∏–∫–æ. –ü–µ—Ä–≤–æ–Ω–∞—á–∞–ª—å–Ω–æ –∞—Ü—Ç–µ–∫–∏ –ø—Ä–∏–Ω—è–ª

---

> **Note:** The DI container keeps all created objects in an internal dictionary called `_cache`.  
> This may be useful when integrating custom modules or accessing shared dependencies manually.

In [24]:
contain_obj = container._cache
display(contain_obj.keys())

dict_keys(['data_loader', 'text_chunker', 'retriever', 'retriever_with_index', 'llm'])

In [25]:
retriever = contain_obj.get("retriever")
display(retriever)

<ragbee_fw.infrastructure.retriever.bm25_client.BM25Client at 0x7ff36ef6b520>

#### Low-Level Control Over Dependencies and Custom Module Creation

You can bypass the DI container and manually construct all dependencies.  
This provides maximum flexibility and control when integrating custom modules.

However, this approach requires a deeper understanding of the framework's architecture, including the relationships between ports and adapters in the spirit of hexagonal architecture.


In [5]:
from ragbee_fw.infrastructure.data_loader.file_loader import FileSystemLoader
from ragbee_fw.infrastructure.text_splitter.recursive_text_splitter import RecursiveTextSplitter
from ragbee_fw.infrastructure.retriever.bm25_client import BM25Client 
from ragbee_fw.infrastructure.llm_clients.huggingface_client import HuggingFaceInferenceAdapter

from ragbee_fw import IngestionService
from ragbee_fw import AnswerService

In [None]:
loader = FileSystemLoader()
chanker = RecursiveTextSplitter()
retriever = BM25Client()
llm = HuggingFaceInferenceAdapter(model_name="meta-llama/Llama-4-Scout-17B-16E-Instruct",
                                    provider="cerebras",
                                    token="your_token",)


At this point, you can build and inject any custom module into the pipeline that aligns with one of the following responsibilities:

- data loading, chunking, and indexing ‚Äî for `IngestionService`,  
- document retrieval and answer generation ‚Äî for `AnswerService`.

All you need is to follow the port-based connection rules described in `ragbee_fw.core.ports`.


In [None]:
ingestion = IngestionService(
    loader=loader,
    chunker=chanker,
    retriever=retriever
)
retriever = ingestion.build_index("/workspace/documents")

In [None]:
responsible = AnswerService(
    retriever=retriever,
    llm=llm
)
response = responsible.generate_answer(
    query="–∫–∞–∫–∏–µ –±—ã–ª–∏ –ò—Å–ø–∞–Ω—Å–∫–∏–µ –∑–∞–≤–æ–µ–≤–∞–Ω–∏—è –≤ –ê–º–µ—Ä–∏–∫–µ?"
)

In [None]:
response

'–ò—Å–ø–∞–Ω—Å–∫–∏–µ –∑–∞–≤–æ–µ–≤–∞–Ω–∏—è –≤ –ê–º–µ—Ä–∏–∫–µ –≤–∫–ª—é—á–∞–ª–∏ –≤ —Å–µ–±—è –Ω–µ—Å–∫–æ–ª—å–∫–æ –∫–ª—é—á–µ–≤—ã—Ö —Å–æ–±—ã—Ç–∏–π –∏ –∑–∞–≤–æ–µ–≤–∞–Ω–∏–π, –∫–æ—Ç–æ—Ä—ã–µ —Å—É—â–µ—Å—Ç–≤–µ–Ω–Ω–æ –ø–æ–≤–ª–∏—è–ª–∏ –Ω–∞ –∏—Å—Ç–æ—Ä–∏—é —Ä–µ–≥–∏–æ–Ω–∞. –í–æ—Ç –æ—Å–Ω–æ–≤–Ω—ã–µ –º–æ–º–µ–Ω—Ç—ã:\n\n1. **–≠–∫—Å–ø–µ–¥–∏—Ü–∏—è –ö–æ—Ä—Ç–µ—Å–∞ –≤ –ú–µ–∫—Å–∏–∫—É (1518 –≥.)**: –ò—Å–ø–∞–Ω—Å–∫–∏–π –≥—É–±–µ—Ä–Ω–∞—Ç–æ—Ä –æ—Å—Ç—Ä–æ–≤–∞ –ö—É–±—ã –æ—Ç–ø—Ä–∞–≤–∏–ª —ç–∫—Å–ø–µ–¥–∏—Ü–∏—é –ø–æ–¥ –ø—Ä–µ–¥–≤–æ–¥–∏—Ç–µ–ª—å—Å—Ç–≤–æ–º –§–µ—Ä–Ω–∞–Ω–¥–æ –ö–æ—Ä—Ç–µ—Å–∞ –≤ –ú–µ–∫—Å–∏–∫—É. –ö–æ—Ä—Ç–µ—Å —Å–∂–µ–≥ –≤—Å–µ –∫–æ—Ä–∞–±–ª–∏ –ø–æ—Å–ª–µ –≤—ã—Å–∞–¥–∫–∏, —á—Ç–æ–±—ã –∏—Å–∫–ª—é—á–∏—Ç—å –≤–æ–∑–º–æ–∂–Ω–æ—Å—Ç—å –≤–æ–∑–≤—Ä–∞—â–µ–Ω–∏—è, –∏ –¥–≤–∏–Ω—É–ª—Å—è –≤–≥–ª—É–±—å —Å—Ç—Ä–∞–Ω—ã –∫ –≥–æ—Å—É–¥–∞—Ä—Å—Ç–≤—É –∞—Ü—Ç–µ–∫–æ–≤.\n\n2. **–ó–∞—Ö–≤–∞—Ç –ú–µ—Ö–∏–∫–æ (1521 –≥.)**: –ò—Å–ø–∞–Ω—Ü—ã –±–µ—Å–ø—Ä–µ–ø—è—Ç—Å—Ç–≤–µ–Ω–Ω–æ –≤–æ—à–ª–∏ –≤ —Å—Ç–æ–ª–∏—Ü—É –∞—Ü—Ç–µ–∫–æ–≤, –ú–µ—Ö–∏–∫–æ, –≥–¥–µ –∏—Ö –ø—Ä–∏–Ω—è–ª–∏ –∑–∞ –±–æ–≥–æ–≤.

#### Creating a Custom Module and Registering it in the Dependency Container

The easiest way to integrate your own module is by registering it in the DI container through `ADAPTER_REGISTRY`.  
This lowers the entry barrier and simplifies integration.

Steps:

- Implement your custom class that adheres to the port interface (see `ragbee_fw.core.ports`).  
  You‚Äôll also find abstract base classes there to guide and validate your implementation.
- Register your module in the `ADAPTER_REGISTRY`.
- Define your module in the `config.yml` file.
- Proceed with the standard generation process using the DI container.


In [1]:
from ragbee_fw.core.ports.llm_port import LLMPort, BaseLLM
from ragbee_fw import DIContainer, ADAPTER_REGISTRY


In [2]:
display(ADAPTER_REGISTRY)

{'data_loader': {'file_loader': ragbee_fw.infrastructure.data_loader.file_loader.FileSystemLoader},
 'text_chunker': {'recursive_splitter': ragbee_fw.infrastructure.text_splitter.recursive_text_splitter.RecursiveTextSplitter},
 'retriever': {'bm25': ragbee_fw.infrastructure.retriever.bm25_client.BM25Client},
 'llm': {'HF': ragbee_fw.infrastructure.llm_clients.huggingface_client.HuggingFaceInferenceAdapter}}

In [None]:
class MyDummyResponce(LLMPort):
    def generate(self, prompt: str):
        # custom logic goes here
        return f"I got {len(prompt)} chars in out prompt: {prompt}"


In [4]:
DIContainer.register_adapter(component="llm", 
                           adapter_type="dummy",
                           cls=MyDummyResponce)

In [5]:
display(ADAPTER_REGISTRY)

{'data_loader': {'file_loader': ragbee_fw.infrastructure.data_loader.file_loader.FileSystemLoader},
 'text_chunker': {'recursive_splitter': ragbee_fw.infrastructure.text_splitter.recursive_text_splitter.RecursiveTextSplitter},
 'retriever': {'bm25': ragbee_fw.infrastructure.retriever.bm25_client.BM25Client},
 'llm': {'HF': ragbee_fw.infrastructure.llm_clients.huggingface_client.HuggingFaceInferenceAdapter,
  'dummy': __main__.MyDummyResponce}}

Once registered, your module becomes available to the DI container and can replace the default implementation in the pipeline.

The next step is to define it in the `config.yml` file:

```yaml
# config_app.yml
llm:
  type: dummy
  model_name: any_name
  token: ""
  provider: ""
  prompt: ""
  max_new_tokens: 0
```

In this example, however, we will directly modify the loaded Pydantic config object instead of editing the YAML file.

In [6]:
from ragbee_fw import load_config

app_config_path = "/workspace/src/ragbee_fw/config/app_config.yml"
app_config = load_config(app_config_path)

In [7]:
from ragbee_fw.core.models.app_config import LLM


app_config.llm = LLM(**{
    "type": "dummy",
    "model_name": "any_name",
    "token": "",
    "provider": "",
    "prompt": "",
    "max_new_tokens": 0,
})

> üí° **Tip:** The `load_config(...)` function returns a Pydantic model instance.  
> This means you can interact with it just like any other Python object:
>
> - update fields directly (`app_config.llm = ...`),
> - validate values,
> - export using `.dict()` or `.json()`.
>
To save the configuration back to YAML:

```python
from pathlib import Path
import yaml

with Path("new_config.yml").open("w", encoding="utf-8") as f:
    yaml.safe_dump(app_config.dict(), f, allow_unicode=True)
```

In [8]:
display(app_config.llm)

LLM(type='dummy', model_name='any_name', token='', provider='', base_url=None, prompt='', max_new_tokens=0, return_full_response=False, params=None)

Now you have an application configuration object that includes your custom LLM module (e.g., `MyDummyResponse`).  
From here, you can proceed with the standard framework usage.

In [9]:
container = DIContainer(app_config)
app = container.build_app_service()

In [10]:
response = app.generate_answer(query="–∫–∞–∫–∏–µ –±—ã–ª–∏ –ò—Å–ø–∞–Ω—Å–∫–∏–µ –∑–∞–≤–æ–µ–≤–∞–Ω–∏—è –≤ –ê–º–µ—Ä–∏–∫–µ?")

In [11]:
response

'I got 4150 chars in out prompt: Based on the following fragments:\n\n[1]  –ø—Ä–∏–±—ã–ª—å.   –ò—Å–ø–∞–Ω—Å–∫–∏–µ –∑–∞–≤–æ–µ–≤–∞–Ω–∏—è –≤ –ê–º–µ—Ä–∏–∫–µ.\n   –í 1518 –≥. –∏—Å–ø–∞–Ω—Å–∫–∏–π –≥—É–±–µ—Ä–Ω–∞—Ç–æ—Ä –æ—Å—Ç—Ä–æ–≤–∞ –ö—É–±—ã –ø–æ—Å–ª–∞–ª —ç–∫—Å–ø–µ–¥–∏—Ü–∏—é –∏–∑ 600 —á–µ–ª–æ–≤–µ–∫ –≤–æ –≥–ª–∞–≤–µ —Å –§–µ—Ä–Ω–∞–Ω–¥–æ –ö–æ—Ä—Ç–µ—Å–æ–º –≤ —Ç–æ–ª—å–∫–æ —á—Ç–æ –æ—Ç–∫—Ä—ã—Ç—É—é –ú–µ–∫—Å–∏–∫—É. –ö–æ—Ä—Ç–µ—Å —Å–∂–µ–≥ –ø–æ—Å–ª–µ –≤—ã—Å–∞–¥–∫–∏ –≤—Å–µ –∫–æ—Ä–∞–±–ª–∏, —á—Ç–æ–±—ã –Ω–µ–ª—å–∑—è –±—ã–ª–æ –≤–µ—Ä–Ω—É—Ç—å—Å—è –¥–æ–º–æ–π, –∏ –¥–≤–∏–Ω—É–ª—Å—è –≤ –≥–ª—É–±—å —Å—Ç—Ä–∞–Ω—ã –∫ –≥–æ—Å—É–¥–∞—Ä—Å—Ç–≤—É –∞—Ü—Ç–µ–∫–æ–≤. –ò—Å–ø–∞–Ω—Ü—ã –±–µ—Å–ø—Ä–µ–ø—è—Ç—Å—Ç–≤–µ–Ω–Ω–æ –ø—Ä–æ–Ω–∏–∫–ª–∏ –≤ —Å—Ç–æ–ª–∏—Ü—É –∞—Ü—Ç–µ–∫–æ–≤ –ú–µ—Ö–∏–∫–æ. –ê—Ü—Ç–µ–∫–∏ –ø—Ä–∏–Ω—è–ª–∏ –±–µ–ª–æ–∫–æ–∂–∏—Ö –∏—Å–ø–∞–Ω—Ü–µ–≤ –∑–∞ –±–æ–≥–æ–≤. –ù–æ –º–∏—Ä–Ω—ã–µ –æ—Ç–Ω–æ—à–µ–Ω–∏—è –±—ã—Å—Ç—Ä–æ –∑–∞–∫–æ–Ω—á–∏–ª–∏—Å—å. –ò—Å–ø–∞–Ω—Ü—ã —É—á–∏–Ω–∏–ª–∏ —Å—Ç—Ä–∞—à–Ω—ã–π –ø–æ–≥—Ä–æ–º. –í 1521 –≥. –ö–æ—Ä—Ç–µ—Å –æ–∫–æ–Ω—á–∞—Ç–µ–ª—å–Ω–æ