### Custom MBA courses RAG project
---

In [1]:
# Go one level up in the directories hierarchy to access src directory and codes
import sys
import os
# Add project root to Python path
project_root = os.path.abspath("..")  # go one level up from notebooks/
sys.path.append(project_root)

In [2]:
# Setup necessary models for routing, chatting and embedding
from core.config.llm_setup import LLMsetups

router_llm = LLMsetups.ROUTER_LLM
chat_llm = LLMsetups.CHAT_LLM
embed_model = LLMsetups.EMBED_MODEL

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
# Load the collections name and description from my custom JSON file
import json

docs_path = "../documents"
collections_mba_json = docs_path + "/collections_mba.json"

with open(collections_mba_json, "r", encoding = "utf-8") as file:
    COLLECTIONS_MBA = json.load(file)

for collections_name, collection_description in COLLECTIONS_MBA.items():
    COLLECTIONS_MBA[collections_name] = (" \n ").join([line.strip() for line in collection_description.splitlines()[1:-2]])

### Test the RagSystem class defined

In [4]:
# Let's test our RAG module defined in class to see how well it is refactored
from core.helpers.chat_engine_registry import ChatEngineRegistry
from core.src.rag_workflow_final import RagWorkflow

import nest_asyncio
nest_asyncio.apply()

# Initialize necessary modules and objects
chat_engines = ChatEngineRegistry(chat_llm = chat_llm)

rag_instance = RagWorkflow(
    chat_engine_registry = chat_engines,
    router_llm = router_llm,
    chat_llm = chat_llm,
    embed_model = embed_model
)

In [5]:
router_retriever = await rag_instance.run(docs_path = docs_path, collections = COLLECTIONS_MBA)

Parsing nodes: 100%|██████████| 337/337 [00:00<00:00, 7157.59it/s]
Generating embeddings: 100%|██████████| 337/337 [00:11<00:00, 28.45it/s]
Parsing nodes: 100%|██████████| 205/205 [00:00<00:00, 358.95it/s]
Generating embeddings: 100%|██████████| 217/217 [00:12<00:00, 17.39it/s]
Parsing nodes: 100%|██████████| 203/203 [00:00<00:00, 4559.07it/s]
Generating embeddings: 100%|██████████| 209/209 [00:05<00:00, 35.55it/s]
Parsing nodes: 100%|██████████| 306/306 [00:00<00:00, 8337.17it/s]
Generating embeddings: 100%|██████████| 306/306 [00:09<00:00, 33.99it/s]
Parsing nodes: 100%|██████████| 292/292 [00:00<00:00, 8085.78it/s]
Generating embeddings: 100%|██████████| 292/292 [00:05<00:00, 54.61it/s]
Parsing nodes: 100%|██████████| 365/365 [00:00<00:00, 7394.32it/s]
Generating embeddings: 100%|██████████| 365/365 [00:13<00:00, 27.43it/s]


In [6]:
user_query = "What can you say about behaviour theories used in Marketing?"
some = await router_retriever.aretrieve(user_query)

2025-12-15 16:42:25,655 - INFO - AFC is enabled with max remote calls: 10.
2025-12-15 16:42:27,040 - INFO - Selecting retriever 3: This choice explicitly discusses the use of behavioral theories to understand how individuals make consumption decisions, detailing internal psychological factors and external influences that shape consumer behavior. It directly addresses the application of behavioral theories in marketing..


In [7]:
# Let's test the query and memory of the RAG system
user_name = "SomeNewUSer"
user_id = "test_user_id0"
user_query = "What are the differences between Financial and Managerial accounting?"

rag_response = await rag_instance.run(
    router_retriever = router_retriever,
    user_query = user_query,
    user_name = user_name, 
    user_id = user_id
)
print(rag_response)

2025-12-15 16:42:29,955 - INFO - AFC is enabled with max remote calls: 10.
2025-12-15 16:42:30,930 - INFO - Selecting retriever 0: This choice describes financial accounting and its goal of equipping students to read, analyze, and interpret financial accounting information for decision-making. It details the financial statements and the principles used (US GAAP and IFRS). While it doesn't explicitly contrast with managerial accounting, it defines one side of the comparison..
2025-12-15 16:42:30,957 - INFO - AFC is enabled with max remote calls: 10.
[32m2025-12-15 16:42:31[0m | [1mINFO    [0m | [36mcore.helpers.chat_engine_registry[0m:[36mget_or_create_chat_engine[0m:[36m23[0m - [1mCreating a new chat engine for user test_user_id0[0m
2025-12-15 16:42:31,636 - INFO - AFC is enabled with max remote calls: 10.


SomeNewUSer, the differences between financial and managerial accounting are as follows:

**Financial Accounting:**
*   **Rules:** Prepared under external rules such as IFRS and US-GAAP.
*   **Auditing:** Subject to external audits by firms like KPMG, EY, Deloitte, and PwC.
*   **Users:** Primarily for external stakeholders including shareholders, creditors, tax authorities, labor unions, and employees.
*   **Units of Measure:** Uses monetary units (e.g., Tenge, USD).
*   **Scope:** Considered "external" accounting.
*   **Output:** Produces financial statements (e.g., Balance sheet, 10K-Form).

**Managerial Accounting:**
*   **Rules:** Prepared under internal guidelines.
*   **Auditing:** Not audited.
*   **Users:** Primarily for internal managers, such as c-suite executives (CEO, CFO).
*   **Units of Measure:** Uses both monetary and non-monetary units (e.g., hours per product).
*   **Scope:** Considered "internal" accounting.
*   **Output:** Includes budget reports, among other inter

In [18]:
# Let's test the new user
user_name2 = "SomeNewUSer"
user_id2 = "test_user_id2"
user_query2 = "Do you think that people would have money now or later?"

rag_response2 = await rag_instance.run(
    router_retriever = router_retriever,
    user_query = user_query2,
    user_name = user_name2, 
    user_id = user_id2
)
print(rag_response2)

2025-12-15 16:50:37,712 - INFO - AFC is enabled with max remote calls: 10.
2025-12-15 16:50:38,580 - INFO - Selecting retriever 1: Choice (2) directly addresses the question by stating '1 Time is Money (Impatience Principle) - All else equal, individuals prefer money now to later.' This principle is a foundational concept in finance that directly relates to the preference for having money now versus later..
2025-12-15 16:50:38,601 - INFO - AFC is enabled with max remote calls: 10.
[32m2025-12-15 16:50:39[0m | [1mINFO    [0m | [36mcore.helpers.chat_engine_registry[0m:[36mget_or_create_chat_engine[0m:[36m20[0m - [1mUsing cached engine for the user test_user_id2[0m
2025-12-15 16:50:39,009 - INFO - AFC is enabled with max remote calls: 10.


SomeNewUSer, the provided context does not offer an opinion on whether people would have money now or later. It poses questions about determining the value of money available in the future and how to ascertain that value.


In [20]:
rag_instance.chat_engine_registry.chat_engines_cached["test_user_id2"]._memory.get()

[ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='Name of the user: SomeNewUSer; question: What are the main principles of Finance?')]),
 ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, additional_kwargs={'thought_signatures': [None], 'prompt_tokens': 288, 'completion_tokens': 32, 'total_tokens': 320}, blocks=[TextBlock(block_type='text', text='I am sorry, SomeNewUSer, but the provided context does not contain information about the main principles of Finance. Therefore, I cannot answer your question.')]),
 ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text="\n            You are a helpful academic assistant specialized in MBA courses for Master of Engineering Management (MEM) students.\n\n            Address the user by their name: SomeNewUSer.\n\n            Your task is to answer the user's question using ONLY the information provided in the context below.\n  