### Custom MBA courses RAG project
---

In [1]:
# Go one level up in the directories hierarchy to access src directory and codes
import sys
import os
# Add project root to Python path
project_root = os.path.abspath("..")  # go one level up from notebooks/
sys.path.append(project_root)

In [2]:
# Setup necessary models for routing, chatting and embedding
from core.config.llm_setup import LLMsetups

router_llm = LLMsetups.ROUTER_LLM
chat_llm = LLMsetups.CHAT_LLM
embed_model = LLMsetups.EMBED_MODEL

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
# Load the collections name and description from my custom JSON file
import json

docs_path = "../documents"
collections_mba_json = docs_path + "/collections_mba.json"

with open(collections_mba_json, "r", encoding = "utf-8") as file:
    COLLECTIONS_MBA = json.load(file)

for collections_name, collection_description in COLLECTIONS_MBA.items():
    COLLECTIONS_MBA[collections_name] = (" \n ").join([line.strip() for line in collection_description.splitlines()[1:-2]])

### Test the RagSystem class defined

In [4]:
# Let's test our RAG module defined in class to see how well it is refactored
from core.helpers.chat_engine_registry import ChatEngineRegistry
from core.src.rag_workflow import RagIngestionWorkflow
from core.src.rag_workflow import RagChatWorkflow

import nest_asyncio
nest_asyncio.apply()

# Initialize necessary modules and objects
chat_engines = ChatEngineRegistry(chat_llm = chat_llm)

rag_ingestion = RagIngestionWorkflow(
    router_llm = router_llm,
    embed_model = embed_model
)

rag_chat = RagChatWorkflow(
    chat_engine_registry = chat_engines,
    router_llm = router_llm,
    chat_llm = chat_llm
)

In [5]:
# We run this workflow only once to get a router retriever object from our knowledge base
router_retriever = await rag_ingestion.run(docs_path = docs_path, collections = COLLECTIONS_MBA)

Parsing nodes: 100%|██████████| 337/337 [00:00<00:00, 3556.34it/s]
Generating embeddings: 100%|██████████| 337/337 [00:39<00:00,  8.49it/s]
Parsing nodes: 100%|██████████| 205/205 [00:00<00:00, 573.78it/s]
Generating embeddings: 100%|██████████| 217/217 [00:45<00:00,  4.74it/s]
Parsing nodes: 100%|██████████| 203/203 [00:00<00:00, 2720.22it/s]
Generating embeddings: 100%|██████████| 209/209 [00:18<00:00, 11.00it/s]
Parsing nodes: 100%|██████████| 306/306 [00:00<00:00, 3767.26it/s]
Generating embeddings: 100%|██████████| 306/306 [00:30<00:00, 10.08it/s]
Parsing nodes: 100%|██████████| 292/292 [00:00<00:00, 3915.45it/s]
Generating embeddings: 100%|██████████| 292/292 [00:16<00:00, 17.27it/s]
Parsing nodes: 100%|██████████| 365/365 [00:00<00:00, 3348.61it/s]
Generating embeddings: 100%|██████████| 365/365 [00:41<00:00,  8.76it/s]


In [6]:
user_query = "What can you say about behaviour theories used in Marketing?"
some = await router_retriever.aretrieve(user_query)

2025-12-16 17:00:27,708 - INFO - AFC is enabled with max remote calls: 10.
2025-12-16 17:00:28,803 - INFO - Selecting retriever 3: This choice explicitly mentions 'consumer behavior' and states that the course will delve into it by 'applying behavioral theories to understand how individuals make consumption decisions.' It further elaborates on how 'internal psychological factors' and 'external influences' shape these decisions, directly addressing the use of behavior theories in marketing..


In [7]:
# Let's test the query and memory of the RAG system
user_name = "SomeNewUSer"
user_id = "test_user_id0"
user_query = "What are the differences between Financial and Managerial accounting?"

rag_response = await rag_chat.run(
    router_retriever = router_retriever,
    user_query = user_query,
    user_name = user_name, 
    user_id = user_id
)
print(rag_response)

2025-12-16 17:02:18,648 - INFO - AFC is enabled with max remote calls: 10.
2025-12-16 17:02:20,654 - INFO - AFC is enabled with max remote calls: 10.
2025-12-16 17:02:21,567 - INFO - Selecting retriever 0: This choice describes financial accounting, focusing on reading, analyzing, and interpreting financial accounting information for external stakeholders. It details the financial statements and the principles governing their preparation, which are key aspects of financial accounting. While it doesn't explicitly contrast with managerial accounting, it provides a clear definition of one side of the comparison..
2025-12-16 17:02:21,597 - INFO - AFC is enabled with max remote calls: 10.
[32m2025-12-16 17:02:22[0m | [1mINFO    [0m | [36mcore.helpers.chat_engine_registry[0m:[36mget_or_create_chat_engine[0m:[36m23[0m - [1mCreating a new chat engine for user test_user_id0[0m
2025-12-16 17:02:22,300 - INFO - AFC is enabled with max remote calls: 10.


Hello SomeNewUSer,

The provided context outlines the following differences between financial and managerial accounting:

*   **Rules of Preparation:** Financial accounting is prepared under external rules such as IFRS and US-GAAP, while managerial accounting is prepared under internal guidelines.
*   **Auditing:** Financial accounting is audited by external firms (e.g., KPMG, EY, Deloitte, PwC), whereas managerial accounting is not audited.
*   **Users of Information:** Financial accounting is intended for external parties including shareholders, creditors, tax authorities, labor unions, and employees. Managerial accounting is for internal managers, such as the c-suite, CEO, and CFO.
*   **Units of Measurement:** Financial accounting uses monetary units (e.g., Tenge, USD). Managerial accounting uses both monetary and non-monetary units (e.g., hours per product).
*   **Scope:** Financial accounting is considered "external" accounting, while managerial accounting is considered "internal

In [8]:
# Let's test the new user
user_name2 = "SomeNewUSer"
user_id2 = "test_user_id2"
user_query2 = "Do you think that people would have money now or later?"

rag_response2 = await rag_chat.run(
    router_retriever = router_retriever,
    user_query = user_query2,
    user_name = user_name2, 
    user_id = user_id2
)
print(rag_response2)

2025-12-16 17:03:01,739 - INFO - AFC is enabled with max remote calls: 10.
2025-12-16 17:03:02,604 - INFO - Selecting retriever 1: This choice explicitly mentions '1 Time is Money (Impatience Principle) - All else equal, individuals prefer money now to later.' which directly addresses the question of preferring money now or later..
2025-12-16 17:03:02,632 - INFO - AFC is enabled with max remote calls: 10.
[32m2025-12-16 17:03:03[0m | [1mINFO    [0m | [36mcore.helpers.chat_engine_registry[0m:[36mget_or_create_chat_engine[0m:[36m23[0m - [1mCreating a new chat engine for user test_user_id2[0m
2025-12-16 17:03:03,371 - INFO - AFC is enabled with max remote calls: 10.


SomeNewUSer, the provided context does not contain information to determine whether people would have money now or later. It only poses questions about the future value of money and how to determine it.


In [10]:
rag_chat.chat_engine_registry.chat_engines_cached["test_user_id2"]._memory.get()

[ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text="\n            You are a helpful academic assistant specialized in MBA courses for Master of Engineering Management (MEM) students.\n\n            Address the user by their name: (Note: user's name will be provided to you alongside each user question).\n\n            Your task is to answer the user's question using ONLY the information provided in the context. \n            (Note: The specific context will be provided to you alongside each user question).\n\n            Rules:\n            - Answer the question clearly, accurately, and concisely.\n            - Use ONLY the provided context to construct your answer.\n            - Do NOT use outside knowledge, assumptions, or general world knowledge.\n            - Do NOT invent facts or fill in missing information.\n            - If the context does not contain enough information to answer the question, explicitly state that t