## Semantic Node Splitter

In [1]:
from rag.document_loader import DocumentLoader

# Create loader
doc_loader = DocumentLoader()

# Ingest all papers (adjust path as needed)
result = doc_loader.ingest_directory(
    directory_path="Research Paper",
    document_type="research_paper",
    tenant_id="staging",
    recursive=True
)

print(f"✓ Ingested {result['documents_processed']} documents")
print(f"✓ Created {result['nodes_processed']} searchable chunks")

[32m2025-12-01 14:54:45.615[0m | [1mINFO    [0m | [36mrag.vector_store[0m:[36m_create_indexes[0m:[36m85[0m - [1mCreated MongoDB indexes for filtering[0m
[32m2025-12-01 14:54:45.617[0m | [1mINFO    [0m | [36mrag.vector_store[0m:[36m__init__[0m:[36m65[0m - [1mInitialized VectorStore: db=swastya, collection=user_context_vectors, index=vector_search_index[0m
[32m2025-12-01 14:54:45.618[0m | [1mINFO    [0m | [36mrag.embedding_service[0m:[36m__init__[0m:[36m43[0m - [1mInitialized EmbeddingService with model: text-embedding-3-small[0m
[32m2025-12-01 14:54:45.619[0m | [1mINFO    [0m | [36mrag.embedding_service[0m:[36m__init__[0m:[36m43[0m - [1mInitialized EmbeddingService with model: text-embedding-3-small[0m
[32m2025-12-01 14:54:45.620[0m | [1mINFO    [0m | [36mrag.semantic_chunking[0m:[36m__init__[0m:[36m63[0m - [1mInitialized SemanticChunker with buffer_size=1, threshold=95[0m
[32m2025-12-01 14:54:45.620[0m | [1mINFO    [0m | 

✓ Ingested 19 documents
✓ Created 72 searchable chunks


In [2]:
tenant_id = "staging"
user_id = "research_paper:research_paper"
query = "what are the side effects of a high fat diet?"

In [3]:
from rag.retriever import RAGRetriever

# Initialize retriever
retriever = RAGRetriever()

[32m2025-12-01 14:56:26.954[0m | [1mINFO    [0m | [36mrag.vector_store[0m:[36m_create_indexes[0m:[36m85[0m - [1mCreated MongoDB indexes for filtering[0m
[32m2025-12-01 14:56:26.955[0m | [1mINFO    [0m | [36mrag.vector_store[0m:[36m__init__[0m:[36m65[0m - [1mInitialized VectorStore: db=swastya, collection=user_context_vectors, index=vector_search_index[0m
[32m2025-12-01 14:56:26.956[0m | [1mINFO    [0m | [36mrag.embedding_service[0m:[36m__init__[0m:[36m43[0m - [1mInitialized EmbeddingService with model: text-embedding-3-small[0m
[32m2025-12-01 14:56:27.920[0m | [1mINFO    [0m | [36mrag.retriever[0m:[36m__init__[0m:[36m58[0m - [1mInitialized RAGRetriever with LlamaIndex components[0m


In [4]:
# Or use LlamaIndex's query engine for more advanced RAG
query_engine = retriever.as_query_engine(
    user_id=user_id,
    tenant_id=tenant_id
)
response = query_engine.query(query)
print(response)

2025-12-01 14:56:31,465 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-12-01 14:56:34,332 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Some reported side effects of a high-fat diet include bad taste in the mouth, constipation, diarrhea, dizziness, halitosis, headache, insomnia, nausea, thirst, and tiredness, weakness, or fatigue.


### The side effects of a low-fat diet may include potential nutrient deficiencies, particularly in fat-soluble vitamins like Vitamin D, E, A, and K, as well as essential fatty acids. Additionally, some individuals may experience increased hunger and cravings due to reduced satiety from fat intake, leading to potential overeating or difficulty in maintaining weight loss.

### The side effects of a high-fat diet can include an elevated risk of coronary heart disease, potential harmful effects on kidney function, and an increased risk of all-cause mortality, particularly when the diet is based on animal protein sources.

In [5]:
result = retriever.retrieve(
    query=query,
    user_id=user_id,
    tenant_id=tenant_id,
    top_k=5
)

for idx, res in enumerate(result["chunks"]):
    print(f"---------------------------------------- Chunk Start {idx+1} ------------------------------------------")
    print("Source Document: ", res["metadata"].get("filename", "N/A"))
    print("------------------------------------------------------------------------------------------\n")
    print("Text: ", res["text"])
    print(f"---------------------------------------- Chunk End {idx+1} ------------------------------------------")
    print("\n\n")

[32m2025-12-01 14:56:40.496[0m | [1mINFO    [0m | [36mrag.retriever[0m:[36mretrieve[0m:[36m84[0m - [1mRetrieving context for query: 'what are the side effects of a high fat diet?...'[0m
2025-12-01 14:56:41,126 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
[32m2025-12-01 14:56:41.861[0m | [1mINFO    [0m | [36mrag.retriever[0m:[36mretrieve[0m:[36m165[0m - [1mRetrieved 5 chunks for user research_paper:research_paper[0m


---------------------------------------- Chunk Start 1 ------------------------------------------
Source Document:  Popular-Diets-A-Scientific-Review.pdf
------------------------------------------------------------------------------------------

Text:  A number of different metabolic effects have been re-
ported for high-fat, low-CHO diets. The most common is
ketosis, as measured by increased urinary ketones
(24,57,58,60,63,69,79). Ketogenic diets usually have less
than 20% calories from CHOs (80). Because many of these
are also low calorie, average CHO intake is 50 to 100 g/d.
All popular low-CHO diets recommend,100 g of CHO per
day. Ketogenic diets may cause a significant increase in
blood uric acid concentration (57,60,63,67,78).
Other metabolic effects range from decreased blood
glucose and insulin levels, to altered blood lipid levels
(Table 10). Many of these effects (e.g., decreased LDL
and HDL cholesterol) may be the consequence of weight
loss, rather than diet composition, esp

In [7]:
tenant_id = "staging"
user_id = "research_paper:research_paper"
query = "Can you explain in detail the potential short-term and long-term side effects of consuming a high-fat diet, including how it may affect different body systems such as the heart, metabolism, digestion, and overall health?"

In [8]:
# Or use LlamaIndex's query engine for more advanced RAG
query_engine = retriever.as_query_engine(
    user_id=user_id,
    tenant_id=tenant_id
)
response = query_engine.query(query)
print(response)

2025-12-01 15:01:44,259 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-12-01 15:01:48,913 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Consuming a high-fat diet can lead to various short-term and long-term side effects. In the short term, individuals may experience increased levels of urinary ketones, elevated blood uric acid concentration, and potential adverse effects like bad taste in the mouth, constipation, diarrhea, dizziness, halitosis, headache, insomnia, nausea, thirst, and tiredness. Long-term consumption of a high-fat diet may impact different body systems. It can affect the heart by potentially increasing the risk of coronary heart disease due to elevated levels of LDL cholesterol. Metabolically, high-fat diets may lead to altered blood lipid levels, decreased blood glucose and insulin levels, and changes in blood pressure. In terms of digestion, high-fat diets could potentially impact renal function due to increased calciuria from high protein intake and may pose risks to bone health if not balanced with adequate nutrients like potassium, magnesium, fiber, and vitamins. Overall, the long-term effects of a

In [9]:
result = retriever.retrieve(
    query=query,
    user_id=user_id,
    tenant_id=tenant_id,
    top_k=5
)

for idx, res in enumerate(result["chunks"]):
    print(f"---------------------------------------- Chunk Start {idx+1} ------------------------------------------")
    print("Source Document: ", res["metadata"].get("filename", "N/A"))
    print("------------------------------------------------------------------------------------------\n")
    print("Text: ", res["text"])
    print(f"---------------------------------------- Chunk End {idx+1} ------------------------------------------")
    print("\n\n")

[32m2025-12-01 15:01:56.724[0m | [1mINFO    [0m | [36mrag.retriever[0m:[36mretrieve[0m:[36m84[0m - [1mRetrieving context for query: 'Can you explain in detail the potential short-term...'[0m
2025-12-01 15:01:57,284 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
[32m2025-12-01 15:01:57.696[0m | [1mINFO    [0m | [36mrag.retriever[0m:[36mretrieve[0m:[36m165[0m - [1mRetrieved 5 chunks for user research_paper:research_paper[0m


---------------------------------------- Chunk Start 1 ------------------------------------------
Source Document:  Popular-Diets-A-Scientific-Review.pdf
------------------------------------------------------------------------------------------

Text:  Results of studies seem impressive but questions about
long-term efficacy and risk reduction remain. Extrapolation
to the general population from motivated individuals (e.g.,
those with coronary heart disease) is questionable. The
independent effects of weight loss, physical activity and
accompanying lifestyle interventions complicate interpreta-
tion (29). The American Heart Association’s Science Ad-
visory recommends persons with insulin-dependent diabetes
mellitus, elevated TG levels, and CHO malabsorption ill-
nesses avoid VLF diets (29).
4. Hunger and Appetite:Compliance
c What is the effect of low-fat and VLF diet on hunger
and appetite?
c What data supports compliance to low-fat and VLF
diets?
Low-Fat Diets
The issue of satiety fo