# Vedic Language Model (VLM) Demo

This notebook demonstrates the basic functionality of the Vedic Language Model (VLM).

The VLM combines:
1. A core LLM trained on Vedic texts
2. Pāṇinian grammar validation
3. Neuro-symbolic reasoning
4. Retrieval-Augmented Generation (RAG)

In [None]:
# Import required modules
import sys
import torch
sys.path.append('../')  # Add project root to path

from src.vlm.core.model import VLMCore
from src.vlm.core.tokenizer import SanskritTokenizer
from src.vlm.grammar.ashtadhyayi import AshtadhyayiEngine
from src.vlm.rag.retriever import IndicRetriever
from src.vlm.rag.generator import RAGGenerator
from src.utils.config import VLMConfig

## 1. Initialize Components

First, we'll initialize the core components of the VLM system.

In [None]:
# Initialize configuration
config = VLMConfig()

# Initialize tokenizer
tokenizer = SanskritTokenizer()

# Initialize core model
model = VLMCore(config)

# Initialize grammar engine
grammar_engine = AshtadhyayiEngine()

# Initialize retriever
retriever = IndicRetriever()

# Initialize RAG generator
rag_generator = RAGGenerator(model, retriever)

## 2. Tokenization Demo

Demonstrate the Sanskrit-aware tokenization.

In [None]:
# Sample Sanskrit text
text = "नमस्ते संस्कृत भाषा"  # 'Hello Sanskrit language'

# Tokenize the text
# Note: This is a placeholder for when the tokenizer is implemented
# tokens = tokenizer.tokenize(text)
tokens = text.split()  # Simple placeholder tokenization

print(f"Text: {text}")
print(f"Tokens: {tokens}")

## 3. Grammar Validation Demo

Demonstrate the Pāṇinian grammar validation.

In [None]:
# Sample texts (valid and invalid according to Pāṇinian grammar)
valid_text = "रामः वनं गच्छति"  # 'Rama goes to the forest'
invalid_text = "रामः वनं गच्छति अहम्"  # Invalid grammar

# Note: This is a placeholder for when the grammar engine is implemented
# is_valid1 = grammar_engine.validate(valid_text)
# is_valid2 = grammar_engine.validate(invalid_text)
# corrected_text = grammar_engine.correct(invalid_text)

# Placeholder results
is_valid1 = True
is_valid2 = False
corrected_text = "रामः वनं गच्छति"

print(f"Valid text: '{valid_text}' -> {is_valid1}")
print(f"Invalid text: '{invalid_text}' -> {is_valid2}")
print(f"Corrected: '{corrected_text}'")

## 4. RAG Generation Demo

Demonstrate the Retrieval-Augmented Generation capabilities.

In [None]:
# Sample query
query = "What is the significance of Agni in Rigveda?"

# Note: This is a placeholder for when the RAG system is implemented
# response = rag_generator.generate(query)

# Placeholder response
response = """In the Rigveda, Agni (fire) holds profound significance as both a deity and a cosmic principle. 
As one of the most frequently invoked deities, Agni serves as the divine messenger between humans and gods, 
carrying sacrificial offerings upward. The Rigveda describes Agni as 'mouth of the gods' (devānām mukham), 
highlighting his role in ritual practice. Beyond his ritual function, Agni represents transformative power, 
purification, and illumination of consciousness."""

print(f"Query: {query}")
print(f"Response: {response}")

## 5. Next Steps

Future development areas:

1. Complete the Sanskrit tokenizer implementation
2. Implement the Pāṇinian grammar rule engine
3. Train the core model on Vedic corpus
4. Build the knowledge base for the retriever
5. Integrate the Nyāya reasoning module