In [1]:
from llama_index.readers import SimpleDirectoryReader, WikipediaReader, SimpleWebPageReader
from llama_index import download_loader

SimpleCSVReader = download_loader("SimpleCSVReader")
from llama_index.schema import MetadataMode
import dotenv
dotenv.load_dotenv('../.env')
from projectgurukul.readers import CSVReader,RamayanaCSVReader
import pandas as pd

# Gita

In [3]:
def preprocess(row):
    row['chapter'], row['verse'] = row.verse_number.split(', ')
    return row


reader = CSVReader(text_columns=['verse','verse_in_sanskrit','translation_in_english','meaning_in_english'], metadata_columns=['chapter'], preprocess = preprocess)

documents = reader.load_data('../data/gita/data/bhagavad_gita.csv', extra_info={'souce': 'Bhagavad Gita'})

In [4]:
len(documents)

701

In [5]:
documents[0].metadata_template = "> {key}: {value}"
print(documents[0].get_content(metadata_mode = MetadataMode.ALL))

> chapter: Chapter 1
> souce: Bhagavad Gita

Verse 1

धृतराष्ट्र उवाच |धर्मक्षेत्रे कुरुक्षेत्रे समवेता युयुत्सवः |मामकाः पाण्डवाश्चैव किमकुर्वत सञ्जय ||1||

Dhritarashtra said: O Sanjay, after gathering on the holy field of Kurukshetra, and desiring to fight, what did my sons and the sons of Pandu do?

The two armies had gathered on the battlefield of Kurukshetra, well prepared to fight a war that was inevitable. Still, in this verse, King Dhritarashtra asked Sanjay, what his sons and his brother Pandu’s sons were doing on the battlefield? It was apparent that they would fight, then why did he ask such a question?The blind King Dhritarashtra’s fondness for his own sons had clouded his spiritual wisdom and deviated him from the path of virtue. He had usurped the kingdom of Hastinapur from the rightful heirs; the Pandavas, sons of his brother Pandu. Feeling guilty of the injustice he had done towards his nephews, his conscience worried him about the outcome of this battle.The words dhar

In [4]:
documents[1].get_metadata_str()

'chapter: Chapter 1\nsouce: Bhagavad Gita'

In [21]:
wikireader = WikipediaReader()
wikidocs = wikireader.load_data(['ayodhya ram temple'])


In [23]:
print( wikidocs[0].get_content()[:1500])

The Ram Mandir is a Hindu temple that is under construction in Ayodhya, Uttar Pradesh, India. It is located at the site of Ram Janmabhoomi, the hypothesized birthplace of Rama, a principal deity of Hinduism. The site is the former location of the Babri Masjid which was built after the demolition an existing non-Islamic structure. The worship of Hindu god Ram and Sita at the disputed site started when their idols were installed in 1949. In 2019, the Supreme Court of India delivered the verdict to give the disputed land to Hindus for a temple of Ram, while Muslims would be given land elsewhere to construct a mosque. The court referenced a report from the Archaeological Survey of India (ASI) as evidence suggesting the presence of a structure beneath the demolished Babri Masjid, that was found to be non-Islamic.The bhumi pujan (transl. ground breaking ceremony) for the commencement of the construction of Ram Mandir was performed on 5 August 2020, by Prime Minister Narendra Modi. The temple

# Ramayana

In [8]:
df = pd.read_csv("../data/ramayana/data/ayodhyakanda.csv").dropna(how = 'all')
df.head()

Unnamed: 0,content,explanation
0,गच्छता मातुलकुलं भरतेन तदाऽनघ।शत्रुघ्नो नित्यश...,Bharata set out for his maternal uncle's hous...
1,तत्र न्यवसद्भ्रात्रा सह सत्कारसत्कृतः।मातुलेना...,Treated with warm hospitality and fatherly aff...
2,तत्रापि निवसन्तौ तौ तर्प्यमाणौ च कामतः।भ्रातरौ...,Though both the heroic brothers stayed there s...
3,राजाऽपि तौ महातेजा स्सस्मार प्रोषितौ सुतौ।उभौ ...,The glorious king Dasaratha also remembered hi...
4,सर्व एव तु तस्येष्टा श्चत्वारः पुरुषर्षभाः।स्व...,"Dasaratha, a bull among men, loved all his fou..."


In [9]:
df[df.content.isna()]

Unnamed: 0,content,explanation


In [10]:
import re
ids = df.content.map(lambda shloka: re.findall(r'.*(\d+\.\d+\.\d+).*',shloka))
ids

0          [2.1.1]
1          [2.1.2]
2          [2.1.3]
3          [2.1.4]
4          [2.1.5]
           ...    
4052    [2.119.18]
4053    [2.119.19]
4054    [2.119.20]
4055    [2.119.21]
4056    [2.119.22]
Name: content, Length: 4057, dtype: object

In [11]:
ids[ids.map(len)<1]

524     []
611     []
2640    []
Name: content, dtype: object

In [13]:
df.iloc[524].content

'क्षुरोपमां नित्यमसत्प्रियंवदां प्रदुष्टभावां स्वकुलोपघातिनीम्।न जीवितुं त्वां विषहेऽमनोरमां दिधक्षमाणां हृदयं सबन्धनम्।।2.12,112।।'

In [7]:
docs = RamayanaCSVReader().load_data("../data/ramayana/data/kishkindakanda.csv")

In [7]:
len(docs)

77