# Quran & Sentence Transformers

What is the most common verse in Quran? To answer this question and similar others we can use sentence transformers.

1. Downloaded an English Quran translation here: https://tanzil.net/trans/ -- This notebook uses [this](https://tanzil.net/trans/en.itani) --copied to `data/en.itani.txt`.
2. Encoded all the verses with the [best performing](https://paperswithcode.com/task/semantic-textual-similarity) sentence transformer and save the embeddings: `data/en_itani.p` -- took 1 hour on my machine
3. Calculated cosine similarities of all pairs of verses --thus got a matrix of size (6236,6236).
4. For each verse, counted the number of verses that it is "very similar" to -- here, "very similar" means `cosine similarity >.80`.

Result:
1. The most repeated verse is about God's greatness.
2. The second most repeated verse is about God's forgiveness.

In [3]:
import pandas as pd

In [172]:
# read in the quran translation
df = pd.read_csv('data/en.itani.txt',
                 sep='|',comment='#',
                 names=['chapter','verse','text'])
df

Unnamed: 0,chapter,verse,text
0,1,1,"In the name of God, the Gracious, the Merciful."
1,1,2,"Praise be to God, Lord of the Worlds."
2,1,3,"The Most Gracious, the Most Merciful."
3,1,4,Master of the Day of Judgment.
4,1,5,"It is You we worship, and upon You we call for..."
...,...,...,...
6231,114,2,The King of mankind.
6232,114,3,The God of mankind.
6233,114,4,From the evil of the sneaky whisperer.
6234,114,5,Who whispers into the hearts of people.


In [12]:
# This cell is needed to create the embeddings
# you don't need to run this since I made the output available in the repo.
# from simcse import SimCSE
# sentence_model = SimCSE('princeton-nlp/sup-simcse-roberta-large')
# embeddings = sentence_model.encode(df['text'].tolist())
# pd.to_pickle(embeddings,'data/en_itani.p')

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [1:03:31<00:00, 38.11s/it]


In [169]:
# read the embedddings
embeddings = pd.read_pickle('data/en_itani.p')

In [62]:
import torch.nn.functional as F
def cos_sim(a,b):
    return F.normalize(a) @ F.normalize(b).t()
sims = cos_sim(embeddings,embeddings)

In [171]:
df['sim_sum'] = sims.sum(dim=1).numpy()

In [81]:
values, indices = sims.fill_diagonal_(0).max(dim=1)
df['most_sim_ind'] = indices.numpy()
df['most_sim_val'] = values.numpy()
df['most_sim'] = df['most_sim_ind'].map(df['text'])

In [168]:
df['sim_count']=(sims>.8).sum(dim=1).numpy()
for _,row in df.sort_values('sim_count',ascending=False).head().iterrows():
    print(row['text'])
    print('# of similar verses:',row['sim_count'])
    print()

Such is God, your Lord. There is no god except He, the Creator of all things; so worship Him. He is responsible for everything.
# of similar verses: 54

Whoever commits evil, or wrongs his soul, then implores God for forgiveness, will find God Forgiving and Merciful.
# of similar verses: 39

God has promised those who believe and work righteousness: they will have forgiveness and a great reward.
# of similar verses: 34

Except for those who repent, and believe, and do good deeds. These—God will replace their bad deeds with good deeds. God is ever Forgiving and Merciful.
# of similar verses: 34

He is God; besides Whom there is no god; the Sovereign, the Holy, the Peace-Giver, the Faith-Giver, the Overseer, the Almighty, the Omnipotent, the Overwhelming. Glory be to God, beyond what they associate.
# of similar verses: 33



In [148]:
df.sort_values('sim_count',ascending=False).head()

Unnamed: 0,verse,chapter,text,sim_sum,most_sim_ind,most_sim_val,most_sim,sim_count
890,102,6,"Such is God, your Lord. There is no god except...",1902.067383,4119,0.904311,"God is the Creator of all things, and He is in...",54
602,110,4,"Whoever commits evil, or wrongs his soul, then...",2133.013916,707,0.912778,"But whoever repents after his crime, and refor...",39
677,9,5,God has promised those who believe and work ri...,2309.167969,349,0.874379,And as for those who believe and do good works...,34
2924,70,25,"Except for those who repent, and believe, and ...",2192.769043,2795,0.93972,"Except for those who repent afterwards, and re...",34
5148,23,59,He is God; besides Whom there is no god; the S...,1949.536499,890,0.877157,"Such is God, your Lord. There is no god except...",33


In [167]:
print(df[(sims[890]>.8).numpy()]['text'].tolist())

['To God belong the East and the West. Whichever way you turn, there is God’s presence. God is Omnipresent and Omniscient.', 'And they say, “God has begotten a son.” Be He glorified. Rather, His is everything in the heavens and the earth; all are obedient to Him.', 'Your God is one God. There is no god but He, the Benevolent, the Compassionate.', 'God! There is no god except He, the Living, the Everlasting. Neither slumber overtakes Him, nor sleep. To Him belongs everything in the heavens and everything on earth. Who is he that can intercede with Him except with His permission? He knows what is before them, and what is behind them; and they cannot grasp any of His knowledge, except as He wills. His Throne extends over the heavens and the earth, and their preservation does not burden Him. He is the Most High, the Great.', 'To God belongs everything in the heavens and the earth. Whether you reveal what is within your selves, or conceal it, God will call you to account for it. He forgives

In [160]:
print(df[(sims[602]>.8).numpy()]['text'].tolist())

['Then disperse from where the people disperse, and ask God for forgiveness. God is Most Forgiving, Most Merciful.', "Those who believed, and those who migrated and fought for the sake of God-those look forward to God's mercy. God is Forgiving and Merciful.", 'Say, "If you love God, then follow me, and God will love you, and will forgive you your sins." God is Forgiving and Merciful.', 'He specifies His mercy for whomever He wills. God is Possessor of Sublime Grace.', 'Except those who repent afterwards, and reform; for God is Forgiving and Merciful.', 'And those who, when they commit an indecency or wrong themselves, remember God and ask forgiveness for their sins-and who forgives sins except God? And they do not persist in their wrongdoing while they know.', 'Repentance is available from God for those who commit evil out of ignorance, and then repent soon after. These-God will relent towards them. God is Knowing and Wise.', 'Degrees from Him, and forgiveness, and mercy. God is Forgiv