# Quranic Numeric Patterns Analysis

This notebook aims to verify the numeric patterns related to the Bismillah, Ism, Allah, Rahman, and Rahim in the Quran, per the bullet points:

1. Bismillah = 19 letters
2. Ism (اسم) without contraction => 19 times
3. Allah (الله) => 2698 times (19×142)
   - with possible expansions
4. Rahman (الرحمن) => 57 times (19×3)
5. Rahim (الرحيم) => 114 times (19×6)
6. And checks for the other well-known 19-based patterns:
   - 114 chapters, 6346 total verses, cross-sum = 19, etc.
   
Depending on your Quran text file (and how morphological variants are handled), you may or may not exactly match the historical claims. This notebook provides a configurable approach to get as close as possible to those reported numbers.

In [1]:
import re
from collections import Counter

## Configuration
Set up the configuration flags that control how the text is processed and analyzed.

In [2]:
# Configuration Flags
STRIP_DIACRITICS = True
REMOVE_NON_ARABIC = True
ALLAH_MODE = "both"  # Options: "strict", "expanded", "both"
DEBUG = False

## Regex Patterns
Define the regular expression patterns for matching various words.

In [13]:
# Regex Patterns
PATTERN_ISM = [r"\bاسم\b"]

ALLAH_PATTERNS_STRICT = [r"\bالله\b"]
ALLAH_PATTERNS_EXPANDED = [
    r"\bالله\b",     # standalone
    r"\bاللهم\b",    # vocative
    r"\bبالله\b",
    r"\bوالله\b",
    r"\bفلله\b",
    r"\bتالله\b"
]

PATTERN_RAHMAN = [r"\bالرحمن\b"]
PATTERN_RAHIM  = [r"\bالرحيم\b"]

# Known bullet-point references
chapters = 114
verses = 6346
cross_sum_verses = sum(map(int, str(verses)))

## Helper Functions
Define the functions needed for text processing and analysis.

In [14]:
def load_text_file(filename):
    """Load a text file and return its contents as a single string."""
    with open(filename, "r", encoding="utf-8") as f:
        return f.read()

def preprocess_text(text):
    """Preprocess the Quranic text based on configuration flags."""
    if REMOVE_NON_ARABIC:
        text = re.sub(r"[^\u0600-\u06FF\s]", "", text)
    
    if STRIP_DIACRITICS:
        diacritics_regex = re.compile(r"[\u064B-\u0652\u0670\u06D6-\u06ED]")
        text = diacritics_regex.sub("", text)
    
    return re.sub(r"\s+", " ", text).strip()

def count_occurrences_regex(text, patterns):
    """Count total matches across multiple regex patterns."""
    total = 0
    for pat in patterns:
        matches = re.findall(pat, text)
        match_count = len(matches)
        total += match_count
        if DEBUG:
            print(f"[DEBUG] Pattern '{pat}' found {match_count} matches.")
    return total

def count_allah(text):
    """Count 'Allah' occurrences based on ALLAH_MODE configuration."""
    strict_count = count_occurrences_regex(text, ALLAH_PATTERNS_STRICT)
    expanded_count = count_occurrences_regex(text, ALLAH_PATTERNS_EXPANDED)
    
    if ALLAH_MODE == "strict":
        return strict_count
    elif ALLAH_MODE == "expanded":
        return expanded_count
    else:
        return (strict_count, expanded_count)

def verify_bismillah_letters():
    """Verify the Bismillah letter count."""
    bismillah = "بسم الله الرحمن الرحيم"
    processed_bismillah = preprocess_text(bismillah)
    return sum(1 for c in processed_bismillah if c.isalpha())

## Analysis
Run the analysis on the Quranic text.

In [15]:
# Load and preprocess text
raw_text = load_text_file("quran-simple.txt")
prepped_text = preprocess_text(raw_text)

# Verify Bismillah letter count
bismillah_letter_count = verify_bismillah_letters()
print(f"Bismillah letter count (should be 19): {bismillah_letter_count}")

# Count occurrences
ism_count    = count_occurrences_regex(prepped_text, PATTERN_ISM)
allah_count  = count_allah(prepped_text)
rahman_count = count_occurrences_regex(prepped_text, PATTERN_RAHMAN)
rahim_count  = count_occurrences_regex(prepped_text, PATTERN_RAHIM)

print("\n--- WORD COUNTS ---")
print(f"Ism (اسم) [No pronoun forms]: {ism_count}")
if isinstance(allah_count, tuple):
    strict, expanded = allah_count
    print(f"Allah (الله) [Strict]: {strict}")
    print(f"Allah (الله) [Expanded]: {expanded}")
else:
    print(f"Allah (الله): {allah_count}")

print(f"Rahman (الرحمن): {rahman_count}")
print(f"Rahim (الرحيم): {rahim_count}")

Bismillah letter count (should be 19): 19

--- WORD COUNTS ---
Ism (اسم) [No pronoun forms]: 14
Allah (الله) [Strict]: 2265
Allah (الله) [Expanded]: 2663
Rahman (الرحمن): 0
Rahim (الرحيم): 146


## Known Patterns
Display the known 19-based patterns from the historical claims.

In [20]:
pip install tabulate

Defaulting to user installation because normal site-packages is not writeable
Collecting tabulate
  Using cached tabulate-0.9.0-py3-none-any.whl (35 kB)
Installing collected packages: tabulate
Successfully installed tabulate-0.9.0
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


In [22]:
from tabulate import tabulate

def verify_claims():
    claims = [
        ["Bismillah Letters", 19, bismillah_letter_count],
        ["Ism Count", 19, ism_count],
        ["Allah Count", 2698, allah_count[0] if isinstance(allah_count, tuple) else allah_count],
        ["Rahman Count", 57, rahman_count],
        ["Rahim Count", 114, rahim_count],
        ["Total Chapters", 114, chapters],
        ["Total Verses", 6346, verses],
        ["Verses Cross Sum", 19, cross_sum_verses]
    ]
    
    # Add "Matches" column and calculate total matches
    table_data = []
    matches = 0
    for claim in claims:
        matches += (claim[1] == claim[2])
        table_data.append([
            claim[0],          # Pattern Name
            claim[1],          # Expected
            claim[2],          # Actual
            "✓" if claim[1] == claim[2] else "✗"  # Match symbol
        ])
    
    print(tabulate(table_data, 
                  headers=["Pattern", "Expected", "Actual", "Matches"],
                  tablefmt="grid"))
    print(f"\nTotal matches: {matches}/{len(claims)} claims verified")

verify_claims()

+-------------------+------------+----------+-----------+
| Pattern           |   Expected |   Actual | Matches   |
| Bismillah Letters |         19 |       19 | ✓         |
+-------------------+------------+----------+-----------+
| Ism Count         |         19 |       14 | ✗         |
+-------------------+------------+----------+-----------+
| Allah Count       |       2698 |     2265 | ✗         |
+-------------------+------------+----------+-----------+
| Rahman Count      |         57 |        0 | ✗         |
+-------------------+------------+----------+-----------+
| Rahim Count       |        114 |      146 | ✗         |
+-------------------+------------+----------+-----------+
| Total Chapters    |        114 |      114 | ✓         |
+-------------------+------------+----------+-----------+
| Total Verses      |       6346 |     6346 | ✓         |
+-------------------+------------+----------+-----------+
| Verses Cross Sum  |         19 |       19 | ✓         |
+-------------

In [24]:
# Regex for "الرحمن"
REGEX_RAHMAN = (
    r"(?<![ء-ي])"          # negative lookbehind: no Arabic letter before
    r"ا[\u064B-\u0652\u0670\u06D6-\u06ED]*"  # Alef + optional diacritics
    r"ل[\u064B-\u0652\u0670\u06D6-\u06ED]*"  # Lam + optional diacritics
    r"ر[\u064B-\u0652\u0670\u06D6-\u06ED]*"  # Ra + optional diacritics
    r"ح[\u064B-\u0652\u0670\u06D6-\u06ED]*"  # Ha + optional diacritics
    r"م[\u064B-\u0652\u0670\u06D6-\u06ED]*"  # Meem + optional diacritics
    r"ن[\u064B-\u0652\u0670\u06D6-\u06ED]*"  # Noon + optional diacritics
    r"(?![ء-ي])"           # negative lookahead: no Arabic letter after
)

# Regex for "الرحيم"
REGEX_RAHIM = (
    r"(?<![ء-ي])"
    r"ا[\u064B-\u0652\u0670\u06D6-\u06ED]*"
    r"ل[\u064B-\u0652\u0670\u06D6-\u06ED]*"
    r"ر[\u064B-\u0652\u0670\u06D6-\u06ED]*"
    r"ح[\u064B-\u0652\u0670\u06D6-\u06ED]*"
    r"ي[\u064B-\u0652\u0670\u06D6-\u06ED]*"
    r"م[\u064B-\u0652\u0670\u06D6-\u06ED]*"
    r"(?![ء-ي])"
)

PATTERN_RAHMAN = [REGEX_RAHMAN]
PATTERN_RAHIM  = [REGEX_RAHIM]

def count_occurrences_regex(text, patterns):
    total = 0
    for pat in patterns:
        matches = re.findall(pat, text)
        total += len(matches)
    return total

rahman_count = count_occurrences_regex(prepped_text, PATTERN_RAHMAN)
rahim_count  = count_occurrences_regex(prepped_text, PATTERN_RAHIM)

print("Rahman count:", rahman_count)
print("Rahim count:", rahim_count)


Rahman count: 0
Rahim count: 146


In [25]:
# Quick check: does "سورة الرحمن" exist at all?
print("سورة الرحمن" in raw_text)


False
