## Installations

In [10]:
!pip install crewai crewai-tools langchain langchain-community langchain-openai faiss-cpu pandas beautifulsoup4



## Data Loading and Preprocessing

### Data Collection

In [12]:
!pip install -q kaggle

In [13]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [14]:
! mkdir ~/.kaggle

In [15]:
!cp /content/drive/MyDrive/CollabData/kaggle_API/kaggle.json ~/.kaggle/kaggle.json

In [16]:
! chmod 600 ~/.kaggle/kaggle.json

In [17]:
! kaggle datasets download fahd09/hadith-dataset

Dataset URL: https://www.kaggle.com/datasets/fahd09/hadith-dataset
License(s): CC0-1.0
Downloading hadith-dataset.zip to /content
  0% 0.00/9.99M [00:00<?, ?B/s]
100% 9.99M/9.99M [00:00<00:00, 281MB/s]


In [18]:
!unzip hadith-dataset

Archive:  hadith-dataset.zip
  inflating: all_hadiths_clean.csv   


In [23]:
# Download the specific file from the dataset
!kaggle datasets download alizahidraja/quran-nlp

Dataset URL: https://www.kaggle.com/datasets/alizahidraja/quran-nlp
License(s): apache-2.0
Downloading quran-nlp.zip to /content
 75% 166M/222M [00:00<00:00, 844MB/s] 
100% 222M/222M [00:00<00:00, 309MB/s]


In [24]:
!unzip quran-nlp

Archive:  quran-nlp.zip
  inflating: data/hadith/Sanadset 650K Data on Hadith Narrators/books.csv  
  inflating: data/hadith/Sanadset 650K Data on Hadith Narrators/hadith_samples.csv  
  inflating: data/hadith/Sanadset 650K Data on Hadith Narrators/sanadset.csv  
  inflating: data/hadith/Sanadset 650K Data on Hadith Narrators/translated_samples.csv  
  inflating: data/hadith/arabic_hadith/Maliks Muwatta Without_Tashkel.csv  
  inflating: data/hadith/arabic_hadith/Maliks Muwatta.csv  
  inflating: data/hadith/arabic_hadith/Musnad Ahmad ibn Hanbal Without_Tashkel.csv  
  inflating: data/hadith/arabic_hadith/Musnad Ahmad ibn Hanbal.csv  
  inflating: data/hadith/arabic_hadith/Sahih Bukhari Without_Tashkel.csv  
  inflating: data/hadith/arabic_hadith/Sahih Bukhari.csv  
  inflating: data/hadith/arabic_hadith/Sahih Muslim.csv  
  inflating: data/hadith/arabic_hadith/Sahih Muslime Without_Tashkel.csv  
  inflating: data/hadith/arabic_hadith/Sunan Abu Dawud Without_Tashkel.csv  
  inflating: 

### Data Preprocess

In [20]:
from langchain_community.document_loaders import CSVLoader

In [25]:
from langchain_community.document_loaders import CSVLoader

def load_and_process_data():
    """Loads data from CSV files into a preliminary document format."""
    print("\n--- Loading and Processing Data ---")
    quran_docs, hadith_docs = [], []
    try:
        quran_loader = CSVLoader(file_path='data/main_df.csv', encoding='utf-8')
        quran_docs = quran_loader.load()
        hadith_loader = CSVLoader(file_path='all_hadiths_clean.csv', encoding='utf-8')
        hadith_docs = hadith_loader.load()
    except FileNotFoundError as e:
        print(f"Error: {e}. Please ensure CSV files are in the correct directory.")
        return None, None
    except RuntimeError as e:
        print(f"Runtime error loading CSV file: {e}")
        return None, None

    print("Data loading complete.")
    return quran_docs, hadith_docs

In [26]:
# Prepare data and retrievers
quran_docs, hadith_docs = load_and_process_data()
if not (quran_docs and hadith_docs):
    exit("Data loading failed. Exiting.")


--- Step 1: Loading and Processing Data ---
Data loading complete.


## Vectorized Data Storing

In [39]:
import os
import time # Import time for delays
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import AzureOpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from tqdm.auto import tqdm # Import tqdm for progress bar

In [45]:
# Prepare data for vectorization
print("\n--- Ingesting Data ---")
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
quran_splits = text_splitter.split_documents(quran_docs)
hadith_splits = text_splitter.split_documents(hadith_docs)

print(f"  - Created {len(quran_splits)} chunks for Quran.")
print(f"  - Created {len(hadith_splits)} chunks for Hadith.")


--- Ingesting Data ---
  - Created 14773 chunks for Quran.
  - Created 50683 chunks for Hadith.


In [54]:
embeddings = AzureOpenAIEmbeddings(
    azure_deployment=os.environ["AZURE_EMBEDDING_DEPLOYMENT_NAME"],
    api_key=os.environ["AZURE_API_KEY"],
    azure_endpoint=os.environ["AZURE_API_BASE"],
    api_version=os.environ["AZURE_API_VERSION"]
)
print("  - Embeddings initialized.")

  - Embeddings initialized.


In [61]:
def vectorize_quran(quran_splits, embeddings):
    """Vectorizes Quran documents and stores them in a FAISS vector store."""
    print("\n  - Vectorizing Quran documents...")
    max_retries = 5
    retry_delay = 60 # seconds
    batch_size = 1000
    quran_vector_store = None

    for attempt in range(max_retries):
        try:
            print(f"    Vectorizing Quran in batches of {batch_size}...")
            for i in tqdm(range(0, len(quran_splits), batch_size)):
                batch = quran_splits[i:i + batch_size]
                if quran_vector_store is None:
                    quran_vector_store = FAISS.from_documents(documents=batch, embedding=embeddings)
                else:
                    quran_vector_store.add_documents(batch, embedding=embeddings)
                print(f"      Processed {i + len(batch)}/{len(quran_splits)} Quran chunks.")
                time.sleep(1) # Add a small delay between batches to avoid hitting rate limits

            print("  - Quran vector store created.")
            return quran_vector_store.as_retriever(k=5)

        except Exception as e:
            if "RateLimitError" in str(e) and attempt < max_retries - 1:
                print(f"Rate limit exceeded. Retrying in {retry_delay} seconds... (Attempt {attempt + 1}/{max_retries})")
                time.sleep(retry_delay)
            else:
                print(f"An error occurred during Quran vector store creation: {e}")
                return None

In [62]:
def vectorize_hadith(hadith_splits, embeddings):
    """Vectorizes Hadith documents and stores them in a FAISS vector store."""
    print("\n  - Vectorizing Hadith documents...")
    max_retries = 5
    retry_delay = 60 # seconds
    batch_size = 500
    hadith_vector_store = None

    for attempt in range(max_retries):
        try:
            print(f"    Vectorizing Hadith in batches of {batch_size}...")
            for i in tqdm(range(0, len(hadith_splits), batch_size)):
                batch = hadith_splits[i:i + batch_size]
                if hadith_vector_store is None:
                    hadith_vector_store = FAISS.from_documents(documents=batch, embedding=embeddings)
                else:
                    hadith_vector_store.add_documents(batch, embedding=embeddings)
                print(f"      Processed {i + len(batch)}/{len(hadith_splits)} Hadith chunks.")
                time.sleep(1) # Add a small delay between batches to avoid hitting rate limits

            print("  - Hadith vector store created.")
            return hadith_vector_store.as_retriever(k=5)

        except Exception as e:
            if "RateLimitError" in str(e) and attempt < max_retries - 1:
                print(f"Rate limit exceeded. Retrying in {retry_delay} seconds... (Attempt {attempt + 1}/{max_retries})")
                time.sleep(retry_delay)
            else:
                print(f"An error occurred during Hadith vector store creation: {e}")
                return None

In [57]:
# Create Quran vector store
quran_retriever = vectorize_quran(quran_splits, embeddings)
if not quran_retriever:
    exit("Quran vector store creation failed. Exiting.")


  - Vectorizing Quran documents...
    Vectorizing Quran in batches of 1000...


  0%|          | 0/15 [00:00<?, ?it/s]

      Processed 1000/14773 Quran chunks.
      Processed 2000/14773 Quran chunks.
      Processed 3000/14773 Quran chunks.
      Processed 4000/14773 Quran chunks.
      Processed 5000/14773 Quran chunks.
      Processed 6000/14773 Quran chunks.
      Processed 7000/14773 Quran chunks.
      Processed 8000/14773 Quran chunks.
      Processed 9000/14773 Quran chunks.
      Processed 10000/14773 Quran chunks.
      Processed 11000/14773 Quran chunks.
      Processed 12000/14773 Quran chunks.
      Processed 13000/14773 Quran chunks.
      Processed 14000/14773 Quran chunks.
      Processed 14773/14773 Quran chunks.
  - Quran vector store created.


In [64]:
# Create Hadith vector store
hadith_retriever = vectorize_hadith(hadith_splits, embeddings)
if not hadith_retriever:
    exit("Hadith vector store creation failed. Exiting.")


  - Vectorizing Hadith documents...
    Vectorizing Hadith in batches of 500...


  0%|          | 0/102 [00:00<?, ?it/s]

      Processed 500/50683 Hadith chunks.
      Processed 1000/50683 Hadith chunks.
      Processed 1500/50683 Hadith chunks.
      Processed 2000/50683 Hadith chunks.
      Processed 2500/50683 Hadith chunks.
      Processed 3000/50683 Hadith chunks.
      Processed 3500/50683 Hadith chunks.
      Processed 4000/50683 Hadith chunks.
      Processed 4500/50683 Hadith chunks.
      Processed 5000/50683 Hadith chunks.
      Processed 5500/50683 Hadith chunks.
      Processed 6000/50683 Hadith chunks.
      Processed 6500/50683 Hadith chunks.
      Processed 7000/50683 Hadith chunks.
      Processed 7500/50683 Hadith chunks.
      Processed 8000/50683 Hadith chunks.
      Processed 8500/50683 Hadith chunks.
      Processed 9000/50683 Hadith chunks.
      Processed 9500/50683 Hadith chunks.
      Processed 10000/50683 Hadith chunks.
      Processed 10500/50683 Hadith chunks.
      Processed 11000/50683 Hadith chunks.
      Processed 11500/50683 Hadith chunks.
      Processed 12000/50683 Had

## Define Tools

In [67]:
from crewai_tools import SerperDevTool
from crewai.tools import BaseTool # Corrected import path

class ReligiousTextSearchTool(BaseTool):
    name: str = "Religious Text Search Tool"
    description: str = "Searches Qur'an and Hadith vectorstores for texts relevant to a query."
    quran_retriever: object
    hadith_retriever: object

    def _run(self, query: str) -> str:
        quran_results = self.quran_retriever.invoke(query)
        hadith_results = self.hadith_retriever.invoke(query)
        context = "QURANIC SOURCES:\n" + "\n\n".join([doc.page_content for doc in quran_results])
        context += "\n\nHADITH SOURCES:\n" + "\n\n".join([doc.page_content for doc in hadith_results])
        return context

In [68]:
# Instantiate tools
religious_search_tool = ReligiousTextSearchTool(
    quran_retriever=quran_retriever,
    hadith_retriever=hadith_retriever
)
serper_tool = SerperDevTool()

## Define Agents

In [69]:
from crewai import Agent
from langchain_openai import AzureChatOpenAI

def define_agents(llm, religious_tool, web_tool):
    """Defines the team of AI agents with their roles, goals, and tools."""
    researcher = Agent(
        role='Primary Source Researcher',
        goal='Find foundational texts from the Qur\'an and Hadith relevant to the user\'s query on {topic}.',
        backstory='An expert in Islamic scriptures, skilled at navigating vast digital libraries of religious texts to find the most relevant passages.',
        tools=[religious_tool], llm=llm, verbose=True
    )
    validator = Agent(
        role='Contemporary Validator',
        goal='Find contemporary views, news, and fatwas on {topic} from trusted online sources.',
        backstory='A meticulous researcher who cross-references religious findings with modern-day discourse and scholarly opinions available on the web.',
        tools=[web_tool], llm=llm, verbose=True
    )
    synthesizer = Agent(
        role='Synthesis Agent',
        goal='Craft a comprehensive, balanced, and well-structured answer to the user\'s query on {topic}, integrating primary sources and contemporary views.',
        backstory='A master communicator with deep knowledge of Islamic jurisprudence, skilled at synthesizing complex information into a clear and nuanced response.',
        llm=llm, verbose=True
    )
    return researcher, validator, synthesizer

## Define Task

In [70]:
from crewai import Task

def define_tasks(researcher, validator, synthesizer):
    """Defines the sequence of tasks for the agents to perform."""
    research_task = Task(
        description='Search for primary texts (Qur\'an and Hadith) related to the topic: {topic}.',
        expected_output='A compiled list of relevant verses and hadiths, with full text.',
        agent=researcher
    )
    validation_task = Task(
        description='Search the web for contemporary opinions, articles, and fatwas on the topic: {topic}.',
        expected_output='A summary of key findings from diverse and reliable online sources.',
        agent=validator
    )
    synthesis_task = Task(
        description=(
            'Analyze the provided primary sources and contemporary web findings. '
            'Synthesize them into a single, comprehensive answer that addresses the user\'s query on {topic}. '
            'The answer must be well-structured, citing different viewpoints where applicable. '
        ),
        expected_output='A final, curated answer that is ready to be presented to the user.',
        agent=synthesizer,
        context=[research_task, validation_task]
    )
    return [research_task, validation_task, synthesis_task]

## Assembling the Crew

In [78]:
import os

# Set environment variables for LiteLLM compatibility with Azure
os.environ["AZURE_API_KEY"] = userdata.get('AZURE_OPENAI_API_KEY')
os.environ["AZURE_API_BASE"] = userdata.get('AZURE_OPENAI_ENDPOINT')
os.environ["AZURE_API_VERSION"] = userdata.get('OPENAI_API_VERSION')
os.environ["AZURE_DEPLOYMENT_ID"] = userdata.get('AZURE_OPENAI_CHAT_DEPLOYMENT_NAME')
os.environ["OPENAI_API_TYPE"] = 'azure' # Keep this to explicitly set the provider type for LiteLLM


os.environ["SERPER_API_KEY"] = userdata.get('SERPER_API_KEY')
print("Secrets loaded successfully.")

Secrets loaded successfully.


In [80]:
from crewai import Agent, Task, Crew, Process, LLM
from langchain_openai import AzureChatOpenAI
import os
from google.colab import userdata

# Explicitly get and set the variables
azure_endpoint = userdata.get('AZURE_OPENAI_ENDPOINT')
azure_deployment = userdata.get('AZURE_OPENAI_CHAT_DEPLOYMENT_NAME')
api_key = userdata.get('AZURE_OPENAI_API_KEY')
api_version = userdata.get('OPENAI_API_VERSION')

llm = AzureChatOpenAI(
    azure_endpoint=azure_endpoint,
    azure_deployment=azure_deployment,
    api_key=api_key,
    api_version=api_version,
    model=f"azure/{userdata.get('AZURE_OPENAI_CHAT_DEPLOYMENT_NAME')}"
)

In [81]:
# Define the agents and tasks by calling the factory functions
agents = define_agents(llm, religious_search_tool, serper_tool)
tasks = define_tasks(*agents)

In [83]:
# Assemble the Crew and kickoff the process
islamic_qna_crew = Crew(
    agents=list(agents),
    tasks=tasks,
    process=Process.sequential,
    verbose=True # Changed from 2 to True
)

print("\n Kicking off the Crew... ")
result = islamic_qna_crew.kickoff(inputs={'topic': 'the different scholarly views on cryptocurrency'})

print("\n\n FINAL CURATED RESPONSE ")
print("="*50)
print(result)


 Kicking off the Crew... 


Output()

Output()

Output()

Output()

Output()



 FINAL CURATED RESPONSE 
**The Islamic Scholarly Views on Cryptocurrency: A Comprehensive Analysis**  

The rise of cryptocurrency has sparked extensive debate among Islamic scholars regarding its permissibility under Sharia law. The discussion hinges on whether cryptocurrencies align with the principles of Islamic finance, which prioritize fairness, transparency, and ethical trading. This analysis synthesizes the primary Qur'anic and Hadith-based guidance, as well as contemporary scholarly perspectives, to provide a nuanced overview of the divergent views on cryptocurrency within Islamic jurisprudence.  

---

### **1. Foundational Islamic Principles Relevant to Cryptocurrency**  

Islamic finance is governed by core principles that regulate wealth management, trade, and transactions. These principles include:  

- **Prohibition of Riba (Interest):** Transactions involving interest are strictly forbidden. Cryptocurrencies generally do not charge interest, which some scholars argue m