## Pre-Requisities

1. Install Ollama Server from [Ollama Download](https://ollama.com/download)
2. Download the desired [Models](https://ollama.com/library)
3. Install `pip install -U langchain langchain-core langchain-community langchain-experimental langchain-openai langchain-ollama streamlit` in the environment
4. Run this notebook


# Initial Setup

In [1]:
import os
from dotenv import load_dotenv

from langchain_community.document_loaders import UnstructuredPDFLoader, PyMuPDFLoader, PyPDFDirectoryLoader
from langchain_community.document_loaders import DirectoryLoader

from langchain_text_splitters.character import RecursiveCharacterTextSplitter

from langchain_ollama.embeddings import OllamaEmbeddings
from langchain_community.embeddings import HuggingFaceEmbeddings

from langchain_community.vectorstores import FAISS

from langchain_community.llms import Ollama
from langchain_ollama.chat_models import ChatOllama
from langchain_ollama.llms import OllamaLLM

from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain, create_retrieval_chain
from langchain.chains import RetrievalQA

from langchain.prompts import PromptTemplate, ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

from langchain_core.memory import BaseMemory

import textwrap

import time
import streamlit as st

# Indexing

## Document Loader

In [6]:
# Load the documents

# loader = PyPDFDirectoryLoader('./data')
loader = PyMuPDFLoader('./data/380455eng.pdf')
documents = loader.load()

In [7]:
len(documents)

21

## Chunking

In [8]:
# Split the document into chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=80,
)

docs = text_splitter.split_documents(documents)

### Semantic Chunking

In [None]:
from langchain_experimental.text_splitter import SemanticChunker

# Required Arugement
embeddings = OllamaEmbeddings(model='nomic-embed-text')

# Chunker
text_splitter = SemanticChunker(
     embeddings, breakpoint_threshold_type="percentile"
)

docs = text_splitter.split_documents(documents)

### Preview Docs

In [10]:
len(docs)

115

In [9]:
docs[0].page_content

'Recommendation on the ethics of artificial intelligence \nPREAMBLE \nThe General Conference of the United Nations Educational, Scientific and Cultural Organization (UNESCO), \nmeeting in Paris from 9 to 24 November 2021, at its 41st session,  \nRecognizing the profound and dynamic positive and negative impacts of artificial intelligence (AI) on societies, \nenvironment, ecosystems and human lives, including the human mind, in part because of the new ways in \nwhich its use influences human thinking, interaction and decision-making and affects education, human, social \nand natural sciences, culture, and communication and information, \nRecalling that, by the terms of its Constitution, UNESCO seeks to contribute to peace and security by \npromoting collaboration among nations through education, the sciences, culture, and communication and \ninformation, in order to further universal respect for justice, for the rule of law and for the human rights and'

In [12]:
docs[1].metadata

{'source': './data/380455eng.pdf',
 'file_path': './data/380455eng.pdf',
 'page': 0,
 'total_pages': 21,
 'format': 'PDF 1.4',
 'title': '',
 'author': 'Mcgrath, Dermot',
 'subject': '',
 'keywords': '',
 'creator': 'Microsoft® Word for Microsoft 365',
 'producer': 'Microsoft® Word for Microsoft 365',
 'creationDate': "D:20220113112627+01'00'",
 'modDate': "D:20220508020546+02'00'",
 'trapped': ''}

## Embedding

In [13]:
# Initialize embeddings
embeddings = OllamaEmbeddings(model='nomic-embed-text')

## Vector Store

In [15]:
#Save and Load the FAISS database
db = FAISS.from_documents(docs, embeddings)

# db.save_local('./faiss_db/', index_name='ai_ethics')
# db = FAISS.load_local('./faiss_db/', embeddings=embeddings, index_name='ai_ethics', allow_dangerous_deserialization=True)

db.save_local('./faiss_db/', index_name='unesco_ai')
db = FAISS.load_local('./faiss_db/', embeddings=embeddings, index_name='usesco_ai', allow_dangerous_deserialization=True)

# Retrieval

In [57]:
# Initialize retriever
retriever = db.as_retriever(search_kwargs={"k": 10, "search_type": "similarity"})

## Multi Query Retrieval

In [16]:
from langchain.retrievers.multi_query import MultiQueryRetriever

llm = ChatOllama(model="llama3.2:latest")

retriever = MultiQueryRetriever.from_llm(
    retriever=db.as_retriever(), llm=llm
)

# Generation

In [17]:
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [18]:
from langchain import hub

# prompt = hub.pull("rlm/rag-prompt")

template = """ You are a helpful AI assistant that can answer questions about AI ethics.

Based on given context, you can answer questions about AI ethics. If it is out of context, say "I am not sure about that."

{context}

Question: {question}

Answer:"""

prompt = PromptTemplate.from_template(template)

llm = ChatOllama(model="llama3.2:latest", temperature=0.2)

retriever_from_llm = MultiQueryRetriever.from_llm(
    retriever=db.as_retriever(), llm=llm
)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [54]:
# rag_chain.invoke("What is this document about?")
for chunk in rag_chain.stream("What are the objectives of defining AI ethics?"):
    print(chunk, end="", flush=True)

Based on the provided text, the objectives of defining AI ethics appear to be:

1. To ensure that AI systems do not harm humans and other morally relevant beings.
2. To determine the moral status of AI systems themselves.
3. To assess how AI systems might differ from humans in certain basic respects relevant to our ethical assessment of them.
4. To consider the implications of creating AI systems more intelligent than human.

These objectives suggest that defining AI ethics is crucial for addressing potential risks and challenges associated with the development and deployment of artificial intelligence, while also exploring its potential benefits and applications.

In [55]:
# rag_chain.invoke("What is this document about?")
for chunk in rag_chain.stream("What is the impact of AI on society?"):
    print(chunk, end="", flush=True)

The impact of AI on society can be significant and far-reaching. As mentioned in the provided text, AI systems have the potential to exacerbate existing biases, prejudices, and stereotypes, leading to increased inequality and social unrest. Additionally, AI-powered systems can perpetuate human cognitive biases, manipulate data, and undermine public trust in technology.

The consequences of AI-related ethical problems can include:

* Increased inequality
* Extended litigation processes
* Social uprising
* Profiling and biases against specific groups
* Faking data, stealing passwords, and interfering with other software and machines
* Undermining personal privacy, data protection, fairness, and autonomy

Furthermore, the development of AI systems raises new types of ethical issues, including their impact on decision-making, employment, labor, social interaction, healthcare, education, media, access to information, digital divide, personal data, consumer protection, environment, democracy

In [57]:
for chunk in rag_chain.stream("What factors should be considered when defining AI ethics for software systems?"):
    print(chunk, end="", flush=True)

According to the provided text, the following factors should be considered when defining AI ethics for software systems:

1. **Transparency**: Providing insight into how AI systems work, including their decision-making processes and data sources.
2. **Fairness and non-discrimination**: Ensuring that AI systems do not perpetuate or exacerbate existing social inequalities and biases.
3. **Safety**: Verifying the safety of AI systems to prevent harm to humans, animals, and the environment.
4. **Accountability**: Establishing clear lines of accountability for AI system developers, deployers, and users.
5. **Respect for human rights**: Ensuring that AI systems respect and protect human rights, including the right to privacy, freedom of expression, and non-discrimination.
6. **Cultural sensitivity**: Considering the cultural context in which AI systems will be used and ensuring that they are sensitive to diverse cultures and values.
7. **Data protection**: Protecting personal data and ensuri

In [19]:
for chunk in rag_chain.stream("List down all important criterias to define AI ethics?"):
    print(chunk, end="", flush=True)

Based on the provided text, here are the important criteria to define AI ethics:

1. **Proportionality and Do No Harm**: AI systems should not cause harm or exceed what is necessary to achieve legitimate aims or objectives.
2. **Respect for Human Rights**: AI systems should not infringe upon human rights, including fundamental freedoms.
3. **Contextual Appropriateness**: AI methods chosen should be appropriate and proportional to the context in which they are used.
4. **Rigorous Scientific Foundations**: AI methods should be based on rigorous scientific foundations.
5. **Avoidance of Abuse**: AI systems should not be used for social scoring or mass surveillance purposes.
6. **Protection of Fundamental Freedoms**: AI systems should respect and protect fundamental freedoms, including freedom of speech, assembly, and association.
7. **Promotion of Peace, Inclusiveness, Justice, Equity, and Interconnectedness**: AI systems should promote these values throughout their life cycle.
8. **Avoid