*This notebook has been created to demonstrate a Responsible AI tool that addresses any queries users might have regarding responsible AI practices.*

**Author :** [Satarupa Deb](https://www.linkedin.com/in/satarupa-deb-/)

**Github  :** https://github.com/Satarupa22-SD


In [2]:
# Install necessary packages
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117 --upgrade
!pip install langchain einops accelerate transformers bitsandbytes scipy
!pip install xformers sentencepiece
!pip install llama-index llama_hub --upgrade
!pip install sentence-transformers
!pip install pypdf2

Looking in indexes: https://download.pytorch.org/whl/cu117
INFO: pip is looking at multiple versions of torch to determine which version is compatible with other requirements. This could take a while.
Collecting torch
  Downloading https://download.pytorch.org/whl/cu117/torch-2.0.1%2Bcu117-cp310-cp310-linux_x86_64.whl (1843.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 GB[0m [31m596.8 kB/s[0m eta [36m0:00:00[0m
[?25hCollecting triton==2.0.0 (from torch)
  Downloading https://download.pytorch.org/whl/triton-2.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m63.3/63.3 MB[0m [31m10.5 MB/s[0m eta [36m0:00:00[0m
Collecting lit (from triton==2.0.0->torch)
  Downloading https://download.pytorch.org/whl/lit-15.0.7.tar.gz (132 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m132.3/132.3 kB[0m [31m20.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Prep

In [12]:
!pip install llama-index-llms-huggingface
%pip install llama-index-embeddings-langchain


Collecting llama-index-embeddings-langchain
  Downloading llama_index_embeddings_langchain-0.1.2-py3-none-any.whl (2.5 kB)
Installing collected packages: llama-index-embeddings-langchain
Successfully installed llama-index-embeddings-langchain-0.1.2


In [14]:
import PyPDF2
import torch
from pathlib import Path
from transformers import AutoTokenizer, AutoModelForCausalLM
from llama_index.core import VectorStoreIndex, download_loader, set_global_service_context, ServiceContext
from llama_index.llms.huggingface import HuggingFaceLLM
from langchain.embeddings import HuggingFaceEmbeddings
from llama_index.embeddings.langchain import LangchainEmbedding


In [15]:
# Extract text from PDF
def extract_text_from_pdf(pdf_path):
    with open(pdf_path, 'rb') as file:
        pdf_reader = PyPDF2.PdfReader(file)
        text = ""
        for page_num in range(len(pdf_reader.pages)):
            page = pdf_reader.pages[page_num]
            text += page.extractText()
    return text

In [16]:
# Llama setup
name = "meta-llama/Llama-2-7b-chat-hf"
auth_token = "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"
tokenizer = AutoTokenizer.from_pretrained(name, cache_dir='./model/', use_auth_token=auth_token)
model = AutoModelForCausalLM.from_pretrained(name, cache_dir='./model/', use_auth_token=auth_token, torch_dtype=torch.float16, rope_scaling={"type": "dynamic", "factor": 2}, load_in_8bit=True)




tokenizer_config.json:   0%|          | 0.00/1.62k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]



config.json:   0%|          | 0.00/614 [00:00<?, ?B/s]

The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.
`low_cpu_mem_usage` was None, now set to True since model is quantized.


model.safetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

In [17]:
system_prompt = """<s>[INST] <<SYS>>
You are a helpful, respectful, and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
Your goal is to provide answers relating to the Responsible AI practices that a company should follow.<</SYS>>"""

query_wrapper_prompt = "{query_str}"

llm = HuggingFaceLLM(
    context_window=4096,
    max_new_tokens=256,
    system_prompt=system_prompt,
    query_wrapper_prompt=query_wrapper_prompt,
    model=model,
    tokenizer=tokenizer
)

embeddings = LangchainEmbedding(
    HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
)

service_context = ServiceContext.from_defaults(
    chunk_size=2048,
    llm=llm,
    embed_model=embeddings
)

set_global_service_context(service_context)

# Load the document and add to index
pdf_path = "/content/Responsible Ai.pdf"
PyMuPDFReader = download_loader("PyMuPDFReader")
loader = PyMuPDFReader()
documents = loader.load(file_path=Path(pdf_path), metadata=True)

# Create an index using the loaded document
index = VectorStoreIndex.from_documents(documents)

# Create a query engine using the index
query_engine = index.as_query_engine()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

  service_context = ServiceContext.from_defaults(
  PyMuPDFReader = download_loader("PyMuPDFReader")


In [18]:
# Querying the document - 1
response = query_engine.query("What are important aspects of responsible AI?")
print(response.response)

The `seq_len` argument is deprecated and unused. It will be removed in v4.39.



There are several important aspects of responsible AI that companies should follow to ensure their AI systems are ethical, transparent, and trustworthy. Some of these aspects include:

1. Explanability: AI systems should be able to provide clear and understandable explanations for their decisions and actions, so that users can understand how they work and make informed decisions.
2. Accountability: Companies should be accountable for the actions of their AI systems and be able to explain and justify their decisions and actions.
3. Transparency: AI systems should be transparent about their decision-making processes and the data they use, so that users can trust their decisions and actions.
4. Bias and fairness: AI systems should be designed to avoid bias and ensure fairness in their decision-making processes, so that they do not discriminate against certain groups of users.
5. Security and privacy: AI systems should be designed to protect the security and privacy of users' data and ens

In [19]:
# Querying the document - 2
response = query_engine.query("How can bias be identified and mitigated in AI systems?")
print(response.response)

Bias can be identified and mitigated in AI systems by engaging with social scientists, humanists, and other relevant experts to understand and account for various perspectives. It is important to design AI systems with concrete goals for fairness and inclusion, and to use a variety of approaches to iterate and improve, especially when working in the emerging area of generative AI. Additionally, it is critical to work towards systems that are fair and inclusive for all, and to address the ethical considerations of fairness, equity, and inclusion in AI.


In [20]:
# Querying the document - 3
response = query_engine.query("What ethical considerations should be taken into account when deploying AI technologies?")
print(response.response)

When deploying AI technologies, ethical considerations that should be taken into account include respecting the privacy of individuals, ensuring safety and security, and following best-practice processes established for cryptographic and security-critical software. It is essential to consider and address the safety of AI systems before they are widely relied upon in safety-critical applications. Additionally, it is important to conduct iterative user testing to incorporate a diverse set of users' needs in the development cycles, apply the quality engineering principle of poka-yoke, and conduct integration tests to understand how the AI system interacts with other systems and what, if any, feedback loops are created. Finally, it is crucial to offer users transparency and control of their data and to minimize the possibility that ML models reveal underlying data by appropriately applying various techniques in a precise, principled fashion.


In [21]:
# Querying the document - 4
response = query_engine.query("How can transparency and interpretability be ensured in AI decision-making processes?")
print(response.response)


Transparency and interpretability are crucial in AI decision-making processes to ensure that the systems are fair, accountable, and trustworthy. Here are some recommended practices to ensure transparency and interpretability in AI decision-making processes:

1. Plan out your options to pursue interpretability: Identify the level of interpretability required for your model and work closely with domain experts to determine the necessary features.
2. Analyze your training and testing data: Investigate your input data and gather more training data for certain subsets or categories of interest.
3. Design a new model or modify an existing one: Consider designing a new model or modifying an existing one to improve interpretability.
4. Provide appropriate levels of transparency: Balance the need for transparency with the potential risks of abuse or exploitation.
5. Iterate with users in the development cycle: Test and refine your assumptions about user needs and goals through iterative testin

In [22]:
# Querying the document - 5
response = query_engine.query("What regulatory frameworks exist for governing the ethical use of AI?")
print(response.response)

There are several regulatory frameworks that exist for governing the ethical use of AI, including:

1. The European Union's General Data Protection Regulation (GDPR): This regulation provides a comprehensive framework for the protection of personal data and privacy in the EU.
2. The California Consumer Privacy Act (CCPA): This act provides California residents with certain rights regarding their personal information, including the right to know what personal information is being collected, used, and shared, and the right to request deletion of that information.
3. The American Data Privacy and Protection Act (ADPPA): This act provides a comprehensive framework for the protection of personal data and privacy in the US.
4. The Organization for Economic Cooperation and Development (OECD) Principles on Artificial Intelligence: These principles provide a framework for the ethical development and use of AI, including the importance of transparency, accountability, and respect for human right