# Importing Libraries

In [18]:
# updated version
from time import time
import torch
from langchain_community.llms import Ollama
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.chains import RetrievalQA
from langchain_community.vectorstores import Chroma

import warnings
warnings.filterwarnings('ignore')


In [11]:
llm = Ollama(model="llama3")

In [12]:
# Function to test the Ollama model directly
def test_model_ollama(model, message):
    """
    Perform a query with the Ollama model and print the result
    Args:
        model: The Ollama model
        message: The prompt
    Returns:
        str: Question and Answer
    """    
    time_start = time()
    response = model.invoke(message)
    time_end = time()
    total_time = f"{round(time_end-time_start, 3)} sec."
    
    return f"Question: {message}\nAnswer: {response}\nTotal time: {total_time}"

In [13]:
# Test the model query pipeline
from IPython.display import display, Markdown

def colorize_text(text):
    for word, color in zip(["Reasoning", "Question", "Answer", "Total time"], ["blue", "red", "green", "magenta"]):
        text = text.replace(f"{word}:", f"\n\n**<font color='{color}'>{word}:</font>**")
    return text

In [14]:
# Test model with some queries
response = test_model_ollama(llm, "Please explain what is EU AI Act.")
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** Please explain what is EU AI Act.


**<font color='green'>Answer:</font>** The European Union (EU) Artificial Intelligence (AI) Act, also known as the "Artificial Intelligence Act" or "AI Act", is a proposed legislation aimed at regulating and promoting the development of Artificial Intelligence in the EU. The act aims to ensure that AI systems are trustworthy, transparent, and safe for society.

Here are some key aspects of the EU AI Act:

**Objectives:**

1. **Trustworthiness**: Ensure that AI systems are reliable, transparent, and free from biases.
2. **Safety**: Prevent AI systems from causing harm or damage to individuals or society.
3. **Transparency**: Encourage the development of explainable AI (XAI) and ensure that users understand how AI decisions are made.

**Scope:**

1. The act applies to AI systems used in various sectors, including healthcare, finance, education, transportation, and more.
2. It covers both high-risk and low-risk AI applications.

**Key Provisions:**

1. **Risk-based approach**: The act proposes a risk-based approach to regulate AI systems, considering factors like the potential harm they may cause or the level of human oversight required.
2. **High-Risk Systems**: For high-risk AI applications (e.g., healthcare, transportation), strict rules and guidelines will be established for development, testing, and deployment.
3. **Low-Risk Systems**: For low-risk AI applications (e.g., games, entertainment), a lighter regulatory framework will apply, focusing on transparency and accountability.
4. **Explainability**: The act encourages the development of explainable AI (XAI) to ensure that users understand how AI decisions are made.
5. **Ethics Guidelines**: The EU AI Act will provide ethical guidelines for AI developers, emphasizing values like fairness, transparency, and non-discrimination.

**Implementation:**

1. The EU AI Act is expected to be implemented through a combination of binding regulations, guidelines, and best practices.
2. A dedicated AI authority will be established to oversee the implementation and enforcement of the act.

The European Union AI Act aims to strike a balance between promoting innovation in AI development and ensuring that these systems are safe, transparent, and trustworthy for society. The proposed legislation has been subject to public consultation and is expected to be finalized in 2022-2023.

Would you like me to elaborate on any specific aspect of the EU AI Act?


**<font color='magenta'>Total time:</font>** 131.633 sec.

In [16]:
# Data ingestion using PDF loader
loader = PyPDFLoader("pdf/Physics Classes 9-10.pdf")
documents = loader.load()
#print(documents)

In [19]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
all_splits = text_splitter.split_documents(documents)
#print(all_splits)

In [20]:
device = "cuda" if torch.cuda.is_available() else "cpu"
model_name = "sentence-transformers/all-mpnet-base-v2"
model_kwargs = { "device": device }

embeddings = HuggingFaceEmbeddings(
    model_name=model_name,
    model_kwargs=model_kwargs
)

print(f"Using device: {device}")

2024-09-16 16:39:30.793773: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-16 16:39:30.896634: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-16 16:39:30.926101: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-09-16 16:39:31.094475: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Using device: cpu


In [21]:
print(embeddings)

client=SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
) model_name='sentence-transformers/all-mpnet-base-v2' cache_folder=None model_kwargs={'device': 'cpu'} encode_kwargs={} multi_process=False show_progress=False


In [22]:
vectordb = Chroma.from_documents(
    documents=all_splits, 
    embedding=embeddings, persist_directory="chroma_db"
)

In [23]:
retriever = vectordb.as_retriever()
qa = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=retriever, verbose=True
)

In [24]:
def test_rag(qa, query):
    time_start = time()
    response = qa.run(query)
    time_end = time()
    total_time = f"{round(time_end-time_start, 3)} sec."
    full_response =  f"Question: {query}\nAnswer: {response}\nTotal time: {total_time}"
    print(full_response)

query = "What is force?"
test_rag(qa, query)



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m
Question: What is force?
Answer: According to Newton's first law of motion, a force is "that which acting on a body" - in other words, an external cause that changes or tends to change the state of an object.
Total time: 94.074 sec.


In [25]:
query = "explain me briefly about the electricity formulas"
test_rag(qa, query)



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m
Question: explain me briefly about the electricity formulas
Answer: Based on the provided context, I'll try to help you with that.

From what we've covered so far:

1. **Work done (Energy spent)**: WV Q = VQ joule, where V is the potential difference and Q is the amount of charge.
2. **Electric current**: QIt = (or QI t) = I t (current I flowing through conductor for time t).
3. **Electric power**: Power = Work done = 1WV I t = WV I t Joule, where W is the electric power.

Using Ohm's law, we can express the relationship between electric potential difference (V), current (I), and resistance (R) as:

WV I t = VQ = 2 (or 3.33 in this specific example)

These formulas should give you a brief overview of electricity basics!
Total time: 120.019 sec.


In [26]:
# Retrieve the document sources for the query
docs = vectordb.similarity_search(query)
print(f"Query: {query}")
print(f"Retrieved documents: {len(docs)}")
for doc in docs:
    doc_details = doc.to_json()['kwargs']
    print("Source: ", doc_details['metadata']['source'])
    print("Text: ", doc_details['page_content'], "\n")

Query: explain me briefly about the electricity formulas
Retrieved documents: 4
Source:  pdf/Physics Classes 9-10.pdf
Text:  coulomb charge flows through the conductor, the amount of work done =  VQ joule. 
Therefore, energy spent or the amount of energy converted is                                                          
WV Q=  
Again, electric current,                   QIt=  
                                                   or, QI t= 
                                                 ∴  WV I t=                                (11.11) 
Using ohm’s law this relationship can be expressed as below, 
                             ∴    2
2 VWV I tI R t tR== =   Joule                       (11.12) 
Electric power 
The electric appliances which we use in houses or in offices are generally marked with 
the voltage by which it runs and the electric power in watt. We know the rate of work 
done or the rate of energy conversion is called power. Therefore, the rate at which energy is converted 