Kudos to: 

* https://www.kaggle.com/code/paultimothymooney/kaggle-support-assistant-mistral-7b-rag/notebook
* https://huggingface.co/docs/transformers/index
* https://python.langchain.com/docs/get_started/introduction
* https://www.kaggle.com/code/philculliton/talking-papers-with-mistral-7b/
* https://www.kaggle.com/code/gpreda/rag-using-llama-2-langchain-and-chromadb/

In [1]:
!pip install -qU langchain accelerate bitsandbytes transformers chromadb sentence-transformers faiss-gpu

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
cudf 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
cuml 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
dask-cudf 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
apache-beam 2.46.0 requires dill<0.3.2,>=0.3.1.1, but you have dill 0.3.7 which is incompatible.
apache-beam 2.46.0 requires pyarrow<10.0.0,>=3.0.0, but you have pyarrow 11.0.0 which is incompatible.
cudf 23.8.0 requires pandas<1.6.0dev0,>=1.3, but you have pandas 2.0.3 which is incompatible.
cudf 23.8.0 requires protobuf<5,>=4.21, but you have protobuf 3.20.3 which is incompatible.
cuml 23.8.0 requires dask==2023.7.1, but you have dask 2023.12.1 which is incompatible.
cuml 23.8.0 requires distributed==2023.7.1, but you have distributed 2023.12.1 which is incompatible.
dask-cuda 23.8.0 requires da

In [2]:
import os
import numpy as np
import pandas as pd
import transformers
from datasets import load_dataset
from transformers import (
    AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
)

from langchain.document_loaders import TextLoader, PyPDFLoader
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain.llms import HuggingFacePipeline
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from torch import cuda, bfloat16
import torch
from time import time
import warnings
warnings.filterwarnings('ignore')

In [3]:
class CFG:
    model_path = "/kaggle/input/mistral/pytorch/7b-instruct-v0.1-hf/1"
    temperature = 0.7
    repetition_penalty = 1.1
    max_new_tokens = 2000
    model_name = "sentence-transformers/all-mpnet-base-v2"
    rag_data = "/kaggle/input/eu-ai-act-full-text/AIAct_final_four-column21012024.pdf"

In [4]:
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype="float16",
    bnb_4bit_use_double_quant=False,
)

# Functions

In [5]:
def test_model(tokenizer, pipeline, prompt_to_test):
    
    time_1 = time()
    sequences = pipeline(
        prompt_to_test, do_sample=True,
        top_k=10, num_return_sequences=1,
        eos_token_id=tokenizer.eos_token_id,
    )
    time_2 = time()
    print(f"Test inference: {round(time_2-time_1, 3)} sec.")
    for seq in sequences:
        print(f"Result: {seq['generated_text']}")
        

In [6]:
def test_rag(qa, query):
    print(f"Query: {query}\n")
    time_1 = time()
    result = qa.run(query)
    time_2 = time()
    print(f"Inference time: {round(time_2-time_1, 3)} sec.")
    print("\nResult: ", result)

# Model

In [7]:
model = AutoModelForCausalLM.from_pretrained(
    CFG.model_path, quantization_config = bnb_config, do_sample=True,
)
tokenizer = AutoTokenizer.from_pretrained(CFG.model_path)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"


`low_cpu_mem_usage` was None, now set to True since model is quantized.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [8]:
text_generation_pipeline = transformers.pipeline(
    model=model,
    tokenizer=tokenizer,
    temperature= CFG.temperature,    
    task="text-generation",
    repetition_penalty= CFG.repetition_penalty,
    return_full_text=True,
    max_new_tokens= CFG.max_new_tokens,    
)

In [9]:
llm = HuggingFacePipeline(pipeline=text_generation_pipeline)

embeddings = HuggingFaceEmbeddings(model_name= CFG.model_name, model_kwargs={"device": "cuda"})


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

# The data

In [10]:
loader = PyPDFLoader(CFG.rag_data)
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=20)
all_splits = text_splitter.split_documents(documents)

In [11]:
print(f'We have created {len(all_splits)} chunks from {len(documents)} pages')

We have created 4925 chunks from 892 pages


# RAG setup

In [12]:
vectordb = Chroma.from_documents(documents=all_splits, embedding=embeddings, persist_directory="chroma_db")
retriever = vectordb.as_retriever()

qa = RetrievalQA.from_chain_type(
    llm=llm,  chain_type="stuff",  retriever=retriever, verbose=True
)

# Compare performance

In [13]:
query = "What is general purpose AI?"
test_model(tokenizer, text_generation_pipeline, query)

Test inference: 9.612 sec.
Result: What is general purpose AI?
A: Q: General Purpose AI (GPAI) refers to the development of a single, powerful, and versatile artificial intelligence system that can perform a wide range of tasks and adapt to new situations. This type of AI would be capable of understanding natural language, recognizing patterns and images, and making decisions based on its knowledge and experience. It would also be able to learn from its interactions with humans and other systems, allowing it to improve over time. GPAI has the potential to revolutionize many industries, including healthcare, finance, transportation, and education, by automating routine tasks and enabling machines to make decisions and take actions independently.


In [14]:
test_rag(qa, query)

Query: What is general purpose AI?



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m
Inference time: 5.314 sec.

Result:  Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

AI system model that is trained on 
broad data at scale, is designed for 
generality of output, and can be 
adapted to a wide range of 
distinctive tasks;  
   
G 
 Article 3, first paragraph, point (1d new)   
G 128f   
(1d)  ‘general purpose AI system’ 
means an AI system that can be 
used in and adapted to a wide range  
of applications for which it was not 
intentionally and specifically   
G

safety, public security, fundamental 
rights, or the society as a whole, 
that can be propagated at scale 
across the value chain.  
 G 
 Article 3, first paragraph, point (44e)   
G 175ah      
(44e)   ‘general -purpose AI system’ 
means an AI system which is based 
on a general -purpose

In [15]:
query = "What does the EU AI Act say about open source models?"
test_model(tokenizer, text_generation_pipeline, query)

Test inference: 10.534 sec.
Result: What does the EU AI Act say about open source models?

The EU AI Act says that:

1. Open source models should be allowed to access and use EU data, provided that they comply with the conditions of the AI system's licensing agreement or other contractual arrangements.
2. The use of open source models should not be restricted by any intellectual property rights or trade secrets of the model's creators.
3. Member states are encouraged to promote transparency in the development and deployment of AI systems, including through the use of open source models.
4. The EU AI Ethics Guidelines recommend that open source models be made available for public scrutiny, auditing, and reuse.
5. The EU AI Ethics Guidelines also recommend that open source models be designed and developed in a way that ensures their reliability, accuracy, and security.


In [16]:
test_rag(qa, query)

Query: What does the EU AI Act say about open source models?



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m
Inference time: 7.893 sec.

Result:  Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

Article 2(5g)   
G 125h      
5g.  The obligations laid down in 
this Regulation shall not apply to 
AI systems released under free and 
open source licences unless they 
are placed on the market or put into 
service as high -risk AI systems or 
an AI system that falls under Title 
II and IV.  
 G 
 Article 2(5h)   
G 125i   
5e.  This Regulation shall not apply 
to AI components provided under 
free and open -source licences 
except to the extent they are placed

Text Origin: GSC  
 
 Recital 60j   
G 70v     
(60j)   Providers that place general 
purpose AI models on the EU 
market should ensure compliance 
with the  relevant obligations in this 
Regulat

In [17]:
query = "What does the AI Act say about copyright?"
test_model(tokenizer, text_generation_pipeline, query)

Test inference: 7.514 sec.
Result: What does the AI Act say about copyright?
A: Q: The AI Act says that copyright owners have the right to be attributed as authors of works generated by AI systems. This means that if an AI system generates a work, the copyright owner's name would appear as the author. Additionally, copyright owners have the right to control the use of their works in certain ways, such as through licensing or distribution. However, the AI Act also recognizes that AI systems can create new and original works that are not protected by existing copyright laws, which raises questions about how these works should be treated in terms of ownership and control.


In [18]:
test_rag(qa, query)

Query: What does the AI Act say about copyright?



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m
Inference time: 7.109 sec.

Result:  Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

G 517t     
(i)  enable providers of AI systems 
to have a good understanding of the 
capabilities and limitations of the 
general purpose AI model and to 
comply with their obligations 
pursuant to this Regulation; and  
 G 
 Article 52c(1), point (b)(ii)   
G 517u      
(ii)  contain, at a minimum, the 
elements set out in Annex XY;  
 G 
 Article 52c(1), point (c)   
G 517v      
(c)  put in place a policy to respect 
Union copyright law in particular 
to identify and respect, including

not necessarily reveal substantial 
information on the dataset used for 
the training or fine -tuning of  the 
model and on how thereby the 
respect of copyright law was 
ensured, 

In [19]:
query = "What is the AI Act interpretation of systemic risk?"
test_model(tokenizer, text_generation_pipeline, query)

Test inference: 2.831 sec.
Result: What is the AI Act interpretation of systemic risk?
A: Systemic risk refers to a significant disruption in financial markets and economic activity that poses a threat to the stability of the financial system as a whole, rather than just to individual institutions or markets.


In [20]:
test_rag(qa, query)

Query: What is the AI Act interpretation of systemic risk?



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m
Inference time: 3.37 sec.

Result:  Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

systemic risk

as a general -purpose AI model with 
systemic risk, the Commission 
should take the request into 
account an d may decide to reassess 
whether the general -purpose AI 
model can still be considered to 
present systemic risks.  
 
Text Origin: GSC  
 
 Recital 60o   
G 70ab      
(60o)   It is also necessary to clarify 
a procedure for the classification of 
a general purpose AI model with 
systemic risks. A general purpose 
AI model that meets the applicable G

provider of the general purpose AI 
model with systemic risk offers 
commitments to implement 
mitigation meausres to address a 
systemic risk at Union level, the 
Commission may by deci