# RAGondin
Proto NB to make a RAG working out.

In [None]:
import sys
!{sys.executable} -m pip install  numpy pymilvus accelerate bitsandbytes
!{sys.executable} -m pip install  langchain langchain_experimental unstructured pillow_heif unstructured_inference pytesseract unstructured_pytesseract pikepdf timm
!{sys.executable} -m pip install  pypdf pdf2image pdfminer pdfminer-six pypdfium2 pdfplumber
!{sys.executable} -m pip install  rapidocr-onnxruntime
!{sys.executable} -m pip install  torch transformers accelerate bitsandbytes transformers sentence-transformers qdrant-client
!{sys.executable} -m pip install  ragatouille 

In [None]:
import sys
!{sys.executable} -m pip install langchain_openai

## Data Processing

### Uniformising Data

In [8]:
from langchain.document_loaders import DirectoryLoader
from langchain.document_loaders.pdf import PDFMinerLoader
from langchain.document_loaders.xml import UnstructuredXMLLoader
from langchain.document_loaders.csv_loader import CSVLoader
from langchain_community.document_loaders.text import TextLoader
from langchain_community.document_loaders import UnstructuredHTMLLoader
from langchain_community.document_loaders.merge import MergedDataLoader



# Define a dictionary to map file extensions to their respective loaders
loaders = {
    '.pdf': PDFMinerLoader,
    '.xml': UnstructuredXMLLoader,
    '.csv': CSVLoader,
    '.txt': TextLoader,
    '.html': UnstructuredHTMLLoader,
}

# Define a function to create a DirectoryLoader for a specific file type
def create_directory_loader(file_type, directory_path):
    return DirectoryLoader(
        path=directory_path,
        glob=f"**/*{file_type}",
        loader_cls=loaders[file_type],
    )

directory_path = "../../test_data/"
# Create DirectoryLoader instances for each file type
pdf_loader = create_directory_loader('.pdf', directory_path)
xml_loader = create_directory_loader('.xml', directory_path)
csv_loader = create_directory_loader('.csv', directory_path)
txt_loader = create_directory_loader('.txt', directory_path)
html_loader = create_directory_loader('.html', directory_path)

#loader_all = MergedDataLoader(loaders=[pdf_loader, xml_loader,csv_loader,txt_loader,html_loader])

# Load the files
pdf_documents = pdf_loader.load()
xml_documents = xml_loader.load()
csv_documents = csv_loader.load()
txt_documents = txt_loader.load()
html_documents = html_loader.load()

docs = pdf_documents + xml_documents + csv_documents + txt_documents + html_documents
#docs2 = loader_all.load()

### Chunking

In [9]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(chunk_size=1024, chunk_overlap=100)

chunked_docs = splitter.split_documents(docs)

In [10]:
print(chunked_docs[10].page_content)

Valeur en Points

Chaque  figurine  possède  une  valeur  en  points, 
qui est citée dans son profil. Cette valeur permet de dé-
terminer l’impact de la figurine à la bataille. Un simple 
milicien Hobbit coûte ainsi 4pts, tandis que Sauron en 
vaut 400 ! Certaines figurines valent beaucoup de points, 
car  elles  sont  capables  d’éliminer  des  douzaines  d’ad-
versaires  en  quelques  tours,  d’autres  sont  plus  utiles 
pour renforcer leurs alliés, etc. 
En cumulant les coûts en points de vos figurines, vous obtenez la valeur totale de votre armée. Cela vous 
permet de disputer des parties équitables : il faut beaucoup de Hobbits pour espérer mettre à bas le Seigneur 
des Ténèbres !

Taille de la Partie


### Embedings

In [11]:
from langchain_community.vectorstores import Qdrant
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.docstore.document import Document as LangchainDocument


EMBEDDING_MODEL_NAME = "thenlper/gte-small"

embedding_model = HuggingFaceEmbeddings(
    model_name=EMBEDDING_MODEL_NAME,
    multi_process=True,
    model_kwargs={"device": "cuda"},
    encode_kwargs={"normalize_embeddings": True},  # set True for cosine similarity
)

### Vector Data Base

In [12]:
from qdrant_client import QdrantClient

host = "10.0.0.177"
port = "6333"
client = QdrantClient(url=f"http://{host}:{port}", prefer_grpc=False)
KNOWLEDGE_VECTOR_DATABASE = Qdrant(client=client, collection_name="texts", embeddings=embedding_model)
KNOWLEDGE_VECTOR_DATABASE = KNOWLEDGE_VECTOR_DATABASE.from_documents(chunked_docs, embedding=embedding_model)

#KNOWLEDGE_VECTOR_DATABASE = Qdrant.from_documents(client= client,chunked_docs,embedding_model)

## Bloc LLM
Here we work on the LLM model

### LLM Model

In [6]:
from transformers import pipeline
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from langchain_community.llms import HuggingFacePipeline


READER_MODEL_NAME = "openchat/openchat-3.5-0106"

bnb_config = BitsAndBytesConfig(
   load_in_4bit=True,
   bnb_4bit_quant_type="nf4",
   bnb_4bit_use_double_quant=True,
   bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(READER_MODEL_NAME, quantization_config=bnb_config,low_cpu_mem_usage=True)
tokenizer = AutoTokenizer.from_pretrained(READER_MODEL_NAME)

pipe = pipeline(
    model=model,
    tokenizer=tokenizer,
    task="text-generation",
    do_sample=True,
    temperature=0.1,
    repetition_penalty=1.1,
    return_full_text=False,
    max_new_tokens=10000,
)

hf = HuggingFacePipeline(pipeline=pipe)

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [None]:
output = pipe("Qu'elle est la capitale de la France?")

### Pipeline for the RAG

#### Prompt config

In [13]:
from langchain.prompts import PromptTemplate
prompt_template = """
<|system|>
Answer the question in french only using the following french context to help:

{context}

</s>
<|user|>
{question}
</s>
<|assistant|>

 """

RAG_PROMPT_TEMPLATE = PromptTemplate(
    input_variables=["context", "question"],
    template=prompt_template,
)


### Reranking Option

### Assembling

In [14]:
from transformers import Pipeline
from typing import Optional, Tuple, List
from langchain.chains import SimpleSequentialChain, LLMChain, StuffDocumentsChain
from langchain.retrievers import ContextualCompressionRetriever
from langchain_core.output_parsers import StrOutputParser
from ragatouille import RAGPretrainedModel

RAG = RAGPretrainedModel.from_pretrained("colbert-ir/colbertv2.0")

def answer_with_rag(
    question: str,
    llm: Pipeline,
    knowledge_index: Qdrant,
    num_retrieved_docs: int = 10,
    num_returned_docs: int = 5,
    rerank: bool = True,
) -> Tuple[str, List[LangchainDocument]]:
    # Gather documents with retriever
    print("=> Retrieving documents...")
    
    retriever = knowledge_index.as_retriever(search_kwargs={"k": num_retrieved_docs})

    if rerank:
        compression_retriever = ContextualCompressionRetriever(base_compressor=RAG.as_langchain_document_compressor(),
                                                           base_retriever=retriever)
        relevant_docs = compression_retriever.invoke(question)
    else:
        relevant_docs = retriever.invoke(question)

    #Build the chain

    relevant_docs = relevant_docs[:num_returned_docs]

    #Stuff Chain
    document_variable_name = "context"
    
    
    document_prompt = PromptTemplate(
        input_variables=["page_content"],
        template="{page_content}"
    )
    
    template = PromptTemplate(
        input_variables=["context", "question"],
        template=prompt_template,
    )
    
    chain = LLMChain(llm=hf, prompt=RAG_PROMPT_TEMPLATE, output_parser=StrOutputParser(),verbose= True)
    chainfinal = StuffDocumentsChain(
        llm_chain=chain,
        document_prompt=document_prompt,
        document_variable_name=document_variable_name,
        verbose =True
    )

    #Stuff Chain
    



    # Redact an answer
    print("=> Generating answer...")
    
    
    answer = chainfinal.run(question = question, input_documents = relevant_docs)

    return answer, relevant_docs

def answer_without_rag(
    question: str,
    llm: Pipeline,
) -> Tuple[str, List[LangchainDocument]]:
    answer = llm(question)[0]["generated_text"]

    return answer, _

In [15]:
question = "Qu'elle sont les 5 phases du jeu MESBG?"
answer, relevant_docs = answer_with_rag(question, hf, KNOWLEDGE_VECTOR_DATABASE,num_retrieved_docs = 10 )
print(answer)

=> Retrieving documents...


100%|█████████████████████████████████████████████| 1/1 [00:00<00:00, 59.94it/s]
  warn_deprecated(


=> Generating answer...


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
<|system|>
Answer the question in french only using the following french context to help:

Le Jeu

Que ce soit en jouant des scénarios historiques ou bien compétitifs, la manière de jouer est la même.

La partie est séparée est en une succession de tours. Chaque tour est composée de 5 phases :

Initiative : c’est la phase qui détermine quel joueur va jouer en premier pour le tour.

Mouvement : c’est la phase la plus importante où quasiment tout ce décide. Les figurines auront le droit de se déplacer, faire de la magie, de charger, etc. On commencera toujours par le joueur qui a l’initiative. Une fois qu’il a terminé, l’autre joueur pourra bouger ses figurines.

Tir : c’est la phase où les figurines pourront utiliser leurs armes de tir (arc, arbalète, sarbacane ou engin de siège,…). On commence par le joueur qui a l’initiative. E

In [16]:
question2 = "Comment jouer un mumakil?"
answer, relevant_docs = answer_with_rag(question2, hf, KNOWLEDGE_VECTOR_DATABASE,num_retrieved_docs = 15 )
print(answer)

=> Retrieving documents...


100%|█████████████████████████████████████████████| 1/1 [00:00<00:00, 54.95it/s]


=> Generating answer...


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
<|system|>
Answer the question in french only using the following french context to help:

R : Oui. 

R : Non.

Q : D’autres figurines que le Chef de Guerre des Mûmakil ou le 
Chef  Commandant  Mahûd  peuvent-elles  être  placées  sur  la 
partie fourchue en dehors du Howdah ?

R : Non, les autres figurines doivent être placées dans le Howdah.

Q : Est-ce que l’option des Peintures de Guerre sur le Mûmak de 
Guerre du Harad affecte le Mûmak également ?

R : Oui. 

Q  :  Peut-on  échanger  l’épée  du  Gardien  de  Karnâ  pour  un 
autre  équipement,  et  bénéficier  tout  de  même  de  l’attaque 
supplémentaire  conférée  par  l’équipement  et  la  règle Doubles 
Lames  s’il utilise un coup spécial associé à ce nouvel équipement ?

R  :  L’équipement Doubles Lames  confère  en  fait  au  Gardien  de 
Karnâ  une  seconde  épée.  L