<a href="https://colab.research.google.com/github/nizarmahmoudi/legal-assistant-Simple-RAG-/blob/main/Simple_RAG_Demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1. Installation and Setup

In [1]:
!nvidia-smi

Sat Dec 28 15:50:56 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   42C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [2]:
!pip install pypdf
!pip install -q transformers einops accelerate langchain bitsandbytes
!pip install sentence_transformers
!pip install llama-index==0.9.39
!pip install -U langchain-community



# 2. Data Preparation

In [6]:
import os
from llama_index import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
from llama_index.llms import HuggingFaceLLM
from llama_index.prompts.prompts import SimpleInputPrompt

Create data directory


In [8]:
!mkdir data

Load documents from the directory

In [7]:
documents = SimpleDirectoryReader("/content/data").load_data()

In [8]:
documents

[Document(id_='83752255-6700-41d3-bc37-0f369281da5a', embedding=None, metadata={'page_label': '1', 'file_name': 'L_exercice_du_commerce_par_des_etrangers_en_Tunisie.pdf', 'file_path': '/content/data/L_exercice_du_commerce_par_des_etrangers_en_Tunisie.pdf', 'file_type': 'application/pdf', 'file_size': 121197, 'creation_date': '2024-12-28', 'last_modified_date': '2024-12-28', 'last_accessed_date': '2024-12-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, text=' 1  \n \nL’exercice du commerce par les étrangers en Tunisie  \n \n \n        Par Mahmoud Anis BETTAIEB \n \n \nDans le passé, le commerçant pouvait se \ndéplacer librement et exercer son activité ou \nbon lui semblait.  C’est grâce à cette mobilité et \nnotamment à l’existence des foires médiévales \

# 3. Prompt Definition

In [9]:
system_prompt = """
Vous êtes un assistant juridique expert en droit commercial. Votre objectif est de répondre
de manière précise, claire et complète aux questions concernant les lois et réglementations
commerciales, en tenant compte des informations disponibles.
"""

query_wrapper_prompt = SimpleInputPrompt(
    "<|UTILISATEUR|>{query_str}<|ASSISTANT|>Veuillez fournir une réponse précise et complète :"
)

# 4. Model Configuration

Login to Hugging Face CLI

In [10]:
!huggingface-cli login


    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.
    Setting a new token will erase the existing one.
    To log in, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Enter your token (input will not be visible): 
Add token as git credential? (Y/n) Y
Token is valid (permission: fineG

In [11]:
from transformers import BitsAndBytesConfig
import torch

# Quantization configuration
quantization_config = BitsAndBytesConfig(
    load_in_8bit=True,  # Enable 8-bit quantization
    llm_int8_enable_fp32_cpu_offload=True  # Allow offloading to CPU for FP32 layers
)

# Load Mistral model with updated configuration
llm = HuggingFaceLLM(
    context_window=4096,
    max_new_tokens=256,
    generate_kwargs={
        "temperature": 0.7,
        "top_p": 0.9,
        "do_sample": True,
        "repetition_penalty": 1.2,
    },
    system_prompt=system_prompt,
    query_wrapper_prompt=query_wrapper_prompt,
    tokenizer_name="mistralai/Mistral-7B-v0.1",
    model_name="mistralai/Mistral-7B-v0.1",
    device_map="auto",
    model_kwargs={
        "torch_dtype": torch.float16,
        "quantization_config": quantization_config  # Updated quantization config
    }
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

# 5. Embedding Model Setup

In [13]:
from langchain.embeddings.huggingface import HuggingFaceBgeEmbeddings
from llama_index import ServiceContext
from llama_index.embeddings import LangchainEmbedding

**Initialize embedding model**

In [14]:
embed_model = LangchainEmbedding(
    HuggingFaceBgeEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
)

  HuggingFaceBgeEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")


# 6. Service Context and Index Creation

In [15]:
service_context = ServiceContext.from_defaults(
    chunk_size = 1024,
    llm = llm,
    embed_model = embed_model
)

**Build index from documents**

In [16]:
index = VectorStoreIndex.from_documents(documents , service_context=service_context)

# 7. Query Execution

In [17]:
query_engine = index.as_query_engine()

In [18]:
query = "c'est quoi Les restrictions à l exercice du commerce par les étrangers  ?"
response = query_engine.query(query)
print(response)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


"Les restrictions à l’exercice du commerce par les étrangers sont variées selon le pays et peuvent inclure des limites sur le nombre d’entreprises que les étrangers peuvent détenir, des limitations sur les secteurs où ils peuvent travailler, ainsi que des exigences de capital minimum."
