In [2]:
# Cellule 1 : Installation des librairies nécessaires
!pip install transformers accelerate bitsandbytes


Collecting transformers
  Downloading transformers-4.49.0-py3-none-any.whl.metadata (44 kB)
Collecting accelerate
  Downloading accelerate-1.4.0-py3-none-any.whl.metadata (19 kB)
Collecting bitsandbytes
  Downloading bitsandbytes-0.45.3-py3-none-manylinux_2_24_x86_64.whl.metadata (5.0 kB)
Collecting filelock (from transformers)
  Downloading filelock-3.17.0-py3-none-any.whl.metadata (2.9 kB)
Collecting huggingface-hub<1.0,>=0.26.0 (from transformers)
  Downloading huggingface_hub-0.29.1-py3-none-any.whl.metadata (13 kB)
Collecting tokenizers<0.22,>=0.21 (from transformers)
  Downloading tokenizers-0.21.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Collecting safetensors>=0.4.1 (from transformers)
  Downloading safetensors-0.5.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.8 kB)
Collecting torch>=2.0.0 (from accelerate)
  Downloading torch-2.6.0-cp312-cp312-manylinux1_x86_64.whl.metadata (28 kB)
Collecting nvidia-cuda-nvrtc-cu12==12

In [None]:
import huggingface_hub

huggingface_hub.login(token='########', add_to_git_credential=True)

In [15]:
# Cellule 2 : Chargement du modèle et test de génération

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, pipeline

# Nom du modèle sur Hugging Face
model_name = "mistralai/Mistral-7B-v0.1"

# Configuration pour la quantification en 4 bits (pour réduire l'empreinte mémoire)
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)

# Chargement du tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

# Chargement du modèle en 4-bit
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

# Création d'une pipeline pour la génération de texte
generator = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map="auto")

# Prompt de test
prompt = "Hello, tell me what is artificial intelligence "

# Génération de texte avec quelques paramètres (ajustez max_new_tokens, température, etc. selon vos besoins)
result = generator(
    prompt,
    max_new_tokens=100,
    do_sample=True,
    temperature=0.7,
    top_p=0.95,
    top_k=50
)

# Affichage du résultat
print(result[0]['generated_text'])


Loading checkpoint shards: 100%|██████████| 2/2 [00:05<00:00,  2.78s/it]
Device set to use cuda:0
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Hello, tell me what is artificial intelligence 🙂

Artificial intelligence is an area of computer science that deals with the creation of intelligent machines that work and think like humans.

Artificial intelligence is the simulation of human intelligence processes by machines, especially computer systems. These processes include learning (the acquisition of information and rules for using the information), reasoning (using rules to reach approximate or definite conclusions) and self-correction.

Intelligence is the ability to learn, understand, and solve problems. Artificial intelligence is


# Modèle RAG

In [16]:
!pip install faiss-cpu sentence-transformers
!pip install PyPDF2
!pip install pycryptodome




huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Collecting faiss-cpu
  Downloading faiss_cpu-1.10.0-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (4.4 kB)
Collecting sentence-transformers
  Downloading sentence_transformers-3.4.1-py3-none-any.whl.metadata (10 kB)
Downloading faiss_cpu-1.10.0-cp312-cp312-manylinux_2_28_x86_64.whl (30.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m30.7/30.7 MB[0m [31m72.7 MB/s[0m eta [36m0:00:00[0m:00:01[0m00:01[0m
[?25hDownloading sentence_transformers-3.4.1-py3-none-any.whl (275 kB)
Installing collected packages: faiss-cpu, sentence-transformers
Successfully installed faiss-cpu-1.10.0 sentence-transformers-3.4.1


In [1]:
# 2. Extraction du texte du PDF et découpage en chunks
import PyPDF2
import re

# Chemin vers ton PDF
pdf_path = "rules.pdf"


# Extraire le texte de chaque page
pdf_text = ""
with open(pdf_path, "rb") as f:
    reader = PyPDF2.PdfReader(f)
    for page in reader.pages:
        text = page.extract_text()
        if text:
            pdf_text += text + "\n"

# Pour éviter d'avoir des morceaux trop longs, on découpe le texte en chunks 
# en se basant par exemple sur les sauts de ligne multiples ou une taille maximale
def split_text(text, max_length=1000):
    # On découpe d'abord par paragraphes (lignes vides)
    paragraphs = re.split(r'\n\s*\n', text)
    chunks = []
    current_chunk = ""
    for para in paragraphs:
        if len(current_chunk) + len(para) < max_length:
            current_chunk += para + "\n\n"
        else:
            chunks.append(current_chunk.strip())
            current_chunk = para + "\n\n"
    if current_chunk:
        chunks.append(current_chunk.strip())
    return chunks

documents = split_text(pdf_text, max_length=1000)
print(f"Nombre de chunks extraits : {len(documents)}")


unknown widths : 
[0, IndirectObject(582, 0, 140057686764496)]
unknown widths : 
[0, IndirectObject(585, 0, 140057686764496)]
unknown widths : 
[0, IndirectObject(588, 0, 140057686764496)]


Nombre de chunks extraits : 102


In [2]:
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np

# Initialisation du modèle d'embeddings 
embedder = SentenceTransformer('all-MiniLM-L6-v2')

# Calcul des embeddings pour chaque document (page)
doc_embeddings = embedder.encode(documents, convert_to_numpy=True)

# Création de l'index FAISS (index L2)
dimension = doc_embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(doc_embeddings)

def retrieve_documents(query, top_k=2):
    """Récupère les top_k documents les plus pertinents pour une requête donnée."""
    query_embedding = embedder.encode([query], convert_to_numpy=True)
    distances, indices = index.search(query_embedding, top_k)
    retrieved_docs = [documents[i] for i in indices[0]]
    return retrieved_docs


  from .autonotebook import tqdm as notebook_tqdm


In [9]:
def build_prompt(query):
    """
    Construit un prompt en ajoutant en contexte les documents récupérés.
    Exemple de prompt généré :
    
    Contexte:
    [Chunk 1]
    [Chunk 2]
    ...
    
    Question: [ta requête]
    Réponse:
    """
    retrieved_docs = retrieve_documents(query, top_k=3)  # on récupère par exemple les 3 meilleurs chunks
    context = "\n\n".join(retrieved_docs)
    prompt = f"Contexte:\n{context}\n\nQuestion: {query}\nRéponse:"
    return prompt

# Exemple d'utilisation :
query = "What are the dimension of the court ? "
rag_prompt = build_prompt(query)
print("Prompt pour RAG :\n", rag_prompt)


Prompt pour RAG :
 Contexte:
Page 6 of 105 OFFICIAL BASKETBALL RULES 2024 October 2024 
RULE TWO – COURT AND EQUIPMENT Article 2 Court 2.1 Court The court shall have a flat, hard surface free from obstructions (Diagram 1) with dimensions of 28 m in length by 15 m in width measured from the inner edge of the boundary line.  2.2 Floor The floor shall include the court area surrounded by a further boundary lane free from obstructions with a minimum of 2 m in width (Diagram 2). Therefore, the floor shall have dimensions of a minimum of 32 m in length and a minimum of 19 m in width.  2.3 Backcourt A team's backcourt consists of its team's own basket, the inbounds part of the backboard and that part of the court limited by the endline behind its own basket, the sidelines and the centre line.  2.4 Frontcourt A team's frontcourt consists of the opponents' basket, the inbounds part of the backboard and that part of the court limited by the endline behind the opponents' basket, the sidelines and

In [4]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, pipeline

# Nom du modèle sur Hugging Face
model_name = "mistralai/Mistral-7B-v0.1"

# Configuration pour la quantification en 4 bits
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)

# Chargement du tokenizer et du modèle
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

# Création de la pipeline de génération
generator = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map="auto")

# Génération avec le prompt enrichi
result = generator(
    rag_prompt,
    max_new_tokens=150,
    do_sample=True,
    temperature=0.7,
    top_p=0.95,
    top_k=50
)

print("Réponse générée :\n", result[0]['generated_text'])


Loading checkpoint shards: 100%|██████████| 2/2 [00:04<00:00,  2.46s/it]
Device set to use cuda:0
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Réponse générée :
 Contexte:
Page 6 of 105 OFFICIAL BASKETBALL RULES 2024 October 2024 
RULE TWO – COURT AND EQUIPMENT Article 2 Court 2.1 Court The court shall have a flat, hard surface free from obstructions (Diagram 1) with dimensions of 28 m in length by 15 m in width measured from the inner edge of the boundary line.  2.2 Floor The floor shall include the court area surrounded by a further boundary lane free from obstructions with a minimum of 2 m in width (Diagram 2). Therefore, the floor shall have dimensions of a minimum of 32 m in length and a minimum of 19 m in width.  2.3 Backcourt A team's backcourt consists of its team's own basket, the inbounds part of the backboard and that part of the court limited by the endline behind its own basket, the sidelines and the centre line.  2.4 Frontcourt A team's frontcourt consists of the opponents' basket, the inbounds part of the backboard and that part of the court limited by the endline behind the opponents' basket, the sidelines and

In [22]:
def poser_question_et_repondre():
    # Demande à l'utilisateur de saisir une question
    question = input("Veuillez entrer votre question : ")
    
    # Construction du prompt enrichi en ajoutant le contexte récupéré
    prompt = build_prompt(question)
    print("\n--- Prompt généré pour RAG ---\n")
    print(prompt)
    
    # Génération de la réponse par le modèle
    result = generator(
        prompt,
        max_new_tokens=150,
        do_sample=True,
        temperature=0.7,
        top_p=0.95,
        top_k=50
    )
    
    # Affichage de la réponse générée
    print("\n--- Réponse générée ---\n")
    print(result[0]['generated_text'])

# Appel de la fonction pour lancer le processus interactif
poser_question_et_repondre()


Veuillez entrer votre question :  What is travelling ? 


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.



--- Prompt généré pour RAG ---

Contexte:
Page 64 of 105 OFFICIAL BASKETBALL RULES 2024 October 2024 
Violations TRAVELLING  ILLEGAL DRIBBLE: DOUBLE DRIBBLING  ILLEGAL DRIBBLE: CARRYING THE BALL 
     Rotate fists  Patting motion with palm  Half rotation with palm  3 SECONDS  5 SECONDS  8 SECONDS  SHOT CLOCK 
       Wave arm, show 3 fingers  Show 5 fingers  Show 8 fingers  Fingers touch shoulder  BALL RETURNED TO BACKCOURT  DELIBERATE KICK OR BLOCK OF THE BALL  GOALTENDING/ BASKET INTERFERENCE 

 Wave arm front of body  Point to the foot  Rotate finger, extend index finger over the other hand with a circle   
17 18 19 
23 
24 25 21 22 
26 20

Page 98 of 105 OFFICIAL BASKETBALL RULES 2024 October 2024 
Centre line .................................................................................................................................................... 6 Charging ....................................................................................................................

# J'essaie d'améliorer la pertinence des réponses en utilisant rank_bm25 qui se base sur la fréquence et l'importance des mots dans les documents. Elle est robuste pour trouver des correspondances exactes et peut être utile lorsque le vocabulaire technique est important.

In [11]:
!pip install rank_bm25


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Collecting rank_bm25
  Downloading rank_bm25-0.2.2-py3-none-any.whl.metadata (3.2 kB)
Downloading rank_bm25-0.2.2-py3-none-any.whl (8.6 kB)
Installing collected packages: rank_bm25
Successfully installed rank_bm25-0.2.2


In [12]:
from rank_bm25 import BM25Okapi

# Prétraitement simple : on tokenize chaque document (en minuscules et en splittant sur les espaces)
tokenized_docs = [doc.lower().split() for doc in documents]

# Création de l'instance BM25
bm25 = BM25Okapi(tokenized_docs)


In [13]:
def retrieve_documents_combined(query, top_k=2, alpha=0.5):
    """
    Combine les scores de la recherche dense et BM25.
    alpha est le poids accordé à la recherche dense (entre 0 et 1).
    """
    # Scores dense
    query_embedding = embedder.encode([query], convert_to_numpy=True)
    distances, indices = index.search(query_embedding, len(documents))
    # Inverser la distance pour obtenir une mesure où plus le score est élevé, mieux c'est
    dense_scores = 1 / (1 + distances[0])
    
    # Scores BM25
    tokenized_query = query.lower().split()
    bm25_scores = np.array(bm25.get_scores(tokenized_query))
    # Normalisation BM25
    bm25_scores = (bm25_scores - bm25_scores.min()) / (bm25_scores.max() - bm25_scores.min() + 1e-8)
    
    # Calcul d'un score combiné
    combined_scores = alpha * dense_scores + (1 - alpha) * bm25_scores
    
    # Sélectionne les top_k documents selon le score combiné
    sorted_indices = np.argsort(combined_scores)[::-1][:top_k]
    retrieved_docs = [documents[i] for i in sorted_indices]
    return retrieved_docs


In [19]:
def build_prompt_bm25(query):
    """
    Construit un prompt en ajoutant en contexte les documents récupérés.
    Exemple de prompt généré :
    
    Contexte:
    [Chunk 1]
    [Chunk 2]
    ...
    
    Question: [ta requête]
    Réponse:
    """
    retrieved_docs = retrieve_documents_combined(query, top_k=3)  # on récupère par exemple les 3 meilleurs chunks
    context = "\n\n".join(retrieved_docs)
    prompt = f"Contexte:\n{context}\n\nQuestion: {query}\nRéponse:"
    return prompt


# Exemple d'utilisation :
query = "Travelling"
rag_prompt2 = build_prompt_bm25(query)
print("Prompt pour RAG :\n", rag_prompt2)


Prompt pour RAG :
 Contexte:
Page 30 of 105 OFFICIAL BASKETBALL RULES 2024 October 2024 
A dribble ends when the player touches the ball with both hands simultaneously or permits the ball to come to rest in one or both hands. 24.1.3 A player who accidentally loses and then regains control of a live ball on the court is considered to have fumbled the ball.  24.1.4 The following are not dribbles: • Successive shots for a goal. • Fumbling the ball at the start or at the end of a dribble. • Attempts to gain control of the ball by tapping it from the vicinity of other players. • Tapping the ball from the control of another player. • Deflecting a pass and gaining control of the ball. • Tossing the ball from hand to hand and allowing it to come to rest in one or both hands before touching the court, provided that no travelling violation is committed. • Throwing the ball against the backboard and regaining the control of the ball. 24.2 Rule A player shall not dribble for a second time after th

In [21]:
def poser_question_et_repondre2():
    # Demande à l'utilisateur de saisir une question
    question = input("Veuillez entrer votre question : ")
    
    # Construction du prompt enrichi en ajoutant le contexte récupéré
    prompt = build_prompt_bm25(question)
    print("\n--- Prompt généré pour RAG ---\n")
    print(prompt)
    
    # Génération de la réponse par le modèle
    result = generator(
        prompt,
        max_new_tokens=150,
        do_sample=True,
        temperature=0.7,
        top_p=0.95,
        top_k=50
    )
    
    # Affichage de la réponse générée
    print("\n--- Réponse générée ---\n")
    print(result[0]['generated_text'])

# Appel de la fonction pour lancer le processus interactif
poser_question_et_repondre2()


Veuillez entrer votre question :  What is travelling ? 


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.



--- Prompt généré pour RAG ---

Contexte:
Page 30 of 105 OFFICIAL BASKETBALL RULES 2024 October 2024 
A dribble ends when the player touches the ball with both hands simultaneously or permits the ball to come to rest in one or both hands. 24.1.3 A player who accidentally loses and then regains control of a live ball on the court is considered to have fumbled the ball.  24.1.4 The following are not dribbles: • Successive shots for a goal. • Fumbling the ball at the start or at the end of a dribble. • Attempts to gain control of the ball by tapping it from the vicinity of other players. • Tapping the ball from the control of another player. • Deflecting a pass and gaining control of the ball. • Tossing the ball from hand to hand and allowing it to come to rest in one or both hands before touching the court, provided that no travelling violation is committed. • Throwing the ball against the backboard and regaining the control of the ball. 24.2 Rule A player shall not dribble for a second

In [24]:
poser_question_et_repondre2()

Veuillez entrer votre question :  what are the different type of fouls ? 


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.



--- Prompt généré pour RAG ---

Contexte:
October 2024 OFFICIAL BASKETBALL RULES 2024 Page 71 of 105 B.1 The scoresheet shown in Diagram 9 is the one approved by the FIBA Technical Commission.  B.2 It consists of 1 original and 3 copies, each to be of different coloured paper. The origi-nal on white paper is for FIBA. The first copy on blue paper is for the organising body of the competition, the second copy on pink paper is for the winning team, and the last copy, on yellow paper, is for the losing team. Note: 1. The scorer shall use 2 different coloured pens, RED for the first and third quarter and BLUE or BLACK for the second and fourth quarter. For all overtimes, all entries shall be made in BLUE or BLACK (same colour as for the second and fourth quarter). 2. The scoresheet may be prepared and completed electronically. B.3 At least 40 minutes before the game is scheduled to start, the scorer shall prepare the scoresheet in the following manner: B.3.1 The scorer shall enter the nam