<a href="https://colab.research.google.com/github/rvats20/RAG-LLM-APP/blob/main/Multi_agentic_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# MULTI-AGENTIC RAG WITH HUGGING FACE CODE AGENTS


This notebook accompanies the blog post Multi-agentic RAG with Hugging Face Code Agents. The full code implementation can be found in the GitHub repo https://github.com/GabrieleSgroi/multiagentic_rag.

Code agents work by executing Python code, make sure you run the notebook in an isolated environment and don't allow any unsafe import or function. Always supervise the system execution inspecting the logs, interrupt the execution if the system is trying to execute code that may pose risks.

As all Large Language Model based systems, this is also prone to hallucination. Always check the sources used to generate the answers and don't trust the generated answers blindly.

By default, the agents are powered by the model *Qwen2.5–7B-Instruct*. See the [model card](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) for licence, limitations, and other information.

The default embedding model for similarity retrieval is all-mpnet-base-v2, check also the corresponding [model card](https://huggingface.co/sentence-transformers/all-mpnet-base-v2
).

# Install and Imports


In [None]:
!pip install git+https://github.com/GabrieleSgroi/multiagentic_rag/

Collecting git+https://github.com/GabrieleSgroi/multiagentic_rag/
  Cloning https://github.com/GabrieleSgroi/multiagentic_rag/ to /tmp/pip-req-build-93ad45z9
  Running command git clone --filter=blob:none --quiet https://github.com/GabrieleSgroi/multiagentic_rag/ /tmp/pip-req-build-93ad45z9
  Resolved https://github.com/GabrieleSgroi/multiagentic_rag/ to commit 8bc3b5aa78672f8e715042fc5defdc872ff32687
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting fastapi (from multiagentic-rag==0.0.1)
  Downloading fastapi-0.115.6-py3-none-any.whl.metadata (27 kB)
Collecting bitsandbytes (from multiagentic-rag==0.0.1)
  Downloading bitsandbytes-0.45.0-py3-none-manylinux_2_24_x86_64.whl.metadata (2.9 kB)
Collecting langchain-community (from multiagentic-rag==0.0.1)
  Downloading langchain_community-0.3.13-py3-none-any.whl.metadata (2.9 kB)
Collecting langchain-huggingfa

In [None]:
from transformers import BitsAndBytesConfig, AutoTokenizer, AutoModelForCausalLM
from multirag.multiagentic_system import MultiAgenticRAG
import torch

# Load model


In [None]:
nf4_config = BitsAndBytesConfig(load_in_4bit=True,
                                bnb_4bit_quant_type="nf4",
                                bnb_4bit_use_double_quant=True,
                                bnb_4bit_compute_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct",
                                          trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct",
                                              device_map="cuda",
                                              torch_dtype=torch.float16,
                                              trust_remote_code=True,
                                              quantization_config=nf4_config)

tokenizer_config.json:   0%|          | 0.00/7.30k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/2.78M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/1.67M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/7.03M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/663 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/27.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/4 [00:00<?, ?it/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/3.95G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/3.86G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/3.86G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/3.56G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/243 [00:00<?, ?B/s]

In [None]:
multiagentic_rag = MultiAgenticRAG(model=model, tokenizer=tokenizer)

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

# Multi-agentic RAG

In [None]:
answer = multiagentic_rag("Does birch plywood float in ethanol?")

[37;1mDoes birch plywood float in ethanol?[0m
[33;1m=== Agent thoughts:[0m
[0mThought: To determine if birch plywood floats in ethanol, we need to know the density of both birch plywood and ethanol. If the density of birch plywood is less than the density of ethanol, it will float. I will use the `wikipedia_search_agent` tool to find the densities of birch plywood and ethanol.[0m
[33;1m>>> Agent is executing the code below:[0m
[0m[38;5;7mbirch_plywood_density_info[39m[38;5;7m [39m[38;5;109;01m=[39;00m[38;5;7m [39m[38;5;7mwikipedia_search_agent[39m[38;5;7m([39m[38;5;144m"[39m[38;5;144mbirch plywood density[39m[38;5;144m"[39m[38;5;7m)[39m
[38;5;7methanol_density_info[39m[38;5;7m [39m[38;5;109;01m=[39;00m[38;5;7m [39m[38;5;7mwikipedia_search_agent[39m[38;5;7m([39m[38;5;144m"[39m[38;5;144methanol density[39m[38;5;144m"[39m[38;5;7m)[39m
[38;5;109mprint[39m[38;5;7m([39m[38;5;7mbirch_plywood_density_info[39m[38;5;7m)[39m
[38;5;109mprin

In [None]:
answer

'Yes, birch plywood will float in ethanol.'

## Manager agent logs

In [None]:
for s in multiagentic_rag.manager_agent_logs['steps']:
  print("AGENT: ", s['llm_output'])
  print("OBSERVATION: ", s['observation'])

AGENT:  Thought: To determine if birch plywood floats in ethanol, we need to know the density of both birch plywood and ethanol. If the density of birch plywood is less than the density of ethanol, it will float. I will use the `wikipedia_search_agent` tool to find the densities of birch plywood and ethanol.
Code:
```py
birch_plywood_density_info = wikipedia_search_agent("birch plywood density")
ethanol_density_info = wikipedia_search_agent("ethanol density")
print(birch_plywood_density_info)
print(ethanol_density_info)
```<end_action>
OBSERVATION:  Print outputs:
The density of birch plywood is approximately 680 kg/m³.
According to the information extracted from the Wikipedia page 'Alcohol by volume', the density of ethanol is 0.78945 g/mL at 20°C.

AGENT:  Thought: Now that I have the densities, I can compare them to determine if birch plywood will float in ethanol. Birch plywood has a density of 680 kg/m³ and ethanol has a density of 0.78945 g/mL. Since 1 g/mL = 1000 kg/m³, the dens

## Wikipedia search agent logs

In [None]:
for r in multiagentic_rag.wikipedia_search_agent_logs:
  print("TASK: ", r['task'])
  for s in r['steps']:
    print('AGENT: ', s['llm_output'])
    print('OBSERVATION: ', s['observation'])
  print("---"*10)

TASK:  birch plywood density
AGENT:  Thought: I will use the `search_wikipedia` tool to find information about birch plywood density.
Code:
```py
search_wikipedia('birch plywood density')
```<end_action>
OBSERVATION:  Print outputs:
Last output from code snippet:
Pages found for query 'birch plywood density':
Page: Plywood
Summary: Plywood is a composite material manufactured from thin layers, or "plies", of wood veneer that have been stacked and glued together. It is an engineered wood from the family of manufactured boards, which include plywood, medium-density fibreboard (MDF), oriented strand board (OSB), and particle board (or chipboard).
All plywoods bind resin and wood fibre sheets (cellulose cells are long, strong and thin) to form a composite material. The sheets of wood are stacked such that each layer has its grain set typically (see below) perpendicular to its adjacent layers. This alternation of the grain is called cross-graining and has several important benefits: it redu

## Page search agent logs

In [None]:
for r in multiagentic_rag.page_search_agent_logs:
  print("TASK: ", r['task'])
  for s in r['steps']:
    print('AGENT: ', s['llm_output'])
    print('OBSERVATION: ', s['observation'])
  print("---"*10)

TASK:  Retrieve information about the query:"birch plywood density" from the Wikipedia page "Plywood".
AGENT:  Thought: I will use the `retrieve_passages` tool to find information about birch plywood density from the Wikipedia page "Plywood".
Code:
```py
result = retrieve_passages("birch plywood density")
print(result)
```<end_action>
OBSERVATION:  Print outputs:
Retrieved passages for query "birch plywood density":
Passage 0: ... . Shear stress, by contrast, is higher in the center of the panel, and at the outer fibres. Within Europe basic plywood can be divided into three main categories: birch plywood (density approx. 680 kg/m3), mixed plywood (density approx. 620 kg/m3) and conifer plywoods (density 460–520 kg/m3). ...
Passage 1: ... . Aircraft grade plywood is made from three or more plies of birch, as thin as 0.40 millimetres (1⁄64 in) thick in total, and is extremely strong and light. ...
Passage 2: ... Baltic Birch plywood is a product of an area around the Baltic Sea. Original