```mermaid
graph TD;
    A[User Input] -->|Input Moderation| B{Is Input Allowed?};
    B -- Yes --> C[Dialog Constraints];
    B -- No --> D[Reject Input];
    C -->|Execute Actions| E[Custom Actions];
    E -->|Retrieve Data| F[Context Retrieval];
    F --> G[Response Generation];
    G -->|Output Validation| H{Is Output Valid?};
    H -- Yes --> I[Send Response];
    H -- No --> J[Modify/Reject Response];




---

#### **Complete Configuration of NeMo Guardrails**

**NeMo Guardrails** allows customization of how an **LLM-based system** manages inputs, dialogues, and outputs. Below is a detailed explanation of each configuration component and its purpose.

---

#### **Configuration Components**

##### **1. Input Moderation**
Controls user inputs to ensure they are safe and appropriate.

- **What does it do?**
  - Blocks offensive language, inappropriate questions, or manipulation attempts.
  - Ensures that only valid queries are processed.

- **Example (Colang format):**
```colang
define user express_insult
  "You are stupid"
  "I want to harm you"

```

define flow handle_insult
  user express_insult
  bot express_calmly_willingness_to_help
```


##### **2. Dialog Constraints**


Structures and manages conversational flows between the user and the bot.

- **What does it do?**
 - Defines how the bot responds to different user intentions.
 - Establishes rules for handling queries, retrieving context, and generating responses.

- **Example (Colang format):**

```colang
define user ask_question
    "What is the capital of France?"
    "What is the weather today?"

define flow handle_general_input
    user ask_question
    $contexts = execute retrieve(query=$last_user_message)
    $answer = execute rag(query=$last_user_message, contexts=$contexts)
    bot $answer
```

##### **3. Output Validation**

Validates generated responses before sending them to the user

- **What does it do**

 - Checks that the response is consistent with the context.
 - Removes false or inappropriate responses.

- **Example (Colang format):**
```colang
define flow validate_response
    $accurate = execute check_facts(evidence_list=$contexts)
    if not $accurate:
        bot remove last message
        bot "I couldn't find a reliable answer to your question."
```

##### **4. config.yml File**

Configures the system's main models and parameters.

- **What does it do?**

 - Defines the LLM model, the engine (OpenAI or others), and parameters such as temperature.

- **Example (Colang format):**

```yaml
models:
- type: main
  engine: openai
  model: gpt-3.5-turbo
  parameters:
    temperature: 0.1
```
##### **5.co (Colang) File**

Defines the rules and interaction flows between the user and the bot.

- **What does it do?**

 - Specifies user expressions, bot responses, and actions to execute.

- **Structure**

 - User Expressions: Identifies user intentions.
 - Bot Responses: Defines how the bot should respond.
 - Flows: Links user inputs to responses and actions.

##### **6. Custom Actions**

Allows integration of custom logic or querying external systems.

- **What are they?**

 -Python functions that extend the bot’s capabilities.

 -Registered in NeMo Guardrails for use in dialogue flows.

- **What do they do?**

 -Retrieve data, generate summaries, or validate facts.

 -Executed via execute.


- **Example (Python format):**
 ```python
from nemoguardrails.actions.actions import action
@action(name="retrieve")
async def retrieve(query: str) -> list:
    # Retrieve relevant context from a database or vector store
    results = index.similarity_search(query, k=5)
    return [doc.text for doc in results]
@action(name="summarize_document")
async def summarize_document(document: str) -> str:
    # Generate a summary for a given document
    prompt = f"Summarize the following document:\n\n{document}"
    return get_completion(prompt, model="gpt-3.5-turbo")
 ```

##### **Action Registration**

Registers actions so they can be used in flows.


- **Example (Python format):**

```python
from nemoguardrails import LLMRails, RailsConfig

config = RailsConfig.from_content(yaml_content=yaml_content, colang_content=colang_content)
rails = LLMRails(config)

print("NeMo Guardrails successfully configured.")

#Setup dependencies

In [1]:
!pip install nemoguardrails llama-index openai docling langchain-openai

Collecting nemoguardrails
  Downloading nemoguardrails-0.11.1-py3-none-any.whl.metadata (22 kB)
Collecting llama-index
  Downloading llama_index-0.12.16-py3-none-any.whl.metadata (12 kB)
Collecting docling
  Downloading docling-2.18.0-py3-none-any.whl.metadata (8.4 kB)
Collecting langchain-openai
  Downloading langchain_openai-0.3.4-py3-none-any.whl.metadata (2.3 kB)
Collecting annoy>=1.17.3 (from nemoguardrails)
  Downloading annoy-1.17.3.tar.gz (647 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m647.5/647.5 kB[0m [31m13.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting fastapi>=0.103.0 (from nemoguardrails)
  Downloading fastapi-0.115.8-py3-none-any.whl.metadata (27 kB)
Collecting fastembed<0.4.1,>=0.2.2 (from nemoguardrails)
  Downloading fastembed-0.4.0-py3-none-any.whl.metadata (7.7 kB)
Collecting httpx<0.25.0,>=0.24.1 (from nemoguardrails)
  Downloading httpx-0.24.1-py3-none-any.whl.metadata (7.4 kB)
Colle

##Download the document

In [2]:

!wget -O CraneManual.pdf "https://silvatech.pl/wp-content/uploads/2013/07/New-book-Crane-Great-Britain.pdf"


--2025-02-07 08:05:47--  https://silvatech.pl/wp-content/uploads/2013/07/New-book-Crane-Great-Britain.pdf
Resolving silvatech.pl (silvatech.pl)... 79.96.48.114
Connecting to silvatech.pl (silvatech.pl)|79.96.48.114|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5284022 (5.0M) [application/pdf]
Saving to: ‘CraneManual.pdf’


2025-02-07 08:05:49 (3.75 MB/s) - ‘CraneManual.pdf’ saved [5284022/5284022]



## Parser with docling and convert to LlamaIndex nodes (Vector Store Index)

In [3]:
import torch
print(torch.__version__)
print("CUDA available:", torch.cuda.is_available())
print("Torch device count:", torch.cuda.device_count())


2.5.1+cu124
CUDA available: True
Torch device count: 1


In [4]:
import os
os.environ["OPENAI_API_KEY"] = 'YOUR_OPEN_API'

In [5]:
import os
import json
from pathlib import Path
from docling.document_converter import DocumentConverter
from llama_index.core import VectorStoreIndex, Document, Settings
from llama_index.core.node_parser import SentenceSplitter
from llama_index.embeddings.openai import OpenAIEmbedding

# Global embedding model for LlamaIndex
Settings.embed_model = OpenAIEmbedding()

class CraneManualProcessor:
    def __init__(self, source_pdf):
        """
        Converts the PDF into structured nodes for retrieval.
        """
        self.source_pdf = str(Path(source_pdf).resolve())
        self.base_name = str(Path(source_pdf).with_suffix('').resolve())

    # Step 1: Convert PDF to Markdown using Docling
    def convert_pdf_to_markdown(self):
        """Uses Docling to extract structured content from the PDF."""
        converter = DocumentConverter()
        result = converter.convert(self.source_pdf)
        md_content = result.document.export_to_markdown()

        markdown_path = f"{self.base_name}_content.md"
        with open(markdown_path, "w", encoding="utf-8") as f:
            f.write(md_content)

        print(f"✅ Markdown saved: {markdown_path}")
        return md_content

    # Step 2: Split Markdown by Sections and Subsections
    def split_markdown_by_sections(self, md_content, min_length=1024):
        """
        Splits the markdown content using heading markers (#, ##, ###) while ensuring
        that each section has at least `min_length` characters before further splitting.
        """
        lines = md_content.split("\n")
        sections = []
        current_section = []
        current_text = ""

        for line in lines:
            if line.strip().startswith("#"):  # Detect section headers
                if current_text and len(current_text) >= min_length:
                    sections.append(current_text.strip())  # Save previous section
                    current_text = ""

                current_section = [line]  # Start new section

            else:
                current_section.append(line)

            current_text = "\n".join(current_section)  # Combine lines

        # Ensure last section is added
        if current_text:
            sections.append(current_text.strip())

        print(f"✅ Split document into {len(sections)} sections.")
        return sections

    # Step 3: Convert Sections to LlamaIndex Nodes
    def convert_sections_to_nodes(self, sections):
        """
        Converts each structured section into LlamaIndex document nodes.
        """
        doc_nodes = [Document(text=section) for section in sections]
        print(f"✅ Created {len(doc_nodes)} document nodes for LlamaIndex.")
        return doc_nodes

    # Step 4: Index Nodes into LlamaIndex
    def index_nodes(self, doc_nodes):
        """
        Creates a vector index for efficient retrieval using LlamaIndex.
        """
        # Create an in-memory vector index
        index = VectorStoreIndex.from_documents(doc_nodes)

        # Save the index for later retrieval
        index.storage_context.persist(persist_dir=f"{self.base_name}_index")

        print(f"✅ Indexed {len(doc_nodes)} sections into LlamaIndex.")
        return index

    # Step 5: Run the full processing pipeline
    def process(self):
        md_content = self.convert_pdf_to_markdown()
        sections = self.split_markdown_by_sections(md_content)
        doc_nodes = self.convert_sections_to_nodes(sections)
        index = self.index_nodes(doc_nodes)
        return index

# Example usage
if __name__ == "__main__":
    processor = CraneManualProcessor("CraneManual.pdf")
    index = processor.process()


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].


✅ Markdown saved: /content/CraneManual_content.md
✅ Split document into 16 sections.
✅ Created 16 document nodes for LlamaIndex.
✅ Indexed 16 sections into LlamaIndex.


See the values

In [6]:
nodes = list(index.docstore.docs.values())
print(nodes[9].text)


## 4.6 Technical data, FARMA cranes

|         |   Lifting  capacity,  net:  kNm |   Slewing  torque  kNm |   Slewing  angle   degrees |   Reach  m | Telescopic  stroke  length  m   | Recommended  flow  l/min   |   Working  pressure  bar | Weight kg  incl. 0.12  grapple*/ Weight of  crane   | Weight kg  incl. 0.16  grapple*/ Weight of  crane   | Weight kg  incl. 0.19  grapple*/ Weight of  crane   | Weight kg  incl. 0.20  grapple*/ Weight of  crane   | Weight kg  incl. 0.22  grapple*/ Weight of  crane   |   Lifting force  kg  full length  (excl. grapple  and rotator) |
|---------|---------------------------------|------------------------|----------------------------|------------|---------------------------------|----------------------------|--------------------------|-----------------------------------------------------|-----------------------------------------------------|-----------------------------------------------------|-----------------------------------------------------|-------

##Create the query engine and reranker

In [91]:
from llama_index.core.postprocessor import SentenceTransformerRerank

# Initialize reranker with top 5 candidates
rerank = SentenceTransformerRerank(
    model="cross-encoder/ms-marco-MiniLM-L-2-v2", top_n=5
)

# Create query engine with reranker
query_engine = index.as_query_engine(
    similarity_top_k=10,  # Retrieve top 10 candidates first
    node_postprocessors=[rerank]  # Apply reranking select 5 from 10
)


In [94]:
response=query_engine.query('What are the primary applications of FARMA cranes, and in which industries are they intended to be used?')

In [96]:
response.response

'The primary applications of FARMA cranes include handling round timber, fodder, fertiliser, sand, loose fertiliser, large sacks, and automatic timber transportation in forestry. These cranes are intended to be used in the fields of agriculture and forestry by individuals with knowledge about the handling of agricultural machinery.'

## NeMo GuardRails -->Building Custom Retrieval and RAG Actions

Next, we create NeMo Guardrails custom actions


### RAG Action

In [81]:
from nemoguardrails.actions.actions import action

@action(name="rag")
async def rag(query: str) -> list:
    response = query_engine.query(query)
    return str(response)

In [82]:

@action(name="escalate_to_industrial_system")
async def escalate_to_industrial_system(query: str) -> str:
    """
    Simulates escalation of non-crane-related industrial questions
    to an external industrial system or human expert.

    Args:
        query (str): The user's original question.

    Returns:
        str: Response indicating escalation has been processed.
    """
    # Simulated escalation process
    print(f"[Escalation] Forwarding query to external industrial expert system: {query}")

    # Simulated external system response
    external_response = f"I have forwarded your request: '{query}'. A specialist will assist you shortly."

    return external_response


### Configuring NeMo Guardrails

In [83]:
from nemoguardrails import LLMRails, RailsConfig

yaml_content = """
models:
- type: main
  engine: openai
  model: gpt-3.5-turbo
"""


### Creating the LLMRails Object

In [84]:
from nemoguardrails import LLMRails, RailsConfig
config = RailsConfig.from_content(yaml_content=yaml_content, colang_content="")

rails = LLMRails(config)

# Register our custom actions
#rails.register_action(action=retrieve, name="retrieve")
rails.register_action(action=rag, name="rag")
rails.register_action(escalate_to_industrial_system, "escalate_to_industrial_system")


### Creating Guardrail Rules (Colang)

Let’s define dialog flows and moderation rules:

Catching Offensive or Unsafe Queries
Ignoring Off-Topic Queries
Responding to Crane-Related Questions
In Colang, we define user expressions, bot responses, and flows.

**Example Colang File**

In [85]:
colang_content = """
# --- Welcome Message ---
define flow welcome_message
  bot "Good morning, I am your assistant for crane installation, maintenance, and safety guidelines. Please ask me about setup procedures, troubleshooting, load capacities, or regulatory compliance."

# --- Offensive / Unsafe Queries ---
define user express_insult
  "You are stupid"
  "I want to harm you"
  "Go to hell"
  "This is useless"

define bot respond_offensive
  "I am here to assist with crane-related topics. If you need help with installation or maintenance, please ask."

define flow handle_insult
  user express_insult
  bot respond_offensive

# --- Off-topic Queries ---
define user ask_off_topic
  "How do I bake a cake?"
  "What's the best soccer team?"
  "Which team emerged as the winner in the latest F1 race?"
  "Who won the last Formula 1 race?"
  "Who won the last Formula 1 this year?"
  "Who won the Formula 1 this year?"
  "Who won the the Formula 1 this year?"   // added to catch extra "the"
  "Who won the last F1 race?"              // additional variant
  "Who is the Formula 1 winner?"
  "Who won the latest Formula 1 race?"
  "Who won the last NBA game?"
  "Which team won the most recent NFL matchup?"
  "Who came out on top in the last MLB game?"
  "Which club won the latest English Premier League match?"
  "Who was the victor in the most recent NHL game?"
  "Which team secured the win in the last UEFA Champions League match?"
  "Who won the latest tennis match at the US Open?"
  "latest formula 1 results"
  "formula 1 winner"
  "formula one winner"

define bot refuse_off_topic
  "This assistant is designed for industrial crane operations. Please ask about installation, safety, or troubleshooting."

define flow handle_off_topic
  user ask_off_topic
  bot refuse_off_topic

# --- Crane-Related Q&A ---
define user ask_crane_installation
  "How do I install the ceiling-mounted crane?"
  "What are the steps to assemble the crane?"
  "Can you guide me through the setup process?"

define bot respond_crane_installation
  "Installation of the ceiling-mounted crane involves several steps, including structural assessment, track assembly, and hoist installation. Let me fetch the exact procedure from the manual."

define flow handle_crane_installation
  user ask_crane_installation
  bot respond_crane_installation
  $answer = execute rag(query=$last_user_message)
  bot $answer
  bot "Would you like more details on a specific step, such as electrical connections or load testing?"

# --- Maintenance & Safety ---
define user ask_crane_maintenance
  "How do I perform regular maintenance?"
  "What is the recommended maintenance schedule?"
  "What lubrication should I use?"
  "How often should I inspect the crane?"

define bot respond_crane_maintenance
  "Routine maintenance ensures the crane operates safely and efficiently. Let me retrieve the maintenance guidelines."

define flow handle_crane_maintenance
  user ask_crane_maintenance
  bot respond_crane_maintenance
  $answer = execute rag(query=$last_user_message)
  bot $answer
  bot "Do you need specific details on inspections, lubrication, or troubleshooting common issues?"

# --- Load Capacity & Safety Warnings ---
define user ask_crane_safety
  "What are the safety warnings?"
  "What is the maximum load capacity?"
  "Can I use this crane to lift people?"
  "What safety checks should I do before operating the crane?"

define bot respond_crane_safety
  "Crane safety is critical. Overloading, improper rigging, or ignoring maintenance can cause failures. Let me check the manual for precise safety recommendations."

define flow handle_crane_safety
  user ask_crane_safety
  bot respond_crane_safety
  $answer = execute rag(query=$last_user_message)
  bot $answer
  bot "Would you like additional information on operator training, emergency procedures, or regulatory compliance?"

# --- Troubleshooting & Error Codes ---
define user ask_crane_troubleshooting
  "The crane is not moving"
  "I hear a strange noise from the hoist"
  "The motor is overheating"
  "There is an error code on the control panel"

define bot respond_crane_troubleshooting
  "Let's troubleshoot the issue. Please describe the symptoms in more detail, or I can check the error code list."

define flow handle_crane_troubleshooting
  user ask_crane_troubleshooting
  bot respond_crane_troubleshooting
  $answer = execute rag(query=$last_user_message)
  bot $answer
  bot "If the issue persists, I can provide troubleshooting steps or refer you to a certified technician."

# --- Escalation for Other Industrial Topics ---
define user ask_other_industrial_topic
  "How do I calibrate an industrial sensor?"
  "What is the best way to inspect conveyor belts?"
  "How do I optimize production line efficiency?"
  "What are the best maintenance practices for hydraulic systems?"

define bot escalate_other_topic
  "Your question goes beyond crane operations. I will connect you with a more suitable system or expert."

define flow handle_other_industrial_topic
  user ask_other_industrial_topic
  bot escalate_other_topic
  $escalation_response = execute escalate_to_industrial_system(query=$last_user_message)
  bot $escalation_response

# --- Handling Repetitive / Clarification Requests ---
define user ask_repeat_or_clarify
  "Can you explain that again?"
  "I didn't understand, can you rephrase?"
  "Could you give me more details?"

define flow handle_repeat
  user ask_repeat_or_clarify
  bot "Certainly! Let me simplify the explanation..."
  $answer = execute rag(query=$last_user_message)
  bot $answer

# --- Unrecognized Queries ---
define user ask_unclear_question
  "Hmmm..."
  "I'm not sure how to ask"
  "Can you help me with something else?"

define bot respond_unclear
  "Could you provide more details about what you need help with? I can assist with installation, maintenance, and troubleshooting."

define flow handle_unclear_question
  user ask_unclear_question
  bot respond_unclear

# --- Ending Message ---
define user end_conversation
  "That’s all for now"
  "Thank you"
  "Goodbye"
  "No, I don’t need more help"

define bot say_goodbye
  "You're welcome! If you need more help with crane operations, feel free to return anytime. Have a safe workday."

define flow handle_goodbye
  user end_conversation
  bot say_goodbye

"""

#### Re-initialize Rails with Colang Content

At this point, we can add Colang content for the conversation flows next.

In [86]:
config = RailsConfig.from_content(
    yaml_content=yaml_content,
    colang_content=colang_content
)
rails = LLMRails(config)

# Re-register actions to ensure they're recognized
rails.register_action(rag, "rag")
rails.register_action(escalate_to_industrial_system, "escalate_to_industrial_system")
#print("NeMo Guardrails configured successfully")


###Testing & Demonstration

In [87]:
import nest_asyncio
nest_asyncio.apply()

In [89]:
import asyncio

async def demo_crane_assistant():
    print("\n🟢 TEST 1: Crane-Related Question")
    response = await rails.generate_async(prompt="What are the primary applications of FARMA cranes, and in which industries are they intended to be used?")
    print(f"Bot: {response}\n")

    print("\n🟡 TEST 2: Off-Topic Question")
    response = await rails.generate_async(prompt="What's the best soccer team?")
    print(f"Bot: {response}\n")

    print("\n🟠 TEST 3: Industrial But Non-Crane Question (Escalation)")
    response = await rails.generate_async(prompt="How do I calibrate an industrial sensor?")
    print(f"Bot: {response}\n")

    print("\n🔴 TEST 4: Ending Conversation")
    response = await rails.generate_async(prompt="Thank you")
    print(f"Bot: {response}\n")

try:
    loop = asyncio.get_running_loop()
    task = loop.create_task(demo_crane_assistant())
    await task
except RuntimeError:
    asyncio.run(demo_crane_assistant())



🟢 TEST 1: Crane-Related Question
Bot: FARMA cranes, also known as forestry cranes, are primarily used in the forestry industry for tasks such as loading and unloading timber, moving logs, and assisting in forestry operations. These cranes are designed to be mounted on forestry trucks or trailers, allowing them to operate efficiently in rugged and forested environments.
The primary applications of FARMA cranes include:
1. Loading and unloading timber: FARMA cranes are used to lift logs and timber onto trucks for transportation.
2. Moving logs: These cranes can handle and move logs within a forestry operation or processing facility.
3. Assisting in forestry operations: FARMA cranes support various forestry tasks, such as clearing trees, handling forestry equipment, and assisting in forest management activities.
Industries where FARMA cranes are commonly used include:
1. Forestry industry: FARMA cranes are essential equipment for forestry companies, logging operations, and timber harvest