<h1 style=\"text-align: center; font-size: 50px;\"> <h1 style=\"text-align: center; font-size: 50px;\"> 📦 Register Model </h1> </h1>

📘 Project Overview: 
 This notebook demonstrates a modular architecture for answering natural language questions 
 over one or more feedback documents using only local and open-source models (e.g., LLaMA.cpp).
 The system processes long documents chunk-by-chunk and synthesizes a final answer using a multi-step LLM workflow.

# Notebook Overview

- Start Execution
- Define User Constants
- Install and Import Libraries
- Configure Settings
- Verify Assets
- KV Memory
- LLM Setup
- State Model
- Node Functions
- Graph Definition
- Graph Visualization
- Generated Answer
- Message History

# Start Execution

In [1]:
# ─────── Standard Library Imports ───────
import os  # OS-level utilities like path and environment operations
import sys  # Access to interpreter variables and runtime configuration
import time  # Time-related functions

# Extend sys.path to include parent directory for local module resolution
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), "..")))

# ─────── Local Application Imports ───────
from src.utils import (  # Core utilities for logging, LLM interaction, and schema generation
    display_image,
    get_response_from_llm,
    json_schema_from_type,
    log_timing,
    logger,
)

In [2]:
start_time = time.time()  
logger.info("Notebook execution started.")

# Define User Constants

In [3]:
TOPIC: str = "Focus Flow"  
QUESTION: str = "Which poeple provided the feedback?"

# Install and Import Libraries

In [4]:
%%time

%pip install -r ../requirements.txt --quiet

Note: you may need to restart the kernel to use updated packages.
CPU times: user 18.3 ms, sys: 0 ns, total: 18.3 ms
Wall time: 991 ms


In [5]:
from __future__ import annotations  # Enables postponed evaluation of type annotations (PEP 563)

# ─────── Standard Library Imports ───────
import base64  # Encoding and decoding binary data
import functools  # Functional programming utilities like lru_cache, partial, etc.
import json  # JSON serialization and deserialization
import logging  # Logging framework
import multiprocessing  # Parallel execution using subprocesses
import os  # OS-level utilities
import shutil  # File and directory operations
import sys  # Access to runtime environment and system-specific parameters
import time  # Time tracking and delays
import warnings  # Warning control and filtering
from collections import namedtuple  # Lightweight object types
from pathlib import Path  # Object-oriented filesystem paths
from typing import Any, Dict, List, Literal, Optional, TypedDict  # Static typing annotations

# ─────── Third-Party Package Imports ───────
import mlflow  # Model tracking and serving framework
from mlflow.tracking import MlflowClient  # Interface to interact with MLflow tracking server for experiments, runs, and artifacts
import yaml  # YAML file parsing
from IPython.display import HTML, display, Markdown  # Rich output formatting in Jupyter environments
from tqdm import tqdm  # Progress bar for loops

# ─────── LangChain Core & Community Imports ───────
from langchain.docstore.document import Document  # Document abstraction
from langchain.text_splitter import RecursiveCharacterTextSplitter  # Intelligent text splitting
from langchain_community.document_loaders import (  # Document loaders for different file types
    CSVLoader,
    PyPDFLoader,
    TextLoader,
    UnstructuredExcelLoader,
    UnstructuredMarkdownLoader,
    UnstructuredWordDocumentLoader,
)
from langchain_community.llms import LlamaCpp  # Integration for running LlamaCpp locally

# ─────── LangGraph Imports ───────
from langgraph.graph import END, START, StateGraph  # Constructs and controls stateful agent graphs

# ─────── Local Application-Specific Imports ───────
from src.agentic_feedback_model import AgenticFeedbackModel  # Core agent logic for feedback analysis
from src.simple_kv_memory import SimpleKVMemory  # In-memory store for agent state

# Configure Settings

In [6]:
# Suppress Python warnings
warnings.filterwarnings("ignore")

In [7]:
INPUT_PATH: Path = Path("../data/input")  

MEMORY_PATH: Path = Path("../data/memory")   

MODEL_PATH = "/home/jovyan/datafabric/meta-llama3.1-8b-Q8/Meta-Llama-3.1-8B-Instruct-Q8_0.gguf"
CONTEXT_WINDOW = 8192
MAX_TOKENS = CONTEXT_WINDOW // 8
CHUNK_SIZE = CONTEXT_WINDOW // 2
CHUNK_OVERLAP = CHUNK_SIZE // 8  

EXPERIMENT_NAME = "AIStudio-Agentic-Customer-Feedback-Analyzer-with-LangGraph-Experiment"
RUN_NAME = "AIStudio-Agentic-Customer-Feedback-Analyzer-with-LangGraph-Run"
MODEL_NAME = "AIStudio-Agentic-Customer-Feedback-Analyzer-with-LangGraph-Model"

In [8]:
logger.info('Notebook execution started.')

## Verify Assets

In [9]:
def log_asset_status(asset_path: str, asset_name: str) -> None:
    """
    Logs the status of a given asset based on its existence.

    Parameters:
        asset_path (str): File or directory path to check.
        asset_name (str): Name of the asset for logging context.
    """
    if Path(asset_path).exists():
        logger.info(f"{asset_name} is properly configured.")
    else:
        logger.info(f"{asset_name} is not properly configured. Please ensure the required asset is correctly configured in your AI Studio project according to the README file.")

In [10]:
log_asset_status(
    asset_path=INPUT_PATH,
    asset_name="Input Data",
)
log_asset_status(
    asset_path=MODEL_PATH,
    asset_name="LLM",
)

# KV Memory

In [11]:
memory: SimpleKVMemory = SimpleKVMemory(MEMORY_PATH)
memory.set('dummy key', 'dummy value')

# Load Documents

In [12]:
logger.info("📂 Scanning directory for documents: %s", INPUT_PATH)

supported_extensions = {
".txt": TextLoader,
".csv": lambda path: CSVLoader(path, encoding="utf-8", csv_args={"delimiter": ","}),
".xlsx": UnstructuredExcelLoader,
".docx": UnstructuredWordDocumentLoader,
".pdf": PyPDFLoader,
".md": UnstructuredMarkdownLoader,
}

all_docs = []

for file_path in Path(INPUT_PATH).rglob("*"):
    # Skip hidden/system folders
    if any(part.startswith(".") and part not in {".", ".."} for part in file_path.parts):
        continue
    
    ext = file_path.suffix.lower()
    loader_class = supported_extensions.get(ext)
    
    if loader_class:
        try:
            loader = loader_class(str(file_path))
            docs = loader.load()
            all_docs.extend(docs)
            logger.info("✅ Loaded %d docs from %s", len(docs), file_path.name)
        except Exception as e:
            logger.warning("❌ Failed to load %s: %s", file_path.name, e)
    else:
        logger.info("⚠️ Unsupported file type: %s", file_path.name)

short text: "docx test". Defaulting to English.


short text: "md test". Defaulting to English.


short text: "excel test excel test". Defaulting to English.


In [13]:
INPUT_TEXT = '\n\n'.join([doc.page_content for doc in all_docs])

# MLflow Registration

In [14]:
# 1. Set MLflow tracking URI and experiment
mlflow.set_tracking_uri(os.getenv("MLFLOW_TRACKING_URI", "/phoenix/mlflow"))
mlflow.set_experiment(experiment_name=EXPERIMENT_NAME)
print(f"Using MLflow tracking URI: {mlflow.get_tracking_uri()}")
print(f"Experiment: {EXPERIMENT_NAME}")

Using MLflow tracking URI: /phoenix/mlflow
Experiment: AIStudio-Agentic-Customer-Feedback-Analyzer-with-LangGraph-Experiment


In [15]:
%%time

# These should point to the actual files you're using for model and memory
MODEL_ARTIFACTS = {
    "model_path": str(MODEL_PATH),
    "memory_path": str(MEMORY_PATH),
}
 
# === Start MLflow run, log, and register ===
with mlflow.start_run(run_name=RUN_NAME) as run:
    print(f"🚀 Started MLflow run: {run.info.run_id}")

    # Log and register the model using the classmethod
    AgenticFeedbackModel.log_model(
        model_name=MODEL_NAME,
        model_artifacts=MODEL_ARTIFACTS
    )

logger.info(f"✅ Model '{MODEL_NAME}' successfully logged and registered.")

2025/08/02 06:43:49 INFO mlflow.models.signature: Inferring model signature from type hints


🚀 Started MLflow run: e015ccbb3a024c3e9ab35a177ab9d238


Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

Successfully registered model 'AIStudio-Agentic-Customer-Feedback-Analyzer-with-LangGraph-Model'.
Created version '1' of model 'AIStudio-Agentic-Customer-Feedback-Analyzer-with-LangGraph-Model'.


CPU times: user 1 s, sys: 15.7 s, total: 16.7 s
Wall time: 4min 3s


In [16]:
# 3. Retrieve the latest version from the Model Registry
client = MlflowClient()
versions = client.get_latest_versions(MODEL_NAME, stages=["None"])

if not versions:
    raise RuntimeError(f"No registered versions found for model '{MODEL_NAME}'.")
    
latest_version = versions[0].version
model_info = mlflow.models.get_model_info(f"models:/{MODEL_NAME}/{latest_version}")

logger.info(f"Latest registered version of '{MODEL_NAME}': {latest_version}")
logger.info(f"Signature: {model_info.signature}")

In [17]:
%%time

# 4. Load the model from the Model Registry
loaded_model = mlflow.pyfunc.load_model(model_uri=f"models:/{MODEL_NAME}/{latest_version}")
logger.info(f"Successfully loaded model '{MODEL_NAME}' version {latest_version} for inference.")

CPU times: user 1.22 s, sys: 2.34 s, total: 3.56 s
Wall time: 1min 13s


In [18]:
# 5. Run a sample inference using the loaded model
input_payload = [{"topic": TOPIC, "question": QUESTION, "input_text": INPUT_TEXT, }]

print("\n=== Running Sample Inference ===")
results = loaded_model.predict(input_payload)
result = results[0]


=== Running Sample Inference ===


🔁 Processing each chunk: 100%|██████████| 2/2 [00:01<00:00,  1.96it/s, group=✅ Chunk 2 response length: 28 chars]


🔁 Processing each grouped chunk answers: 100%|██████████| 1/1 [00:01<00:00,  1.38s/it, group=🧠 Synthesized partial answer (1/1)]



🔚 === Final Answer ===

# 🧠 Synthesized partial answer (1/1)

Since the user's question is about individuals mentioned in the document as providing feedback, and neither Chunk 1 nor Chunk 2 mentions any individuals providing feedback, the final answer is:

**No individuals are mentioned in the document as providing feedback.**




# Generated Answer

In [19]:
display(Markdown(result.answer))

# 🧠 Synthesized partial answer (1/1)

Since the user's question is about individuals mentioned in the document as providing feedback, and neither Chunk 1 nor Chunk 2 mentions any individuals providing feedback, the final answer is:

**No individuals are mentioned in the document as providing feedback.**

# Message History

In [20]:
print(result.messages)

[
    {
        "role": "developer",
        "content": "User submitted a question."
    },
    {
        "role": "user",
        "content": "Which poeple provided the feedback?"
    },
    {
        "role": "developer",
        "content": "\ud83e\udde0 Relevance check result:"
    },
    {
        "role": "assistant",
        "content": "yes"
    },
    {
        "role": "developer",
        "content": "\ud83e\udded No cached answer found for question: 'Which poeple provided the feedback?'"
    },
    {
        "role": "developer",
        "content": "\u270f\ufe0f Rewritten user question:"
    },
    {
        "role": "assistant",
        "content": "Who are the individuals mentioned in the document as providing feedback?"
    },
    {
        "role": "developer",
        "content": "\ud83e\udde9 Chunked 1 documents into 2 chunks (size=4096, overlap=256)"
    },
    {
        "role": "developer",
        "content": "\ud83e\udde0 Processed 2 chunks for question: 'Who are the individual

In [21]:
end_time: float = time.time()
elapsed_time: float = end_time - start_time
elapsed_minutes: int = int(elapsed_time // 60)
elapsed_seconds: float = elapsed_time % 60

logger.info(f"⏱️ Total execution time: {elapsed_minutes}m {elapsed_seconds:.2f}s")
logger.info("✅ Notebook execution completed successfully.")

Built with ❤️ using [**HP AI Studio**](https://hp.com/ai-studio).