<h1 style=\"text-align: center; font-size: 50px;\"> <h1 style=\"text-align: center; font-size: 50px;\"> 📦 Register Model </h1> </h1>

📘 Project Overview: 
 This notebook demonstrates a modular architecture for answering natural language questions 
 over one or more feedback documents using only local and open-source models (e.g., LLaMA.cpp).
 The system processes long documents chunk-by-chunk and synthesizes a final answer using a multi-step LLM workflow.

## 🧠 System Architecture"

                 "```mermaid\n"
                 "graph TD\n"
                 "    A[Original Question] --> B[LLM generates Prompts]\n"
                 "    B --> C[Split Documents into Chunks]\n"
                 "    C --> D[LLM answers Question per Chunk]\n"
                 "    D --> E[Collect All Chunk-Level Answers]\n"
                 "    E --> F[LLM Synthesizes Final Answer]\n"

# Notebook Overview

- Start Execution
- Define User Constants
- Install and Import Libraries
- Configure Settings
- Verify Assets
- KV Memory
- LLM Setup
- State Model
- Node Functions
- Graph Definition
- Graph Visualization
- Generated Answer
- Message History

# Start Execution

In [1]:
import time
import os 
import sys

sys.path.append(os.path.abspath(os.path.join(os.getcwd(), "..")))

from src.utils import display_image, get_response_from_llm, json_schema_from_type, log_timing, logger

In [2]:
start_time = time.time()  
logger.info("Notebook execution started.")

2025-08-02 05:10:37 - INFO - Notebook execution started.


# Define User Constants

In [3]:
TOPIC: str = "Focus Flow"  
QUESTION: str = "Which poeple provided the feedback?"

# Install and Import Libraries

In [4]:
%%time

%pip install -r ../requirements.txt --quiet

Note: you may need to restart the kernel to use updated packages.
CPU times: user 93.8 ms, sys: 32.5 ms, total: 126 ms
Wall time: 9.98 s


In [5]:
from __future__ import annotations

# ─────── Standard Library ───────
import base64
import functools
import json
import logging
import multiprocessing
import os
import shutil
import sys
import time
import warnings
from collections import namedtuple
from pathlib import Path
from typing import Any, Dict, List, Literal, Optional, TypedDict
import os
import sys
from IPython.display import display, Markdown
import yaml

# ─────── Third-Party: Torch & Progress ───────
from tqdm import tqdm

# ─────── LangChain Core & Community ───────
from langchain.docstore.document import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter

from langchain_community.document_loaders import (
    CSVLoader,
    PyPDFLoader,
    TextLoader,
    UnstructuredExcelLoader,
    UnstructuredMarkdownLoader,
    UnstructuredWordDocumentLoader,
)
from langchain_community.llms import LlamaCpp

# ─────── LangGraph & Messaging ───────
from langgraph.graph import END, START, StateGraph

# ─────── Jupyter Display ───────
from IPython.display import HTML, display


import mlflow

from src.simple_kv_memory import SimpleKVMemory
from src.agentic_feedback_model import AgenticFeedbackModel

# Configure Settings

In [6]:
# Suppress Python warnings
warnings.filterwarnings("ignore")

In [7]:
INPUT_PATH: Path = Path("../data/input")  

MEMORY_PATH: Path = Path("../data/memory")   

LLM_PATH = "/home/jovyan/datafabric/meta-llama3.1-8b-Q8/Meta-Llama-3.1-8B-Instruct-Q8_0.gguf"
CONTEXT_WINDOW = 8192
MAX_TOKENS = CONTEXT_WINDOW // 8
CHUNK_SIZE = CONTEXT_WINDOW // 2
CHUNK_OVERLAP = CHUNK_SIZE // 8  

EXPERIMENT_NAME = "AIStudio-Customer-Feedback-Analyzer-Experiment"
RUN_NAME = "AIStudio-Customer-Feedback-Analyzer-Run"
MODEL_NAME = "AIStudioCustomerFeedbackAnalyzerModel"
MODEL_PATH = "AIStudio-Customer-Feedback-Analyzer-Model"

In [8]:
logger.info('Notebook execution started.')

2025-08-02 05:10:50 - INFO - Notebook execution started.


## Verify Assets

In [9]:
def log_asset_status(asset_path: str, asset_name: str) -> None:
    """
    Logs the status of a given asset based on its existence.

    Parameters:
        asset_path (str): File or directory path to check.
        asset_name (str): Name of the asset for logging context.
    """
    if Path(asset_path).exists():
        logger.info(f"{asset_name} is properly configured.")
    else:
        logger.info(f"{asset_name} is not properly configured. Please ensure the required asset is correctly configured in your AI Studio project according to the README file.")

In [10]:
log_asset_status(
    asset_path=INPUT_PATH,
    asset_name="Input Data",
)
log_asset_status(
    asset_path=MODEL_PATH,
    asset_name="LLM",
)

2025-08-02 05:10:50 - INFO - Input Data is properly configured.
2025-08-02 05:10:50 - INFO - LLM is not properly configured. Please ensure the required asset is correctly configured in your AI Studio project according to the README file.


# KV Memory

In [11]:
memory: SimpleKVMemory = SimpleKVMemory(MEMORY_PATH)
memory.set('dummy key', 'dummy value')

# Load Documents

In [12]:
logger.info("📂 Scanning directory for documents: %s", INPUT_PATH)

supported_extensions = {
".txt": TextLoader,
".csv": lambda path: CSVLoader(path, encoding="utf-8", csv_args={"delimiter": ","}),
".xlsx": UnstructuredExcelLoader,
".docx": UnstructuredWordDocumentLoader,
".pdf": PyPDFLoader,
".md": UnstructuredMarkdownLoader,
}

all_docs = []

for file_path in Path(INPUT_PATH).rglob("*"):
    # Skip hidden/system folders
    if any(part.startswith(".") and part not in {".", ".."} for part in file_path.parts):
        continue
    
    ext = file_path.suffix.lower()
    loader_class = supported_extensions.get(ext)
    
    if loader_class:
        try:
            loader = loader_class(str(file_path)) if callable(loader_class) else loader_class
            docs = loader.load()
            all_docs.extend(docs)
            logger.info("✅ Loaded %d docs from %s", len(docs), file_path.name)
        except Exception as e:
            logger.warning("❌ Failed to load %s: %s", file_path.name, e)
    else:
        logger.info("⚠️ Unsupported file type: %s", file_path.name)

2025-08-02 05:10:50 - INFO - 📂 Scanning directory for documents: ../data/input
2025-08-02 05:10:50 - INFO - ✅ Loaded 1 docs from csv.csv
short text: "docx test". Defaulting to English.
2025-08-02 05:10:55 - INFO - ✅ Loaded 1 docs from docx.docx
short text: "md test". Defaulting to English.
2025-08-02 05:10:55 - INFO - ✅ Loaded 1 docs from md.md
2025-08-02 05:10:55 - INFO - ✅ Loaded 1 docs from pdf.pdf
2025-08-02 05:10:55 - INFO - ✅ Loaded 1 docs from sample-product-feedback-doc.md
2025-08-02 05:10:55 - INFO - ✅ Loaded 1 docs from txt.txt
short text: "excel test excel test". Defaulting to English.
2025-08-02 05:10:56 - INFO - ✅ Loaded 1 docs from xlsx.xlsx


In [13]:
INPUT_TEXT = '\n\n'.join([doc.page_content for doc in all_docs])

# MLflow Registration

In [14]:
# 1. Set MLflow tracking URI and experiment
mlflow.set_tracking_uri(os.getenv("MLFLOW_TRACKING_URI", "/phoenix/mlflow"))
mlflow.set_experiment(experiment_name=EXPERIMENT_NAME)
print(f"Using MLflow tracking URI: {mlflow.get_tracking_uri()}")
print(f"Experiment: {EXPERIMENT_NAME}")

2025/08/02 05:10:56 INFO mlflow.tracking.fluent: Experiment with name 'AIStudio-Customer-Feedback-Analyzer-Experiment' does not exist. Creating a new experiment.


Using MLflow tracking URI: /phoenix/mlflow
Experiment: AIStudio-Customer-Feedback-Analyzer-Experiment


In [15]:
%%time

# These should point to the actual files you're using for model and memory
MODEL_ARTIFACTS = {
    "model_path": str(LLM_PATH),
    "memory_path": str(MEMORY_PATH),
}
 
# === Start MLflow run, log, and register ===
with mlflow.start_run(run_name=RUN_NAME) as run:
    print(f"🚀 Started MLflow run: {run.info.run_id}")

    # Log and register the model using the classmethod
    AgenticFeedbackModel.log_model(
        model_name=MODEL_NAME,
        model_path=MODEL_PATH,
        model_artifacts=MODEL_ARTIFACTS
    )

    # Construct model URI and register it
    model_uri = f"runs:/{run.info.run_id}/{MODEL_PATH}"
    mlflow.register_model(model_uri=model_uri, name=MODEL_NAME)

logger.info(f"✅ Model '{MODEL_NAME}' successfully logged and registered.")

2025/08/02 05:10:56 INFO mlflow.models.signature: Inferring model signature from type hints


🚀 Started MLflow run: a7b20165a68840458b267295f8c3d29d


Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

Successfully registered model 'AIStudioCustomerFeedbackAnalyzerModel'.
Created version '1' of model 'AIStudioCustomerFeedbackAnalyzerModel'.
Registered model 'AIStudioCustomerFeedbackAnalyzerModel' already exists. Creating a new version of this model...
Created version '2' of model 'AIStudioCustomerFeedbackAnalyzerModel'.
2025-08-02 05:14:58 - INFO - ✅ Model 'AIStudioCustomerFeedbackAnalyzerModel' successfully logged and registered.


CPU times: user 892 ms, sys: 17.1 s, total: 18 s
Wall time: 4min 2s


In [16]:
from mlflow.tracking import MlflowClient

# 3. Retrieve the latest version from the Model Registry
client = MlflowClient()
versions = client.get_latest_versions(MODEL_NAME, stages=["None"])

if not versions:
    raise RuntimeError(f"No registered versions found for model '{MODEL_NAME}'.")
    
latest_version = versions[0].version
model_info = mlflow.models.get_model_info(f"models:/{MODEL_NAME}/{latest_version}")

logger.info(f"Latest registered version of '{MODEL_NAME}': {latest_version}")
logger.info(f"Signature: {model_info.signature}")

2025-08-02 05:14:59 - INFO - Latest registered version of 'AIStudioCustomerFeedbackAnalyzerModel': 2
2025-08-02 05:14:59 - INFO - Signature: inputs: 
  [{input_text: string (required), question: string (required), topic: string (required)} (required)]
outputs: 
  [{answer: string (required), messages: string (required)} (required)]
params: 
  None



In [17]:
%%time

# 4. Load the model from the Model Registry
loaded_model = mlflow.pyfunc.load_model(model_uri=f"models:/{MODEL_NAME}/{latest_version}")
logger.info(f"Successfully loaded model '{MODEL_NAME}' version {latest_version} for inference.")

2025-08-02 05:16:07 - INFO - Successfully loaded model 'AIStudioCustomerFeedbackAnalyzerModel' version 2 for inference.


CPU times: user 1.27 s, sys: 2 s, total: 3.27 s
Wall time: 1min 8s


In [18]:
# 5. Run a sample inference using the loaded model
input_payload = [{"topic": TOPIC, "question": QUESTION, "input_text": INPUT_TEXT, }]

print("\n=== Running Sample Inference ===")
results = loaded_model.predict(input_payload)
result = results[0]

2025-08-02 05:16:07 - INFO - 🗣️ Ingested user question: Which poeple provided the feedback?
2025-08-02 05:16:07 - INFO - Function 'ingest_question' took 0.0006 seconds.
--------------------------------------------------------------




=== Running Sample Inference ===


2025-08-02 05:16:19 - INFO - 🧠 Relevance response: yes → Relevant
2025-08-02 05:16:19 - INFO - Function 'check_relevance' took 11.4955 seconds.
--------------------------------------------------------------

2025-08-02 05:16:19 - INFO - 🧭 Cache miss for question: Which poeple provided the feedback?
2025-08-02 05:16:19 - INFO - Function 'check_memory' took 0.0021 seconds.
--------------------------------------------------------------

2025-08-02 05:16:19 - INFO - ✏️ Rewritten user question:
→ Who are the individuals mentioned in the document as providing feedback?
2025-08-02 05:16:19 - INFO - Function 'rewrite_question' took 0.5458 seconds.
--------------------------------------------------------------

2025-08-02 05:16:19 - INFO - 📑 Starting chunking for 1 loaded documents
2025-08-02 05:16:19 - INFO - 🧩 Created 2 total chunks (size=4096, overlap=256)
2025-08-02 05:16:19 - INFO - Function 'create_chunks' took 0.0016 seconds.
--------------------------------------------------------------


🔚 === Final Answer ===

# 🧠 Synthesized partial answer (1/1)

Since the user's question is about individuals mentioned in the document as providing feedback, and neither Chunk 1 nor Chunk 2 mentions any individuals providing feedback, the final answer is:

**No individuals are mentioned in the document as providing feedback.**




# Generated Answer

In [19]:
display(Markdown(result.answer))

# 🧠 Synthesized partial answer (1/1)

Since the user's question is about individuals mentioned in the document as providing feedback, and neither Chunk 1 nor Chunk 2 mentions any individuals providing feedback, the final answer is:

**No individuals are mentioned in the document as providing feedback.**

# Message History

In [20]:
print(result.messages)

[
    {
        "role": "developer",
        "content": "User submitted a question."
    },
    {
        "role": "user",
        "content": "Which poeple provided the feedback?"
    },
    {
        "role": "developer",
        "content": "\ud83e\udde0 Relevance check result:"
    },
    {
        "role": "assistant",
        "content": "yes"
    },
    {
        "role": "developer",
        "content": "\ud83e\udded No cached answer found for question: 'Which poeple provided the feedback?'"
    },
    {
        "role": "developer",
        "content": "\u270f\ufe0f Rewritten user question:"
    },
    {
        "role": "assistant",
        "content": "Who are the individuals mentioned in the document as providing feedback?"
    },
    {
        "role": "developer",
        "content": "\ud83e\udde9 Chunked 1 documents into 2 chunks (size=4096, overlap=256)"
    },
    {
        "role": "developer",
        "content": "\ud83e\udde0 Processed 2 chunks for question: 'Who are the individual

In [21]:
end_time: float = time.time()
elapsed_time: float = end_time - start_time
elapsed_minutes: int = int(elapsed_time // 60)
elapsed_seconds: float = elapsed_time % 60

logger.info(f"⏱️ Total execution time: {elapsed_minutes}m {elapsed_seconds:.2f}s")
logger.info("✅ Notebook execution completed successfully.")

2025-08-02 05:16:22 - INFO - ⏱️ Total execution time: 5m 44.16s
2025-08-02 05:16:22 - INFO - ✅ Notebook execution completed successfully.


Built with ❤️ using [**HP AI Studio**](https://hp.com/ai-studio).