# Chat Debug Notebook (Ultra Simple)

Pure minimal loop calling `ChatService.chat()` with **no history** and **single user role only**.
Each round is an independent call; the model does not see prior rounds.

## Usage
1. Run the environment cell.
2. Adjust parameters in the configuration cell.
3. Run the execute cell to print N independent answers.

Optional: set `CHAT_STRICT_LANGUAGE_ENFORCEMENT=0` (see optional cell) before running to disable Arabic enforcement.


In [30]:
# Minimal environment / imports (simplified)
import os, sys, time
from pathlib import Path
from typing import List, Dict

# Determine backend root heuristically
REPO_ROOT = Path.cwd()
BACKEND_ROOT = REPO_ROOT / 'backend'
if not BACKEND_ROOT.exists():
    alt = Path().resolve().parent
    if (alt / 'src').exists():
        BACKEND_ROOT = alt

if str(BACKEND_ROOT) not in sys.path:
    sys.path.insert(0, str(BACKEND_ROOT))
SRC_PATH = BACKEND_ROOT / 'src'
if SRC_PATH.exists() and str(SRC_PATH) not in sys.path:
    sys.path.insert(0, str(SRC_PATH))

from beautyai_inference.services.inference.chat_service import ChatService

# Create chat service and attempt to load default lightweight model (quietly)
chat_service = ChatService()
try:
    chat_service.load_default_model_from_config()
except Exception:
    pass  # silently ignore – user may load model elsewhere

2025-08-17 05:18:35,390 | INFO | beautyai_inference.services.inference.content_filter_service | Content filter initialized - Allowed: 252 medical keywords, Forbidden: 103 inappropriate keywords
2025-08-17 05:18:35,391 | INFO | beautyai_inference.services.inference.chat_service | Loading fastest model for 24/7 service: qwen3-unsloth-q4ks (unsloth/Qwen3-14B-GGUF)
2025-08-17 05:18:35,391 | INFO | beautyai_inference.core.model_manager | Stopped keep-alive timer for model 'qwen3-unsloth-q4ks'
2025-08-17 05:18:35,391 | INFO | beautyai_inference.core.model_manager | Started keep-alive timer for model 'qwen3-unsloth-q4ks' (will unload after 60 minutes of inactivity)
2025-08-17 05:18:35,392 | INFO | beautyai_inference.services.inference.chat_service | ✅ Fastest model loaded successfully and ready on GPU: qwen3-unsloth-q4ks


Using already loaded model 'qwen3-unsloth-q4ks'.


In [31]:
# Very simple helper: send the same user message N times, print model reply.
# - No accumulated history
# - Only user role sent each round
# - No logging, no assistant turns stored

def run_chat(message: str, language: str, rounds: int = 1):
    for i in range(rounds):
        print(f"\n=== ROUND {i+1} ===")
        result = chat_service.chat(
            message=message,
            conversation_history=None,  # no history
            max_length=max_new_tokens,
            language=language,
            thinking_mode=False,
            temperature=temperature,
            top_p=top_p,
        )
        print(f"USER : {message}")
        print(f"AI   : {result.get('response')}")
    print(f"\nCompleted {rounds} rounds.")

In [35]:
# Configuration
message = '/no_think ما هي الجاذبية؟'
response_language = 'ar'   # 'auto', 'ar', 'en', 'es', 'fr', 'de'
temperature = 0.0
top_p = 0.95
max_new_tokens = 192
rounds = 3  # number of repeated independent calls

In [36]:
# Execute simple chat rounds
run_chat(message=message, language=response_language, rounds=rounds)

2025-08-17 05:19:43,342 | INFO | beautyai_inference.core.model_manager | Stopped keep-alive timer for model 'qwen3-unsloth-q4ks'
2025-08-17 05:19:43,343 | INFO | beautyai_inference.core.model_manager | Started keep-alive timer for model 'qwen3-unsloth-q4ks' (will unload after 60 minutes of inactivity)
2025-08-17 05:19:43,343 | INFO | beautyai_inference.services.inference.chat_service | Using persistent default model: qwen3-unsloth-q4ks
2025-08-17 05:19:43,343 | INFO | beautyai_inference.services.inference.chat_service | 🌍 Using specified language: ar
2025-08-17 05:19:43,344 | INFO | beautyai_inference.services.inference.chat_service | Generating response for language: ar



=== ROUND 1 ===


2025-08-17 05:19:44,513 | INFO | beautyai_inference.services.inference.chat_service | [chat][ar_enforce] original_len=243 arabic_chars=197 latin_chars=0 ratio_ar=0.811 min_ratio=0.100 min_ar_chars=8 soft_mode=True
2025-08-17 05:19:44,514 | INFO | beautyai_inference.utils.language_detection | Language detection result: ar (confidence: 1.585)
2025-08-17 05:19:44,514 | INFO | beautyai_inference.services.inference.chat_service | [chat][ar_enforce] output_language_detected=ar conf=1.585
2025-08-17 05:19:44,514 | INFO | beautyai_inference.services.inference.chat_service | Generated response (first 100 chars): ١
الجاذبية هي قوة طبيعية تجذب الأجسام نحو بعضها البعض، وتُعتبر من القوى الأساسية في الكون. على سطح ا
2025-08-17 05:19:44,514 | INFO | beautyai_inference.core.model_manager | Stopped keep-alive timer for model 'qwen3-unsloth-q4ks'
2025-08-17 05:19:44,515 | INFO | beautyai_inference.core.model_manager | Started keep-alive timer for model 'qwen3-unsloth-q4ks' (will unload after 60 minutes 

USER : /no_think ما هي الجاذبية؟
AI   : ١
الجاذبية هي قوة طبيعية تجذب الأجسام نحو بعضها البعض، وتُعتبر من القوى الأساسية في الكون. على سطح الأرض، تظهر الجاذبية كقوة تدفع الأشياء نحو الأسفل، وهي ما يسمى بالوزن. هذه القوة تؤثر على كل شيء في الكون، من الصخور والنجوم إلى البشر والمجرات.

=== ROUND 2 ===


2025-08-17 05:19:45,652 | INFO | beautyai_inference.services.inference.chat_service | [chat][ar_enforce] original_len=243 arabic_chars=197 latin_chars=0 ratio_ar=0.811 min_ratio=0.100 min_ar_chars=8 soft_mode=True
2025-08-17 05:19:45,652 | INFO | beautyai_inference.utils.language_detection | Language detection result: ar (confidence: 1.585)
2025-08-17 05:19:45,653 | INFO | beautyai_inference.services.inference.chat_service | [chat][ar_enforce] output_language_detected=ar conf=1.585
2025-08-17 05:19:45,653 | INFO | beautyai_inference.services.inference.chat_service | Generated response (first 100 chars): ١
الجاذبية هي قوة طبيعية تجذب الأجسام نحو بعضها البعض، وتُعتبر من القوى الأساسية في الكون. على سطح ا
2025-08-17 05:19:45,653 | INFO | beautyai_inference.core.model_manager | Stopped keep-alive timer for model 'qwen3-unsloth-q4ks'
2025-08-17 05:19:45,654 | INFO | beautyai_inference.core.model_manager | Started keep-alive timer for model 'qwen3-unsloth-q4ks' (will unload after 60 minutes 

USER : /no_think ما هي الجاذبية؟
AI   : ١
الجاذبية هي قوة طبيعية تجذب الأجسام نحو بعضها البعض، وتُعتبر من القوى الأساسية في الكون. على سطح الأرض، تظهر الجاذبية كقوة تدفع الأشياء نحو الأسفل، وهي ما يسمى بالوزن. هذه القوة تؤثر على كل شيء في الكون، من الصخور والنجوم إلى البشر والمجرات.

=== ROUND 3 ===


2025-08-17 05:19:46,791 | INFO | beautyai_inference.services.inference.chat_service | [chat][ar_enforce] original_len=243 arabic_chars=197 latin_chars=0 ratio_ar=0.811 min_ratio=0.100 min_ar_chars=8 soft_mode=True
2025-08-17 05:19:46,792 | INFO | beautyai_inference.utils.language_detection | Language detection result: ar (confidence: 1.585)
2025-08-17 05:19:46,792 | INFO | beautyai_inference.services.inference.chat_service | [chat][ar_enforce] output_language_detected=ar conf=1.585
2025-08-17 05:19:46,792 | INFO | beautyai_inference.services.inference.chat_service | Generated response (first 100 chars): ١
الجاذبية هي قوة طبيعية تجذب الأجسام نحو بعضها البعض، وتُعتبر من القوى الأساسية في الكون. على سطح ا


USER : /no_think ما هي الجاذبية؟
AI   : ١
الجاذبية هي قوة طبيعية تجذب الأجسام نحو بعضها البعض، وتُعتبر من القوى الأساسية في الكون. على سطح الأرض، تظهر الجاذبية كقوة تدفع الأشياء نحو الأسفل، وهي ما يسمى بالوزن. هذه القوة تؤثر على كل شيء في الكون، من الصخور والنجوم إلى البشر والمجرات.

Completed 3 rounds.


In [None]:
# Optional: disable Arabic enforcement (uncomment if needed)
# import os; os.environ['CHAT_STRICT_LANGUAGE_ENFORCEMENT'] = '0'
# Then rerun the environment + execute cells.