# Fridge Chatbot Prototype (Local Tiny Model)
This notebook demonstrates a proof-of-concept AI chatbot that maintains a fridge state and can provide dinner suggestions. Uses a tiny local model (`flan-t5-small`) for CPU inference.

## 1. Install Dependencies

In [1]:
!pip install transformers torch sentencepiece
!pip install gpt4all
!pip install transformers accelerate bitsandbytes>0.37.0
!pip install --upgrade accelerate
!pip install huggingface_hub[hf_xet]






[notice] A new release of pip is available: 25.1.1 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip





[notice] A new release of pip is available: 25.1.1 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip

[notice] A new release of pip is available: 25.1.1 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip





[notice] A new release of pip is available: 25.1.1 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip





[notice] A new release of pip is available: 25.1.1 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip


## 2. Import Libraries

In [47]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from transformers import AutoModelForCausalLM
import torch
from pathlib import Path
from gpt4all import GPT4All
from transformers import GenerationConfig

## 3. Load Model

In [37]:
def resolve_model_dir(rel_path: str) -> Path:
    p = Path(rel_path)
    try:
        return p.resolve(strict=True)
    except FileNotFoundError:
        # Try relative to the notebook/script directory if needed, or raise a clear error
        raise FileNotFoundError(
            f"Local model directory not found at '{rel_path}'. "
            f"Run from your project root or pass an absolute path."
        )


print(Path().resolve())  # current working directory
print((Path("src/models/gpt4all-j-bnb-4bit-smashed")).resolve(strict=False))

local_model_dir = Path(r"D:\Cook-Bot\src\models\gpt4all-j-bnb-4bit-smashed").resolve(strict=True)

required_files = ["config.json", "model.safetensors"]
missing = [f for f in required_files if not (local_model_dir / f).exists()]
if missing:
    raise FileNotFoundError(
        f"Missing required file(s) in '{local_model_dir}': {missing}"
    )

use_bitsandbytes = False
try:
    import bitsandbytes as _  # noqa: F401
    use_bitsandbytes = True
except Exception:
    use_bitsandbytes = False

if not use_bitsandbytes:
    raise RuntimeError(
        "The local model appears to be 4-bit bitsandbytes-quantized. "
        "Load will fail without 'bitsandbytes'. Either:\n"
        "  - Use a non-quantized GPT-J checkpoint in the same folder, or\n"
        "  - Add 'bitsandbytes' to your environment."
    )


# Load tokenizer from HF repo (requires internet)
tokenizer = AutoTokenizer.from_pretrained("nomic-ai/gpt4all-j")

# Model load with offload to CPU
model = AutoModelForCausalLM.from_pretrained(
    local_model_dir,
    device_map=None,
    load_in_4bit=True,
    llm_int8_enable_fp32_cpu_offload=True,  # <-- allow CPU offload for modules that don't fit
    trust_remote_code=True
)

D:\Cook-Bot\src\models
D:\Cook-Bot\src\models\src\models\gpt4all-j-bnb-4bit-smashed


The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.


## 4. Initialize Fridge State and Conversation Memory

In [39]:
# Fridge inventory
fridge_state = {
    "eggs": 4,
    "milk": "500ml",
    "tomatoes": 3,
    "cheese": "200g"
}

# Conversation history (for sliding window memory)
conversation_history = []
SLIDING_WINDOW = 3  # last N messages

## 5. Helper Functions
### 5.1 Update Fridge State (simple rule-based)

In [40]:

def update_fridge(user_input: str):
    msg = None
    user_input_lower = user_input.lower()
    if "eat" in user_input_lower or "used" in user_input_lower:
        if "egg" in user_input_lower and fridge_state.get("eggs", 0) > 0:
            # try to detect number
            import re
            match = re.search(r'(\d+)', user_input_lower)
            qty = int(match.group(1)) if match else 1
            fridge_state["eggs"] = max(fridge_state["eggs"] - qty, 0)
            msg = f"Removed {qty} egg(s). You now have {fridge_state['eggs']} eggs."
    if "add" in user_input_lower or "buy" in user_input_lower:
        if "egg" in user_input_lower:
            import re
            match = re.search(r'(\d+)', user_input_lower)
            qty = int(match.group(1)) if match else 1
            fridge_state["eggs"] += qty
            msg = f"Added {qty} egg(s). You now have {fridge_state['eggs']} eggs."
    return msg

### 5.2 Build Prompt for Model

In [41]:

def build_prompt(user_input: str):
    recent_history = conversation_history[-SLIDING_WINDOW:]
    history_text = "\n".join([f"{m['role'].capitalize()}: {m['content']}" for m in recent_history])

    fridge_lines = "\n".join([f"- {k}: {v}" for k, v in fridge_state.items()])

    prompt = f"""
System: You are a smart kitchen assistant. Your goals:
1. Suggest dinner using only the fridge items below.
2. Update fridge state if the user mentions consuming or adding items.
3. Be concise and helpful.

Fridge Inventory:
{fridge_lines}

Conversation History:
{history_text}

User: {user_input}
Assistant:"""
    return prompt.strip()

### 5.3 Chat Function

In [42]:

def chat(user_input: str, max_new_tokens: int = 128):
    # Update fridge
    fridge_update_msg = update_fridge(user_input)

    # Build prompt
    prompt = build_prompt(user_input)

    # Encode prompt
    inputs = tokenizer(prompt, return_tensors="pt")
    input_ids = inputs["input_ids"]

    # Generate output
    with torch.no_grad():
        output_ids = model.generate(
            input_ids,
            max_new_tokens=max_new_tokens,
            do_sample=True,
            temperature=0.7,
            pad_token_id=tokenizer.eos_token_id
        )

    # Decode reply (skip the prompt portion)
    reply = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True).strip()

    # Update conversation history
    conversation_history.append({"role": "user", "content": user_input})
    conversation_history.append({"role": "assistant", "content": reply})

    # Include fridge feedback
    if fridge_update_msg:
        reply = f"{reply}\n\n[Fridge Update]: {fridge_update_msg}"

    return reply


In [48]:
def chat_streaming(user_input, max_new_tokens=128):
    # Update fridge state first
    fridge_update_msg = update_fridge(user_input)

    # Build prompt
    prompt = build_prompt(user_input)

    # Tokenize prompt
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    # Configure generation
    gen_config = GenerationConfig(
        max_new_tokens=max_new_tokens,
        temperature=0.7,
        do_sample=True,
        top_p=0.9
    )

    # Generate token-by-token
    output_ids = model.generate(
        **inputs,
        generation_config=gen_config
    )

    # Decode tokens as text
    reply_text = tokenizer.decode(output_ids[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)

    # Append to conversation history
    conversation_history.append({"role": "user", "content": user_input})
    conversation_history.append({"role": "assistant", "content": reply_text})

    # Combine reply + fridge update feedback
    if fridge_update_msg:
        reply_text = f"{reply_text}\n\n[Fridge Update]: {fridge_update_msg}"

    return reply_text

## 6. Demo Conversation

In [43]:

print(chat("Hi! What's in my fridge?"))
print(chat("I ate 2 eggs."))
print(chat("Can you suggest something for dinner?"))
print(chat("I bought more cheese."))
print(chat("What can I cook tonight?"))

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


KeyboardInterrupt: 

In [None]:

print(chat_streaming("Hi! What's in my fridge?"))

`generation_config` default values have been modified to match model-specific defaults: {'use_cache': False, 'bos_token_id': 50256, 'eos_token_id': 50256}. If this is not desired, please set these values explicitly.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
