# 08 - Streamlit App Preparation

I prepare the trained model and helper functions for building the SleepSense Streamlit app.  
This notebook lets me test sample predictions and confirm everything works smoothly before deployment.


In [22]:
# I import required libraries and set file paths
import os, joblib
import pandas as pd
import numpy as np

RIDGE_MODEL = "../models/ridge_model.pkl"
FEATURES_PATH = "../data/processed/sleepsense_features.csv"
os.makedirs("../app", exist_ok=True)

print("imports ready and folder structure confirmed")


imports ready and folder structure confirmed


#### ***Inference: I make sure my environment and paths are set correctly.***

In [23]:
# I load my final Ridge Regression model and the processed dataset
model = joblib.load(RIDGE_MODEL)
df = pd.read_csv(FEATURES_PATH)

print("model and dataset loaded successfully")
print("dataset shape:", df.shape)


model and dataset loaded successfully
dataset shape: (12000, 45)


#### ***Inference: Both model and dataset are ready for testing.***

## ***Extract features used in model***

In [24]:
# I identify which features the Ridge model was trained on
feature_cols = [
    'sleep_deficit','digital_fatigue','env_stress','lifestyle_balance',
    'late_snack_effect','fatigue_env_interaction','is_metro',
    'avg_sleep_hours','screen_time_hours','stress_level',
    'physical_activity_min','age','family_size'
]

feature_cols = [f for f in feature_cols if f in df.columns]
print("features used in app prediction:", feature_cols)


features used in app prediction: ['sleep_deficit', 'digital_fatigue', 'env_stress', 'lifestyle_balance', 'late_snack_effect', 'fatigue_env_interaction', 'is_metro', 'avg_sleep_hours', 'screen_time_hours', 'stress_level', 'physical_activity_min', 'age', 'family_size']


#### ***Inference: These are the exact columns my app will need for user input.***

## ***Creating helper function for prediction***

In [25]:
# I define a function that accepts a dictionary of inputs and returns predicted sleep quality
def predict_sleep_quality(input_data: dict):
    """
    I take input_data (a dictionary), convert it to a DataFrame,
    ensure all required columns exist, and return predicted sleep quality score.
    """
    # make dataframe from single record
    x_input = pd.DataFrame([input_data])
    
    # add any missing columns with 0 (in case city or category not provided)
    for col in feature_cols:
        if col not in x_input.columns:
            x_input[col] = 0
    
    # reorder columns to match training
    x_input = x_input[feature_cols]
    
    # predict sleep quality
    prediction = model.predict(x_input)[0]
    return round(prediction, 2)


### ***Comment:***
* ***This function is what the Streamlit app will use for real-time prediction.***

* ***Inference: I can now pass user-like data and instantly get predicted sleep quality.***

## ***Test sample prediction (example)***

In [26]:
# I test the model with a custom example input
sample_input = {
    "age": 28,
    "family_size": 4,
    "work_hours": 9,
    "avg_sleep_hours": 7,
    "screen_time_hours": 5,
    "tea_cups": 2,
    "coffee_cups": 1,
    "late_snack": 1,
    "spice_intake": 2,
    "religious_freq": 3,
    "festival_freq": 1,
    "physical_activity_min": 40,
    "bedtime_variability": 2,
    "stress_level": 5,
    "city_noise_dB": 60,
    "light_pollution_index": 70,
    "air_quality_index": 90,
    "sleep_deficit": -0.5,
    "digital_fatigue": 0.8,
    "env_stress": 0.7,
    "lifestyle_balance": -0.2,
    "late_snack_effect": 0.3,
    "fatigue_env_interaction": 0.5,
    "is_metro": 1
}

pred_value = predict_sleep_quality(sample_input)
print("Predicted Sleep Quality Score:", pred_value)


Predicted Sleep Quality Score: 15.67


#### ***Inference: I test one full data record and confirm my function works as expected.***

## ***Create helper for clean prediction display***

#### ***Inference: My app will use this to show friendly text instead of just numbers.***

## ***Saving helper functions for Streamlit use***

In [29]:
# file: app/predict_helper.py
# encoding: utf-8
"""
SleepSense helper module:
- predict_sleep_quality(input_dict) -> float score (0-100)
- display_prediction(pred_value, language='English', verbose=False) -> message (localized)

Save this file as app/predict_helper.py (UTF-8).
"""

import os
from pathlib import Path
import pandas as pd
import joblib

# ✅ FIXED MODEL PATH HANDLING — works in both Notebook and Streamlit
try:
    # if running as script or in Streamlit, __file__ exists
    current_dir = Path(__file__).resolve().parent
except NameError:
    # if running in Jupyter / interactive shell, fallback to current working directory
    current_dir = Path(os.getcwd())

# Define model path safely
MODEL_PATH = current_dir.parent / "models" / "ridge_model.pkl"

# print info (optional for debug)
print(f"✅ Model path resolved to: {MODEL_PATH}")

# --- Load the model safely ---
if MODEL_PATH.exists():
    model = joblib.load(MODEL_PATH)
    print("✅ Model loaded successfully!")
else:
    model = None
    print("⚠️ Model file not found — please ensure ridge_model.pkl exists in /models/")


def predict_sleep_quality(input_data: dict) -> float:
    """
    Convert input dict to DataFrame, ensure required columns, predict with the loaded model,
    and return a rounded float score (0-100).
    - input_data: dict with any subset of features. Missing features are filled with 0.
    """
    _ensure_model()

    # create single-row dataframe
    x = pd.DataFrame([input_data])

    # ensure all feature columns exist
    for col in FEATURE_COLS:
        if col not in x.columns:
            x[col] = 0

    # reorder columns to match model
    x = x[FEATURE_COLS]

    # some columns expected as numeric; try safe conversion
    x = x.apply(pd.to_numeric, errors='coerce').fillna(0)

    # prediction
    pred = _model.predict(x)[0]

    # constrain to 0-100 and round
    pred = float(pred)
    pred = max(0.0, min(100.0, pred))
    return round(pred, 2)


def display_prediction(pred_value: float, language: str = 'English', verbose: bool = False) -> str:
    """
    Return multilingual, human-friendly sleep interpretation for pred_value (0-100).
    - language: full language name string (e.g., 'Hindi', 'Tamil', 'Bengali', 'English', etc.)
    - verbose: if True, prints internal debug lines (useful during development)
    """

    # small safety convert
    try:
        score = float(pred_value)
    except Exception:
        score = -9999.0

    if verbose:
        print(f"🧮 Predicted Sleep Quality Score: {score}")
        print(f"🌐 Language requested: {language}")

    # --- translations: full-language names (expandable) ---
    translations = {
        "English": {"excellent": "Excellent Sleep Quality", "poor": "Poor Sleep Quality",
                    "critical": "Critical Sleep Condition", "rest": "Take proper rest and reduce stress."},
        "Hindi": {"excellent": "उत्कृष्ट नींद गुणवत्ता", "poor": "कमज़ोर नींद गुणवत्ता",
                  "critical": "गंभीर नींद स्थिति", "rest": "अच्छी नींद लें और तनाव कम करें।"},
        "Bengali": {"excellent": "চমৎকার ঘুমের মান", "poor": "দুর্বল ঘুমের মান",
                    "critical": "গুরুতর ঘুমের অবস্থা", "rest": "বিশ্রাম নিন এবং চাপ কমান।"},
        "Tamil": {"excellent": "சிறந்த உறக்கத் தரம்", "poor": "மோசமான உறக்கத் தரம்",
                  "critical": "கடுமையான உறக்க நிலை", "rest": "நன்றாக உறங்குங்கள் மற்றும் மனஅழுத்தத்தை குறையுங்கள்."},
        "Telugu": {"excellent": "అద్భుతమైన నిద్ర నాణ్యత", "poor": "పేద నిద్ర నాణ్యత",
                   "critical": "తీవ్రమైన నిద్ర స్థితి", "rest": "సరైన విశ్రాంతి తీసుకోండి మరియు ఒత్తిడిని తగ్గించండి."},
        "Kannada": {"excellent": "ಅತ್ಯುತ್ತಮ ನಿದ್ರೆ ಗುಣಮಟ್ಟ", "poor": "ಕೆಟ್ಟ ನಿದ್ರೆ ಗುಣಮಟ್ಟ",
                    "critical": "ಗಂಭೀರ ನಿದ್ರೆ ಸ್ಥಿತಿ", "rest": "ಒಳ್ಳೆಯ ನಿದ್ರೆ ಮಾಡಿ ಮತ್ತು ಒತ್ತಡವನ್ನು ಕಡಿಮೆ ಮಾಡಿ."},
        "Malayalam": {"excellent": "മികച്ച ഉറക്ക ഗുണമേന്മ", "poor": "ദുര്‍ബലമായ ഉറക്ക ഗുണമേന്മ",
                      "critical": "ഗൗരവമുള്ള ഉറക്ക അവസ്ഥ", "rest": "ശ്രദ്ധയോടെ വിശ്രമിക്കുകയും സമ്മർദ്ദം കുറയ്ക്കുകയും ചെയ്യുക."},
        "Marathi": {"excellent": "उत्कृष्ट झोप गुणवत्ता", "poor": "कमी झोप गुणवत्ता",
                    "critical": "गंभीर झोप स्थिती", "rest": "चांगली झोप घ्या आणि तणाव कमी करा."},
        "Punjabi": {"excellent": "ਉਤਕ੍ਰਿਸ਼ਟ ਨੀਂਦ ਗੁਣਵੱਤਾ", "poor": "ਖਰਾਬ ਨੀਂਦ ਗੁਣਵੱਤਾ",
                    "critical": "ਗੰਭੀਰ ਨੀਂਦ ਦੀ ਸਥਿਤੀ", "rest": "ਚੰਗੀ ਨੀਂਦ ਕਰੋ ਅਤੇ ਤਣਾਅ ਘਟਾਓ।"},
        "Gujarati": {"excellent": "ઉત્કૃષ્ટ નિંદ્રાની ગુણવત્તા", "poor": "નબળી નિંદ્રાની ગુણવત્તા",
                     "critical": "ગંભીર નિંદ્રા સ્થિતિ", "rest": "સારી ઊંઘ લો અને તણાવ ઘટાડો."},
        "Odia": {"excellent": "ଉତ୍କୃଷ୍ଟ ନିଦ୍ରା ଗୁଣତା", "poor": "ଦୁର୍ବଳ ନିଦ୍ରା ଗୁଣତା",
                 "critical": "ଗୁରୁତର ନିଦ୍ରା ଅବସ୍ଥା", "rest": "ଭଲ ସୁଇପାରିବେ ଏବଂ ଚିନ୍ତା କମାନ୍ତୁ।"},
        "Assamese": {"excellent": "উত্তম ঘুমৰ গুণমান", "poor": "দুৰ্বল ঘুমৰ গুণমান",
                     "critical": "গভীৰ ঘুমৰ অৱস্থা", "rest": "ভালকৈ বিশ্রাম লওক আৰু চাপ কমাওক।"},
        "Nepali": {"excellent": "उत्कृष्ट निद्रा गुणस्तर", "poor": "कमजोर निद्रा गुणस्तर",
                   "critical": "गंभीर निद्रा स्थिति", "rest": "राम्ररी निद्रा लिनुहोस् र तनाव घटाउनुहोस्।"},
        "Urdu": {"excellent": "بہترین نیند کا معیار", "poor": "ناقص نیند کا معیار",
                 "critical": "سنگین نیند کی حالت", "rest": "اچھی نیند لیں اور دباؤ کم کریں۔"},
        "Sindhi": {"excellent": "بهترين ننڊ جو معيار", "poor": "ڪمزور ننڊ جو معيار",
                   "critical": "سنگين ننڊ حالت", "rest": "سٺي ننڊ وٺو ۽ دٻاءُ گهٽايو."},
        "Dogri": {"excellent": "उत्तम नींद गुणवत्ता", "poor": "कमज़ोर नींद गुणवत्ता",
                  "critical": "गंभीर नींद स्थिति", "rest": "अच्छी नींद लें और तनाव कम करें।"},
        "Maithili": {"excellent": "उत्तम नींद गुणवत्ता", "poor": "कमज़ोर नींद गुणवत्ता",
                     "critical": "गंभीर नींद स्थिति", "rest": "अच्छे से आराम करें और तनाव घटाएं।"},
        "Konkani": {"excellent": "उत्कृष्ट झोप गुणवत्ता", "poor": "खराब झोप गुणवत्ता",
                    "critical": "गंभीर झोप स्थिती", "rest": "योग्य विश्रांती घ्या आणि ताण कमी करा."},
        "Manipuri": {"excellent": "ꯑꯦꯠꯇꯥꯝꯁꯤꯡ ꯅꯤꯗꯔꯥ ꯄꯥꯡꯅꯥꯡ", "poor": "ꯋꯤꯡꯈꯥꯡ ꯅꯤꯗꯔꯥ ꯄꯥꯡꯅꯥꯡ",
                    "critical": "ꯒꯔꯤꯕ ꯅꯤꯗꯔꯥ ꯄꯥꯡꯅꯥꯡ", "rest": "ꯆꯥꯎꯕ ꯅꯤꯗꯔꯥ ꯄꯥꯡꯅꯥꯡ ꯃꯇꯝꯗꯤ ꯃꯇꯩ ꯍꯣꯡꯅꯥ ꯈꯪꯗꯤ."},
        "Sanskrit": {"excellent": "उत्तमा निद्रा", "poor": "नीचा निद्रा",
                     "critical": "गंभीर निद्रा", "rest": "विश्रामं कुरुत व तनावं न्यूनं कुरुत।"},
        "Bodo": {"excellent": "उत्तम निंदर गुनमान", "poor": "कमजोर निंदर गुनमान",
                 "critical": "गंभीर निंदर स्थिति", "rest": "भाल सुइ आ तनाव गोर हो।"},
        "Bhojpuri": {"excellent": "बढ़िया नींद गुणवत्ता", "poor": "खराब नींद गुणवत्ता",
                     "critical": "गंभीर नींद स्थिति", "rest": "अच्छे से आराम करीं, तनाव आ मोबाइल टाइम घटाईं।"}
    }

    # fallback
    lang_dict = translations.get(language, translations["English"])

    # Choose message based on score (30-level logic)
    # messages append localized 'rest' advice from lang_dict
    rest_text = lang_dict.get('rest', translations['English']['rest'])

    # Note: keep conditions exactly as designed for interpretability
    if score >= 99:
        msg = f"💎 {lang_dict['excellent']} ({score}) — Elite sleeper, perfect balance! {rest_text}"
    elif 95 <= score < 99:
        msg = f"🌕 {lang_dict['excellent']} ({score}) — Perfect lifestyle balance. {rest_text}"
    elif 92 <= score < 95:
        msg = f"✨ {lang_dict['excellent']} ({score}) — Near perfect, slight improvement possible. {rest_text}"
    elif 88 <= score < 92:
        msg = f"🌙 {lang_dict['excellent']} ({score}) — Calm routine and minimal stress. {rest_text}"
    elif 85 <= score < 88:
        msg = f"💚 {lang_dict['excellent']} ({score}) — Stable routine with minor fatigue. {rest_text}"
    elif 80 <= score < 85:
        msg = f"🙂 {lang_dict['excellent']} ({score}) — Healthy sleep pattern, keep it up. {rest_text}"
    elif 75 <= score < 80:
        msg = f"😴 {lang_dict['excellent']} ({score}) — Mild irregularity in sleep hours. {rest_text}"
    elif 70 <= score < 75:
        msg = f"🟩 {lang_dict['poor']} ({score}) — Moderate lifestyle balance. {rest_text}"
    elif 66 <= score < 70:
        msg = f"🟢 {lang_dict['poor']} ({score}) — Manageable sleep, occasional stress. {rest_text}"
    elif 63 <= score < 66:
        msg = f"😐 {lang_dict['poor']} ({score}) — Average pattern, needs improvement. {rest_text}"
    elif 60 <= score < 63:
        msg = f"🟡 {lang_dict['poor']} ({score}) — Irregular sleep time, fatigue visible. {rest_text}"
    elif 56 <= score < 60:
        msg = f"🟠 {lang_dict['poor']} ({score}) — Body fatigue and rest gap detected. {rest_text}"
    elif 52 <= score < 56:
        msg = f"⚠️ {lang_dict['poor']} ({score}) — Minor imbalance due to stress. {rest_text}"
    elif 48 <= score < 52:
        msg = f"🔴 {lang_dict['poor']} ({score}) — Regular disturbance noticed. {rest_text}"
    elif 45 <= score < 48:
        msg = f"🚧 {lang_dict['poor']} ({score}) — Inconsistent sleep or overwork. {rest_text}"
    elif 42 <= score < 45:
        msg = f"☁️ {lang_dict['poor']} ({score}) — Urban fatigue and noise impact. {rest_text}"
    elif 38 <= score < 42:
        msg = f"💤 {lang_dict['poor']} ({score}) — Restless pattern, low deep sleep. {rest_text}"
    elif 35 <= score < 38:
        msg = f"⚙️ {lang_dict['poor']} ({score}) — Late-night workload affecting rest. {rest_text}"
    elif 30 <= score < 35:
        msg = f"🔥 {lang_dict['poor']} ({score}) — Stress dominated routine. {rest_text}"
    elif 28 <= score < 30:
        msg = f"🧠 {lang_dict['poor']} ({score}) — Overthinking before bed. {rest_text}"
    elif 25 <= score < 28:
        msg = f"📱 {lang_dict['poor']} ({score}) — Excessive screen use before sleep. {rest_text}"
    elif 22 <= score < 25:
        msg = f"💼 {lang_dict['poor']} ({score}) — Overwork causing low rest. {rest_text}"
    elif 18 <= score < 22:
        msg = f"🥱 {lang_dict['poor']} ({score}) — Fatigue and incomplete rest. {rest_text}"
    elif 15 <= score < 18:
        msg = f"🚨 {lang_dict['critical']} ({score}) — Exhaustion and anxiety possible. {rest_text}"
    elif 12 <= score < 15:
        msg = f"💀 {lang_dict['critical']} ({score}) — Severe sleep deprivation stage. {rest_text}"
    elif 8 <= score < 12:
        msg = f"🩸 {lang_dict['critical']} ({score}) — Signs of chronic insomnia. {rest_text}"
    elif 5 <= score < 8:
        msg = f"💤 {lang_dict['critical']} ({score}) — Almost no restful sleep. {rest_text}"
    elif 2 <= score < 5:
        msg = f"🕯️ {lang_dict['critical']} ({score}) — Body energy collapse detected. {rest_text}"
    elif 0 < score < 2:
        msg = f"⚰️ {lang_dict['critical']} ({score}) — Extreme lack of sleep. {rest_text}"
    else:
        msg = f"❌ Invalid Sleep Score ({pred_value}) — Please check input."

    if verbose:
        print("→ Final message:", msg)

    return msg


✅ Model path resolved to: c:\Users\ASUS\Desktop\(SUPERVISED-1)ML_\project\Sleep Sense Predictor\models\ridge_model.pkl
✅ Model loaded successfully!


## ***Inference: Now my Streamlit app can import this helper file directly for live predictions.***

## ***Summary***

- The Ridge model is loaded and working perfectly for prediction.  
- I created helper functions to handle input and display results clearly.  
- The same functions are saved in `predict_helper.py` for the Streamlit app.  
- Everything is ready to move into **`streamlit_app.py`** to build the interactive user interface.
