# 05 — Rush Index + personalizacija uporabnika

Ta notebook gradi **inteligenco storitve** nad naučenim modelom (notebook 04):

## Kaj naredimo
1. Naložimo naučen model + feature dataset
2. Izračunamo **verjetnost hitenja** `p(rush)` za vsako okno
3. Definiramo **Rush Index** (delež časa v hitenju)
4. Dodamo **personaliziran prag** (kalibracija uporabnika)
5. Generiramo **prilagojeno povratno informacijo**
6. Pripravimo izhod, primeren za UI / Streamlit demo

➡️ Ta korak pokriva:
- *Mehanizme prilagajanja*
- *Merjenje uporabnikov*
- *User-adapted communications*


## 0) Nastavitve
Nastavi `DATA_DIR` in `TAG` enako kot v notebooku 04.


In [1]:

from pathlib import Path
import numpy as np
import pandas as pd
import joblib

DATA_DIR = Path(r"/Users/pikakriznar/Documents/1_letnik_MAG/UPK/Projekti/Razpoznava_hitenja_projekt/data/wisdm+smartphone+and+smartwatch+activity+and+biometrics+dataset/wisdm-dataset")
TAG = "5s_50pct_purity80"

FEATURES_PATH = DATA_DIR / "prepared" / f"features_{TAG}.parquet"
assert FEATURES_PATH.exists(), FEATURES_PATH
MODEL_PATH = DATA_DIR / "models" / f"logisticregression_{TAG}.joblib"
assert MODEL_PATH.exists(), MODEL_PATH

print("Features:", FEATURES_PATH)
print("Model:", MODEL_PATH)


Features: /Users/pikakriznar/Documents/1_letnik_MAG/UPK/Projekti/Razpoznava_hitenja_projekt/data/wisdm+smartphone+and+smartwatch+activity+and+biometrics+dataset/wisdm-dataset/prepared/features_5s_50pct_purity80.parquet
Model: /Users/pikakriznar/Documents/1_letnik_MAG/UPK/Projekti/Razpoznava_hitenja_projekt/data/wisdm+smartphone+and+smartwatch+activity+and+biometrics+dataset/wisdm-dataset/models/logisticregression_5s_50pct_purity80.joblib


## 1) Naloži podatke in model


In [2]:

df = pd.read_parquet(FEATURES_PATH)
model = joblib.load(MODEL_PATH)

df.head()


Unnamed: 0,x_mean,x_std,x_min,x_max,y_mean,y_std,y_min,y_max,z_mean,z_std,...,mag_mean,mag_std,mag_min,mag_max,fft_energy_0p5_4Hz,fft_peak_freq,label,subject_id,start_ts,end_ts
0,-1.707408,3.754923,-15.795654,6.010147,8.229062,7.848689,-6.569107,19.613052,-0.78184,3.395726,...,10.86549,6.357471,0.977497,24.557747,165638.568201,2.6,1,1600,251987600000000.0,251992600000000.0
1,-1.651616,3.388637,-15.795654,5.757797,7.963412,7.724643,-5.96492,19.480194,-1.096644,3.409408,...,10.583551,6.173481,1.076254,24.557747,151214.180529,2.6,1,1600,251990100000000.0,251995100000000.0
2,-1.751988,3.813674,-18.615417,5.757797,8.187352,7.860366,-6.113724,19.552063,-1.152061,3.43003,...,10.916729,6.35209,1.37642,29.975824,157862.622344,2.6,1,1600,251992700000000.0,251997600000000.0
3,-1.633039,3.76753,-18.615417,4.595642,8.463151,7.699601,-6.113724,19.552063,-0.927556,3.625126,...,11.069862,6.263381,1.37642,29.975824,152685.547626,2.6,1,1600,251995200000000.0,252000200000000.0
4,-1.141853,3.581151,-16.669891,8.13707,8.295156,7.697258,-7.046631,19.512024,-0.976565,3.780738,...,10.909266,6.198503,1.080016,27.168623,147820.085065,2.6,1,1600,251997700000000.0,252002700000000.0


## 2) Izračun verjetnosti hitenja p(rush)
Uporabimo `predict_proba` naučenega modela.


In [3]:

FEATURE_COLS = [c for c in df.columns if c not in ["label", "subject_id", "start_ts", "end_ts"]]

X = df[FEATURE_COLS].to_numpy()

df["p_rush"] = model.predict_proba(X)[:, 1]
df[["p_rush", "label"]].head()


Unnamed: 0,p_rush,label
0,0.99902,1
1,0.998507,1
2,0.99886,1
3,0.998079,1
4,0.998642,1


## 3) Definicija Rush Index-a
Rush Index = % oken, kjer je uporabnik v stanju hitenja.

Privzeto:
- globalni prag = 0.5


In [4]:

GLOBAL_THRESHOLD = 0.5

df["rush_global"] = (df["p_rush"] >= GLOBAL_THRESHOLD).astype(int)

def rush_index(series):
    return 100.0 * series.mean()

# primer za enega uporabnika
example_subject = df["subject_id"].iloc[0]
ri_example = rush_index(df[df["subject_id"] == example_subject]["rush_global"])

ri_example


50.35971223021583

## 4) Personaliziran prag (kalibracija uporabnika)
V začetni fazi opazujemo uporabnika in prilagodimo prag tako, da
njegova *normalna hoja* ni stalno označena kot hitenje.

Uporabimo porazdelitev `p_rush` v kalibracijskem obdobju.


In [5]:

def personalized_threshold(p_rush_values, method="quantile", q=0.9):
    if method == "quantile":
        return float(np.quantile(p_rush_values, q))
    elif method == "mean_std":
        return float(p_rush_values.mean() + 0.5 * p_rush_values.std())
    else:
        raise ValueError("Unknown method")

# simulacija kalibracije za enega uporabnika
user_df = df[df["subject_id"] == example_subject]
user_threshold = personalized_threshold(user_df["p_rush"], q=0.9)

user_threshold


0.9993722089673296

## 5) Rush Index s personalizacijo


In [6]:

df["rush_personal"] = 0

for sid, g in df.groupby("subject_id"):
    thr = personalized_threshold(g["p_rush"], q=0.9)
    df.loc[g.index, "rush_personal"] = (g["p_rush"] >= thr).astype(int)

ri_global = rush_index(df["rush_global"])
ri_personal = rush_index(df["rush_personal"])

ri_global, ri_personal


(49.80595084087969, 10.090556274256144)

## 6) Prilagojena povratna informacija (rule-based)
Preprosta pravila, ki jih lahko uporabiš v aplikaciji.


In [7]:

def generate_feedback(rush_index_value):
    if rush_index_value > 40:
        return "Pogosto hitiš. Morda bi ti koristil kratek odmor ali bolj umirjen tempo."
    elif rush_index_value > 20:
        return "Občasno hitiš. Poskusi bolj enakomerno razporediti obveznosti."
    else:
        return "Tvoj tempo je večinoma umirjen. Odlično!"

generate_feedback(ri_personal)


'Tvoj tempo je večinoma umirjen. Odlično!'

## 7) Priprava izhoda za UI / Streamlit
Agregiramo podatke na nivo 'dan uporabnika'.


In [8]:

summary = (
    df.groupby("subject_id")
      .agg(
          rush_index_global=("rush_global", rush_index),
          rush_index_personal=("rush_personal", rush_index),
          mean_p_rush=("p_rush", "mean"),
          n_windows=("p_rush", "size")
      )
)

summary.head()


Unnamed: 0_level_0,rush_index_global,rush_index_personal,mean_p_rush,n_windows
subject_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1600,50.359712,10.071942,0.50226,139
1601,50.0,10.11236,0.499654,178
1602,49.640288,10.071942,0.497446,139
1603,49.438202,10.11236,0.504126,178
1604,49.640288,10.071942,0.488079,139


## 8) Kaj imaš po tem notebooku
- kontinuirni **Rush Score** (`p_rush`)
- binarni status (globalni vs personaliziran)
- Rush Index na nivoju uporabnika
- osnovni personalization + feedback mehanizem

➡️ Naslednji korak: **Streamlit demo / simulacija aplikacije**.
