
# FitPulse Health Anomaly Detection from Fitness Devices  
## Milestone 2: Feature Extraction and Modeling

### Objective
The goal of this milestone is to extract meaningful features from preprocessed fitness data,
model temporal trends, and identify behavioral patterns using clustering techniques.
This milestone forms the foundation for anomaly detection in later stages.


In [None]:

!pip install prophet tsfresh


In [None]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from scipy.stats import skew, kurtosis
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans, DBSCAN
from sklearn.decomposition import PCA

from prophet import Prophet



## Load Dataset
Upload heartrate.csv, steps.csv, and sleep.csv to Colab before running this section.


In [None]:

hr = pd.read_csv("heartrate.csv")
hr['date'] = pd.to_datetime(hr['date'])
hr = hr.sort_values('date')
hr.head()



## Feature Extraction
Statistical features are extracted from heart rate time-series data.


In [None]:

def extract_features(series):
    return {
        "mean": series.mean(),
        "std": series.std(),
        "min": series.min(),
        "max": series.max(),
        "skewness": skew(series),
        "kurtosis": kurtosis(series)
    }

features = extract_features(hr["value"])
feature_df = pd.DataFrame([features])
feature_df



## Trend Modeling using Prophet
Prophet is used to model seasonal trends and temporal patterns.


In [None]:

prophet_df = hr.rename(columns={"date": "ds", "value": "y"})

model = Prophet()
model.fit(prophet_df)

future = model.make_future_dataframe(periods=30)
forecast = model.predict(future)

model.plot(forecast)
plt.show()



## Residual Analysis
Residuals indicate deviations from expected behavior.


In [None]:

prophet_df["trend"] = forecast["yhat"][:len(prophet_df)]
prophet_df["residual"] = prophet_df["y"] - prophet_df["trend"]
prophet_df.head()



## Clustering Behavioral Patterns
KMeans clustering is applied to identify behavior groups.


In [None]:

scaler = StandardScaler()
X_scaled = scaler.fit_transform(feature_df)

kmeans = KMeans(n_clusters=2, random_state=42)
labels = kmeans.fit_predict(X_scaled)
labels



## Dimensionality Reduction using PCA


In [None]:

pca = PCA(n_components=2)
X_pca = pca.fit_transform(X_scaled)

plt.scatter(X_pca[:, 0], X_pca[:, 1], c=labels)
plt.xlabel("PC1")
plt.ylabel("PC2")
plt.title("Behavioral Clustering Visualization")
plt.show()



## Key Observations
- Statistical features summarize fitness behavior effectively  
- Prophet captures trends and seasonality  
- Clustering identifies normal vs atypical patterns  

## Conclusion
This milestone successfully demonstrates feature extraction, trend modeling,
and clustering for fitness time-series data.
