# FEATURE ENGINEERING

## FEW CONCEPTS

### what mean by timestamps and inertial signals ?


Inertial signals are measurements from sensors like accelerometers and gyroscopes that capture motion‚Äîspecifically linear acceleration and angular velocity. In the UCI HAR dataset, these include signals such as body acceleration, gyroscope data, and total acceleration along the X, Y, and Z axes. That gives a total of 9 signals.

Time stamps refer to the sequence of measurements taken over time. The dataset is divided into windows of 2.56 seconds, sampled at 50 Hz, meaning each window contains 128 time steps. So, for each sample, you have 128 values for each of the 9 signals, forming a matrix of shape (128, 9) per sample.


### reason for shape ?


the shape `(10299, 128, 9)` is **correct** for the UCI HAR Dataset **Inertial Signals** when processed properly. Here's what each dimension means:

* `10299`: Total number of samples (train + test combined: 7352 train + 2947 test)
* `128`: Each sample is a time window of 128 readings (time steps)
* `9`: There are **9 inertial signal features**:
  * Total Acc: `body_acc_x/y/z`
  * Body Acc: `body_gyro_x/y/z`
  * Jerk: `total_acc_x/y/z`

### üìä Why the **train_data has 561 features**:



- The 561 features come from **hand-crafted feature extraction** on the raw signal data.
‚öôÔ∏è Step-by-step:

1. **Raw data shape** (after stacking `Inertial Signals`):

   * Shape = `(7352, 128, 9)` for training
   * Raw signal from accelerometer + gyroscope in time windows.

2. **Feature extraction process** (already done in the dataset's `X_train.txt` file):

   * From each 128-sample √ó 9-signal window ‚Üí extract statistical features:

     * e.g., mean, std, energy, entropy, correlation, FFT coefficients, etc.
     * This is done per signal.
   * In total, **561 features per sample** are engineered this way.

3. **Final train dataset used in classification:**

   * Shape = `(7352, 561)`
   * Each row is a sample (1 time window)
   * Each column is a feature
 üìÅ Files Involved:

* `Inertial Signals/` ‚Üí Raw 128√ó9 time series
* `X_train.txt` ‚Üí Preprocessed data with 561 features
* `y_train.txt` ‚Üí Corresponding activity labels

If we're working from raw Inertial Signals, you'll have to **re-implement feature extraction** to get the 561 features ‚Äî or use the `X_train.txt` directly.



# concatatanating the signal files to create raw data

#### data form ?


‚ö†Ô∏è Only Consider Feature Selection If:
- You're doing benchmarking with traditional ML models.
- You want to reduce model complexity for ultra-low power devices (like wearable deployment).
- You're doing multisensor fusion and need to discard irrelevant channels.
- ‚úÖ Option 1: Use Provided X_train.txt and X_test.txt
- Ready to use, already feature-extracted.
- Shape: (7352, 561) for train.
- Good if you want to compare with traditional ML or get quick results with dense/deep networks.
> ‚úÖ Option 2: Use Raw Inertial Signals/ and Build Custom Data
- You get full time series signals:
- body_acc_x_train.txt, body_gyro_y_test.txt, etc.
- Shape: (7352, 128) per signal
- Combine the 9 signals to form shape: (7352, 128, 9)
- Ideal for deep learning: LSTM, CNN-LSTM, GRU, Transformers.
> üî• Recommended for Deep Learning:
- üí° Use raw signals from Inertial Signals/ to build a 3D array:
- (samples, timesteps=128, features=9)

#### loading the files 

In [21]:
import pandas as pd
import numpy as np
import os

def load_signals(signal_dir):
    filenames = sorted(os.listdir(signal_dir))
    signal_data = [pd.read_csv(os.path.join(signal_dir, f), delim_whitespace=True, header=None) 
                   for f in filenames]
    return np.stack(signal_data, axis=-1)  # shape: (samples, time, features)

x_train_raw = load_signals("/Users/priyam/paper_recreation/UCI_HAR_FEATURE_ANALYSIS_MODELLING/UCI HAR Dataset/train/Inertial Signals 1")

x_test_raw = load_signals("/Users/priyam/paper_recreation/UCI_HAR_FEATURE_ANALYSIS_MODELLING/UCI HAR Dataset/test/Inertial Signals")
# Combine train and test data
x_raw = np.concatenate((x_train_raw, x_test_raw), axis=0)
y_train = pd.read_csv("/Users/priyam/paper_recreation/UCI_HAR_FEATURE_ANALYSIS_MODELLING/UCI HAR Dataset/train/y_train.CSV", header=None).values.flatten()
y_test = pd.read_csv("/Users/priyam/paper_recreation/UCI_HAR_FEATURE_ANALYSIS_MODELLING/UCI HAR Dataset/test/y_test.CSV", header=None).values.flatten()
# Combine train and test labels
y = np.concatenate((y_train, y_test), axis=0)
# convert x_raw to a DataFrame without flattening
#x = pd.DataFrame(x_raw)  # shape: (samples, time * features)


  signal_data = [pd.read_csv(os.path.join(signal_dir, f), delim_whitespace=True, header=None)
  signal_data = [pd.read_csv(os.path.join(signal_dir, f), delim_whitespace=True, header=None)
  signal_data = [pd.read_csv(os.path.join(signal_dir, f), delim_whitespace=True, header=None)
  signal_data = [pd.read_csv(os.path.join(signal_dir, f), delim_whitespace=True, header=None)
  signal_data = [pd.read_csv(os.path.join(signal_dir, f), delim_whitespace=True, header=None)
  signal_data = [pd.read_csv(os.path.join(signal_dir, f), delim_whitespace=True, header=None)
  signal_data = [pd.read_csv(os.path.join(signal_dir, f), delim_whitespace=True, header=None)
  signal_data = [pd.read_csv(os.path.join(signal_dir, f), delim_whitespace=True, header=None)
  signal_data = [pd.read_csv(os.path.join(signal_dir, f), delim_whitespace=True, header=None)
  signal_data = [pd.read_csv(os.path.join(signal_dir, f), delim_whitespace=True, header=None)
  signal_data = [pd.read_csv(os.path.join(signal_dir, f), de

In [22]:
x_raw.shape  # Check the shape of the raw data


(10299, 128, 9)

In [23]:
y.shape 

(10299,)

## CONVERTING INTO TENSOR 

### NORMALIZE 

In [24]:
from sklearn.preprocessing import StandardScaler

def normalize_signals(x_raw):
    num_samples, time_steps, num_features = x_raw.shape
    x_reshaped = x_raw.reshape(-1, num_features)  # (samples * time_steps, features)
    scaler = StandardScaler()
    x_scaled = scaler.fit_transform(x_reshaped).reshape(num_samples, time_steps, num_features)
    return x_scaled

x = normalize_signals(x_raw)  # shape: (samples, 128, 9)


### ONE HOT ENCODING OF LABELS 

In [27]:
from sklearn.preprocessing import OneHotEncoder

# Convert y from 1-6 to 0-5 (zero-indexed)
y = y - 1

encoder = OneHotEncoder(sparse_output=False)
y_encoded = encoder.fit_transform(y.reshape(-1, 1))  # shape: (samples, 6)


##### why we use one hot encoding ?
   - ‚ùì Why One-Hot Encoding?
   - You're one-hot encoding the y labels because:
    
   - The labels are categorical (e.g., 0‚Äì5 for 6 activities).
   - Many ML models (especially neural networks) don't handle categorical variables directly.
   - One-hot encoding converts labels like 2 into [0, 0, 1, 0, 0, 0], making it easier for models to understand.
   - This is especially important for classification tasks using neural networks.
        

### TRAIN TEST SPLIT FINAL CONVERSION INTO TENSOR 

In [28]:
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x, y_encoded, test_size=0.2, random_state=42)


# FEATURE EXTRACTION FROM RAW_FILE 

In [2]:
import numpy as np
import pandas as pd
from scipy.stats import skew, kurtosis, entropy
from scipy.fft import fft

def extract_features_from_window(window):
    features = []
    for i, signal in enumerate(window.T):  # window.T shape (9, 128) ‚Üí iterate over 9 signals
        # Time-domain features
        features.append(np.mean(signal))
        features.append(np.std(signal))
        features.append(np.min(signal))
        features.append(np.max(signal))
        features.append(np.median(signal))
        features.append(skew(signal))
        features.append(kurtosis(signal))

        # Frequency-domain features
        fft_vals = np.abs(fft(signal))
        fft_norm = fft_vals / (np.sum(fft_vals) + 1e-12)  # avoid div zero
        features.append(np.sum(fft_vals**2))              # Energy
        features.append(entropy(fft_norm + 1e-12))        # Entropy
        features.append(np.mean(fft_vals))                 # Mean power
        features.append(np.argmax(fft_vals))               # Max freq index (dominant frequency)
    return features

def extract_features_from_all_windows(x_raw):
    feature_names = []
    signals = ['body_acc_x', 'body_acc_y', 'body_acc_z',
               'body_gyro_x', 'body_gyro_y', 'body_gyro_z',
               'total_acc_x', 'total_acc_y', 'total_acc_z']
    stats = ['mean', 'std', 'min', 'max', 'median', 'skew', 'kurtosis',
             'energy', 'entropy', 'mean_power', 'max_freq_idx']
    
    for sig in signals:
        for stat in stats:
            feature_names.append(f"{sig}_{stat}")
    
    all_features = []
    for window in x_raw:
        feats = extract_features_from_window(window)
        all_features.append(feats)
        
    df_features = pd.DataFrame(all_features, columns=feature_names)
    return df_features

# Usage:
# x_features_df = extract_features_from_all_windows(x_raw)
# print(x_features_df.shape)  # (samples, 9*11=99 features)


In [3]:
df_features = extract_features_from_all_windows(x_raw)

In [4]:
df_features.head(2)

Unnamed: 0,body_acc_x_mean,body_acc_x_std,body_acc_x_min,body_acc_x_max,body_acc_x_median,body_acc_x_skew,body_acc_x_kurtosis,body_acc_x_energy,body_acc_x_entropy,body_acc_x_mean_power,...,total_acc_z_std,total_acc_z_min,total_acc_z_max,total_acc_z_median,total_acc_z_skew,total_acc_z_kurtosis,total_acc_z_energy,total_acc_z_entropy,total_acc_z_mean_power,total_acc_z_max_freq_idx
0,0.002269,0.002941,-0.004294,0.01081,0.002025,0.481111,-0.395797,0.226044,4.338396,0.024127,...,0.00397,0.088742,0.109485,0.099841,0.071125,0.4938,163.220498,1.654016,0.132356,0
1,0.000174,0.001981,-0.006706,0.005251,0.00011,-0.480776,1.472747,0.064786,4.462213,0.016851,...,0.004918,0.0811,0.105788,0.097748,-1.084209,1.257869,154.361101,1.850264,0.135705,0


## PREPROCESSING

In [11]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler


In [12]:
# missing_values = df_features.isnull().sum()
print(df_features.isnull().sum().sum())  # Should be 0


0


if any found :
    df_features.fillna(df_features.mean(), inplace=True)


In [17]:
X = df_features
X.head(2)


Unnamed: 0,body_acc_x_mean,body_acc_x_std,body_acc_x_min,body_acc_x_max,body_acc_x_median,body_acc_x_skew,body_acc_x_kurtosis,body_acc_x_energy,body_acc_x_entropy,body_acc_x_mean_power,...,total_acc_z_std,total_acc_z_min,total_acc_z_max,total_acc_z_median,total_acc_z_skew,total_acc_z_kurtosis,total_acc_z_energy,total_acc_z_entropy,total_acc_z_mean_power,total_acc_z_max_freq_idx
0,0.002269,0.002941,-0.004294,0.01081,0.002025,0.481111,-0.395797,0.226044,4.338396,0.024127,...,0.00397,0.088742,0.109485,0.099841,0.071125,0.4938,163.220498,1.654016,0.132356,0
1,0.000174,0.001981,-0.006706,0.005251,0.00011,-0.480776,1.472747,0.064786,4.462213,0.016851,...,0.004918,0.0811,0.105788,0.097748,-1.084209,1.257869,154.361101,1.850264,0.135705,0


STANDARDIZE FEATURES 

In [None]:
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Convert back to DataFrame with feature names
X_scaled_df = pd.DataFrame(X_scaled, columns=X.columns)

In [19]:
X_scaled_df.head(2)

Unnamed: 0,body_acc_x_mean,body_acc_x_std,body_acc_x_min,body_acc_x_max,body_acc_x_median,body_acc_x_skew,body_acc_x_kurtosis,body_acc_x_energy,body_acc_x_entropy,body_acc_x_mean_power,...,total_acc_z_std,total_acc_z_min,total_acc_z_max,total_acc_z_median,total_acc_z_skew,total_acc_z_kurtosis,total_acc_z_energy,total_acc_z_entropy,total_acc_z_mean_power,total_acc_z_max_freq_idx
0,0.210534,-0.883335,0.918871,-0.868773,0.611526,0.356362,-0.3394,-0.706303,0.782367,-0.884585,...,-0.913413,0.40467,-0.444325,0.039661,0.299175,0.052158,-0.563109,-0.730411,-1.203022,-0.287719
1,0.060208,-0.890098,0.908664,-0.884263,0.567732,-1.17913,0.580954,-0.706492,1.339885,-0.893425,...,-0.900186,0.388502,-0.456851,0.033242,-1.522574,0.556553,-0.56574,-0.593033,-1.19572,-0.287719


In [21]:
X_CLEAN = X_scaled_df
X_CLEAN.head(2)

Unnamed: 0,body_acc_x_mean,body_acc_x_std,body_acc_x_min,body_acc_x_max,body_acc_x_median,body_acc_x_skew,body_acc_x_kurtosis,body_acc_x_energy,body_acc_x_entropy,body_acc_x_mean_power,...,total_acc_z_std,total_acc_z_min,total_acc_z_max,total_acc_z_median,total_acc_z_skew,total_acc_z_kurtosis,total_acc_z_energy,total_acc_z_entropy,total_acc_z_mean_power,total_acc_z_max_freq_idx
0,0.210534,-0.883335,0.918871,-0.868773,0.611526,0.356362,-0.3394,-0.706303,0.782367,-0.884585,...,-0.913413,0.40467,-0.444325,0.039661,0.299175,0.052158,-0.563109,-0.730411,-1.203022,-0.287719
1,0.060208,-0.890098,0.908664,-0.884263,0.567732,-1.17913,0.580954,-0.706492,1.339885,-0.893425,...,-0.900186,0.388502,-0.456851,0.033242,-1.522574,0.556553,-0.56574,-0.593033,-1.19572,-0.287719


## SAVING THE FILES 

In [24]:
import os
import pandas as pd

# Define base path
base_path = "/Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/DATA_EXTRACTED_WITH_99_FEATURES"
os.makedirs(base_path, exist_ok=True)

# Save cleaned feature data
X_CLEAN.to_csv(os.path.join(base_path, "X_CLEAN.csv"), index=False)

# Save labels
y_df = pd.DataFrame(y, columns=['activity'])
y_df.to_csv(os.path.join(base_path, "Y_CLEAN.csv"), index=False)

# Save feature names
feature_names = df_features.columns.tolist()
with open(os.path.join(base_path, "feature_names.txt"), 'w') as f:
    for name in feature_names:
        f.write(f"{name}\n")

# Also save feature names in CSV
feature_names_df = pd.DataFrame(feature_names, columns=['feature_name'])
feature_names_df.to_csv(os.path.join(base_path, "feature_names.csv"), index=False)

# Save shape info
shape_info = {
    'features_shape': df_features.shape,
    'labels_shape': y_df.shape
}
shape_info_df = pd.DataFrame([shape_info])
shape_info_df.to_csv(os.path.join(base_path, "shape_info.csv"), index=False)

print(" All files saved under:", base_path)


 All files saved under: /Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/DATA_EXTRACTED_WITH_99_FEATURES


In [38]:
readme_text = """
FOLDER STRUCTURE AND CONTENTS
============================================================

/stage1_BASELINE/ 
‚îÇ__DATA_EXTRACTED_WITH_99_FEATURES/ # Contains cleaned and preprocessed data
‚îÇ   ‚îú‚îÄ‚îÄ feature_names.csv
‚îÇ   ‚îú‚îÄ‚îÄ feature_names.txt
‚îÇ   ‚îú‚îÄ‚îÄ shape_info.csv
‚îÇ   ‚îú‚îÄ‚îÄ X_CLEAN.csv
‚îÇ   ‚îî‚îÄ‚îÄ Y_CLEAN.csv
|__OUTPUT/
‚îÇ   ‚îú‚îÄ‚îÄ <MODEL_NAME>_RESULTS/  # e.g., LinearSVC_RESULTS/
‚îÇ   ‚îÇ   ‚îú‚îÄ‚îÄ classification_report.txt
‚îÇ   ‚îÇ   ‚îî‚îÄ‚îÄ confusion_matrix.png
‚îÇ   ‚îî‚îÄ‚îÄ model_comparison.txt
|__README.txt
------------------------------------------------------------

      
      

Folder: DATA_EXTRACTED_WITH_99_FEATURES
------------------------------------------------------------
This folder contains the cleaned and preprocessed data extracted from the **UCI HAR Dataset** using raw inertial signal files.

‚úîÔ∏è SOURCE:
-----------
The raw data was taken from the Inertial Signals directory of the original UCI HAR dataset. These signals include 9 sensor signals:
- body_acc_x
- body_acc_y
- body_acc_z
- body_gyro_x
- body_gyro_y
- body_gyro_z
- total_acc_x
- total_acc_y
- total_acc_z

Each signal was recorded over 128 time steps for every activity window/sample.

‚úîÔ∏è PIPELINE:
------------
1. **Raw Data Loaded**  
   Inertial signal files from `train` and `test` directories were loaded and concatenated to form the full raw dataset.

2. **Raw Files Saved**  
   Raw signal data was saved as `X_RAW.csv` and labels as `Y_RAW.csv` for reference.

3. **Feature Extraction**  
   From each window (i.e., one sample of 128 time steps), the following statistical and frequency-domain features were extracted:
   
   For each signal (total 9), the following 11 features were computed:
   - Mean
   - Standard Deviation
   - Minimum
   - Maximum
   - Median
   - Skewness
   - Kurtosis
   - Energy (Sum of squares of FFT)
   - Entropy (of normalized FFT)
   - Mean Power (mean of FFT magnitudes)
   - Max Frequency Index (argmax of FFT)

   üëâ This results in **99 features total** (9 signals √ó 11 features each).

4. **Cleaned Data Saved**  
   The extracted features were saved as:
   - `X_CLEAN.csv` ‚Äî Cleaned feature matrix (shape: [samples, 99])
   - `Y_RAW.csv` ‚Äî Corresponding labels for each sample
   - `feature_names.csv` ‚Äî Names of the 99 features
   - `feature_names.txt` ‚Äî Plain text list of all features
   - `shape_info.csv` ‚Äî Shape of the final feature and label datasets

‚úîÔ∏è OUTPUT FILES:
----------------
- `X_CLEAN.csv` ‚Äî Extracted feature dataset
- `Y_RAW.csv` ‚Äî Activity labels
- `feature_names.csv` ‚Äî Feature names in CSV
- `feature_names.txt` ‚Äî Feature names in plain text
- `shape_info.csv` ‚Äî Dataset shapes

üìå This dataset is now ready for use in machine learning pipelines for Human Activity Recognition (HAR).

Author: Priyam Pandey  
Date: [24 TH MAY 2025]

---------------------
MODEL TRAINING AND EVALUATION
------------------------------------------------------------
* Loaded `X_CLEAN.csv` and `Y_CLEAN.csv` from the given base path

* Split data into training and testing sets

* Defined 12 classification models:
  `[LinearSVC, GradientBoosting, ExtraTrees, Bagging, ANN, RandomForest, CART, GaussianNB, DecisionTree, AdaBoost, KNN, LogisticRegression]`

* Trained each model using a loop

* Calculated metrics: Accuracy, F1 Score, Recall, Precision
* Saved classification report (`.txt`) and confusion matrix (`.png`) for each model in a separate folder named `<MODEL_NAME>_RESULTS`
* Compiled all model scores into a comparison table, saved as `model_comparison.txt` in the base path
------------------------------------------------------------
Folder Structure:


/stage1_BASELINE/
‚îÇ
‚îú‚îÄ‚îÄ X_CLEAN.csv
‚îú‚îÄ‚îÄ Y_CLEAN.csv











"""
base_path = "/Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/stage1_BASELINE"
with open(os.path.join(base_path, "README.txt"), "w") as f:
    f.write(readme_text)
print("README file created at:", os.path.join(base_path, "README.txt"))
# Save the README file in the base path

README file created at: /Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/stage1_BASELINE/README.txt


IMPORT LIBRAIES 


In [28]:
import os
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.metrics import (
    accuracy_score, f1_score, recall_score, precision_score,
    classification_report, confusion_matrix
)

# ML models
from sklearn.svm import LinearSVC
from sklearn.ensemble import (
    GradientBoostingClassifier, ExtraTreesClassifier, BaggingClassifier,
    RandomForestClassifier, AdaBoostClassifier
)
from sklearn.tree import DecisionTreeClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.neural_network import MLPClassifier


### LOADING DATA AND SPLITTING IT 

In [33]:
# Define base path
base_path1 = "/Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/stage1_BASELINE/DATA_EXTRACTED_WITH_99_FEATURES"

# Load clean data
X = pd.read_csv(os.path.join(base_path1, "X_CLEAN.csv"))
y = pd.read_csv(os.path.join(base_path1, "Y_CLEAN.csv")).values.ravel()

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)


### DEFINING MODEL DIRECTORY


In [34]:
base_path= "/Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/stage1_BASELINE/OUTPUT1"

In [35]:
# Dictionary of models
models = {
    "Linear_SVC": LinearSVC(max_iter=10000),
    "Gradient_Boosting": GradientBoostingClassifier(),
    "Extra_Trees": ExtraTreesClassifier(),
    "Bagged_Decision_Trees": BaggingClassifier(),
    "ANN": MLPClassifier(max_iter=1000),
    "Random_Forest": RandomForestClassifier(),
    "CART": DecisionTreeClassifier(),  # Same as Decision Tree
    "Gaussian_Naive_Bayes": GaussianNB(),
    "Decision_Tree": DecisionTreeClassifier(),
    "AdaBoost": AdaBoostClassifier(),
    "KNN": KNeighborsClassifier(),
    "Logistic_Regression": LogisticRegression(max_iter=10000)
}


### üîÅ Train, Evaluate, Save Report & Confusion Matrix

In [36]:
# Store results
results = []

# Loop through each model
for model_name, model in models.items():
    print(f"Training {model_name}...")
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)

    # Metrics
    acc = accuracy_score(y_test, y_pred)
    f1 = f1_score(y_test, y_pred, average='weighted')
    recall = recall_score(y_test, y_pred, average='weighted')
    precision = precision_score(y_test, y_pred, average='weighted')
    
    results.append([model_name, acc, f1, recall, precision])

    # Classification report & confusion matrix
    report = classification_report(y_test, y_pred)
    cm = confusion_matrix(y_test, y_pred)

    # Save in model-specific folder
    model_folder = os.path.join(base_path, f"{model_name}_RESULTS")
    os.makedirs(model_folder, exist_ok=True)

    with open(os.path.join(model_folder, f"{model_name}_classification_report.txt"), "w") as f:
        f.write(report)

    plt.figure(figsize=(8, 6))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
    plt.title(f"Confusion Matrix - {model_name}")
    plt.xlabel("Predicted")
    plt.ylabel("Actual")
    plt.savefig(os.path.join(model_folder, f"{model_name}_confusion_matrix.png"))
    plt.close()
    print(f"{model_name} completed. Results saved in {model_folder}")

Training Linear_SVC...
Linear_SVC completed. Results saved in /Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/stage1_BASELINE/OUTPUT1/Linear_SVC_RESULTS
Training Gradient_Boosting...
Gradient_Boosting completed. Results saved in /Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/stage1_BASELINE/OUTPUT1/Gradient_Boosting_RESULTS
Training Extra_Trees...
Extra_Trees completed. Results saved in /Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/stage1_BASELINE/OUTPUT1/Extra_Trees_RESULTS
Training Bagged_Decision_Trees...
Bagged_Decision_Trees completed. Results saved in /Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/stage1_BASELINE/OUTPUT1/Bagged_Decision_Trees_RESULTS
Training ANN...
ANN completed. Results saved in /Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/stage1_BASELINE/OUTPUT1/ANN_RESULTS
Training Random_Forest...
Random_Forest completed. Results saved in /Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/stage1_BASELINE/OUTPUT1/Random_Forest_RESULTS
T

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


AdaBoost completed. Results saved in /Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/stage1_BASELINE/OUTPUT1/AdaBoost_RESULTS
Training KNN...
KNN completed. Results saved in /Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/stage1_BASELINE/OUTPUT1/KNN_RESULTS
Training Logistic_Regression...
Logistic_Regression completed. Results saved in /Users/priyam/paper_recreation/HAR MODEL_OPTIMIZATION/stage1_BASELINE/OUTPUT1/Logistic_Regression_RESULTS


### SAVE COMPARISON TABLE 

In [37]:
# Create and sort comparison DataFrame
comparison_df = pd.DataFrame(results, columns=["Model", "Accuracy", "F1 Score", "Recall", "Precision"])
comparison_df.sort_values(by="F1 Score", ascending=False, inplace=True)

# Save to TXT
comparison_txt_path = os.path.join(base_path, "model_comparison.txt")
with open(comparison_txt_path, "w") as f:
    f.write(comparison_df.to_string(index=False))

print("‚úÖ All models trained and results saved successfully.")


‚úÖ All models trained and results saved successfully.


# model training 

In [19]:
!pip install tensorflow 


Collecting tensorflow
  Using cached tensorflow-2.19.0-cp312-cp312-macosx_12_0_arm64.whl.metadata (4.0 kB)
Collecting absl-py>=1.0.0 (from tensorflow)
  Downloading absl_py-2.3.0-py3-none-any.whl.metadata (2.4 kB)
Collecting astunparse>=1.6.0 (from tensorflow)
  Using cached astunparse-1.6.3-py2.py3-none-any.whl.metadata (4.4 kB)
Collecting flatbuffers>=24.3.25 (from tensorflow)
  Using cached flatbuffers-25.2.10-py2.py3-none-any.whl.metadata (875 bytes)
Collecting gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 (from tensorflow)
  Using cached gast-0.6.0-py3-none-any.whl.metadata (1.3 kB)
Collecting google-pasta>=0.1.1 (from tensorflow)
  Using cached google_pasta-0.2.0-py3-none-any.whl.metadata (814 bytes)
Collecting libclang>=13.0.0 (from tensorflow)
  Using cached libclang-18.1.1-1-py2.py3-none-macosx_11_0_arm64.whl.metadata (5.2 kB)
Collecting opt-einsum>=2.3.2 (from tensorflow)
  Using cached opt_einsum-3.4.0-py3-none-any.whl.metadata (6.3 kB)
Collecting termcolor>=1.1.0 (from tensorflow)
  

In [20]:
import os
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras import layers, models
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from tensorflow.keras.callbacks import EarlyStopping

# Replace with your desired base path to save results
base_path = "/Users/priyam/paper_recreation/UCI_HAR_FEATURE_ANALYSIS_MODELLING/deep_learning_models_results"

os.makedirs(base_path, exist_ok=True)



### defining the models 

In [32]:
def get_lstm(input_shape, n_classes):
    return models.Sequential([
        layers.Input(shape=input_shape),
        layers.LSTM(64),
        layers.Dense(64, activation='relu'),
        layers.Dense(n_classes, activation='softmax')
    ])
class Attention(tf.keras.layers.Layer):
    def __init__(self):
        super(Attention, self).__init__()
        self.score_dense = tf.keras.layers.Dense(1)

    def call(self, inputs):
        score = tf.nn.softmax(self.score_dense(inputs), axis=1)
        context = tf.reduce_sum(inputs * score, axis=1)
        return context

def get_attention_lstm(input_shape, n_classes):
    inputs = layers.Input(shape=input_shape)
    x = layers.LSTM(64, return_sequences=True)(inputs)
    x = Attention()(x)
    x = layers.Dense(64, activation='relu')(x)
    outputs = layers.Dense(n_classes, activation='softmax')(x)
    return models.Model(inputs, outputs)


def get_bilstm(input_shape, n_classes):
    return models.Sequential([
        layers.Input(shape=input_shape),
        layers.Bidirectional(layers.LSTM(64)),
        layers.Dense(64, activation='relu'),
        layers.Dense(n_classes, activation='softmax')
    ])

def get_cnn_lstm(input_shape, n_classes):
    return models.Sequential([
        layers.Input(shape=input_shape),
        layers.Conv1D(64, 3, activation='relu'),
        layers.MaxPooling1D(2),
        layers.LSTM(64),
        layers.Dense(64, activation='relu'),
        layers.Dense(n_classes, activation='softmax')
    ])

def get_gru(input_shape, n_classes):
    return models.Sequential([
        layers.Input(shape=input_shape),
        layers.GRU(64),
        layers.Dense(64, activation='relu'),
        layers.Dense(n_classes, activation='softmax')
    ])

def get_transformer(input_shape, n_classes):
    inputs = layers.Input(shape=input_shape)
    x = layers.Dense(64)(inputs)
    x = layers.MultiHeadAttention(num_heads=2, key_dim=32)(x, x)
    x = layers.GlobalAveragePooling1D()(x)
    x = layers.Dense(64, activation='relu')(x)
    outputs = layers.Dense(n_classes, activation='softmax')(x)
    return models.Model(inputs, outputs)


In [33]:
for model_name, model_func in model_factories.items():
    print(f"\nüü° Training {model_name}")
    
    model = model_func(input_shape, n_classes)
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

    # Callbacks
    early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

    history = model.fit(
        x_train, y_train,
        epochs=15,
        batch_size=64,
        validation_split=0.1,
        callbacks=[early_stop],
        verbose=1
    )

    # üîç Evaluate
    y_pred = model.predict(x_test)
    y_pred_classes = np.argmax(y_pred, axis=1)
    y_true = np.argmax(y_test, axis=1)

    # üìÅ Save folder
    model_dir = os.path.join(base_path, model_name)
    os.makedirs(model_dir, exist_ok=True)

    # üìä Save classification report
    # üìä Save classification report as CSV
    report = classification_report(y_true, y_pred_classes, output_dict=True)
    report_df = pd.DataFrame(report).transpose()
    report_df.to_csv(os.path.join(model_dir, "classification_report.csv"))
    
    # üì∏ Also save classification report as PNG
    fig, ax = plt.subplots(figsize=(10, len(report_df)*0.5 + 1))
    ax.axis('off')
    table = ax.table(cellText=np.round(report_df.values, 2),
                    colLabels=report_df.columns,
                    rowLabels=report_df.index,
                    loc='center',
                    cellLoc='center')
    table.auto_set_font_size(False)
    table.set_fontsize(10)
    table.scale(1, 1.5)
    
    plt.title(f"{model_name} Classification Report", fontsize=12)
    plt.tight_layout()
    plt.savefig(os.path.join(model_dir, "classification_report.png"))
    plt.close()
    
    
    # üß© Save confusion matrix
    cm = confusion_matrix(y_true, y_pred_classes)
    plt.figure(figsize=(6, 5))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
    plt.title(f"{model_name} Confusion Matrix")
    plt.xlabel("Predicted")
    plt.ylabel("True")
    plt.tight_layout()
    plt.savefig(os.path.join(model_dir, "confusion_matrix.png"))
    plt.close()

    # üíæ Save model
    model.save(os.path.join(model_dir, f"{model_name}.h5"))
    print(f"‚úÖ {model_name} training complete. Results saved in: {model_dir}")



üü° Training LSTM
Epoch 1/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m5s[0m 36ms/step - accuracy: 0.5330 - loss: 1.2663 - val_accuracy: 0.8811 - val_loss: 0.3775
Epoch 2/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 31ms/step - accuracy: 0.9140 - loss: 0.2657 - val_accuracy: 0.9053 - val_loss: 0.2478
Epoch 3/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 30ms/step - accuracy: 0.9405 - loss: 0.1715 - val_accuracy: 0.9126 - val_loss: 0.2251
Epoch 4/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 31ms/step - accuracy: 0.9364 - loss: 0.1589 - val_accuracy: 0.9320 - val_loss: 0.1721
Epoch 5/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 31ms/step - accuracy: 0.9419 - loss: 0.1385 - val_a



‚úÖ LSTM training complete. Results saved in: /Users/priyam/paper_recreation/UCI_HAR_FEATURE_ANALYSIS_MODELLING/deep_learning_models_results/LSTM

üü° Training Attention_LSTM
Epoch 1/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m5s[0m 34ms/step - accuracy: 0.6229 - loss: 1.1440 - val_accuracy: 0.9248 - val_loss: 0.2138
Epoch 2/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 33ms/step - accuracy: 0.9289 - loss: 0.1977 - val_accuracy: 0.9478 - val_loss: 0.1365
Epoch 3/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 33ms/step - accuracy: 0.9393 - loss: 0.1421 - val_accuracy: 0.9430 - val_loss: 0.1382
Epoch 4/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 34ms/step - accuracy: 0.9424 - loss: 0.1362 - val_accuracy: 0.9296 - val_loss: 0.1411
Epoch 5/15




‚úÖ Attention_LSTM training complete. Results saved in: /Users/priyam/paper_recreation/UCI_HAR_FEATURE_ANALYSIS_MODELLING/deep_learning_models_results/Attention_LSTM

üü° Training BiLSTM
Epoch 1/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m6s[0m 40ms/step - accuracy: 0.5855 - loss: 1.1343 - val_accuracy: 0.9248 - val_loss: 0.2244
Epoch 2/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m7s[0m 59ms/step - accuracy: 0.9141 - loss: 0.2303 - val_accuracy: 0.9345 - val_loss: 0.1602
Epoch 3/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m7s[0m 60ms/step - accuracy: 0.9397 - loss: 0.1507 - val_accuracy: 0.9454 - val_loss: 0.1373
Epoch 4/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m7s[0m 61ms/step - accuracy: 0.9483 - loss: 0.1251 - val_accuracy: 0.9405 - val_loss: 0.1422




‚úÖ BiLSTM training complete. Results saved in: /Users/priyam/paper_recreation/UCI_HAR_FEATURE_ANALYSIS_MODELLING/deep_learning_models_results/BiLSTM

üü° Training CNN_LSTM
Epoch 1/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m3s[0m 24ms/step - accuracy: 0.6313 - loss: 1.0112 - val_accuracy: 0.8968 - val_loss: 0.2791
Epoch 2/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m3s[0m 23ms/step - accuracy: 0.9142 - loss: 0.2308 - val_accuracy: 0.9114 - val_loss: 0.2216
Epoch 3/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m3s[0m 25ms/step - accuracy: 0.9431 - loss: 0.1506 - val_accuracy: 0.9333 - val_loss: 0.1503
Epoch 4/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m3s[0m 25ms/step - accuracy: 0.9475 - loss: 0.1236 - val_accuracy: 0.9466 - val_loss: 0.1306
Epoch 5/15
[1



‚úÖ CNN_LSTM training complete. Results saved in: /Users/priyam/paper_recreation/UCI_HAR_FEATURE_ANALYSIS_MODELLING/deep_learning_models_results/CNN_LSTM

üü° Training GRU
Epoch 1/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m5s[0m 36ms/step - accuracy: 0.5048 - loss: 1.3255 - val_accuracy: 0.7925 - val_loss: 0.5509
Epoch 2/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 30ms/step - accuracy: 0.8414 - loss: 0.4366 - val_accuracy: 0.8883 - val_loss: 0.2977
Epoch 3/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 33ms/step - accuracy: 0.8823 - loss: 0.2950 - val_accuracy: 0.9248 - val_loss: 0.1950
Epoch 4/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 31ms/step - accuracy: 0.9315 - loss: 0.1690 - val_accuracy: 0.9357 - val_loss: 0.1633
Epoch 5/15
[1m



‚úÖ GRU training complete. Results saved in: /Users/priyam/paper_recreation/UCI_HAR_FEATURE_ANALYSIS_MODELLING/deep_learning_models_results/GRU

üü° Training Transformer
Epoch 1/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m5s[0m 37ms/step - accuracy: 0.5570 - loss: 1.1594 - val_accuracy: 0.8677 - val_loss: 0.3338
Epoch 2/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m5s[0m 39ms/step - accuracy: 0.8715 - loss: 0.3206 - val_accuracy: 0.9041 - val_loss: 0.2417
Epoch 3/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m5s[0m 40ms/step - accuracy: 0.9031 - loss: 0.2544 - val_accuracy: 0.9150 - val_loss: 0.2032
Epoch 4/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m5s[0m 40ms/step - accuracy: 0.9124 - loss: 0.2198 - val_accuracy: 0.9260 - val_loss: 0.1845
Epoch 5/15
[1m11



‚úÖ Transformer training complete. Results saved in: /Users/priyam/paper_recreation/UCI_HAR_FEATURE_ANALYSIS_MODELLING/deep_learning_models_results/Transformer


### preparing dataset and model dictionary

In [30]:
# x, y_encoded should already be normalized and one-hot encoded
x_train, x_test, y_train, y_test = train_test_split(x, y_encoded, test_size=0.2, random_state=42)

input_shape = x_train.shape[1:]
n_classes = y_train.shape[1]

model_factories = {
    "LSTM": get_lstm,
    "Attention_LSTM": get_attention_lstm,
    "BiLSTM": get_bilstm,
    "CNN_LSTM": get_cnn_lstm,
    "GRU": get_gru,
    "Transformer": get_transformer
}


### train save and evaluate models 


In [31]:
for model_name, model_func in model_factories.items():
    print(f"\nüü° Training {model_name}")
    
    model = model_func(input_shape, n_classes)
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

    # Callbacks
    early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

    history = model.fit(
        x_train, y_train,
        epochs=15,
        batch_size=64,
        validation_split=0.1,
        callbacks=[early_stop],
        verbose=1
    )

    # üîç Evaluate
    y_pred = model.predict(x_test)
    y_pred_classes = np.argmax(y_pred, axis=1)
    y_true = np.argmax(y_test, axis=1)

    # üìÅ Save folder
    model_dir = os.path.join(base_path, model_name)
    os.makedirs(model_dir, exist_ok=True)

    # üìä Save classification report
    # üìä Save classification report as CSV
    report = classification_report(y_true, y_pred_classes, output_dict=True)
    report_df = pd.DataFrame(report).transpose()
    report_df.to_csv(os.path.join(model_dir, "classification_report.csv"))
    
    # üì∏ Also save classification report as PNG
    fig, ax = plt.subplots(figsize=(10, len(report_df)*0.5 + 1))
    ax.axis('off')
    table = ax.table(cellText=np.round(report_df.values, 2),
                    colLabels=report_df.columns,
                    rowLabels=report_df.index,
                    loc='center',
                    cellLoc='center')
    table.auto_set_font_size(False)
    table.set_fontsize(10)
    table.scale(1, 1.5)
    
    plt.title(f"{model_name} Classification Report", fontsize=12)
    plt.tight_layout()
    plt.savefig(os.path.join(model_dir, "classification_report.png"))
    plt.close()
    
    
    # üß© Save confusion matrix
    cm = confusion_matrix(y_true, y_pred_classes)
    plt.figure(figsize=(6, 5))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
    plt.title(f"{model_name} Confusion Matrix")
    plt.xlabel("Predicted")
    plt.ylabel("True")
    plt.tight_layout()
    plt.savefig(os.path.join(model_dir, "confusion_matrix.png"))
    plt.close()

    # üíæ Save model
    model.save(os.path.join(model_dir, f"{model_name}.h5"))
    print(f"‚úÖ {model_name} training complete. Results saved in: {model_dir}")



üü° Training LSTM
Epoch 1/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m5s[0m 32ms/step - accuracy: 0.5133 - loss: 1.2729 - val_accuracy: 0.8422 - val_loss: 0.3887
Epoch 2/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 32ms/step - accuracy: 0.8685 - loss: 0.3384 - val_accuracy: 0.9284 - val_loss: 0.2258
Epoch 3/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m3s[0m 30ms/step - accuracy: 0.9335 - loss: 0.1874 - val_accuracy: 0.9393 - val_loss: 0.1678
Epoch 4/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 30ms/step - accuracy: 0.9501 - loss: 0.1380 - val_accuracy: 0.9430 - val_loss: 0.1568
Epoch 5/15
[1m116/116[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 30ms/step - accuracy: 0.9497 - loss: 0.1353 - val_a



‚úÖ LSTM training complete. Results saved in: /Users/priyam/paper_recreation/UCI_HAR_FEATURE_ANALYSIS_MODELLING/deep_learning_models_results/LSTM

üü° Training Attention_LSTM
Epoch 1/15


ValueError: Exception encountered when calling Attention.call().

[1mtf.function only supports singleton tf.Variables created on the first call. Make sure the tf.Variable is only created once or created outside tf.function. See https://www.tensorflow.org/guide/function#creating_tfvariables for more information.[0m

Arguments received by Attention.call():
  ‚Ä¢ inputs=tf.Tensor(shape=(None, 128, 64), dtype=float32)

In [None]:
report = classification_report(y_true, y_pred_classes, output_dict=True)
report_df = pd.DataFrame(report).transpose()
report_df.to_csv(os.path.join(model_dir, "classification_report.csv"))
print(report_df)