### Task

Based on accelerometer data from a mobile phone, it is necessary to classify the activity that a person is engaged in: walking, standing, running, or climbing stairs. Utilize SVM and Random Forest algorithms from the scikit-learn library. You can use accelerometer readings as features, but to enhance the algorithm's performance, we can first prepare our dataset and calculate time domain features.

In [1]:
import os
import pandas as pd
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report
from sklearn.model_selection import train_test_split
import glob

### 1. Define Data Loading Function

In [2]:
# Function to load data from a directory of CSV files
def load_data(root):
    data = pd.DataFrame()
    for activity in os.listdir(root):
        activity_path = os.path.join(root, activity)
        if not os.path.isdir(activity_path):
            continue
        for file_path in glob.glob(os.path.join(activity_path, '*.csv')):
            df = pd.read_csv(file_path)
            df['activity'] = activity
            data = pd.concat([data, df], ignore_index=True)
    return data

### 2. Define Data Preprocessing Function

In [3]:
# Function to preprocess data (e.g., feature engineering)
def preprocess_data(data):
    X = data.drop('activity', axis=1)
    X['time_mean'] = X.mean(axis=1)
    y = data['activity']
    return X, y

### 3. Define Model Training and Evaluation Function

In [4]:
# Function to train and evaluate a model
def train_and_evaluate_model(X_train, X_test, y_train, y_test, model):
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    report = classification_report(y_test, y_pred)
    return accuracy, report

### 4. Load and Preprocess Data

In [5]:
root = 'data'
data = load_data(root)
X, y = preprocess_data(data)

### 5. Split Data into Training, Validation, and Test Sets

In [6]:
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.3, random_state=42)
X_validation, X_test, y_validation, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)


### 6. Create and Train Machine Learning Models

In [7]:

svm_model = SVC()
rfc_model = RandomForestClassifier()


### 7. Train and Evaluate Models

In [8]:
svm_accuracy, svm_report = train_and_evaluate_model(X_train, X_validation, y_train, y_validation, svm_model)
rfc_accuracy, rfc_report = train_and_evaluate_model(X_train, X_validation, y_train, y_validation, rfc_model)


### 8.  Display Results

In [9]:
print("Accuracy (SVM):", svm_accuracy)
print("SVM Results:")
print(svm_report)

Accuracy (SVM): 0.896248151587056
SVM Results:
              precision    recall  f1-score   support

        idle       0.95      0.99      0.97      4585
     running       0.93      0.90      0.92     15421
      stairs       1.00      0.00      0.00       686
     walking       0.81      0.91      0.86      8387

    accuracy                           0.90     29079
   macro avg       0.92      0.70      0.69     29079
weighted avg       0.90      0.90      0.89     29079



In [10]:
print("Accuracy (Random Forest):", rfc_accuracy)
print("Random Forest Results:")
print(rfc_report)

Accuracy (Random Forest): 0.9995529419856254
Random Forest Results:
              precision    recall  f1-score   support

        idle       1.00      1.00      1.00      4585
     running       1.00      1.00      1.00     15421
      stairs       1.00      0.99      0.99       686
     walking       1.00      1.00      1.00      8387

    accuracy                           1.00     29079
   macro avg       1.00      1.00      1.00     29079
weighted avg       1.00      1.00      1.00     29079

