### [Dataset](https://archive.ics.uci.edu/dataset/256/daily+and+sports+activities)

1. The dataset comprises motion sensor data of 19 daily and sports activities each performed by 8 subjects in their own style for 5 minutes. Five Xsens MTx units are used on the torso, arms, and legs.
2. Each of the 19 activities is performed by eight subjects (4 female, 4 male, between the ages 20 and 30) for 5 minutes.
3. The subjects are asked to perform the activities in their own style and were not restricted on how the activities should be performed.
4. The activities are performed at the Bilkent University Sports Hall, in the Electrical and Electronics Engineering Building, and in a flat outdoor area on campus. 
5. The 5-min signals are divided into 5-sec segments so that 480(=60x8) signal segments are obtained for each activity.

![Activities](https://miro.medium.com/v2/resize:fit:3840/format:webp/1*mFwX5XN57l3UjuncAD5CNA.jpeg)

The 19 activities are: 
* sitting (A1), 
* standing (A2), 
* lying on back and on right side (A3 and A4), 
* ascending and descending stairs (A5 and A6), 
* standing in an elevator still (A7) 
* and moving around in an elevator (A8), 
* walking in a parking lot (A9), 
* walking on a treadmill with a speed of 4 km/h (in flat and 15 deg inclined positions) (A10 and A11),
* running on a treadmill with a speed of 8 km/h (A12), 
* exercising on a stepper (A13), 
* exercising on a cross trainer (A14), 
* cycling on an exercise bike in horizontal and vertical positions (A15 and A16),
* rowing (A17), 
* jumping (A18), 
* and playing basketball (A19).

Data volume:

- 19 activities (a) (in the order given above)
-  8 subjects   (p)
- 60 segments   (s)
-  5 units on torso (T), right arm (RA), left arm (LA), right leg (RL), left leg (LL)
-  9 sensor readings on each unit (x,y,z accelerometers, x,y,z gyroscopes, x,y,z magnetometers)

File structure:
* Folders a01, a02, ..., a19 contain data recorded from the 19 activities.
* For each activity, the subfolders p1, p2, ..., p8 contain data from each of the 8 subjects.
* In each subfolder, there are 60 text files s01, s02, ..., s60, one for each segment.
* In each text file, there are 5 units x 9 sensors = 45 columns and 5 sec x 25 Hz = 125 rows.
* Each column contains the 125 samples of data acquired from one of the sensors of one of the units over a period of 5 sec.
* Each row contains data acquired from all of the 45 sensor axes at a particular sampling instant separated by commas.

1. columns  1-9  correspond to the sensors in unit 1 (T), 
2. columns 10-18 correspond to the sensors in unit 2 (RA), 
3. columns 19-27 correspond to the sensors in unit 3 (LA), 
4. columns 28-36 correspond to the sensors in unit 4 (RL), 
5. columns 37-45 correspond to the sensors in unit 5 (LL). 

In [1]:
import os
import numpy as np
import pandas as pd

In [2]:
activities = os.listdir("data")
print(activities)

['a01', 'a02', 'a03', 'a04', 'a05', 'a06', 'a07', 'a08', 'a09', 'a10', 'a11', 'a12', 'a13', 'a14', 'a15', 'a16', 'a17', 'a18', 'a19']


In [3]:
subjects = os.listdir(f"data/{activities[0]}")
print(subjects)

['p1', 'p2', 'p3', 'p4', 'p5', 'p6', 'p7', 'p8']


In [4]:
segments = os.listdir(f"data/{activities[0]}/{subjects[0]}")
print(segments)

['s01.txt', 's02.txt', 's03.txt', 's04.txt', 's05.txt', 's06.txt', 's07.txt', 's08.txt', 's09.txt', 's10.txt', 's11.txt', 's12.txt', 's13.txt', 's14.txt', 's15.txt', 's16.txt', 's17.txt', 's18.txt', 's19.txt', 's20.txt', 's21.txt', 's22.txt', 's23.txt', 's24.txt', 's25.txt', 's26.txt', 's27.txt', 's28.txt', 's29.txt', 's30.txt', 's31.txt', 's32.txt', 's33.txt', 's34.txt', 's35.txt', 's36.txt', 's37.txt', 's38.txt', 's39.txt', 's40.txt', 's41.txt', 's42.txt', 's43.txt', 's44.txt', 's45.txt', 's46.txt', 's47.txt', 's48.txt', 's49.txt', 's50.txt', 's51.txt', 's52.txt', 's53.txt', 's54.txt', 's55.txt', 's56.txt', 's57.txt', 's58.txt', 's59.txt', 's60.txt']


In [5]:
header=["T_xacc", "T_yacc", "T_zacc", "T_xgyro", "T_ygyro", "T_zgyro", "T_xmag", "T_ymag", "T_zmag",
        "RA_xacc","RA_yacc","RA_zacc","RA_xgyro","RA_ygyro","RA_zgyro","RA_xmag","RA_ymag","RA_zmag",
        "LA_xacc","LA_yacc","LA_zacc","LA_xgyro","LA_ygyro","LA_zgyro","LA_xmag","LA_ymag","LA_zmag",
        "RL_xacc","RL_yacc","RL_zacc","RL_xgyro","RL_ygyro","RL_zgyro","RL_xmag","RL_ymag","RL_zmag",
        "LL_xacc","LL_yacc","LL_zacc","LL_xgyro","LL_ygyro","LL_zgyro","LL_xmag","LL_ymag","LL_zmag",]

In [6]:
df = pd.read_csv(f"data/{activities[0]}/{subjects[0]}/{segments[0]}",names=header)
df["Segment"]=segments[0]
df["Subject"]=subjects[0]
df["Activity"]=activities[0]

In [7]:
df

Unnamed: 0,T_xacc,T_yacc,T_zacc,T_xgyro,T_ygyro,T_zgyro,T_xmag,T_ymag,T_zmag,RA_xacc,...,LL_zacc,LL_xgyro,LL_ygyro,LL_zgyro,LL_xmag,LL_ymag,LL_zmag,Segment,Subject,Activity
0,8.1305,1.0349,5.4217,-0.009461,0.001915,-0.003424,-0.78712,-0.069654,0.15730,0.70097,...,2.6220,-0.000232,-0.012092,-0.004457,0.74017,0.30053,-0.057730,s01.txt,p1,a01
1,8.1305,1.0202,5.3843,-0.009368,0.023485,0.001953,-0.78717,-0.068275,0.15890,0.71829,...,2.6218,-0.014784,-0.016477,0.002789,0.73937,0.30183,-0.057514,s01.txt,p1,a01
2,8.1604,1.0201,5.3622,0.015046,0.014330,0.000204,-0.78664,-0.068277,0.15879,0.69849,...,2.6366,-0.012770,0.005717,-0.007918,0.73955,0.30052,-0.057219,s01.txt,p1,a01
3,8.1603,1.0052,5.3770,0.006892,0.018045,0.005649,-0.78529,-0.069849,0.15912,0.72799,...,2.6070,-0.005725,0.009620,0.006555,0.74029,0.30184,-0.057750,s01.txt,p1,a01
4,8.1605,1.0275,5.3473,0.008811,0.030433,-0.005346,-0.78742,-0.068796,0.15916,0.71572,...,2.6218,-0.003929,-0.008371,0.002816,0.73845,0.30090,-0.057527,s01.txt,p1,a01
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
120,7.9515,1.1242,5.6378,-0.010269,0.023425,-0.009024,-0.79241,-0.069590,0.13582,0.64127,...,2.6100,-0.000123,-0.002476,-0.021531,0.73889,0.30092,-0.057689,s01.txt,p1,a01
121,7.9442,1.1466,5.6080,0.006786,0.001938,0.002946,-0.79034,-0.069965,0.13456,0.61924,...,2.6247,0.001349,0.006134,0.004760,0.73996,0.30132,-0.057530,s01.txt,p1,a01
122,7.9517,1.1466,5.6081,0.000527,0.023588,0.010141,-0.79174,-0.069147,0.13343,0.61443,...,2.6247,-0.005735,-0.001302,-0.007031,0.73945,0.30342,-0.056789,s01.txt,p1,a01
123,7.9743,1.1542,5.5038,0.025818,0.005417,0.006603,-0.79166,-0.070216,0.13478,0.60929,...,2.6246,-0.020267,0.000585,0.000255,0.74030,0.30027,-0.056704,s01.txt,p1,a01


In [8]:
df_list = []
for activity in activities:
    for subject in subjects:
        for segment in segments:
            temp_df=pd.read_csv(f"data/{activity}/{subject}/{segment}",names=header)
            temp_df["Segment"]=segment
            temp_df["Subject"]=subject
            temp_df["Activity"]=activity
            df_list.append(temp_df)
print(len(df_list))

9120


In [9]:
iot_df = pd.concat(df_list, ignore_index=True, sort=False)

In [10]:
iot_df.head()

Unnamed: 0,T_xacc,T_yacc,T_zacc,T_xgyro,T_ygyro,T_zgyro,T_xmag,T_ymag,T_zmag,RA_xacc,...,LL_zacc,LL_xgyro,LL_ygyro,LL_zgyro,LL_xmag,LL_ymag,LL_zmag,Segment,Subject,Activity
0,8.1305,1.0349,5.4217,-0.009461,0.001915,-0.003424,-0.78712,-0.069654,0.1573,0.70097,...,2.622,-0.000232,-0.012092,-0.004457,0.74017,0.30053,-0.05773,s01.txt,p1,a01
1,8.1305,1.0202,5.3843,-0.009368,0.023485,0.001953,-0.78717,-0.068275,0.1589,0.71829,...,2.6218,-0.014784,-0.016477,0.002789,0.73937,0.30183,-0.057514,s01.txt,p1,a01
2,8.1604,1.0201,5.3622,0.015046,0.01433,0.000204,-0.78664,-0.068277,0.15879,0.69849,...,2.6366,-0.01277,0.005717,-0.007918,0.73955,0.30052,-0.057219,s01.txt,p1,a01
3,8.1603,1.0052,5.377,0.006892,0.018045,0.005649,-0.78529,-0.069849,0.15912,0.72799,...,2.607,-0.005725,0.00962,0.006555,0.74029,0.30184,-0.05775,s01.txt,p1,a01
4,8.1605,1.0275,5.3473,0.008811,0.030433,-0.005346,-0.78742,-0.068796,0.15916,0.71572,...,2.6218,-0.003929,-0.008371,0.002816,0.73845,0.3009,-0.057527,s01.txt,p1,a01


In [11]:
iot_df.to_csv("iot_sport_activities.csv",sep=',', header=True, index=False)

In [12]:
df = pd.read_csv("iot_sport_activities.csv")
df

Unnamed: 0,T_xacc,T_yacc,T_zacc,T_xgyro,T_ygyro,T_zgyro,T_xmag,T_ymag,T_zmag,RA_xacc,...,LL_zacc,LL_xgyro,LL_ygyro,LL_zgyro,LL_xmag,LL_ymag,LL_zmag,Segment,Subject,Activity
0,8.13050,1.03490,5.42170,-0.009461,0.001915,-0.003424,-0.78712,-0.069654,0.157300,0.70097,...,2.6220,-0.000232,-0.012092,-0.004457,0.74017,0.30053,-0.057730,s01.txt,p1,a01
1,8.13050,1.02020,5.38430,-0.009368,0.023485,0.001953,-0.78717,-0.068275,0.158900,0.71829,...,2.6218,-0.014784,-0.016477,0.002789,0.73937,0.30183,-0.057514,s01.txt,p1,a01
2,8.16040,1.02010,5.36220,0.015046,0.014330,0.000204,-0.78664,-0.068277,0.158790,0.69849,...,2.6366,-0.012770,0.005717,-0.007918,0.73955,0.30052,-0.057219,s01.txt,p1,a01
3,8.16030,1.00520,5.37700,0.006892,0.018045,0.005649,-0.78529,-0.069849,0.159120,0.72799,...,2.6070,-0.005725,0.009620,0.006555,0.74029,0.30184,-0.057750,s01.txt,p1,a01
4,8.16050,1.02750,5.34730,0.008811,0.030433,-0.005346,-0.78742,-0.068796,0.159160,0.71572,...,2.6218,-0.003929,-0.008371,0.002816,0.73845,0.30090,-0.057527,s01.txt,p1,a01
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1139995,16.00800,-2.01660,-0.58220,2.027100,1.656800,0.584410,-0.73195,-0.476070,-0.013494,16.43100,...,-4.5931,-0.230600,0.180890,-2.082300,0.56876,0.39409,0.518170,s60.txt,p8,a19
1139996,8.28230,-0.69936,0.48698,2.887900,1.603900,-0.020417,-0.73055,-0.472470,-0.012385,7.01620,...,-4.1113,1.817200,0.312510,-1.021600,0.53822,0.43745,0.504010,s60.txt,p8,a19
1139997,2.71210,0.49967,0.84053,1.996400,1.465800,-0.072605,-0.72533,-0.478630,-0.012810,-4.55400,...,1.2942,1.842100,0.349400,-0.282080,0.51752,0.47280,0.489250,s60.txt,p8,a19
1139998,2.03080,-0.71349,-0.11264,1.766100,1.010300,-0.102120,-0.71933,-0.482240,-0.011469,-6.85690,...,-12.3640,-0.150260,1.563400,-0.368450,0.50440,0.51029,0.446480,s60.txt,p8,a19


In [13]:
x=df.iloc[:,:-3]
x.head()

Unnamed: 0,T_xacc,T_yacc,T_zacc,T_xgyro,T_ygyro,T_zgyro,T_xmag,T_ymag,T_zmag,RA_xacc,...,RL_zmag,LL_xacc,LL_yacc,LL_zacc,LL_xgyro,LL_ygyro,LL_zgyro,LL_xmag,LL_ymag,LL_zmag
0,8.1305,1.0349,5.4217,-0.009461,0.001915,-0.003424,-0.78712,-0.069654,0.1573,0.70097,...,-0.036453,-2.8071,-9.0812,2.622,-0.000232,-0.012092,-0.004457,0.74017,0.30053,-0.05773
1,8.1305,1.0202,5.3843,-0.009368,0.023485,0.001953,-0.78717,-0.068275,0.1589,0.71829,...,-0.034005,-2.8146,-9.0737,2.6218,-0.014784,-0.016477,0.002789,0.73937,0.30183,-0.057514
2,8.1604,1.0201,5.3622,0.015046,0.01433,0.000204,-0.78664,-0.068277,0.15879,0.69849,...,-0.036489,-2.8221,-9.0886,2.6366,-0.01277,0.005717,-0.007918,0.73955,0.30052,-0.057219
3,8.1603,1.0052,5.377,0.006892,0.018045,0.005649,-0.78529,-0.069849,0.15912,0.72799,...,-0.036151,-2.8071,-9.0811,2.607,-0.005725,0.00962,0.006555,0.74029,0.30184,-0.05775
4,8.1605,1.0275,5.3473,0.008811,0.030433,-0.005346,-0.78742,-0.068796,0.15916,0.71572,...,-0.033807,-2.8146,-9.0737,2.6218,-0.003929,-0.008371,0.002816,0.73845,0.3009,-0.057527


In [14]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaled_data = scaler.fit_transform(x) # scaled_data is a numpy array

X = pd.DataFrame(scaled_data)
X.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,35,36,37,38,39,40,41,42,43,44
0,0.064693,0.703743,0.749763,-0.008394,-0.017047,-0.000361,-0.529864,-0.385927,1.151055,-0.611479,...,-0.291049,0.777418,-0.989491,0.929184,0.017744,-0.082635,-0.001729,0.741803,0.012987,-0.342024
1,0.064693,0.698139,0.739193,-0.008277,0.014167,0.016941,-0.530004,-0.381876,1.15534,-0.608504,...,-0.28428,0.776128,-0.988226,0.929125,0.000985,-0.091704,0.004524,0.739699,0.016374,-0.34142
2,0.069997,0.698101,0.732947,0.022471,0.000919,0.011313,-0.528516,-0.381882,1.155045,-0.611905,...,-0.291149,0.774837,-0.99074,0.933488,0.003304,-0.045804,-0.004715,0.740172,0.012961,-0.340594
3,0.069979,0.69242,0.73713,0.012202,0.006295,0.028834,-0.524725,-0.3865,1.155929,-0.606837,...,-0.290214,0.777418,-0.989475,0.924763,0.011418,-0.037732,0.007773,0.742118,0.0164,-0.34208
4,0.070015,0.700922,0.728736,0.014619,0.024221,-0.006546,-0.530706,-0.383406,1.156036,-0.608945,...,-0.283733,0.776128,-0.988226,0.929125,0.013486,-0.07494,0.004547,0.737279,0.013951,-0.341456


In [15]:
y=df.iloc[:,-1]
y.head()

0    a01
1    a01
2    a01
3    a01
4    a01
Name: Activity, dtype: object

In [16]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

In [17]:
from sklearn.linear_model import LogisticRegression

# Create and train the logistic regression model
lr = LogisticRegression()
lr.fit(X_train, y_train)

# Make predictions on the test set
lr_pred = lr.predict(X_test)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


In [18]:
from sklearn.tree import DecisionTreeClassifier

# Create and train the decision tree model
dt = DecisionTreeClassifier()
dt.fit(X_train, y_train)

# Make predictions on the test set
dt_pred = dt.predict(X_test)

In [None]:
from sklearn.ensemble import RandomForestClassifier

# Create and train the random forest model
rf = RandomForestClassifier()
rf.fit(X_train, y_train)

# Make predictions on the test set
rf_pred = rf.predict(X_test)

In [None]:
from sklearn.svm import SVC

# Create and train the SVM model
svc = SVC()
svc.fit(X_train, y_train)

# Make predictions on the test set
svc_pred = svc.predict(X_test)

In [None]:
from sklearn.naive_bayes import GaussianNB

# Create and train the Naive Bayes model
nb = GaussianNB()
nb.fit(X_train, y_train)

# Make predictions on the test set
nb_pred = nb.predict(X_test)

In [None]:
from sklearn.neighbors import KNeighborsClassifier

# Create and train the k-NN model
knn = KNeighborsClassifier()
knn.fit(X_train, y_train)

# Make predictions on the test set
knn_pred = knn.predict(X_test)

In [None]:
from sklearn.neural_network import MLPClassifier

# Create and train the neural network model
nn = MLPClassifier()
nn.fit(X_train, y_train)

# Make predictions on the test set
nn_pred = nn.predict(X_test)

In [None]:
from sklearn.metrics import accuracy_score


lr_accuracy = accuracy_score(y_test, lr_pred)
print(f"Logistic Regression Classifier: {lr_accuracy:.2f}")

dt_accuracy = accuracy_score(y_test, dt_pred)
print(f"Decision Trees Classifier: {dt_accuracy:.2f}")

rf_accuracy = accuracy_score(y_test, rf_pred)
print(f"Random Forest Classifier: {rf_accuracy:.2f}")

svc_accuracy = accuracy_score(y_test, svc_pred)
print(f"Support Vector Machine Classifier: {svc_accuracy:.2f}")

nb_accuracy = accuracy_score(y_test, nb_pred)
print(f"Naive Bayes Classifier: {nb_accuracy:.2f}")

knn_accuracy = accuracy_score(y_test, knn_pred)
print(f"k-Nearest Neighbors Classifier: {knn_accuracy:.2f}")

nn_accuracy = accuracy_score(y_test, nn_pred)
print(f"Neural Networks (MLP) Classifier: {nn_accuracy:.2f}")

Logistic Regression Classifier: 0.86
Decision Trees Classifier: 0.99
Random Forest Classifier: 1.00
Support Vector Machine Classifier: 0.99
Naive Bayes Classifier: 0.88
k-Nearest Neighbors Classifier: 0.99
Neural Networks (MLP) Classifier: 0.99
