# Neural Network Classification task - Room occupancy

The goal of this task is to predict a room occupancy based on Temperature, Humidity, Light and CO2 measurements using neural networks in Keras. Ground-truth occupancy was obtained from time stamped pictures that were taken every minute.

## Data source
[http://archive.ics.uci.edu/ml/datasets/Occupancy+Detection+](http://archive.ics.uci.edu/ml/datasets/Occupancy+Detection+)

## Feature description
* **Date** - time stamp in the following format: year-month-day hour:minute:second
* **Temperature** - temperature in degrees of Celsius
* **Relative Humidity** - Relative humidity in %
* **Light** - light intensity in Lux
* **CO2** - amount of CO2 in the air, measured in ppm
* **Humidity Ratio** - Humidity ratio derived from temperature and relative humidity, in kgwater-vapor/kg-air
* **Occupancy** - a target binary value, 0 for not occupied, 1 for occupied status

In [None]:
import pandas as pd
data = pd.read_csv('https://raw.githubusercontent.com/mlcollege/introduction-to-ml/master/data/occupancy.csv', sep=',')
data.head()

## Neural Network Classifier
Implement a neural network classifier based on all numerical features.

### Data preparation

In [None]:
from sklearn.model_selection import train_test_split

# Prepare the data by separating features and target variable
X_all = data[['Temperature', 'Humidity', 'Light', 'CO2', 'HumidityRatio']]
y_all = data['Occupancy']

# Split data into training (90%) and test (10%) sets
X_train, X_test, y_train, y_test = train_test_split(
    X_all,
    y_all,
    random_state=1,
    test_size=0.1)

print('Train size: {}'.format(len(X_train)))
print('Test size: {}'.format(len(X_test)))

Standardize the features

In [None]:
from sklearn.preprocessing import StandardScaler

# Standardize features: fit scaler on training data and apply to both train and test
scaler = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

Since the target values are binary, we don't need to encode them in one-hot representation.

In [None]:
print(y_test[:5])

### Training a classifier

Design and train a classification model. Use the [binary crossentropy](https://keras.io/losses/) loss function and Sigmoid output function. Experiment with various architectures and [optimizers](https://keras.io/optimizers/).

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout

# Design a neural network for binary classification
model = Sequential()

# Add first hidden layer with tanh activation
model.add(Dense(10))
model.add(Activation('tanh'))

# Add output layer with sigmoid activation for binary classification
model.add(Dense(1))
model.add(Activation('sigmoid'))

Compile the model

In [None]:
# Compile the model with binary crossentropy loss and adam optimizer
model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

Train the model

In [None]:
# Train the model with validation on test set
model.fit(X_train, y_train,
          batch_size = 128, epochs = 10, verbose=1,
          validation_data=(X_test, y_test))

### Evaluate the model

Predict target values and convert probabilities to binary values.

In [None]:
# Make predictions on test set and convert probabilities to binary values
from numpy import int32
y_pred = model.predict(X_test)
y_pred = (y_pred >= 0.5).astype(int32)

Print evaluation metrics

In [None]:
# Evaluate the model on the test set
from sklearn import metrics
from sklearn.metrics import accuracy_score

print ("Test accuracy: {:.4f}".format(accuracy_score(y_test, y_pred)))
print ()
print(metrics.classification_report(y_test, y_pred, digits=4))

In [None]:
# Evaluate the model on the training set
y_pred = model.predict(X_train)
y_pred = (y_pred >= 0.5).astype(int32)

print ("Train accuracy: {:.4f}".format(accuracy_score(y_train, y_pred)))
print ()
print(metrics.classification_report(y_train, y_pred, digits=4))