# Neural Network Classification task - Room occupancy

The goal of this task is to predict a room occupancy based on Temperature, Humidity, Light and CO2 measurements using neural networks in Keras. Ground-truth occupancy was obtained from time stamped pictures that were taken every minute.

## Data source
[http://archive.ics.uci.edu/ml/datasets/Occupancy+Detection+](http://archive.ics.uci.edu/ml/datasets/Occupancy+Detection+)

## Feature description
* **Date** - time stamp in the following format: year-month-day hour:minute:second
* **Temperature** - temperature in degrees of Celsius
* **Relative Humidity** - Relative humidity in %
* **Light** - light intensity in Lux
* **CO2** - amount of CO2 in the air, measured in ppm
* **Humidity Ratio** - Humidity ratio derived from temperature and relative humidity, in kgwater-vapor/kg-air
* **Occupancy** - a target binary value, 0 for not occupied, 1 for occupied status

In [None]:
import pandas as pd
data = pd.read_csv('https://raw.githubusercontent.com/mlcollege/introduction-to-ml/master/data/occupancy.csv', sep=',')
data.head()

## Neural Network Classifier
Implement a neural network classifier based on all numerical features.

### Data preparation

In [None]:
from sklearn.model_selection import train_test_split

# Prepare the data by separating features and target variable
X_all = data[['Temperature', 'Humidity', 'Light', 'CO2', 'HumidityRatio']]
y_all = data['Occupancy']

# Split data into training (90%) and test (10%) sets
X_train, X_test, y_train, y_test = train_test_split(
    X_all,
    y_all,
    random_state=1,
    test_size=0.1)

print('Train size: {}'.format(len(X_train)))
print('Test size: {}'.format(len(X_test)))

Standardize the features

In [None]:
from sklearn.preprocessing import StandardScaler

# Standardize features: fit scaler on training data and apply to both train and test
scaler = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

Since the target values are binary, we don't need to encode them in one-hot representation.

In [None]:
print(y_test[:5])

### Training a classifier

Design and train a classification model. Use the [binary crossentropy](https://keras.io/losses/) loss function and Sigmoid output function. Experiment with various architectures and [optimizers](https://keras.io/optimizers/).

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout

# TODO: Design a neural network architecture
# 1. Create a Sequential model
# 2. Add Dense layers with different activation functions
# 3. Use tanh or relu for hidden layers
# 4. Use sigmoid for output layer (binary classification)
# 5. Experiment with different architectures

Compile the model

In [None]:
# TODO: Compile the model
# Use binary_crossentropy loss (suitable for binary classification)
# Try different optimizers (adam, sgd, rmsprop)
# Include accuracy as a metric

Train the model

In [None]:
# TODO: Train the model
# Use model.fit() with appropriate parameters:
# - batch_size: typically 32, 64, 128
# - epochs: number of passes through training data (10-50)
# - validation_data: use test set for validation during training
# - verbose: set to 1 to see training progress

### Evaluate the model

Predict target values and convert probabilities to binary values.

In [None]:
# TODO: Make predictions on test set
# 1. Use model.predict() to get probability predictions
# 2. Convert probabilities to binary values (threshold = 0.5)


Print evaluation metrics

In [None]:
# TODO: Print evaluation metrics
# - Calculate and print accuracy score
# - Print detailed classification report with precision, recall, F1-score
# - Evaluate on both test and training sets