This notebook is my rerport for Occupacny Prediction. In this project, I concentrate on logical flow for creating new features. The dataset is from UCI Machine Learning Repository.
The goal of this project is to predict whether a room is occupied or not based on environmental measures such as temperature, light, CO2 and so on.
Occupacny data is divided into three files, one for training and the others for testing. In train file, a column 'Occupancy' shows whether there are one or more people in a room (0 for not occupied, 1 for occupied). The occupancy has been measured every minutes and train file covers a week's worth of data.
As this dataset is timeseries, It is important to discover the pattern of each feature over time.
There exists the case that some continuous features have the same values in both cases (Occupancy=0 and Occupancy=1). As a result, the machine(our model) can't distinguish where these values belong to. So I made the machine discern these values by creating some features. I explained this part in 3.3.4 of notebook concretely.