# Occupancy Detection Data Set (UCI machine)
**Paper : Accurate occupancy detection of an office room from light, temperature, humidity and CO2 measurements using statistical learning models. Luis M. Candanedo, VÃ©ronique Feldheim. Energy and Buildings. Volume 112, 15 January 2016, Pages 28-39.**

#### Abstract:
Experimental data used for binary classification (room occupancy) from Temperature,Humidity,Light and CO2. Ground-truth occupancy was obtained from time stamped pictures that were taken every minute.

**본 논문을 토대로 MLP에서 스칼라 값을 구해 Occupancy를 측정하는 뉴럴네트워크를 구성**

In [1]:
import os,sys,tensorflow,matplotlib
import pandas as pd, numpy as np
from sklearn.neural_network import MLPClassifier

In [2]:
def dir_parser(files=os.listdir('.')):
    for file in files:
        if (file.split(".")[-1] != "txt"):
            files.remove(file)
    return files

In [3]:
#for read data set
def dataset_read(filePath):
    dataset = pd.read_csv(filePath)
    #get header info
    #print(list(dataset))
    
    return dataset

In [4]:
# get Dataset to this.
#print(dir_parser())

testset_2 = dataset_read(dir_parser()[0])
#print(testset_2.head())
testset_1 = dataset_read(dir_parser()[1])
#print(testset_1.head())
train = dataset_read(dir_parser()[2])
#print(train)

In [5]:
#로지스틱 회귀를 씀
#히든레이어는 6개의 뉴런 1개의 층으로 구선
#솔버는 lbfgs를 사용함. scikitlearn에서 lbfgs는 quasi-Newton 메소드를 사용
#lbfgs는 전체 데이터셋으로부터 시간에 따라 변화율을 측정하기 때문에 sgd보다 사실상 가장 효과적인 방법
#그러나 적은 데이터 셋에서만 효율적이고, 대량의 데이터셋은 효과적이지 못 함.

classifier = MLPClassifier(solver='lbfgs', activation='logistic',
                         hidden_layer_sizes=(6,1), random_state=42)
#hidden layer = (neuron = 6, of hidden layer=1)

In [6]:
train_feat = train[["Temperature","Humidity","Light","CO2","HumidityRatio"]]
train_label = train["Occupancy"]
#print([train_CO2, train_Light])   

In [7]:
test2_feat = testset_2[["Temperature","Humidity","Light","CO2","HumidityRatio"]]
test2_label = testset_2["Occupancy"]

In [8]:
test1_feat = testset_1[["Temperature","Humidity","Light","CO2","HumidityRatio"]]
test1_label = testset_1["Occupancy"]

In [9]:
classifier.fit(train_feat,train_label)

MLPClassifier(activation='logistic', alpha=0.0001, batch_size='auto',
       beta_1=0.9, beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(6, 1), learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=42, shuffle=True,
       solver='lbfgs', tol=0.0001, validation_fraction=0.1, verbose=False,
       warm_start=False)

In [10]:
#학습된 모델에서 test샘플 2가지를 상대로 조사한 결과
print("test1 sample")
print(classifier.score(test1_feat, test1_label))
print("test2 sample")
print(classifier.score(test2_feat,test2_label))

test1 sample
0.9789868667917448
test2 sample
0.9437038556193601
