<h2 align="center">Room Occupancy Detection Using Sensor Data</h2>

### Importing Libraries

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline
import pandas as pd
import warnings
import numpy as np
import yellowbrick as yb

warnings.simplefilter('ignore')

###### Yellowbrick a visual api  that extends the scikit-learn API with visual analysis and diagnostic tools. The Yellowbrick API also wraps matplotlib to create publication-ready figures and interactive data explorations while still allowing developers fine-grain control of figures. For users, Yellowbrick can help evaluate the performance, stability, and predictive value of machine learning models and assist in diagnosing problems throughout the machine learning workflow. By visualizing the model selection process, data scientists can steer towards final, explainable models and avoid pitfalls and traps.
##### The Yellowbrick library is a diagnostic visualization platform for machine learning that allows data scientists to steer the model selection process. It extends the scikit-learn API with a new core object: the Visualizer. Visualizers allow visual models to be fit and transformed as part of the scikit-learn pipeline process, providing visual diagnostics throughout the transformation of high-dimensional data.It is for..
* For data scientists, they can help evaluate the stability and predictive value of machine learning models and improve the speed of the experimental workflow.
* For data engineers, Yellowbrick provides visual tools for monitoring model performance in real world applications.
* For users of models, Yellowbrick provides visual interpretation of the behavior of the model in high dimensional feature space.
* For teachers and students, Yellowbrick is a framework for teaching and understanding a large variety of algorithms and methods.

## Importance of Visualization

In [None]:
#data
x = np.array([10, 8, 13, 9, 11, 14, 6, 4, 12, 7, 5])
y1 = np.array([8.04, 6.95, 7.58, 8.81, 8.33, 9.96, 7.24, 4.26, 10.84, 4.82, 5.68])
y2 = np.array([9.14, 8.14, 8.74, 8.77, 9.26, 8.10, 6.13, 3.10, 9.13, 7.26, 4.74])
y3 = np.array([7.46, 6.77, 12.74, 7.11, 7.81, 8.84, 6.08, 5.39, 8.15, 6.42, 5.73])
x4 = np.array([8, 8, 8, 8, 8, 8, 8, 19, 8, 8, 8])
y4 = np.array([6.58, 5.76, 7.71, 8.84, 8.47, 7.04, 5.25, 12.50, 5.56, 7.91, 6.89])


In [None]:
# verify the summary statistics
pairs = (x, y1), (x, y2), (x, y3), (x4, y4)
for x, y in pairs:
    print('mean=%1.2f, std=%1.2f, r=%1.2f' % (np.mean(y), np.std(y),
          np.corrcoef(x, y)[0][1]))

### so, we have same mean std for the above data lets see what visualization plots say

In [None]:
#visualize
g = yb.anscombe()
plt.show()

<h2 align=center> Feature Analysis </h2>

In [None]:
# Load the classification data set
data = pd.read_csv('../input/occupancy.csv')
data.head()

In [None]:
features = ["temperature", "relative humidity", "light", "C02", "humidity"]
classes = ['unoccupied', 'occupied']

In [None]:
X = data[features]
y = data.occupancy

### Feature Analysis - RadViz

In [None]:
from yellowbrick.features.radviz import RadViz

In [None]:
# Instantiate the visualizer
visualizer = RadViz(classes=classes, features=features, size=(900, 900))

# Fit the data to the visualizer
visualizer.fit(X, y)

# Transform the data
visualizer.transform(X)

# Draw/show/poof the data
visualizer.poof()

###  Feature Analysis - Parallel Coordinates Plot

In [None]:
from yellowbrick.features.pcoords import ParallelCoordinates

In [None]:
# Instantiate the visualizer
visualizer = ParallelCoordinates(
    classes=classes, 
    features=features, 
    normalize='standard', 
    sample = 0.1,
    size=(800, 600)
)

# Fit the data to the visualizer
visualizer.fit(X, y)

# Transform the data
visualizer.transform(X)

# Draw/show/poof the data
visualizer.poof()

* it can be seen that unoccupied rooms has lower light occupied has higher light 
* rooms with lowre temperature are unoccupied than room with higher temperature


### Feature Analysis - Rank Features

In [None]:
#Instantiate the visualizer with the Covariance ranking algorithm
from yellowbrick.features.rankd import Rank2D
visualizer = Rank2D(features=features, algorithm='covariance')

visualizer.fit(X, y)                # Fit the data to the visualizer
visualizer.transform(X)             # Transform the data
visualizer.poof()                   # Draw/show/poof the data

In [None]:
# Instantiate the visualizer with the Pearson ranking algorithm
visualizer = Rank2D(features=features, algorithm='pearson')

visualizer.fit(X, y)                # Fit the data to the visualizer
visualizer.transform(X)             # Transform the data
visualizer.poof()                   # Draw/show/poof the data

### Feature Analysis - Manifold Visualization

In [None]:
from yellowbrick.features.manifold import Manifold

visualizer = Manifold(manifold='isomap', target='discrete', classes=classes, size=(800, 600))
visualizer.fit_transform(X,y)
visualizer.poof()

###  ROC/AUC Plots

In [None]:
# Create the train and test data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

In [None]:
from yellowbrick.classifier import ROCAUC
from sklearn.linear_model import LogisticRegression

# Instantiate the classification model and visualizer
visualizer = ROCAUC(LogisticRegression(), size=(800, 600))

# Fit the training data to the visualizer
visualizer.fit(X_train, y_train)
# Evaluate the model on the test data
visualizer.score(X_test, y_test)  
# Draw/show/poof the data
g = visualizer.poof() 

### Classification Report and Confusion Matrix

In [None]:
from yellowbrick.classifier import ClassificationReport

# Instantiate the classification model and visualizer
visualizer = ClassificationReport(LogisticRegression(), classes=classes, support=True)

visualizer.fit(X_train, y_train)  # Fit the visualizer and the model
visualizer.score(X_test, y_test)  # Evaluate the model on the test data
g = visualizer.poof()             # Draw/show/poof the data


In [None]:
from yellowbrick.classifier import ConfusionMatrix

#The ConfusionMatrix visualizer taxes a model
cm = ConfusionMatrix(visualizer, classes=[0,1])

#To create the ConfusionMatrix, we need some test data. Score runs predict() on the data
#and then creates the confusion_matrix from scikit learn.
cm.score(X_test, y_test)

#How did we do?
cm.poof()

### Cross Validation Scores

In [None]:
from sklearn.model_selection import StratifiedKFold
from yellowbrick.model_selection import CVScores

In [None]:
# Create a new figure and axes
_, ax = plt.subplots()

# Create a cross-validation strategy
cv = StratifiedKFold(8)

# Create the CV score visualizer
oz = CVScores(
    LogisticRegression(), ax=ax, cv=cv, 
    scoring='f1_weighted', size=(800,600)
)

oz.fit(X, y)
oz.poof()

### Evaluating Class Balance

In [None]:
from yellowbrick.classifier import ClassBalance

# Instantiate the classification model and visualizer
visualizer = ClassBalance(labels=classes)

visualizer.fit(y_train, y_test)
visualizer.poof()

###  Discrimination Threshold for Logistic Regression

In [None]:
from yellowbrick.classifier import DiscriminationThreshold


visualizer = DiscriminationThreshold(LogisticRegression(), size=(800,600))

visualizer.fit(X_train, y_train)
visualizer.poof()