# Description

Machine learning algorithm using Elastic-Net regression.

## Import Libraries

In [1]:
import pandas as pd # library for data manipulation (dataframes)
import numpy as np # library for computations
import matplotlib.pyplot as plt # library for visualization
from sklearn.linear_model import ElasticNet  # library for machine learning methods
from sklearn.linear_model import ElasticNetCV# library for machine learning methods
import xlsxwriter as xlsw # library for exporting excel files

## Definition for Importing Data

In [2]:
def import_data(file_location, sheet):
    """This function returns data read from an xlsx file from a specific table.
    The sheet inside the file must include a '#' as the origin of the table.
    """
    #read data table from file
    values = pd.read_excel(file_location, sheet_name=sheet)
    #set origin of table at '#'
    values.set_index('#', inplace=True)
    return values

## Import Trainning Data

Assign address for trainning data table to variable ('x' values).
This set represents quantitative data for a group of units.
In this example the units are buildings and the data for each
building are NORMALIZED measurements such as areas, connectivity
between spaces, visibility inside spaces, etc.

In [3]:
x_values = import_data('quant_data_example.xlsx', 'trainning_set')

Print data table.

In [4]:
x_values

Unnamed: 0_level_0,Type01Area,Type01Integration,Type01Entropy,Type01Control,Type01Choice,Type01IsoArea,Type01IsoPerim,Type01IsoOclu,Type02Area,Type02Integration,...,Type05IsoPerim,Type05IsoOclu,Type06Area,Type06Integration,Type06Entropy,Type06Control,Type06Choice,Type06IsoArea,Type06IsoPerim,Type06IsoOclu
#,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
building 01,0.3268,0.813,0.6042,0.2189,0.7927,1.0,1.0,0.6138,0.225,0.8849,...,0.1151,0.0071,0.0562,0.6846,0.7077,0.5101,0.6811,0.0252,0.0578,0.0074
building 02,0.5315,1.0,0.2517,0.3264,0.2048,0.1912,0.3008,1.0,0.573,0.6173,...,0.0088,0.0017,0.1638,0.7126,0.3571,0.5217,0.2142,0.004,0.0132,0.002
building 03,1.0,0.8907,0.8582,0.5267,0.756,0.4752,0.5688,0.3545,0.4487,0.5113,...,0.1483,0.1327,0.438,0.9911,0.7883,0.971,0.7387,0.0573,0.1119,0.0369
building 04,0.6187,0.6493,0.6915,0.2182,0.2445,0.2112,0.2746,0.0178,0.599,0.6483,...,0.1135,0.0924,0.4273,0.8558,0.7782,0.5072,0.8292,0.0356,0.0802,0.0125
building 05,0.3755,0.412,0.4309,0.2839,0.243,0.1067,0.1492,0.0008,1.0,1.0,...,0.5091,0.9143,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
building 06,0.1771,0.7367,1.0,1.0,1.0,0.4612,0.6002,0.2352,0.0387,0.3131,...,1.0,1.0,0.023,0.456,0.5149,0.2319,0.2972,0.004,0.0129,0.0


## Import Labels

Assign address for data table from labels to variable ('y' values).
This set represents the labels by which the quant data
will be trainned. In this case these are scores extracted
from a survey, using a phenomenological approach, about
the different buildings from 1 to 5.
The scores were separated into different fields such as
orientation, circulation, hierarchy of spaces, etc.

In [5]:
y_values = import_data('labels_example.xlsx', 'trainning_labels')

Print data table.

In [6]:
y_values

Unnamed: 0_level_0,Hierarchy,Circulation,Efficiency,Proportion,Kitchen,Outdoor,Orientation,Overall
#,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
building 01,2.56,2.44,2.56,2.44,2.56,2.33,3.11,2.44
building 02,3.56,3.78,3.78,3.89,3.33,2.89,3.56,3.56
building 03,3.0,3.11,3.11,3.11,3.33,2.22,2.89,2.78
building 04,2.33,2.67,3.0,3.22,3.22,2.0,2.89,2.56
building 05,3.78,4.0,3.22,3.33,3.44,3.78,3.11,3.89
building 06,3.56,3.44,3.44,3.67,3.89,4.11,3.67,3.56


## Import Testing Data

Read data table from excel file.

In [7]:
testing_data = import_data('quant_data_example.xlsx', 'testing_set')

Print data table.

In [8]:
testing_data

Unnamed: 0_level_0,Type01Area,Type01Integration,Type01Entropy,Type01Control,Type01Choice,Type01IsoArea,Type01IsoPerim,Type01IsoOclu,Type02Area,Type02Integration,...,Type05IsoPerim,Type05IsoOclu,Type06Area,Type06Integration,Type06Entropy,Type06Control,Type06Choice,Type06IsoArea,Type06IsoPerim,Type06IsoOclu
#,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
test building,0.5516,0.7514,0.9693,0.3516,0.4267,0.1355,0.1869,0.0066,0.763,0.6228,...,0.5629,0.0428,0.6003,0.7838,0.8933,1,1,0.0265,0.0635,0.0003


# ML Models (Elastic Net)

## Elastic Net Definition

In [9]:
def elastic_net(x, y, cross_validation):
    """This function returns a trainned model from a data set ('x') based on a label ('y').
    The variable 'round_integer' defines the precision for coefficients for each variable 'x' for every 'y'.
    The variable 'cross_validation' defines the number of group samples k for cross validation.
    """
    trainned_model_cross = ElasticNetCV(cv=cross_validation, random_state=0).fit(x, y)
    trainned_model = ElasticNet(alpha = trainned_model_cross.alpha_).fit(x, y)
    return trainned_model

## Prediction

Taking the label "Hierarchy" from the example labels, lets train the model and predict the score of the test building for "Hierarchy".

In [10]:
trainned_model = elastic_net(x_values, y_values.Hierarchy, 6)

Plot the coefficients of the trainned model

In [11]:
np.round(trainned_model.coef_, 4)

array([-0., -0., -0.,  0., -0., -0., -0.,  0.,  0., -0., -0., -0., -0.,
        0.,  0.,  0., -0., -0., -0.,  0., -0., -0., -0.,  0., -0., -0.,
       -0., -0., -0., -0., -0., -0.,  0.,  0., -0.,  0.,  0.,  0.,  0.,
        0.,  0., -0., -0.,  0., -0.,  0.,  0.,  0.])

Predict the score test building

In [12]:
trainned_model.predict(testing_data)

array([3.13166667])

# Conclusion

After trainning the model for the specific label "Hierarchy", the score for that label for the test building is 3.13 out of 5.00