# Homework 3
## CS 344
## Elizabeth Koning

### Problem 1

Creating a neural network for the XOR function is only possible for multi-layered neural networks with non-linear activation functions.

With a single layered network, it is impossible. A neuron evaluation uses activations and weights to compare against an activation function, such as a threshold. With XOR, there is no single threshold that can include both true values and neither false. If the threshold includes both of the values we want to be true, it will give the same results as OR and the True, True case will be included. However, we cannot raise it higher because then we exclude both of the cases that we want to result in True.

In short, XOR is not linearly separable, so no linear model can learn the XOR function.

In class, we saw how we could do it with backpropogation or with a multiple layered network.

With a multiple layered network using a step function, we can use AND and OR to get to XOR. It would look like:

![](https://lh3.googleusercontent.com/v1x38cgc3h1f10qMWvha68YlTgC5951PVFgXTkcjao4jDhQLt0FrWKk_FXW9porL-uEcpXFU9HYYd6A_Ou-SZ_VlLPSq8m5RPan6-LPIcXZ2ajF-slgBZCb_t8LVRZPZHp8GHP3EUwulyCJy6kaoz_usLWs5Z52QkCoh2sVuA9BATc2O_AH8Ys5PGZsJ2-984qhEfXSQYezQY-Ex6xmL_WBNvijNAvPmsns_LFNBMukvoydQoUe7PO68EPdhwz5jpgiHCezpxGrwXxrXuqeWOneny_lNc6yB-LkjbmBaF6FWBSkgioILIqn0LXTq9xF_LYLDc1COwF63ax-GMR1LutuuPPPEXh9hR0X-EGc2P128Muox6SnVFqDfEeKuXRw0UlWbeUC9yE5GreFGSD3Ac9WTWVt-BMzQyoKo3DSVdMnPzD2AM1EWufTde2aCeyDUvpaKJhlNqmNUbWCTiqSASRXCkkYLaxrrZpJ4nD9GKmUiUEIzqcXP1gbxwJj0WhF68yJz8lB-8cJlZcRwwEwkMNFoABsKntI2MWaaEQGPJ9hekVAKCMpnK9ZWdO1qsUL0B97w5SsoAXruOpvKXL5lTm7n158Hj7Pmx-A3DGLVA82i9Ug4g0R_t98va4pAi1FWdFBHs5i9hkpTfMkcTC8R1fGfJPfuMQTGy7M---X7uIhVmXwRNCf0uceEy7fPl4L3p-yA1M-pCqRMro1skNxc6059=w720-h959-no)

### Problem 2

First, we need to load in the data set and import the applicable packages

In [42]:
import pandas as pd
import numpy as np
from keras.datasets import boston_housing

(x_train, y_train), (x_test, y_test) = boston_housing.load_data()

a. We can compute the dimensions of the data structures

In [43]:
print("x_train shape: ", x_train.shape)
print("y_train shape: ", y_train.shape)
print("x_test shape:  ", x_test.shape)
print("y_test shape:  ", y_test.shape)

x_train shape:  (404, 13)
y_train shape:  (404,)
x_test shape:   (102, 13)
y_test shape:   (102,)


b. Next, we can create the testing, training, and validation sets. We already having testing and training sets, we just need to create a validation set as well.

It could work well to have testing and validation datasets of the same size. Halving the testing set would leave us with only 51 examples for training and 51 for validation, so I'll take them from the training data instead.

According to the Keras documentation, some of its methods do the training/validation split within the modeling function. Depending on what we were doing, we could also have the option of telling Keras we want 20% of our training data to be used for validation.

In [44]:
x_val = x_train[0:102]
x_train = x_train[102:]
y_val = y_train[0:102]
y_train = y_train[102:]

print("x_train shape: ", x_train.shape)
print("y_train shape: ", y_train.shape)
print("x_val   shape: ", x_val.shape)
print("y_val   shape: ", y_val.shape)
print("x_test  shape: ", x_test.shape)
print("y_test  shape: ", y_test.shape)

print(x_test[1])

x_train shape:  (302, 13)
y_train shape:  (302,)
x_val shape:    (102, 13)
y_val shape:    (102,)
x_test shape:   (102, 13)
y_test shape:   (102,)
[1.2329e-01 0.0000e+00 1.0010e+01 0.0000e+00 5.4700e-01 5.9130e+00
 9.2900e+01 2.3534e+00 6.0000e+00 4.3200e+02 1.7800e+01 3.9495e+02
 1.6210e+01]


c. Next, we can create a synthetic feature that could be useful.

From the Keras documentation, it is not clear what the features we already have are. I found a list here: https://www.kaggle.com/c/boston-housing

One potentially useful synthetic feature would be multiplying "rad" (index of accessibility to radial highways) by "dis" (weighted mean of distances to five Boston employment centres) to get a more complete picture of the commuting and travel situation. This would serve as an "accessibility index" in a way.

One synthetic feature I considered including was whether it was build before or after lead paint became illegal. However, according to this (http://lib.stat.cmu.edu/datasets/boston), the data is from 1978, the same year that lead paint became illegal. This also would be less helpful in Boston than in other parts of the country because the area is so much older.

I used help from this source in order to set the dataframe to have columns associated with names: https://towardsdatascience.com/linear-regression-on-boston-housing-dataset-f409b7e4a155

In [6]:
from sklearn.datasets import load_boston
import pandas as pd

boston_data = load_boston()
boston = pd.DataFrame(boston_data.data, columns=boston_data.feature_names) # does not include MEDV, the target

boston['travel'] = boston['RAD'] * boston['DIS']

boston.head()

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,travel
0,0.00632,18.0,2.31,0.0,0.538,6.575,65.2,4.09,1.0,296.0,15.3,396.9,4.98,4.09
1,0.02731,0.0,7.07,0.0,0.469,6.421,78.9,4.9671,2.0,242.0,17.8,396.9,9.14,9.9342
2,0.02729,0.0,7.07,0.0,0.469,7.185,61.1,4.9671,2.0,242.0,17.8,392.83,4.03,9.9342
3,0.03237,0.0,2.18,0.0,0.458,6.998,45.8,6.0622,3.0,222.0,18.7,394.63,2.94,18.1866
4,0.06905,0.0,2.18,0.0,0.458,7.147,54.2,6.0622,3.0,222.0,18.7,396.9,5.33,18.1866
