# Homework 3 - Regression



## 1. Restraunt data

---

 Before sorting by price, the data is evenly split between yes and no answers, so the entropy at this point is
\begin{align*}
Entropy(v)=&-\sum_{i=1}^{n} v_i \cdot  log_2(v_i) \newline
Entropy(start)=&-[0.5\cdot log_2(0.5)+0.5\cdot log_2(0.5)] \\
=& 1
\end{align*}

Next we find the remainder after splitting between the prices. In the three categories:

| Price 	| Yes 	| No 	|
|---------|-------|-----|
| \$     	| 3   	| 4  	|
| \$\$    	| 2   	| 0  	|
| \$\$\$   	| 1   	| 2  	|


\begin{align*}
Remainder(price)=&-\sum_{i=1}^{d} \frac{p_i+n_i}{p+n} \cdot Entropy(\frac{p_i}{p_i+n_i},\frac{n_i}{p_i+n_i})\\
=&  \frac{7}{12}\left ( \frac{3}{7}log_2\left(\frac{3}{7}\right) + \frac{4}{7}log_2\left(\frac{4}{7}\right) \right )\\
+&  \frac{2}{12}\left ( \frac{2}{2}log_2\left(\frac{2}{2}\right) + \frac{0}{2}log_2\left(\frac{0}{2}\right) \right )\\
+&  \frac{3}{12}\left ( \frac{1}{3}log_2\left(\frac{1}{3}\right) + \frac{2}{3}log_2\left(\frac{2}{3}\right) \right )\\
=& \frac{7}{12} (0.985228) + \frac{2}{12}(0) + \frac{3}{12} (0.918296)\\
=& 0.80429
\end{align*}

To find the information gain:

\begin{align*}
Information \text{ } Gain =& Entropy - Remainder\\
=& 1 - 0.80\\
=& 0.20 \text{ bits}
\end{align*}

From this gain, we can see that the gain is better than the *type* question which yielded a gain of 0 bits, but it is worse than the *patrons* question which yielded a gain of 0.54 bits.




## 2. XOR Neural Network
---

It was possible to construct a neural network that could process an XOR function. To make the network simpler, you could just take the logical approach to solving an XOR function and just pass in the values that are needed.

To do this, you could construct a network that follows this layout. As we saw with simple perceptrons, they could learn AND, OR, and NAND which are all the function that we need here, so the network will consist of trained perceptrons linked up properly.

![3 gate xor](https://upload.wikimedia.org/wikipedia/commons/a/a2/254px_3gate_XOR.jpg) (Wikipedia)

The resulting truth table is as follows:

|   	| Yes 	| NAND 	| OR 	| AND(NAND,OR) 	| Goal (XOR) 	|
|---	|-----	|------	|----	|--------------	|------------	|
| 0 	| 0   	| 1    	| 0  	| 0            	| 0          	|
| 1 	| 0   	| 1    	| 1  	| 1            	| 1          	|
| 0 	| 1   	| 1    	| 1  	| 1            	| 1          	|
| 1 	| 1   	| 0    	| 1  	| 0            	| 0          	|

This is a simpler network although it does not follow the standard conventions for neural networks and, using trained perceptrons, would not need any further training to learn the XOR function.

## 3. Boston Housing Dataset
---



In [0]:
import numpy as np
import pandas as pd
import keras
keras.__version__

from keras.datasets import boston_housing
(train_set, train_target), (test_set, test_target) = boston_housing.load_data()

a. Compute dimensions of data structures

In [53]:
print('testing target dimensions: {} \
      \ntesting data dimensions: {} \
      \ntraining target dimensions: {} \
      \ntraining data dimensions: {} '.format(
          test_target.ndim,
          test_set.ndim,
          train_target.ndim,
          train_set.ndim, ))



testing target dimensions: 1       
testing data dimensions: 2       
training target dimensions: 1       
training data dimensions: 2 
[[0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [14.55026455026455], [5.144032921810699], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [3.7542662116040955], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [3.7542662116040955], [21.978021978021978], [0.0], [0.0], [44.554455445544555], [0.0], [2.059308072487644], [0.0], [0.0], [6.240249609984399], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [39.800995024875625], [12.681159420289855], [0.0], [0.0], [0.0], [0.0], [0.0], [15.137614678899082], [0.0], [0.0], [0.0], [73.77049180327869], [0.0], [0.0], [0.0], [13.08139534883721], [0.0], [0.0], [5.037783375314861], [6.240249609984399], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [21.978021978021978], [0.0], [9.868421052631579], [0.0], [0.0], [0.0], [3.7542662116040955], [0.0], [0.0], [5.582922824302135], [1.5883100381194408], [2.059308072487644], [0.0], [6.0060060060

c. Create new synthetic feature

I created a synthetic feature that calculates the ratio of residential land zoned for lots over 25000 sq ft to the proportion of non retail business acres per town. The reason this could be useful is if there was a correlation between the businesses and large residential lots (such as apartents) which could impact the housing prices. Unfortunately it seems many of them are zero which could impact how useful it is to predict the prices. 

In [56]:
synthetic_feature = []
for dataset in train_set:
  synthetic_feature.append([dataset[1]/dataset[2]])

train_set = np.append(train_set, synthetic_feature, axis=1)
synthetic_feature = []
for dataset in test_set:
  synthetic_feature.append([dataset[1]/dataset[2]])
test_set = np.append(test_set, synthetic_feature, axis=1)

print(synthetic_feature)

[[0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [14.55026455026455], [5.144032921810699], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [3.7542662116040955], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [3.7542662116040955], [21.978021978021978], [0.0], [0.0], [44.554455445544555], [0.0], [2.059308072487644], [0.0], [0.0], [6.240249609984399], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [39.800995024875625], [12.681159420289855], [0.0], [0.0], [0.0], [0.0], [0.0], [15.137614678899082], [0.0], [0.0], [0.0], [73.77049180327869], [0.0], [0.0], [0.0], [13.08139534883721], [0.0], [0.0], [5.037783375314861], [6.240249609984399], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [21.978021978021978], [0.0], [9.868421052631579], [0.0], [0.0], [0.0], [3.7542662116040955], [0.0], [0.0], [5.582922824302135], [1.5883100381194408], [2.059308072487644], [0.0], [6.006006006006006], [0.0], [0.0], [5.037783375314861], [3.7542662116040955], [0.0], [0.0], [1.5883100381194408], [0.0], [0.0], [0.0], [0.0], [0.0], 

b. Contruct testing, training and validation sets

This takes some of the training set to be used as the validation set. I did this after creating the synthetic feature so that I wouldn't have to put it in all 3, because now it's there already.

In [0]:
val_set = train_set[:100]
train_set = train_set[100:]
val_target = train_target[:100]
train_target = train_target[100:]
print("Validation set count:", len(val_set))
print("Training set count:",len(train_set))
print("Validation target count:",len(val_target))
print("Training target count:",len(train_target))
