<a href="https://colab.research.google.com/github/hikmatfarhat-ndu/CSC645/blob/master/9MixedData.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Keras Functional API

In this exercise we will use the functional API provided by Keras. We will use a model with multiple inputs/outputs which cannot be done with the Sequential model.
The data is related to the prices of houses. The features are the number of bathrooms,bedrooms, size and zipcode. Also there are 4 images per house. Since this is a proof of concept exercise we will use only one of the images only.

In [None]:
import tensorflow as tf
import pandas as pd
import cv2
import numpy as np
from tensorflow.keras.layers import Dense,Input,concatenate,Flatten
from tensorflow.keras.models import Model





### The Data
The data can be found in the Github repository below.

In [None]:
!git clone https://github.com/emanhamed/Houses-dataset


### Read the non-image features

In [None]:
df=pd.read_csv("Houses-dataset/Houses Dataset/HousesInfo.txt",header=None,delim_whitespace=True,
               names=["bedrooms","bathrooms","size","zipcode","price"])

Display the first 10 elements of the data

In [None]:
df[0:10]

### Preprocessing the data
First we would like to remove entries where the zipcode contains less than 20 entries.
We use pandas to count the number of houses per zipcode

In [None]:
zipcodes=df['zipcode'].value_counts().keys().tolist()
counts=df['zipcode'].value_counts().tolist()

remove all the entries with zipcodes containing less than 20 houses

In [None]:
for count,zipcode in zip(counts,zipcodes):
  if count<20:
    idx=df[df['zipcode']==zipcode].index
    df.drop(idx,inplace=True)

Display the first 10 entries after the removal

In [None]:
df[0:10]

### Importing the images
Next we use OpenCV to read the images, resize them to 48x48

In [None]:

image_list=[]
prefix="Houses-dataset/Houses Dataset/"
suffix="_frontal.jpg"
for idx in df.index.tolist():
  path=prefix+str(idx)+suffix
  img=cv2.imread(path)
  img=cv2.resize(img,(48,48))
  image_list.append(img)

Create the image dataset

In [None]:

images=np.stack(image_list)

In [None]:
images.shape

For better and faster convergence we rescale the pixel values to between 0 and 1

In [None]:
images=images/255.

### Creating the features dataset
Again for better  and faster convergence we would like to rescale the data. For number of bathrooms, bedrooms and size we just divide by the maximal value. Since zipcodes are categorical we use one_hot encoding for zipcodes.

In [None]:
values=df.values
prices=values[:,4]
max_price=prices.max()
prices=prices/max_price

bedrooms=values[:,0]
bathrooms=values[:,1]
size=values[:,2]
max_size=size.max()
size=size/max_size
zipcodes=values[:,3]
max_zipcodes=zipcodes.max()
zipcodes=tf.one_hot(zipcodes,max_zipcodes).numpy()

In [None]:
nsamples=bedrooms.shape[0]
bathrooms=bathrooms.reshape(nsamples,1)
bedrooms=bedrooms.reshape(nsamples,1)
size=size.reshape(nsamples,1)
features=np.hstack([bedrooms,bathrooms,size,zipcodes])
features.shape

Split the data between train and test sets

In [None]:
train_images=images[0:300]
test_images=images[300:nsamples]
train_features=features[0:300]
test_features=features[300:nsamples]
train_prices=prices[0:300]
test_prices=prices[300:nsamples]


### Create a Model for features

In [None]:
def get_features_model():
  features_input=Input(shape=(train_features.shape[1]),name="features_input")
  features_layers=Dense(32,activation="relu")(features_input)
  features_layers=Dense(16,activation="relu")(features_layers)
  features_output=Dense(1,activation="linear",name="features_output")(features_layers)
  model=Model(inputs=features_input,outputs=features_output)
  return model


### Create a model for images

Using convolution networks work best with images but we are not looking to optimize the results, just to show how to use keras for multiple inputs/outputs models

In [None]:
def get_image_model():
  image_input=Input(shape=(48,48,3),name="image_input")
  flatten=Flatten()(image_input)
  image_layers=Dense(64,activation="relu")(flatten)
  image_layers=Dense(32,activation="relu")(image_layers)
  image_layers=Dense(16,activation="relu")(image_layers)
  image_output=Dense(1,activation="linear",name='image_output')(image_layers)
  model=Model(inputs=image_input,outputs=image_output)
  return model

In [None]:
features_model=get_features_model()
image_model=get_image_model()


### Create the combined model

We use the concatenate layer provided by keras to combine both models.
__Note__: even though the output of the feature_model does not show in the plot below but if you look at the outputs of the model, it is there


In [None]:
both=concatenate([features_model.output,image_model.output])
both=Dense(10,activation='relu')(both)
both=Dense(1,activation='linear',name='both_outputs')(both)
model=Model(inputs=[features_model.input,image_model.input],outputs=[both,features_model.output])


### Plot the model

__IMPORTANT__: the graph below was generated using graphviz so the appearance of the image model on the left does __NOT__ reflect on the way it is organized in Keras. In fact if you check the input of the model you will see that the leftmost input is the feature input not the images

In [None]:
tf.keras.utils.plot_model(model,show_shapes=True)

__NOTE__: for simplicity we are using the same loss function for __both__ outputs but it is possible to specify a different loss for each output. Also, since the output (price) is always positive we choose the mean_absolute_precentage_error

In [None]:
opt=tf.keras.optimizers.Adam()
model.compile(optimizer=opt,loss='mean_absolute_percentage_error')


In [None]:
model.fit(x=[train_features,train_images],y=train_prices,epochs=100)


In [None]:
model.evaluate([test_features,test_images],test_prices)


In [None]:
predict=np.squeeze(model.predict([test_features[0:6],test_images[0:6]]))
print(predict[0]*max_price)
print(test_prices[0:6]*max_price)
print(predict[1]*max_price)

In [None]:
print(100*np.abs(test_prices[0:6]-combined_predict[0])/test_prices[0:6])
print(100*np.abs(test_prices[0:6]-combined_predict[1])/test_prices[0:6])