## <center>Autoencoder Exercise</center>

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

from sklearn.preprocessing import MinMaxScaler

### The Data

In the table is the average consumption of 17 types of food in grams per person per week for every country in the UK.
The table shows some interesting variations across different food types, but overall differences aren't so notable. 


In [None]:
df = pd.read_csv(
    'C:/Users/Lenovo/Desktop/Python/Deep Learning/Data Sets/TensorFlow/UK_foods.csv',
    index_col='Unnamed: 0')

df

In [None]:
df.info()

**TASK: Transpose the DataFrame so that the columns are now the index.**

In [None]:
df = df.T

**TASK: Create a heatmap from the DataFrame. Does any country really stand out as different than the others? It should be tricky to tell just from the image. Do any two countries appear to be very similar?**

In [None]:
sns.heatmap(data=df)

**TASK: Create an encoder. Our goal will be to reduce the dimensions from 17 --> 2 and see if any countries stand out as very different. In the solutions we built one that went 17 --> 8 --> 4 --> 2**

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import SGD 

In [None]:
encoder = Sequential()
encoder.add(Dense(units=8,activation='relu',input_shape=[17]))
encoder.add(Dense(units=4,activation='relu',input_shape=[8]))
encoder.add(Dense(units=2,activation='relu',input_shape=[4]))

**TASK: Create a decoder. In the solutions we built one that went 2-->4-->8-->17**

In [None]:
decoder = Sequential()
decoder.add(Dense(units=4,activation='relu',input_shape=[2]))
decoder.add(Dense(units=8,activation='relu',input_shape=[4]))
decoder.add(Dense(units=17,activation='relu',input_shape=[8]))

**TASK: Combine the encoder and decoder to be an autoencoder and compile the model.**

In [None]:
autoencoder = Sequential([encoder,decoder])
autoencoder.compile(loss="mse" ,optimizer=SGD(learning_rate=1.5))

**TASK: Create a MinMaxScaler to scale the data. Make sure to transpose the data, since we really have 17 feature columns and only 4 rows (one per country).**

In [None]:
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(df)

In [None]:
scaled_data

**TASK: Fit the autoencoder to the scaled data for 15 epochs.**

In [None]:
autoencoder.fit(scaled_data,scaled_data,epochs=15)

**TASK: Run the scaled data through only the encoder and predict the reduced dimensionalty output. Note: You will most likely get different results than us due to random initialiations.**

In [None]:
encoder_pred = encoder.predict(scaled_data) # Dimension reduction, use only encoder
encoder_pred

**TASK: Join the encoded 2 dimensional data with the original countries index. Triple check the index order to make sure its joined correctly. There are many ways to do this with pandas.**

In [None]:
indexes = df.index
indexes

df1 = pd.DataFrame(data=encoder_pred, index=indexes, columns=["C1", "C2"])
df1.reset_index(inplace=True)

**TASK: Now plot out these results in a scatterplot, labeled by their respective country. You should see N. Ireland further away from the other points (but not necessarily to the left or the right, could be centered further away from the others).**

In [None]:
sns.scatterplot(x='C1',y='C2',data=df1,hue='index')

 Once we go back and look at the data in the table, this makes sense: the Northern Irish eat way more grams of fresh potatoes and way fewer of fresh fruits, cheese, fish and alcoholic drinks. It's a good sign that structure we've visualized reflects a big fact of real-world geography: Northern Ireland is the only of the four countries not on the island of Great Britain. (If you're confused about the differences among England, the UK and Great Britain, see: this [video](https://www.youtube.com/watch?v=rNu8XDBSn10).)