# Hello TensorFlow 

In this notebook, you will create your first neural network with Tensorflow. We will be using our familiar penguins dataset for this assignment. This notebook is loosely based off of [Tensorflow's beginner notebook](https://www.tensorflow.org/tutorials/quickstart/beginner).

## 1. Set up TensorFlow

1. Make sure that tensorflow is properly installed via pip in a venv before running this cell. 
2. Activate your venv: Select kernal -> select another kernal -> python environments -> dl-env (or whatever you named your environment)
2. Run this cell to import TensorFlow in python and ensure it works properly

In [41]:
# Run this cell to import libraries and check that tensorflow is properly installed
import pandas as pd
import numpy as np
import math
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
import plotly.express as px
import tensorflow as tf

print("TensorFlow version:", tf.__version__)

TensorFlow version: 2.20.0


# Load your data

Load your penguins dataset. You will want to make sure you hand NA values, encode strings as numbers, and split your data. In this lab we will be predicting body_mass_g rather than species, so make sure to set y to be body_mass_g.

In [None]:
# TODO Load penguins.csv
data = pd.read_csv("classifcation_and_seqs_aln.csv")

#TODO Handle NA Values
data = data.dropna()


# TODO encode string data using LabelEncoder
encoder = LabelEncoder()
data["species"] = encoder.fit_transform(data["species"])
data["island"] = encoder.fit_transform(data["island"])
data["sex"] = encoder.fit_transform(data["sex"])

#TODO Select your features. Select body_mass_g as your "target" (y) and everything else as X
y = data["species"]
X = data[["island", "culmen_length_mm", "culmen_depth_mm", "flipper_length_mm", "body_mass_g", "sex"]]

# TODO : Split the data into testing and training data. Use a 20% split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.8, random_state = 42)


In [43]:
# run this to see if you implemented the above block correctly
assert math.isclose(len(X_train), .8*len(data), rel_tol=1), f"\033[91mExpected {.8*len(data)} but got {len(X_train)}\033[0m"
assert math.isclose(len(X_test), .2*len(data), rel_tol=1), f"\033[91mExpected {.2*len(data)} but got {len(X_test)}\033[0m"

## Build a machine learning model

Let's now built our first neural network with tensorflow. Using tensorflow, we can define each layer of our neural network, specifying what types of activation function we want to use and how many neurons there should be in each layer. 

In [44]:
# TODO create a neural network with tensorflow
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(6, input_shape=[6]),
    tf.keras.layers.Dense(10, activation='sigmoid'),
    tf.keras.layers.Dense(15, activation='relu'),
    tf.keras.layers.Softmax()
])


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



Before you start training, configure and compile the model using Keras `Model.compile`. Set the [`optimizer`](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers) class to `adam`, and set the `loss` to the `mse`.

In [45]:
# TODO set your learning rate
lr = 0.004

#TODO Compile your model with a selected optimizer and loss function
model.compile(
    loss=tf.keras.losses.SparseCategoricalCrossentropy(),
    optimizer=tf.keras.optimizers.Adam(learning_rate=lr),
    metrics=['accuracy']
)

## Train and evaluate your model

Use the `Model.fit` method to adjust your model parameters and minimize the loss. Creating the variable history to store the training output will allow us to graph our loss later on. 

In [46]:
# TODO: fit your model with X_train and Y_train
history = model.fit(X_train, y_train, epochs=1000)

Epoch 1/1000


[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 8ms/step - accuracy: 0.0000e+00 - loss: 2.7556  
Epoch 2/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.0000e+00 - loss: 2.6983 
Epoch 3/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step - accuracy: 0.0000e+00 - loss: 2.6434
Epoch 4/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.0000e+00 - loss: 2.5914 
Epoch 5/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.0000e+00 - loss: 2.5395 
Epoch 6/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.3788 - loss: 2.4923 
Epoch 7/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.3788 - loss: 2.4489 
Epoch 8/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step - accuracy: 0.3788 - loss: 2.4086
Epoch 9/1000
[1m3/3[0m [32m━━━━━━━━━━

In [47]:
#Run this cell to graph your loss
df = pd.DataFrame(history.history)['loss']
px.scatter(df).show()

# Make predictions
Now run the below code block to see your average error. Your *minimum* goal is to get within an average error of 100.

In [48]:
# TODO generate some predictions using Y_test
predictions = model.predict(X_test)

[1m1/9[0m [32m━━[0m[37m━━━━━━━━━━━━━━━━━━[0m [1m0s[0m 53ms/step

[1m9/9[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 


Now run the below code block to see your average error.

In [49]:
# Run this cell to calcuate your mean average error based on Y_test
# error = y_test.squeeze() - predictions.ravel()
# print("Your average error is: ", error.mean())

In [50]:
if abs(error.mean()) > 100:
    print("\033[91mYour model should be a bit more accurate\033[0m")
else:
    print("\033[92mYour model is accurate enough!\033[0m")

NameError: name 'error' is not defined