<a href="https://colab.research.google.com/github/princeKike27/Zoo-Animal-Classification/blob/main/Zoo_Animal_Classifcation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Zoo Animal Classification

- Using NumPy I'm going to build a Neural Network that classifies animals from the Sao Paulo Zoo

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Zoo Data

- The Features that are going to be used in the model are the following:
  - *hair* ⟶ 1 if the animal has hair
  - *feathers* ⟶ 1 if the animal has feathers
  - *eggs* ⟶ 1 if the animal lays eggs
  - *milk* ⟶ 1 if the animal feeds milk to its offspring
  - *airborne* ⟶ 1 if the animal flies
  - *aquatic* ⟶ 1 if the animal lives in water
  - *predator* ⟶ 1 if the animal hunts other animals
  - *toothed* ⟶ 1 if the animal has teeth
  - *backbone* ⟶ 1 if the animal has bones
  - *breathes* ⟶ 1 if the animal breathes
  - *venomous* ⟶ 1 if the animal is poisonous
  - *fins* ⟶ 1 if the animal has fins
  - *legs* ⟶ number of legs
  - *tail* ⟶ 1 if the animal has a tail
  - *domestic* ⟶ 1 if it is a domestic animal
  - *catsize*



In [1]:
# import modules
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# set plot style
sns.set()

In [2]:
# load dataset
zoo_df = pd.read_csv('https://raw.githubusercontent.com/princeKike27/Zoo-Animal-Classification/main/zoo.csv', sep=';')

zoo_df

Unnamed: 0,animal_name,hair,feathers,eggs,milk,airborne,aquatic,predator,toothed,backbone,breathes,venomous,fins,legs,tail,domestic,catsize,class_type
0,aardvark,1,0,0,1,0,0,1,1,1,1,0,0,4,0,0,1,1
1,antelope,1,0,0,1,0,0,0,1,1,1,0,0,4,1,0,1,1
2,bass,0,0,1,0,0,1,1,1,1,0,0,1,0,1,0,0,4
3,bear,1,0,0,1,0,0,1,1,1,1,0,0,4,0,0,1,1
4,boar,1,0,0,1,0,0,1,1,1,1,0,0,4,1,0,1,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
209,vespa,0,0,1,0,1,0,1,0,0,1,1,0,6,0,0,0,6
210,bicho-pau,0,0,1,0,0,0,0,0,0,1,0,0,6,0,0,0,7
211,caracol-da-mata-atlantica,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,7
212,caranguejeira,1,0,1,0,0,0,1,0,0,1,1,0,8,0,0,0,7


In [13]:
# standarize legs >> between 0 and 1
zoo_df['legs'] = (zoo_df.legs - np.min(zoo_df.legs)) / (np.max(zoo_df.legs) - np.min(zoo_df.legs))

zoo_df.iloc[:, 1:]

Unnamed: 0,hair,feathers,eggs,milk,airborne,aquatic,predator,toothed,backbone,breathes,venomous,fins,legs,tail,domestic,catsize,class_type
0,1,0,0,1,0,0,1,1,1,1,0,0,0.50,0,0,1,1
1,1,0,0,1,0,0,0,1,1,1,0,0,0.50,1,0,1,1
2,0,0,1,0,0,1,1,1,1,0,0,1,0.00,1,0,0,4
3,1,0,0,1,0,0,1,1,1,1,0,0,0.50,0,0,1,1
4,1,0,0,1,0,0,1,1,1,1,0,0,0.50,1,0,1,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
209,0,0,1,0,1,0,1,0,0,1,1,0,0.75,0,0,0,6
210,0,0,1,0,0,0,0,0,0,1,0,0,0.75,0,0,0,7
211,0,0,1,0,0,0,0,0,0,1,0,0,0.00,0,0,0,7
212,1,0,1,0,0,0,1,0,0,1,1,0,1.00,0,0,0,7


In [12]:
# create zoo array
zoo_array = np.array(zoo_df.iloc[:, 1:])

# check shape
print(f'Zoo array Shape: {zoo_array.shape}', '\n')

zoo_array

Zoo array Shape: (214, 17) 



array([[1., 0., 0., ..., 0., 1., 1.],
       [1., 0., 0., ..., 0., 1., 1.],
       [0., 0., 1., ..., 0., 0., 4.],
       ...,
       [0., 0., 1., ..., 0., 0., 7.],
       [1., 0., 1., ..., 0., 0., 7.],
       [1., 0., 1., ..., 0., 0., 7.]])

![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)

## Train Test Split

- We are going to Shuffle Randomly the Zoo array and we are going to split it in the following way:

  - Train ⟶ 70% of data ⟶ 150 examples
  - Test ⟶ 30% of data ⟶ 64 examples

In [14]:
# shuffle randomly
np.random.shuffle(zoo_array)

zoo_array

array([[0., 0., 1., ..., 0., 1., 3.],
       [0., 0., 1., ..., 0., 0., 3.],
       [0., 0., 1., ..., 0., 0., 4.],
       ...,
       [1., 0., 0., ..., 0., 1., 1.],
       [1., 0., 1., ..., 0., 0., 6.],
       [0., 1., 1., ..., 1., 1., 2.]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

### $X$ Inputs

In [16]:
# X inputs array
X = zoo_array[:, 0:16]

# check shape
print(f'X Shape: {X.shape}', '\n')

X

X Shape: (214, 16) 



array([[0., 0., 1., ..., 1., 0., 1.],
       [0., 0., 1., ..., 1., 0., 0.],
       [0., 0., 1., ..., 1., 0., 0.],
       ...,
       [1., 0., 0., ..., 1., 0., 1.],
       [1., 0., 1., ..., 0., 0., 0.],
       [0., 1., 1., ..., 1., 1., 1.]])

In [17]:
# X_train >> 70% of data
X_train = X[:150, :]

# check shape
print(f'X_train Shape: {X_train.shape}', '\n')

X_train

X_train Shape: (150, 16) 



array([[0., 0., 1., ..., 1., 0., 1.],
       [0., 0., 1., ..., 1., 0., 0.],
       [0., 0., 1., ..., 1., 0., 0.],
       ...,
       [0., 1., 1., ..., 1., 0., 1.],
       [1., 0., 0., ..., 1., 1., 0.],
       [1., 0., 0., ..., 1., 0., 0.]])

In [18]:
# X_test >> 30% of data
X_test =  X[150: , :]

# check shape
print(f'X_test Shape: {X_test.shape}', '\n')

X_test

X_test Shape: (64, 16) 



array([[1., 0., 0., ..., 1., 0., 0.],
       [0., 0., 1., ..., 0., 0., 0.],
       [0., 0., 1., ..., 1., 0., 0.],
       ...,
       [1., 0., 0., ..., 1., 0., 1.],
       [1., 0., 1., ..., 0., 0., 0.],
       [0., 1., 1., ..., 1., 1., 1.]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

### $X$ Transpose

- Inputs $X$ need to be Transposed so they can be fed to the Neural Network

In [19]:
# X_train Transpose
X_train_T = X_train.T 

# check shape
print(f'X_train Transpose Shape: {X_train_T.shape}', '\n')

X_train_T

X_train Transpose Shape: (16, 150) 



array([[0., 0., 0., ..., 0., 1., 1.],
       [0., 0., 0., ..., 1., 0., 0.],
       [1., 1., 1., ..., 1., 0., 0.],
       ...,
       [1., 1., 1., ..., 1., 1., 1.],
       [0., 0., 0., ..., 0., 1., 0.],
       [1., 0., 0., ..., 1., 0., 0.]])

In [20]:
# X_test Transpose
X_test_T = X_test.T

# check shape
print(f'X_test Transpose Shape: {X_test_T.shape}', '\n')

X_test_T

X_test Transpose Shape: (16, 64) 



array([[1., 0., 0., ..., 1., 1., 0.],
       [0., 0., 0., ..., 0., 0., 1.],
       [0., 1., 1., ..., 0., 1., 1.],
       ...,
       [1., 0., 1., ..., 1., 0., 1.],
       [0., 0., 0., ..., 0., 0., 1.],
       [0., 0., 0., ..., 1., 0., 1.]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

### $Y$ Labels

- The Labels in which each animal is classified are the following:

  - $1 ⟶$ Mammal
  - $2 ⟶$ Bird
  - $3 ⟶$ Reptile
  - $4 ⟶$ Fish
  - $5 ⟶$ Amphibian
  - $6 ⟶$ Bug
  - $7 ⟶$ Invertebrate

In [31]:
# Y labels array
Y = zoo_array[:, -1].astype(int)

# check shape
print(f'Y Shape: {Y.shape}', '\n')

Y

Y Shape: (214,) 



array([3, 3, 4, 6, 2, 2, 7, 4, 2, 2, 3, 1, 3, 4, 7, 2, 1, 1, 2, 3, 4, 3,
       4, 5, 3, 2, 1, 4, 5, 4, 1, 4, 3, 1, 6, 3, 2, 5, 2, 5, 1, 1, 3, 6,
       2, 1, 3, 6, 7, 5, 5, 1, 7, 2, 3, 3, 5, 6, 1, 4, 2, 4, 1, 7, 3, 2,
       2, 4, 2, 5, 5, 3, 1, 2, 2, 7, 5, 2, 6, 1, 1, 4, 1, 2, 3, 7, 1, 1,
       4, 5, 7, 6, 3, 1, 4, 2, 7, 2, 1, 1, 1, 1, 1, 1, 1, 6, 2, 7, 4, 1,
       2, 1, 4, 4, 3, 3, 7, 2, 2, 1, 4, 3, 1, 6, 7, 6, 2, 6, 1, 2, 7, 1,
       6, 2, 2, 7, 2, 1, 6, 3, 5, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 7, 4, 1,
       1, 2, 6, 6, 3, 1, 3, 6, 2, 2, 1, 2, 6, 5, 1, 4, 1, 7, 1, 6, 1, 3,
       5, 4, 7, 1, 7, 1, 7, 6, 7, 6, 4, 2, 2, 4, 4, 1, 1, 4, 6, 1, 6, 1,
       6, 4, 4, 6, 1, 5, 2, 2, 1, 3, 1, 4, 5, 1, 6, 2])

In [32]:
# Y_train >> 70% of data
Y_train = Y[:150]

# check shape
print(f'Y_train Shape: {Y_train.shape}', '\n')

Y_train

Y_train Shape: (150,) 



array([3, 3, 4, 6, 2, 2, 7, 4, 2, 2, 3, 1, 3, 4, 7, 2, 1, 1, 2, 3, 4, 3,
       4, 5, 3, 2, 1, 4, 5, 4, 1, 4, 3, 1, 6, 3, 2, 5, 2, 5, 1, 1, 3, 6,
       2, 1, 3, 6, 7, 5, 5, 1, 7, 2, 3, 3, 5, 6, 1, 4, 2, 4, 1, 7, 3, 2,
       2, 4, 2, 5, 5, 3, 1, 2, 2, 7, 5, 2, 6, 1, 1, 4, 1, 2, 3, 7, 1, 1,
       4, 5, 7, 6, 3, 1, 4, 2, 7, 2, 1, 1, 1, 1, 1, 1, 1, 6, 2, 7, 4, 1,
       2, 1, 4, 4, 3, 3, 7, 2, 2, 1, 4, 3, 1, 6, 7, 6, 2, 6, 1, 2, 7, 1,
       6, 2, 2, 7, 2, 1, 6, 3, 5, 1, 1, 1, 1, 1, 1, 2, 1, 1])

In [33]:
# Y_test >> 30% of data
Y_test = Y[150:]

# check shape
print(f'Y_test Shape: {Y_test.shape}', '\n')

Y_test

Y_test Shape: (64,) 



array([1, 7, 4, 1, 1, 2, 6, 6, 3, 1, 3, 6, 2, 2, 1, 2, 6, 5, 1, 4, 1, 7,
       1, 6, 1, 3, 5, 4, 7, 1, 7, 1, 7, 6, 7, 6, 4, 2, 2, 4, 4, 1, 1, 4,
       6, 1, 6, 1, 6, 4, 4, 6, 1, 5, 2, 2, 1, 3, 1, 4, 5, 1, 6, 2])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

### One Hot Encode $Y$

- We need to One Hot Encode $Y$ so it can be used in the Neural Network

  - For example if $y_i ⟶ 3$, Reptile, then:

    - $y = [0, 0, 1, 0, 0, 0, 0]$

In [36]:
# create zeros array of shape >> Y.size x 6
hot_array = np.zeros((Y_train.size, 7))

# check shape
print(f'hot_array Shape: {hot_array.shape}', '\n')

# check first 10 cols
hot_array[:10, :]

hot_array Shape: (150, 7) 



array([[0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.]])

In [41]:
# for each row >> place a 1 at position y - 1
hot_array[np.arange(Y_train.size), Y_train - 1] = 1

# check shape
print(f'hot_array Shape: {hot_array.shape}', '\n')

# check first 10 cols 
print(hot_array[:10, :], '\n')

# first 10 labels
print('First 10 labels:')
print(Y_train[:10])

hot_array Shape: (150, 7) 

[[0. 0. 1. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 0. 0. 1. 0.]
 [0. 1. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 1. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0.]] 

First 10 labels:
[3 3 4 6 2 2 7 4 2 2]


In [49]:
# create function to one hot encode Y
def one_hot_Y(y):

  # zeros array of shape >> y.size x 7
  y_hot_array = np.zeros((y.size, 7))
  # for each row >> place 1 at position y - 1
  y_hot_array[np.arange(y.size), y - 1] = 1

  return y_hot_array

In [50]:
# test function
y_hot_test = one_hot_Y(Y_train)

# check shape
print(f'y_hot_test Shape: {y_hot_test.shape}', '\n')

# check first 10 cols
y_hot_test[:10, :]

y_hot_test Shape: (150, 7) 



array([[0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0.],
       [0., 1., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1.],
       [0., 0., 0., 1., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0.]])

![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)

## Neural Network Model

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)