#AI in Full Bloom: Classifying Iris Flowers with Code

After the workshop, you must be wondering how **you** could be making your own AIs and using them to solve problems you care about. Fear not! As I, Isita, have prepared this *lovely*, if I do say so myself,  Google Collab Notebook, with a tutorial for an Artificial Neural Network and Decision Tree Classifier. This code will be used on a public dataset of Irises, as it is a simple dataset to use for beginners. With your own projects, you can find datasets about other problems, like pollution or energy usage. 

#Let us now Embark on Our Journey!







## The Iris Dataset

Before we get started with using scikit-learn, we need to decide what machine learning task we want to accomplish with it. 

In this workshop, we'll use scikit-learn for Classification using the **Artificial Neural Network** algorithm. We will also be trying the **Decision Tree** algorithm in this notebook. We'll use a popular sample dataset often referred to as "the Iris dataset" that was made specifically for machine learning algorithms([Link to dataset here](https://archive.ics.uci.edu/ml/datasets/Iris))

There are different species of ([Iris](https://en.wikipedia.org/wiki/Iris_(plant))) flowers. Our goal is to train a machine learning system to be able to take a new Iris flower and predict which Iris species it is.

![alt text](https://cdn.pixabay.com/photo/2015/05/26/13/57/flower-784688__340.jpg)

This dataset has:

* 150 examples
* 3 classes: setosa, versicolor, and virginica
* 4 features: sepal length, sepal width, petal length, petal width
* 50 examples for each class

This dataset can be imported directly from Sci-kit Learn, the python module. That way we do not have to upload a huge file.

Let's load this dataset using scikit-learn, which is called `sklearn` in Python.

#Importing and Cleaning Up the Dataset

Here we are importing all the libraries we will need to access functions from and loading the dataset.

In [None]:
#Import required libraries 
import keras #library for neural network
import pandas as pd #loading data in table form  
import seaborn as sns #visualisation 
import matplotlib.pyplot as plt #visualisation
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from sklearn.preprocessing import normalize #machine learning algorithm library

#  Load the dataset, which contains the data points(sepal length, petal length, etc) and corresponding labels(type of iris)
iris_dataset=pd.read_csv("https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv")
print("success")

# #This is a debug statement to make sure we uploaded the dataset correctly. 
# #We can comment it out when we actually run the code.
# #print(iris_dataset)

success


In [None]:
iris_dataset.loc[iris_dataset["species"]=="setosa","species"]=0
iris_dataset.loc[iris_dataset["species"]=="versicolor","species"]=1
iris_dataset.loc[iris_dataset["species"]=="virginica","species"]=2
iris_dataset

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0
...,...,...,...,...,...
145,6.7,3.0,5.2,2.3,2
146,6.3,2.5,5.0,1.9,2
147,6.5,3.0,5.2,2.0,2
148,6.2,3.4,5.4,2.3,2


## Visualizing the Data

To get a sense of how our data is distributed, we can plot our data based on the different features, such as Sepal Length, Petal Width, etc. Based on the graphs, we can also see what the general characterisitics of each type of iris are.

In [None]:
sns.lmplot('sepal_length', 'sepal_width',
           data=iris_dataset,
           fit_reg=False,
           hue="species",
           scatter_kws={"marker": "D",
                        "s": 50})
plt.title('SepalLength vs SepalWidth')

sns.lmplot('petal_length', 'petal_width',
           data=iris_dataset,
           fit_reg=False,
           hue="species",
           scatter_kws={"marker": "D",
                        "s": 50})
plt.title('PetalLength vs PetalWidth')

sns.lmplot('sepal_length', 'petal_length',
           data=iris_dataset,
           fit_reg=False,
           hue="species",
           scatter_kws={"marker": "D",
                        "s": 50})
plt.title('SepalLength vs PetalLength')

sns.lmplot('sepal_width', 'petal_width',
           data=iris_dataset,
           fit_reg=False,
           hue="species",
           scatter_kws={"marker": "D",
                        "s": 50})
plt.title('SepalWidth vs PetalWidth')
plt.show()

Notice that each species cluster is in a somewhat distinct section of the graph. From this we find general parameters to guess the type of iris ourself. Example: If my flower has a small Sepal Length and a Small Petal Length what type might it be?(Look at the graphs)

Answer: It's proabaly species 0 or 'Setosa' because in the second graph, Sepal Length' versus 'Petal Length', the blue dots are clustered near the origin.

##ANN: Splitting up the training and test sets

Remember that in ANN classification,  a type of supervised machine learning, we must use a training set to teach our model how to correctly classify future examples. We also use a test set to test how good our model is.

The first step that we'll do is break up the Iris dataset into training set and test set:

In [4]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import normalize
from keras.utils import np_utils
from sklearn import preprocessing


# Break the dataset up into the examples (X) and their labels (y)
X = iris_dataset.iloc[:, 0:4].values
y = iris_dataset.iloc[:, 4].values
scaler = preprocessing.Normalizer().fit(X)
X = scaler.transform(X)
# X=normalize(X,axis=0)

# Split up the X and y datasets randomly into train and test sets
# 20% of the dataset will be used for the test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=31)

#Change the label to one hot vector
'''
[0]--->[1 0 0]
[1]--->[0 1 0]
[2]--->[0 0 1]
'''
y_train=np_utils.to_categorical(y_train,num_classes=3)
y_test=np_utils.to_categorical(y_test,num_classes=3)



Our training set contains all of the correct targets (classes) for our flower examples, along with the four features of each flower. We'll need all of this information to teach our classifier how to predict a class given a new set of four features.

**Note: Because the data points are split randomly into train and test, each run might not be the same**

## ANN: Building our Network

Now that we've split our data into a training and test set, it's time to build our ANN.

As you may already know Neural Networks look like this:


![alt text](https://www.researchgate.net/profile/Facundo_Bre/publication/321259051/figure/fig1/AS:614329250496529@1523478915726/Artificial-neural-network-architecture-ANN-i-h-1-h-2-h-n-o.png)


 Here we create the Neural Newtork Framework, with how many layers, activation functions, etc. You can experiment with the numbers as much as you want, but you might want to research "activation functions" before changing those. 

 We choose most of the parameters, except for the weights associated with each node. Those weights are learned by the netwrok thorugh training later on.

In [5]:
import keras
from keras.models import Sequential 
from keras.layers import Dense,Activation,Dropout 


# Initialising the ANN
model = Sequential()

# Adding the input layer and the first hidden layer
model.add(Dense(1000,input_dim=4,activation='relu'))
model.add(Dense(50,activation='relu'))

#Protects against overfitting
model.add(Dropout(0.2))

# Adding the output layer
model.add(Dense(3,activation='softmax'))

# Compiling the ANN
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])


2022-09-13 21:03:46.246197: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:966] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-09-13 21:03:46.249779: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-09-13 21:03:46.249886: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2022-09-13 21:03:46.249929: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory
2022-09-13 21:03:46.249982: W tensorflow/stream_executor/platform/default/dso_loader.cc:6


## ANN: Training our Network

We have built the Neural Network and now we need to train it with our training data. The training process adjusts the weights of the nodes to match the relations between the features and labels of the training set.

In [6]:
# Fitting the ANN to the Training set
model.fit(X_train,y_train,validation_data=(X_test,y_test),batch_size=20,epochs=10,verbose=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f02087bbbb0>

##ANN: Prediction Accuracy
Now let's predict some values using the ANN we just created and trained. We will use the Neural Network Structure, which has been trained and assigned weights on the nodes based on the patterns in the training data, to predict the results of the datapoints in the test set.

In [7]:
# Predicting the Test set results
y_pred = model.predict(X_test)
y_pred = (y_pred > 0.5)

prediction=model.predict(X_test)
length=len(prediction)
y_label=np.argmax(y_test,axis=1)
predict_label=np.argmax(prediction,axis=1)
#how times it matched/ how many test cases
accuracy=np.sum(y_label==predict_label)/length * 100 
print("Accuracy of the dataset",accuracy )

Accuracy of the dataset 60.0


##ANN: Tuning the Parameters

Artificial Neural Networks have many different places to experiment. The Neural Network Structure can be altered and has a significant impact on the performance. Hidden Layers can be added or delete and many different values can be put in. You can also change the training parameters  such as, epoch number, batch size, and other inputs to see what brings out the best accuracy.  

Tuning both structural and training parameters is an essential step in developing a neural network based soultion for any problem.

While this variability of parameters makes it so that ANN's can be tuned to high degrees of perfection, finding the same patterns that ANNs discern and understanding it's choices is hard for humans. **Sometimes** AI is called a ["Black Box Technique"](https://towardsdatascience.com/machine-learning-how-black-is-this-black-box-f11e4031fdf)

---
Let's try adding a second hidden layer and see if it improves our accuracy.

### Moddify existing model to match triton server

In [8]:
X_test = np.array([[[i]]for i in X_test])
X_train = np.array([[[i]]for i in X_train])

In [9]:
import keras
from keras.models import Sequential 
from keras.layers import Dense,Activation,Dropout,Input,Flatten


# Initialising the ANN
model = Sequential()
model.add(Input(shape=(1,1,4)))
model.add(Flatten())
# Adding the input layer and the first hidden layer
model.add(Dense(1000,activation='relu'))
#Changing number of nodes in first hidden layer
model.add(Dense(50,activation='relu'))

# Adding the second hidden layer
model.add(Dense(300,activation='relu'))
#Protects against overfitting
model.add(Dropout(0.2))

# Adding the output layer
model.add(Dense(3,activation='softmax'))

# Compiling the ANN
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])

# Fitting the ANN to the Training set
model.fit(X_train,y_train,validation_data=(X_test,y_test),batch_size=20,epochs=10,verbose=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f01e40de520>

In [10]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten (Flatten)           (None, 4)                 0         
                                                                 
 dense_3 (Dense)             (None, 1000)              5000      
                                                                 
 dense_4 (Dense)             (None, 50)                50050     
                                                                 
 dense_5 (Dense)             (None, 300)               15300     
                                                                 
 dropout_1 (Dropout)         (None, 300)               0         
                                                                 
 dense_6 (Dense)             (None, 3)                 903       
                                                                 
Total params: 71,253
Trainable params: 71,253
Non-trai

In [11]:
# Predicting the Test set results
y_pred = model.predict(X_test)
y_pred = (y_pred > 0.5)

prediction=model.predict(X_test)
length=len(prediction)
y_label=np.argmax(y_test,axis=1)
predict_label=np.argmax(prediction,axis=1)
#how times it matched/ how many test cases
accuracy=np.sum(y_label==predict_label)/length * 100 
print("Accuracy of the dataset",accuracy )

Accuracy of the dataset 93.33333333333333


In [12]:
y_label

array([1, 2, 0, 1, 2, 0, 2, 1, 0, 0, 2, 1, 2, 0, 2, 1, 1, 1, 2, 0, 2, 2,
       0, 2, 1, 0, 1, 1, 1, 1])

In [13]:
X_test

array([[[[0.70631892, 0.37838513, 0.5675777 , 0.18919257]]],


       [[[0.69417747, 0.30370264, 0.60740528, 0.2386235 ]]],


       [[[0.79594782, 0.55370283, 0.24224499, 0.03460643]]],


       [[[0.73446047, 0.37367287, 0.5411814 , 0.16750853]]],


       [[[0.71718148, 0.31640359, 0.58007326, 0.22148252]]],


       [[[0.82210585, 0.51381615, 0.23978087, 0.05138162]]],


       [[[0.73122464, 0.31338199, 0.56873028, 0.20892133]]],


       [[[0.73239618, 0.38547167, 0.53966034, 0.15418867]]],


       [[[0.77381111, 0.59732787, 0.2036345 , 0.05430253]]],


       [[[0.78591858, 0.57017622, 0.23115252, 0.06164067]]],


       [[[0.69385414, 0.29574111, 0.63698085, 0.15924521]]],


       [[[0.71524936, 0.40530797, 0.53643702, 0.19073316]]],


       [[[0.71066905, 0.35533453, 0.56853524, 0.21320072]]],


       [[[0.8025126 , 0.55989251, 0.20529392, 0.01866308]]],


       [[[0.69299099, 0.34199555, 0.60299216, 0.19799743]]],


       [[[0.70779525, 0.31850786, 0.60162596, 0.1887454

### Exporting model

In [16]:
model.save("model.h5")

# That's all Folks!
 
Thank you so much for going through this tutorial. I am confident you will now be able to use AI to change the world and save our planet!

If you have any more questions email me at : isitatalukdar@gmail.com

---

## Continue the Learning!

Now that you have one classification technique down, let's try another, **Decision Trees**. We will also be discuss a way to analyze accuracy called a **Confusion Matrix**. 


## Decision Trees: Importing and Cleaning up the Data
We already loaded the data in the beginning. Let's load it again, just in case you are starting to run the code from here.

In [None]:
#Import required libraries 
import pandas as pd #loading data in table form  
import numpy as np # linear algebra
from sklearn.tree import DecisionTreeClassifier #Creating the Decision Tree
from sklearn import tree#Visualizing the Decision Tree
import graphviz #Visualizing the Decision Tree
from sklearn.metrics import confusion_matrix #Confusion Matrix
import matplotlib.pyplot as plt #visualization
from sklearn.datasets import load_iris

# Load the dataset, which contains the examples and their labels
iris_dataset = load_iris()

## Decision Tree: Splitting up the training and test sets

Remember that in classification, which is a type of supervised machine learning, we must use a training set to teach our model how to correctly classify future examples. We also use a test set to test how good our model is.

The first step that we'll do is break up the Iris dataset into training set and test set:

In [None]:
from sklearn.model_selection import train_test_split

# Break the dataset up into the examples (X) and their labels (y)
X, y = iris_dataset.data, iris_dataset.target

# Split up the X and y datasets into train and test sets
# 25% of the dataset will be used for the test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=31)


Our training set contains all of the correct targets (classes) for our flower examples, along with the four features of each flower. We'll need all of this information to teach our classifier how to predict a class given a new set of four features.

**Note: Because the data points are split randomly into train and test, each run might not be the same**

## Decision Tree: Training our Classifier

As Decision Trees are a well-known classifier. There is already a library with a function to make one. You can always create it from scratch, but sometimes using the library function is less tedius. When you need to customize more advanced aspects of the classifier, it makes sense to start from scratch, unlike here.

In [None]:
# sklearn classifiers built in
# We're going to import the decision tree classifier
from sklearn.tree import DecisionTreeClassifier

# Initialize the classifier with a max_depth of 5
classifier = DecisionTreeClassifier(max_depth=5)

# Fit the classifier to the training set
classifier = classifier.fit(X_train, y_train)

##Decision Tree: Visualizing the Classifier

We can use some of the libraries to visualize our Descicion right here, instead of using a separate software. One of the major advantages with using a Decision Tree is that we can see visually how it makes its decisions. This **transparency** is not true for ANNs and is a very important research question currently.

For example, machine learning is often used in government agencies today. If a model makes a decision that can affect whether someone gets health care, they should be able to justify *why* the system made the decision that it did.

In [None]:
from sklearn import tree
import graphviz

dot_data = tree.export_graphviz(classifier, out_file=None, impurity=False) 
graph = graphviz.Source(dot_data) 
#Displays Graph
graph

_Again, because we split the dataset randomly, this means that your decision tree may look different from another student's decision tree._

Here's an example of what the top of the decision tree visualization may look like:

![alt text](https://i.imgur.com/SFktGyk.png)

It takes some practice to read the visualization of the decision tree, but it's actually quite informative. The root node of the decision tree starts with the line `X[3] <= 0.75`. This is the condition that splits the tree. In this case, it's saying that we should look at the feature with index 3 (petal width) and see if it's less than or equal to 0.75. If this is true, we will follow the tree to the left child. If this is false, we will follow the tree to the right child.

The `samples = 112` line means that at this node, we still have 112 samples to look at.

The `[39, 34, 39]` line tells us that of these 112 samples, 39 are the zeroth class (setosa), 34 are the first class (versicolor), and 39 are the second class (virginica).

After the first split, we'll see that we did really well! All 39 samples of setosa are correctly classified in the left child of the root.

We can follow the visualization for the rest of the decision tree to see what feature it splits on at each node.

---
##Decision Tree: Testing
We've trained our decision tree and visualized it, but we have not yet tested it to see how well it does. This is where the test set comes in -- the test set is a set of correctly labelled examples that we have withheld from the decision tree, so we can test to see if the predictions made by the decision tree match the correct labels.

With `sklearn`, it's really easy to generate our predicted labels for the test set:



In [None]:
# Create a list of predicted classes for each of the examples in the test set
y_predict = classifier.predict(X_test)

In order to find the accuracy of our classifier on the test set, we use the function `score()`, which takes two parameters: (1) the data of the test set, and (2) the correct labels of the test set.

It will automatically compare our predicted label with the correct label to compute the accuracy.

In [None]:
accuracy = classifier.score(X_test, y_test)
print(accuracy)

##Congrats!

Now you've made a decision tree as well! But we are mssing something... Sure, we know the accuracy of the classifier, but what about the distributions of answers? False Positives? False Negatives? That kind of information is valuable for analyzing error and we can see it through a **confusion matrix**

---
 ## Confusion Matrix

Trust me, it's not as confusing as it sounds. Here we will display a confusion matrix to analyze our predictions.

We'll use the familiar `matplotlib` library to accomplish this visualization, but we'll also use a library called `seaborn` to make our visualization look a bit nicer:

In [None]:
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns

confusion_matrix = confusion_matrix(y_test,y_predict)
cm_df = pd.DataFrame(
    confusion_matrix, 
    index = [idx for idx in ['setosa', 'versicolor', 'virginica']],
    columns = [col for col in ['setosa (pred)', 'versicolor (pred)', 'virginica (pred)']])
plt.figure(figsize = (10,7))
sns.heatmap(cm_df, annot=True)

Our confusion matrix in this case is a 3x3 table, because there are 3 different possible classes for each flower. The columns tell us what class we predicted, whereas the rows tell us what the actual class is.

*Because we randomly split the data set, your confusion matrix might look different from another run.*

The following is an example of what the confusion matrix might look like:

![alt text](https://i.imgur.com/vsKCEKx.png)

The first row tells us that for flowers that should actually be classified as setosa, what our decision tree predicted their class should be. In the example screenshot, there were 11 setosa flowers, and they were all correctly labelled setosa.

The second row is more interesting. It tells us that there were 16 versicolor plants, but only 14 were classified correctly. The remaining two were predicted to be virginica, which was incorrect.

Finally, the last row shows that our decision tree classified all 11 virginica plants correctly.

For this particular example above, there were 38 test examples, and 36 were classified correctly, for an accuracy of 94.7%. The confusion matrix helps us visualize the performance of our decision tree and in addition to the accuracy number itself, it gives us the added information of which type of flower we tended to classify incorrectly.

---
#That's all Folks!

Thank you so much for going through this tutorial. I am confident you will now be able to use AI to change the world and save our planet!

If you have any more questions email me at : isitatalukdar@gmail.com

