<a href="https://colab.research.google.com/github/subhan215/deep-learning-assignment/blob/main/Flowers_Recognition_Assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Assignment: Flowers Recognition <br>
Dataset Description:<br>

This dataset contains 4341 images of flowers.<br>
The data collection is based on the data flickr, google images, yandex images.<br>
You can use this dataset to recognize plants from the photo.<br>

Attribute Information:<br>
The pictures are divided into five classes: chamomile, tulip, rose, sunflower, dandelion.<br>
For each class there are about 800 photos. Photos are not high resolution, about 320x240 pixels. <br>
<b>Also explore how to resize images in tensorflow and then resize all the images to a same size. </b> <br>
This is a Multiclass Classification Problem.<br>




WORKFLOW : <br>
Load Data <br>
Split into 60 and 40 ratio.<br>
Encode labels.<br>
Create Model<br>
Compilation Step (Note : Its a Multiclass Classification problem , select loss , metrics according to it)<br>
Train the Model.<br>
If the model gets overfit tune your model by changing the units , No. of layers , epochs , add dropout layer or add Regularizer according to the need .<br>
Prediction should be > 85%<br>
Evaluation Step<br>
Prediction<br>




Data : <br>
https://drive.google.com/file/d/1-OX6wn5gA-bJpjPNfSyaYQLz-A-AB_uj/view?usp=sharing

In [1]:
# mount google drive
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [63]:
# load dependencies
import cv2
import os
import glob
import random
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
from pathlib import Path

import keras
from keras.preprocessing import image
from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten

import tensorflow as tf
from tensorflow.keras.optimizers import Adam, RMSprop

1. make a variable data_path and assign flowers directory of it
2. make a variable categories and assign list of all classes to it
3. create function for input and output
4. in the same function convert rgb img into grayscale, also resize all images to 224
5. store images in input and labels in output
6. convert input into numpy arrays and scale down by dividing to 255
7. using keras to_categorical categorize the output
8. follow same preprocessing steps from here


In [4]:
# accessing the data folder and subfolders
data_path = Path("/content/drive/MyDrive/flowers")
dirs = data_path.glob('*')

In [5]:
# list of features and labels
data = []
img_size = 256

def make_data():
  for folder_dir in dirs:
      label= str(folder_dir).split('/')[-1]
      print(label)

      for image_path in folder_dir.glob('*.jpg'):
        img = image.load_img(image_path, target_size=(img_size,img_size))
        img_array = image.img_to_array(img)
        gray_img = tf.image.rgb_to_grayscale(img_array)
        data.append([gray_img, label])

make_data()

daisy
dandelion
rose
sunflower
tulip


In [6]:
len(data)

4323

In [7]:
# function to separate features and labels from the data
def load_data():
  np.random.shuffle(data)

  features = []
  labels = []
  for img, label in data:
    features.append(img)
    labels.append(label)

  # converting lists into numpy arrays
  features = np.array(features, dtype=np.float32)
  labels = np.array(labels)

  return [features, labels]

In [8]:
# unpacking features, labels from load_data
(features, labels) = load_data()

In [9]:
print(len(features), len(labels))

4323 4323


In [12]:
features.shape

(4323, 256, 256, 1)

In [13]:
labels.shape

(4323,)

In [None]:
# drawing some random flower images
def draw_flower():
  fig,ax=plt.subplots(5,2)
  fig.set_size_inches(15,15)

  for i in range(5):
    for j in range(2):
      l = random.randint(0, len(labels))
      seq_img = features.squeeze()
      ax[i,j].imshow(seq_img[l])
      ax[i,j].set_title(labels[l])
  plt.tight_layout()

draw_flower()

In [14]:
# splitting data into 60% training set and 40% test set
from sklearn.model_selection import train_test_split

train_data, test_data, train_labels, test_labels = train_test_split(
                                                            features,
                                                            labels,
                                                            test_size = 0.40,
                                                            random_state=42)

In [15]:
print(train_data.shape, test_data.shape)

(2593, 256, 256, 1) (1730, 256, 256, 1)


In [16]:
print(train_labels.shape, test_labels.shape)

(2593,) (1730,)


In [17]:
# reshaping and scaling train/test data
image_size = train_data.shape[1]

# reshape
train_data = train_data.reshape((-1,image_size*image_size))
test_data = test_data.reshape((-1,image_size*image_size))

# scale down
train_data = train_data/255.0
test_data = test_data/255.0

In [18]:
print(train_data.shape, test_data.shape)

(2593, 65536) (1730, 65536)


In [19]:
# encoding labels
from sklearn.preprocessing import LabelEncoder
 
# creating encoder
encoder = LabelEncoder()
# fit and transform labels
train_labels = encoder.fit_transform(train_labels)
test_labels = encoder.fit_transform(test_labels)

In [20]:
print(train_labels.shape, test_labels.shape)

(2593,) (1730,)


In [86]:
# baseline model definition
def baseline_model():
  model = Sequential([
      Dense(150, activation='relu', input_shape=(train_data.shape[-1],)),
      Dense(100, activation='relu'),
      Dense(50, activation='softmax'),
      Dense(5, activation='softmax')
  ])


  # compile the model
  model.compile(optimizer=Adam(0.01), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

  return model

In [87]:
model = baseline_model()
model.summary()

Model: "sequential_16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_60 (Dense)             (None, 150)               9830550   
_________________________________________________________________
dense_61 (Dense)             (None, 100)               15100     
_________________________________________________________________
dense_62 (Dense)             (None, 50)                5050      
_________________________________________________________________
dense_63 (Dense)             (None, 5)                 255       
Total params: 9,850,955
Trainable params: 9,850,955
Non-trainable params: 0
_________________________________________________________________


In [88]:
# train model
model.fit(train_data, train_labels, epochs=70, batch_size=138)

Epoch 1/70
Epoch 2/70
Epoch 3/70
Epoch 4/70
Epoch 5/70
Epoch 6/70
Epoch 7/70
Epoch 8/70
Epoch 9/70
Epoch 10/70
Epoch 11/70
Epoch 12/70
Epoch 13/70
Epoch 14/70
Epoch 15/70
Epoch 16/70
Epoch 17/70
Epoch 18/70
Epoch 19/70
Epoch 20/70
Epoch 21/70
Epoch 22/70
Epoch 23/70
Epoch 24/70
Epoch 25/70
Epoch 26/70
Epoch 27/70
Epoch 28/70
Epoch 29/70
Epoch 30/70
Epoch 31/70
Epoch 32/70
Epoch 33/70
Epoch 34/70
Epoch 35/70
Epoch 36/70
Epoch 37/70
Epoch 38/70
Epoch 39/70
Epoch 40/70
Epoch 41/70
Epoch 42/70
Epoch 43/70
Epoch 44/70
Epoch 45/70
Epoch 46/70
Epoch 47/70
Epoch 48/70
Epoch 49/70
Epoch 50/70
Epoch 51/70
Epoch 52/70
Epoch 53/70
Epoch 54/70
Epoch 55/70
Epoch 56/70
Epoch 57/70
Epoch 58/70
Epoch 59/70
Epoch 60/70
Epoch 61/70
Epoch 62/70
Epoch 63/70
Epoch 64/70
Epoch 65/70
Epoch 66/70
Epoch 67/70
Epoch 68/70
Epoch 69/70
Epoch 70/70


<tensorflow.python.keras.callbacks.History at 0x7faa56066390>

In [89]:
score = model.evaluate(test_data, test_labels)



In [90]:
pred = model.predict(test_data)

In [91]:
pred

array([[0.16851658, 0.23409915, 0.1830721 , 0.17832024, 0.23599185],
       [0.16851658, 0.23409915, 0.1830721 , 0.17832024, 0.23599185],
       [0.16851658, 0.23409915, 0.1830721 , 0.17832024, 0.23599185],
       ...,
       [0.16851658, 0.23409915, 0.1830721 , 0.17832024, 0.23599185],
       [0.16851658, 0.23409915, 0.1830721 , 0.17832024, 0.23599185],
       [0.16851658, 0.23409915, 0.1830721 , 0.17832026, 0.23599185]],
      dtype=float32)