# Create Dataset

### In this notebook, we create a tensor dataset from row images. This process is **optional** since you can download the final vesion of our dataset i.e **'tensor_dataset.zip'**,  from the link below : 
https://drive.google.com/file/d/1s2lu2OLRx3OuS1x14vE6_KjDYiMUKZ9S/view?usp=sharing



### If you want to build the dataset from scratch, download the file **'image_dataset.zip'** which contains all collected images from the link : 

https://drive.google.com/file/d/1N1nmLW7VMgETgG_lKayE4s_TBjfqeOgl/view?usp=sharing

and run the following cells. 

## In any case, the file **'tensor_dataset.zip'** must exist under the directory    ./DS1200007_1200004_1200012/Dataset   in order to train, evaluate etc. models in other notebooks.    




In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
src_dir = '/content/drive/MyDrive/'

In [None]:
!unzip "$src_dir/DS1200007_1200004_1200012/Dataset/image_dataset.zip" -d image_dataset/

In [None]:
import cv2
import tensorflow as tf
import tensorflow_datasets as tfds
import numpy as np
import os

In [None]:
my_classes_abr=['bicycle','bridge','bus','car', 'chimney', 'cross','hydrant','motor', 'palm', 'light','boat']
NUM_CLASSES=len(my_classes_abr)
directory="./image_dataset/"

In [None]:
data=[]
labels=[]
name=[]
my_images = os.listdir(directory)
for img in my_images:
    img_label=[]
    for cl in my_classes_abr:
        if cl in img.lower():            
            #print ("Exists")
            img_label.append(1.0)
        else : 
            #print ("Dont Exists")
            img_label.append(0.0)    
    imgg = cv2.imread(directory+'/'+img)  
    imgg = cv2.cvtColor(imgg, cv2.COLOR_BGR2RGB)            
    if len(imgg)!=100:                        
        imgg = cv2.resize(imgg, dsize=(100, 100), interpolation=cv2.INTER_NEAREST)              
    imgg = imgg.astype(np.float32)/255.
    data.append(imgg)
    labels.append(np.array(img_label))
    #name.append(img)


*   Create two tensors: one represents the image (pixels values) and the other the corresponding labels
*   Combine them into one dataset

In [None]:
tf_labels=tf.convert_to_tensor(labels,dtype=np.float32)
tf_data=tf.convert_to_tensor(data)
dataset=tf.data.Dataset.from_tensor_slices((tf_data, tf_labels))

This function defines each image of the dataset as a dictionary of two pairs keys-values i.e the first one for pixel values and the second one for its label.

In [None]:
def process_image(image,label):    
    image=tf.reshape(image,[image.shape[0],image.shape[1],image.shape[2]]) # 100,100,3
    label=tf.reshape(label,[NUM_CLASSES])
    features = {'image': image, 'label': label}    
    return features

In [None]:
AUTOTUNE = tf.data.experimental.AUTOTUNE
dataset=dataset.map(process_image, num_parallel_calls=AUTOTUNE).shuffle(12345) 



In [None]:
for f in dataset.take(1):
  print (type(f),'\n')
  print (f['label'],'\n')
  print (f['image'])

<class 'dict'> 

tf.Tensor([0. 1. 0. 1. 0. 0. 0. 0. 0. 0. 0.], shape=(11,), dtype=float32) 

tf.Tensor(
[[[0.18431373 0.25882354 0.1764706 ]
  [0.21176471 0.29803923 0.20392157]
  [0.24313726 0.30588236 0.21568628]
  ...
  [0.23529412 0.2627451  0.19215687]
  [0.24705882 0.28627452 0.21960784]
  [0.3137255  0.36078432 0.29803923]]

 [[0.24313726 0.28627452 0.22745098]
  [0.21568628 0.26666668 0.20784314]
  [0.1764706  0.23529412 0.16470589]
  ...
  [0.24705882 0.28627452 0.23529412]
  [0.22745098 0.28627452 0.22745098]
  [0.24313726 0.3137255  0.25882354]]

 [[0.26666668 0.2627451  0.22352941]
  [0.28627452 0.29411766 0.25882354]
  [0.20392157 0.23921569 0.1882353 ]
  ...
  [0.27450982 0.31764707 0.2784314 ]
  [0.28235295 0.34901962 0.3019608 ]
  [0.25490198 0.34509805 0.2901961 ]]

 ...

 [[0.2784314  0.3529412  0.32941177]
  [0.32156864 0.3882353  0.38431373]
  [0.30588236 0.36078432 0.3764706 ]
  ...
  [0.22352941 0.3647059  0.44705883]
  [0.21960784 0.34901962 0.4392157 ]
  [0.2039

Save the dataset 

In [None]:
tf.data.experimental.save(dataset, "./tensor_dataset")

Load dataset

In [None]:
new_dataset = tf.data.experimental.load("./tensor_dataset",{'image': tf.TensorSpec(shape=(100, 100, 3), dtype=tf.float32, name=None),
 'label': tf.TensorSpec(shape=(11,), dtype=tf.float32, name=None)}) 

Create the file tensor_dataset.zip with the dataset

In [None]:
!zip -r tensor_dataset.zip ./tensor_dataset