# Assignment 4: CNN

## Description

Implement a Convolutional Neural Network (CNN) classifier to predict whether a given icon image is the real / fake.

- You are not required to use Colab in this assignment, but you have to **submit your source code**.

## Dataset

- https://lab.djosix.com/icons.zip
- 64x64 RGB jpg images


```
real/           (10000 images)
    0000.jpg
    0001.jpg
    ...
    9999.jpg
fake/           (10000 images)
    0000.jpg
    0001.jpg
    ...
    9999.jpg
unknown/        (5350 images, testing set)
    0000.jpg
    0001.jpg
    ...
    5349.jpg
```

- Training set
  - 20000 icons in `real/` and `fake/`
  - You should predict 1 for icons in `real/` and 0 for icons in `fake/`
- Testing set:
  - 5350 icons in `unknown/`
  - Your score depends on the **accuracy** on this testing set,  
    so the prediction of each icon in `unknown/` should be submitted (totally 5350 predictions, see below).


## Submission

Please upload **2 files** to E3. (`XXXXXXX` is your student ID)

1. **`XXXXXXX_4_result.json`**  
  This file contains your model prediction for the testing set.  
  You must generate this file with the function called `save_predictions()`.
2. **`XXXXXXX_4_source.zip`**  
  Zip your source code into this archive.


## Hints

- **Deep Learning Libraries**: You can use any deep learning frameworks (PyTorch, TensorFlow, ...).
- **How to implement**: There are many CNN examples for beginners on the internet, e.g. official websites of the above libraries, play with them and their model architectures to abtain high accuracy on testing set.
- **GPU/TPU**: Colab provides free TPU/GPU for training speedup, please refer to [this page in `pytut.pdf` on E3](https://i.imgur.com/VsrUh7I.png).


In [2]:
from google.colab import drive
drive.mount('/content/drive')
#將dataset上傳到 drive 雲端 並把 real / fake 分成 train set 和 test set 
# real/fake 有 8 成在 train set 剩下當作 test set  

Mounted at /content/drive


In [4]:
import os
import shutil
from google.colab import drive
shutil.copytree('/content/drive/My Drive/icons', 'content/icons')
%cd content/icons
os.listdir()

/content/content/icons


['train', '0816183_hw4.h5', 'content', 'unknown', 'test', '0816183']

### Include this in your code to generate result file

In [5]:
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import Dropout

In [6]:
classifier=Sequential()
classifier.add(Conv2D(32,(3,3),input_shape=(64,64,3),activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2,2),strides=2))
classifier.add(Conv2D(32,(3,3),activation='relu'))

classifier.add(Conv2D(64,(3,3),activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2,2),strides=2))
classifier.add(Conv2D(32,(3,3),activation='relu'))

classifier.add(Conv2D(128,(3,3),activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2,2),strides=2))

classifier.add(Flatten())
classifier.add(Dense(units=512,activation='relu'))
classifier.add(Dense(units=1,activation='sigmoid'))
adam = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.99, epsilon=None, decay=0.0, amsgrad=False)
classifier.compile(optimizer=adam,loss='binary_crossentropy',metrics=['accuracy'])

In [7]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1./255,
                                   shear_range=0.1,
                                   zoom_range=0.1,
                                   horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
unknown_datagen= ImageDataGenerator(rescale=1./255)
#Training Set
train_set = train_datagen.flow_from_directory('train',
                                             target_size=(64,64),
                                             batch_size=32,
                                             class_mode='binary')
#Validation Set
test_set = test_datagen.flow_from_directory('test',
                                           target_size=(64,64),
                                           batch_size = 32,
                                           class_mode='binary',
                                           shuffle=False)
#Test Set /no output available
unknown_set = unknown_datagen.flow_from_directory('unknown',
                                            target_size=(64,64),
                                            batch_size=32,
                                            shuffle=False)

Found 16000 images belonging to 2 classes.
Found 4000 images belonging to 2 classes.
Found 5350 images belonging to 1 classes.


In [8]:
%tensorflow_version 2.x
import tensorflow as tf
import timeit
def gpu():
  with tf.device('/device:GPU:0'):
    random_image_gpu = tf.random.normal((100, 100, 100, 3))
    net_gpu = tf.keras.layers.Conv2D(128,7)(random_image_gpu)
    return tf.math.reduce_sum(net_gpu)
gpu()
gpu_time = timeit.timeit('gpu()', number=10, setup="from __main__ import gpu")
print(gpu_time)

0.032737292000092566


In [69]:
%%capture
import os
from tensorflow.keras.models import load_model
classifier=load_model("0816183_hw4.h5")
#將之前 train 好的成果存在雲端 再用 load model 拿出來
with tf.device('/device:GPU:0'):
  classifier.fit_generator(train_set,
                         epochs=2000,validation_data=test_set,
                         steps_per_epoch=6250,
                         validation_steps=10
                         )
  x1=classifier.evaluate_generator(train_set)
  x2=classifier.evaluate_generator(test_set)
  predict=classifier.predict(unknown_set)




In [70]:
from tensorflow.keras.models import load_model
#classifier.save("0816183/0816183_hw4.h5")
with tf.device('/device:GPU:0'):
  #classifier=load_model("0816183_hw4.h5")
  #x1=classifier.evaluate_generator(train_set)
  #x2=classifier.evaluate_generator(test_set)
  print("accuracy of train set:")
  print(x1[1])
  print("loss rate of train set:")
  print(x1[0])
  print("accuracy of test set:")
  print(x2[1])
  print("loss rate of test set:")
  print(x2[0])
  predict=classifier.predict_generator(unknown_set)
  print(predict)
  print(len(predict))
  ans=list()
  count1=0
  count2=0
  for i in range(5350):
    if predict[i] >= 0.5:
      ans.append(1)
      count1=count1+1
    else:
      ans.append(0)
      count2=count2+1
  print(count1)
  print(count2)
  print(ans)

accuracy of train set:
0.9962499737739563
loss rate of train set:
0.011012166738510132
accuracy of test set:
0.9884999990463257
loss rate of test set:
0.025791797786951065




[[9.2808453e-27]
 [1.0000000e+00]
 [1.1934629e-26]
 ...
 [1.0000000e+00]
 [4.2311976e-09]
 [1.0000000e+00]]
5350
2723
2627
[0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 

In [71]:
import json

def save_predictions(student_id, predictions):
  # Please use this function to generate 'XXXXXXX_4_result.json'
  # `predictions` is a list of int (0 or 1; fake=0 and real=1)
  # For example, `predictions[0]` is the prediction given "unknown/0000.jpg".
  # it will be 1 if your model think it is real, else 0 (fake).

  assert isinstance(student_id, str)
  assert isinstance(predictions, list)
  assert len(predictions) == 5350

  for y in predictions:
    assert y in (0, 1)

  with open('{}_4_result.json'.format(student_id), 'w') as f:
    json.dump(predictions, f)


In [72]:
save_predictions('0816183',ans)