# Korean Character Recognition with Convnets

---
## Notice
This post was delayed due to getting sick, traveling, moving back to college park, etc. It is being updated incrementally and a somewhat complete version will be done by Sunday with suggestions for possible future updates. The main cause of delay was issues with the data size which will be discussed extensivey in the data loading section.

Rather than continue to wait for this to be in perfect condition, I am going to update it each day for the next few days.

---
## Introduction 
This semester, I am studying abroad at Yonsei University in South Korea. 

![Yonsei's Campus](./images/yonsei_campus.JPG)

I spend two hours per day in Korean class, so I wanted to make at least one post related to Korean. I figured using convnets to recognize Korean characters would be fun, and it's also quite a challenge. There are 10 digits for MNIST and 26 letters for the English alphabet, but the Korean alphabet contains 11,172 possible character combinations. In reality, however, only 2,350 characters are frequently used ([source](https://ko.m.wikipedia.org/wiki/%ED%95%9C%EA%B8%80_%EC%9D%8C%EC%A0%88))

> (Translated from Korean) These characters can be expressed in all combinations of Korean characters, but KS X 1001 Korean complete encoding only contains 2,350 characters which are frequently used, so the remaining 8,822 characters cannot be expressed. The recently used extended completion code and Unicode series support all 11,172 characters.

In this post, we will use the PHD08 Korean characters dataset which contains 2,187 samples of each of the 2,350 Korean character classes for a total of 5,139,450 data samples. Uncompressed, the datset is 7.52 GB and it's in a... "unique" format, so we'll get to spend an entire section reformatting it 👍

---
## A Brief Introduction to the Korean Alphabet

### Introduction 
Knowledge of the Korean alphabet is not really necessary to train a convnet to recognize Korean characters. After all, once the data is properly formatted, all we have to do is feed it to Keras and let it tell us our loss and accuracy. That being said, the Korean alphabet is pretty cool, and it may be way easier to learn than you think – especially compared to other asian languages. For example, Japanese has three alphabets: Kanji which contains 2,136 frequently used Chinese characters, hiragana which contains 42 symbols, and katakana which also contains 42 symbols. For chinese, there are over 50,000 unique characters although a Chinese speaker only needs to know about 8,000.

Now you may be thinking, "Japanese has 2,220, Chinese has 8,000, and Korean has 2,350... Korean doesn't seem any easier." But the 2,350 character number is entirely misleading. Korean is a phonetic language, just like English! There are 19 consonants and 21 vowels meaning there are only 40 letters (a few more than English which has 26). The reason there are 2,350 characters is because the letters are combined into syllable blocks. Since syllable blocks can contain 2, 3, or 4 letters, there are 2,350 possible syllable blocks/characters. Let's look at an example. 

### Example
<img src="./images/hello.png" alt="안녕하세요 Spelling" style="width: 300px;"/>

The image above is the word 안녕하세요 (an-nyeong-ha-se-yo) which is the most common greeting in Korean – equivalent to "Hello" in English. The word is 5 syllables long, so it contains 5 different syllable blocks (characters): (1) 안 an, (2) 녕 nyeong, (3) 하 ha, (4) 세 se, (5) 요 yo. Let's break each of these syllable blocks into their individual letters.

+ 안 "an"
    + ㅇ Consonant. At the beginning of a syllable it makes no sound (all syllables must start with consonants)
    + ㅏ Vowel.  'aw' sound
    + ㄴ Consonant. 'n' sound, points **n**orth and east. 
+ 녕 "nyeong"
    + ㄴ Consonant. 'n' sound, points **n**orth and east. 
    + ㅕ Vowel. 'yeo' sound
    + ㅇ Consonant. At the end of a word this makes an 'ng' sound like in 'ing' in English
+ 하 "ha" 
    + ㅎ Consonant. 'h' sound, looks like a man in a **h**at
    + ㅏ Vowel. 'aw' sound 
+ 세 "se"
    + ㅅ Consonant, 's' sound, looks like a person doing a **s**plit
    + ㅔ Vowel. 'eh' sound like the end of the word 'say' 
+ 요 "yo"
    + ㅇ Consonant. Silent at the beginning of a syllable. 
    + ㅛ Vowel. 'yo' sound. 
    
### Korean History
The Korean alphabet, which is called Hangeul (한글), was created in the 15th century by the 4th king of the Joseon dynasty – King Sejong the Great. Prior to the invention of Hanguel, Korean was written using Chinese characters. Due to the large number of characters required, there was low literacy among common people. King Sejong and his linguists created the script to promote literacy among commoners and establish a cultural identity for Korea through its script. The script was finished in 1443 and published in 1446. A book, called the Hunminjeongeum (훈민정음), was also published to teach the new scripts. The Hunminjeongeum was published on October 9, 1446. October 9 is a commemorative holiday known as Hangeul Day (한글날) in South Korea. By the way, here is a fun, little known fact. Hangeul Day is celebrated on January 15th in North Korea and referred to as Choseongeul Day. January 15th is the day that Hangeul was invented. October 9th is the day that Hanguel was proclaimed via the Hunminjeongeum. 

In 2009, the following sculpture of King Sejong was raised in Gwanghwamun Plaza in Seoul.
![Statue of King Sejong](./images/KingSejong.jpg)

---
## Preparing the Data
The biggest challenge with this project was getting the data into a usable form. As you will see below, the data was in an unusual file format and zipped with a non-standard archive tool. Simply unzipping the data was a challenge. Once I got the data unzipped, I had to parse it. From there I had a number of decisions. Each one is explained below. 


### Option 1: Reformat the data as images, then upload to Google 
My first idea was to parse the data, organize it into training/validation/testing directories, and save 2d numpy arrays as images. This was my preferred solution as it makes using data augmentation simple and the images are in an easy to understand format. The problem was that the final dataset contained 5,139,450 images (2,350 classes * 2,187 samples per class) and was 24GB. The code took hours to run, and when it was finished I realized the fatal flaw to this approach... Google Drive file read/writes are S L O W. I expected uploading 24GB in the form of 5,139,450 images to take a while, but I didn't expect the estimated time-to-upload to be 41 days (actually it said 999 hours which appears to be the max estimate). 

However, I am quite proud of the code for this solution. It did work well, and this data format is significantly more convenient. If you have your own deep learning machine, then I would suggest this approach. Alternatively, if you don't mind paying for storage, I would recommend uploading the files to AWS or Paperspace and doing the image parsing there. 

#### Downloading the Data
You can download the data from its [original provider](http://cv.jbnu.ac.kr/index.php?mid=notice&document_srl=189) or use this direct [Dropbox link](https://www.dropbox.com/s/69cwkkqt4m1xl55/phd08.alz?dl=0). 

#### Unzipping the Data
The data is in some propriety `.alz` form. If you're on Windows, you can unzip this using ALZip. If you are on Mac, I recommend using Unarchiver. If you are asked for the encoding format when unzipping in, select "(MS, DOC) Korean" which should show an output file like 가.txt.

![You may not like it, but this is what peak proprietary formats look like](./images/unzip.jpg)

#### Inspecting the Data
Go ahead and look at the output files. You'll notice that there are 2,350 `.txt` files – one for each Korean character. Inside the text files are 2,187 samples of the character represented in the following format: `sample id`, `dimensions` (rows then columns), `binary representation`. If you squint hard enough at the example below, you should see that it resembles the character 가. 

```
s_0_0_0_0_1
22 29
00000000000000000000011100000
00000000000000000000011100000
11111111111111100000011100000
11111111111111100000011100000
00000000000011100000011100000
00000000000001100000011000000
00000000000011100000011100000
00000000000011100000011000000
00000000000001100000011100000
00000000000001100000011111111
00000000000011100000011111111
00000000000011100000011100000
00000000000011100000011000000
00000000000000000000011100000
00000000000000000000011100000
00000000000000000000011000000
00000000000000000000011000000
00000000000000000000011000000
00000000000000000000011000000
00000000000000000000011000000
00000000000000000000011100000
00000000000000000000011000000
```

#### What We Need To Do
We need to complete the following steps
1. Write a file parser to turn each sample into a 2D numpy array
2. Write a function to save all of the 2D numpy arrays as images 
3. Iterate through all the files and save the images to train/validation/test folders

#### Parsing Files Into Numpy Arrays
The following code will parse the file and turn each image into a Numpy array

In [63]:
import re
import numpy as np 
from scipy.misc import imresize

def load_images(file):
    """Return all the characters as a list of 2D numpy arrays"""
    # Precompile match patterns for reuse 
    image_dimensions_regex = re.compile(r'^(\d+) (\d+)$')
    image_binary_regex = re.compile(r'^(\d+)$')

    # Return all images from file 
    images = [] 

    # Open the file 
    with open(file, 'r') as file:
        sample_id = 0 
        image = None 
        for line in file: 
            # Blank Lines: Add image to list 
            if line == '\n':
                if image is not None:
                    images.append(image)
                    image = None
                continue 

            # Sample IDs: Increment the sample number 
            if '_' in line:
                sample_id += 1
                continue 

            # Image Dimensions: Create numpy array 
            dims = re.match(image_dimensions_regex, line)
            if dims:
                rows = int(dims.group(1))
                cols = int(dims.group(2))
                row_index = iter(range(rows))
                image = np.zeros((rows, cols))

            # Binary Image: Add each row to array
            row_data = re.match(image_binary_regex, line)
            if row_data:
                row = next(row_index)
                data = [int(c) for c in list(row_data.group(1))]
                image[row] = data

    return images          

Given a list of images and an output directory, this code will save each image as a jpeg. Since our images were 0 or 1 before, we multiply them by 255. This means we will have to scale them when we load them into our convnet.

In [64]:
import os
from PIL import Image

def save_images(images, output_dir):
    """Saves an array of numpy images to the specified output directory"""
    for (idx, image) in enumerate(images):
        img = Image.fromarray(image*255.).convert("L")
        img_name = str(idx) + '.jpg'
        output_path = os.path.join(output_dir, img_name)
        img.save(output_path)

Finally, this script creates all the output directories, iterates through all the files, and saves the images accordingly. You should change the source directory and the target directory based on your information. Also, we use 1,187 images for training, 500 images for validation, and the remaining (500) images for testing. You can adjust this if you want as well. 

NOTE: This took 2 hours and 10 minutes to run on my MacBook Pro 👎🤕👎

In [68]:
import sys

# ------------------------------------------------------------
# UPDATE THESE  
# ------------------------------------------------------------
phd08_source_dir = '/Users/jtbergman/Datasets/phd08/'
phd08_target_dir = '/users/jtbergman/Datasets/phd08formatted'

# Paths for train, test, validation directories 
train_dir = os.path.join(phd08_target_dir, 'train')
test_dir = os.path.join(phd08_target_dir, 'test')
val_dir = os.path.join(phd08_target_dir, 'validation')

# Train / Val / Test split
train_split = 1187 
val_split = 500

# ------------------------------------------------------------
# DON'T CHANGE   
# ------------------------------------------------------------

def mkdir(directories):
    """Create directories if they don't already exist."""
    if type(directories) != list:
        directories = [directories]
    for d in directories:
        if not os.path.exists(d):
            os.mkdir(d)

mkdir([phd08_target_dir, train_dir, test_dir, val_dir])

def output_directories_for_file(file):
    """Return output directories for a character's images."""
    filename = os.path.basename(file)
    character = filename.split('.')[0]
    train_out = os.path.join(train_dir, character) 
    test_out = os.path.join(test_dir, character)
    val_out = os.path.join(val_dir, character)
    mkdir([train_out, test_out, val_out])
    return train_out, test_out, val_out 

# Iterate over all the files and save them to train/dev/test
for file in os.listdir(phd08_source_dir):
    train, test, val = output_directories_for_file(file)
    images = load_images(os.path.join(phd08_source_dir,file))
    save_images(images[:train_split], train)
    save_images(images[train_split:train_split+val_split], val)
    save_images(images[train_split+val_split:], test)

### Option 2: Upload the files to Google Drive, then convert to Images
After realizing it was going to be impossible to upload the dataset to Google Drive, I decided to upload the text files to Google Drive and convert them into images on Google Drive. I thought the input/output would be faster with Colaboratory. Uploading all 2,350 text files was estimated to take something like 16 hours, but really only took 3. Once they were uploaded, I ran the same code as above and simply changed the input and output directories to the ones on Google Drive. 

As mentioned in the previous section, processing the data only took 2 hours and 10 minutes on my MacBook. But my MacBook has a solid-state drive and 16GB of RAM. I'm not sure what Google Drive has, but I'm sure it's not as fast. The file reading and writing is the bottleneck here. After running the code for several hours it appeared that many of the images were not even showing up. I had to wait almost 24 hours for images to stop showing up so I could completely delete them from my account. 

I recommend you don't even try this. Although, as mentioned in the previous section, this may be a viable solution on Paperspace and AWS where I imagine the read/writes are much faster. Just beware the output is 24 GB (+7GB for the files), so you'll need to be prepared for storage and the GPU time to convert the images. 

### Option 3: Skip the image conversion, just load the data as numpy arrays
At this point, I already had the files uploaded to Google Drive. I knew this was my only hope of using the data in Colaboratory, so I had to find a way to make it work. I decided instead of parsing the files and converting them to images, I would just load the files directly into training, validation, and testing numpy arrays. 

The problem with this is the images need to be 150x150 to be used by most CNN architetures easily available in Keras. Furthermore, the images are supposed to be in RGB format, so we have to duplicate the same image 3 times. If we were to load all of our data into a single numpy array it would have the shape `(5139450, 150, 150, 3)`... that's big. In fact, you can only parse 45 of the 2,350 files before you use the 15GB available to you on Google Colaboratory. If you just loaded the training data you could load 90 files – almost 4% of our data 😭 

To make matters worse, each file was taking about 10 seconds with the reformatting which comes out to about 6.5 hours. When you run a colab file for too long, Google will kill it and wipe all your variables. Thus, even if you could load the data Google would almost certainly cut you off and erase your progress. 

### Option 4: Load, train, repeat
I realized if I wanted to use Colaboratory I was going to have to find a creative solution. 

My first improvement was to reduce the image size. Currently the images are `(150, 150, 3)` which corresponds to 67,500 integers. I decided to use a pre-trained model (which will be shown below) and feed each image through the convolutional base when it was first loaded. Then, instead of storing a `(150, 150, 3)` image I could store a `(1, 3, 3, 2048)` or a `(1,2048)` tensor depending on whether or not I used max-pooling on the final layer. These correspond to 18,432 or 2,048 integer values respectively – a significant reduction from 67,500. 

My second idea was to only load a few samples from each class per training iteration. For instance, I would load 25 randomly selected images from each class, feed each through the convolutional base when loaded, then train a simple model for 20 epochs on this dataset. After that, I would save the model, load 25 more images from each class, then train the model again. This was the solution I ended up using. As you'll see, it's not perfect... but it "works".

First, I defined a dictionary to map each filename to a class label `(0...2349)`. I also reversed the dictionary to map each label to its filename. 

In [None]:
import os 

# Map each filename to a label
filename_to_label = {} 
for (idx,filename) in enumerate(os.listdir(phd08_source_dir)):
  filename_to_label[filename] = idx

# Map each label to a filename 
label_to_filename = {v: k for k,v in filename_to_label.items()}

Next, I implemented a modified version of the the `load_images` function which took a filename and a `Set` of sample ids to keep. The time and space improvements here come from limiting the number of samples being loaded, only allocating a numpy array for the specified samples, and stopping early after the last sample is loaded. Most important for our model is the last line before the return, `features = conv_base.predict(np.array(images)/255.0)`. This is where we get the reduction from 67,500 integers to either 18,432 or 2,048 depending on the conv_base.

In [None]:
import re
import random
import numpy as np 
from scipy.misc import imresize

img_size = 150

def load_features(conv_base, file, samples):
    """Return all the characters as a list of 2D numpy arrays"""
    # Precompile match patterns for reuse 
    image_dimensions_regex = re.compile(r'^(\d+) (\d+)$')
    image_binary_regex = re.compile(r'^(\d+)$')

    # Return all images from file 
    images = [] 

    # Open the file 
    with open(file, 'r') as file:
        sample_id = 0 
        max_sample_id = max(samples)
        image = None 
        for line in file: 
            # Blank Lines: Add image to list 
            if line == '\n':
                if image is not None:
                    image_resize = imresize(image, size=(img_size, img_size), interp='bilinear')
                    image_resize = np.stack((image_resize,)*3, axis=-1)
                    images.append(image_resize)
                    image = None
                continue 

            # Sample IDs: Increment the sample number 
            if '_' in line:
                sample_id += 1
                continue 
                
            if sample_id > max_sample_id:
                break
            
            if sample_id in samples: 
                # Image Dimensions: Create numpy array 
                dims = re.match(image_dimensions_regex, line)
                if dims:
                    rows = int(dims.group(1))
                    cols = int(dims.group(2))
                    row_index = iter(range(rows))
                    image = np.zeros((rows, cols))

                # Binary Image: Add each row to array
                row_data = re.match(image_binary_regex, line)
                if row_data:
                    row = next(row_index)
                    data = [int(c) for c in list(row_data.group(1))]
                    image[row] = data
    
    features = conv_base.predict(np.array(images)/255.0)
    return features

Before we define the massive function to load a few images per iteration and train our model, we are going to write a small generator to give us sample ids to load. Each time this generator it will return a set of sample ids that have not been loaded previously. We will load those samples from every class. There's no reason, but also no harm, in using the same samples from each class on an iteration. I decided to use random sample ids rather than loading images in order in case similar sample ids contain similar characters – I'm not sure if that's the case or not, but it did appear that way to me. 

One drawback is that the sample_range is usually set to 550 so that we can stop after the 550th sample id... this means if we test our model using data outside this range it may be a slightly different distribution. 

In [None]:
import random

def get_sample_ids(sample_range, num_samples, num_iterations):
  """Generate non-repeating sample ids in a given range.
  
  Arguments:
    - sample_range    The possible range of values
    - num_samples     The number of sample ids to return per call 
    - num_iterations  How many times the generator can be called
 """
  i = 0
  samples = np.random.permutation(sample_range)
  while i < num_iterations:
    yield samples[i*num_samples : (i+1)*num_samples]
    i += 1

Finally, here is the function that loads the samples, trains our model, and repeats the extremely long process. 

This line `print('\rLoaded classes:', loaded, end='', flush=True)` is a nifty trick to print the number of classes loaded using only one output line (it updates the same line during each print call). This allows us to maintain our sanity that our code is still running (albeit slowly) without spamming our output with prints. 

If you want to further preserve your sanity, you may want to change `verbose=0` in `hist = model.fit(features, labels, epochs=epochs, verbose=0)` to either `verbose=1` or `verbose=2` otherwise you won't see anything while your model is training. I guess `verbose=2` would probably be better as it just prints one update per epoch.

In [None]:
def train_model(model, conv_base, feature_dims, samples_per_class=25, number_of_iterations=20, epochs=20):
    """Trains our model using a small number of samples from each class.

    This helper function will train our model. Suppose we use the default 
    number of iterations, 20, and the default number of samples per class, 30. 
    Then our model will select 30 images from each of the 2,350 classes and 
    train the model on them for 30 epochs. It will then repeat this process for a
    total of 20 iterations. After each iteration it will display the accuracy 
    and the predictions of the model on a set of examples.

    Arguments
    - model                 The model we're training 
    - conv_base             The conv_base to use to extract features from the samples
    - feature_dims          The dimension of the inputs to the model/outputs of conv base when reshaped
    - samples_per_class     The number of samples from each of 2,350 classes
    - number_of_iterations  The number of times we sample and retrain 
    - epochs                Epochs per iteration 
    """

    phd08_source_dir = '/content/drive/My Drive/Colab Notebooks/Datasets/phd08/'
    num_classes = 2350
    training_sample_range = 550 
    sample_count = num_classes * samples_per_class
    accuracy_history = [] 
    first_ten_prediction_history = []
    unique_ten_prediction_history = []

    i = 1
    for samples in get_sample_ids(training_sample_range, samples_per_class, number_of_iterations):
        print('STARTING ITERATION', i)
        print('===================' + len(str(i)) * '=')
        print('')

        # -----------------------------------------------------------
        # Preallocate space for the labels and the extracted features
        # -----------------------------------------------------------
        features = np.zeros((sample_count, feature_dims))
        labels = np.zeros((sample_count))

        # ----------------------------------------------------
        # Perform feature extraction for the specified samples 
        # ----------------------------------------------------
        loaded = 0
        for file in os.listdir(phd08_source_dir):
            extracted_features = load_features(conv_base, os.path.join(phd08_source_dir,file), samples)
            extracted_features = np.reshape(extracted_features, (extracted_features.shape[0], feature_dims))
            features[loaded * samples_per_class : (loaded+1) * samples_per_class] = extracted_features
            labels[loaded * samples_per_class : (loaded+1) * samples_per_class] = np.array([filename_to_label[file]] * len(extracted_features))
            loaded += 1
            print('\rLoaded classes:', loaded, end='', flush=True)
        labels = to_categorical(labels)
        print('\n')

        # ----------------------------------------------------
        # Fit to the loaded samples 
        # ----------------------------------------------------
        hist = model.fit(features, labels, epochs=epochs, verbose=0)
        accuracy = hist.history['acc']
        accuracy_history += accuracy
        print('Accuracy:', accuracy[-1], end='\n\n')

        # ----------------------------------------------------
        # Check predictions on first 10 samples
        # ----------------------------------------------------
        print('Predictions on first 10 samples')
        print('-------------------------------')
        for sample_number in range(10):
            actual_class_id = np.argmax(labels[sample_number])
            actual_class_name = label_to_filename[actual_class_id].split('.')[0]
            print('Actual Class:', actual_class_name)
            predicted_class_id = model.predict_classes(np.expand_dims(features[sample_number], axis=0))[0]
            predicted_class_name = label_to_filename[predicted_class_id].split('.')[0]
            print('Predicted Class:', predicted_class_name)
            first_ten_prediction_history.append((actual_class_name, predicted_class_name))
            print('')
        print('')

        # ----------------------------------------------------
        # Check predictions on 10 unique samples
        # ----------------------------------------------------
        print("Predictions on 10 unique samples")
        print('--------------------------------')
        for sample_number in range(0,100,samples_per_class):
            actual_class_id = np.argmax(labels[sample_number])
            actual_class_name = label_to_filename[actual_class_id].split('.')[0]
            print('Actual Class:', actual_class_name)
            predicted_class_id = model.predict_classes(np.expand_dims(features[sample_number], axis=0))[0]
            predicted_class_name = label_to_filename[predicted_class_id].split('.')[0]
            print('Predicted Class:', predicted_class_name)
            unique_ten_prediction_history.append((actual_class_name, predicted_class_name))
            print('')

        print('')
        model.save('/content/drive/My Drive/Colab Notebooks/phd08_model.h5')
        print('Model Saved')

        print('\n\n')
        i += 1

    return accuracy_history, first_ten_prediction_history, unique_ten_prediction_history

My initial thoughts with this function was, "it takes so long to load data, I might as well train a lot on it." I found that with 20 epochs I could get really accuracy on those samples, but there was pretty extreme overfitting. The training takes a non-trivial amount of time itself, so it may be better to just do 1-3 epochs and get on to loading your next dataset. 

---
## Architecture Decisions

### Introduction
This post, and the usage of the PHD08 dataset, was inspired by [_Variations of AlexNet and GoogLeNet to Improve Korean Character Recognition Performance_](http://jips-k.org/q.jips?cp=pp&pn=537). They trained to networks KCR-Alexnet and KCR-GoogLeNet for Korean Character Recognition (hence the KCR prefix). 

The KCR-AlexNet architecture is described as follows.
> The overall architecture of KCR-AlexNet is the same as AlexNet, but KCR-AlexNet uses a 56×56-pixel input data size for Korean character images, which is smaller than AlexNet’s input data size of 256×256 for natural images. In addition, while the output layer of the existing AlexNet only has 1,000 nodes for classifying ILSVRC’s classes, KCR-AlexNet needs 2,350 nodes at the output layer to classify PHD08’s 2,350 Korean character classes.

The KCR-GoogLeNet architecture is described as follows. 
> The biggest difference between GoogLeNet and KCR-GoogLeNet is that GoogLeNet uses nine inception modules and KCR-GoogLeNet uses only three inception modules. This is because GoogLeNet’s purpose is to classify the nature of a 256×256×3 image size and KCR-GoogLeNet’s purpose is to classify small Korean characters of size 56×56×1. The KCR-GoogLeNet architecture is shown in Fig. 3 and the detail size of each layer, including the inception modules, is introduced in Table 1.

### Our Model 
We are going to add to the mix by training KCR-ResNet50. That is, we'll be using the ResNet50 architecture (provided by Keras) for Korean Character Recognition. To match the paper, we will also resize our inputs to 56x56x1 and we will have to replace the top layers with a 2,350 dimension softmax classifier. Although the ImageNet images aren't very similar to Korean characters, we are still going to use the ImageNet weights. 

### Training Approach 
1. We will train a small FCNN using the outputs of ResNet. 
2. We will add the FCNN to the ResNet model. 
3. We will "fine tune" some of the convolutional base. 

In [35]:
from keras.applications import ResNet50
from keras.models import Model

resnet = ResNet50(include_top=False, weights='imagenet', input_shape=(200, 200, 3))
output = layers.AveragePooling2D(pool_size=(7, 7), padding='valid')(resnet.output)
resnet_base = Model(resnet.input, output)
resnet_base.summary()



__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_15 (InputLayer)           (None, 200, 200, 3)  0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 206, 206, 3)  0           input_15[0][0]                   
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 100, 100, 64) 9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 100, 100, 64) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation

---
## Training KCR-ResNet50

### Training a Dense Network
First, we will train a dense network using the outputs of ResNet50.

In [36]:
from keras import models 
from keras import layers 
from keras import optimizers

# Define the model 
resnet_top = models.Sequential() 
resnet_top.add(layers.Dense(2350, activation='softmax', input_dim=1*1*2048))

# Compile the Model 
resnet_top.compile(optimizer=optimizers.RMSprop(lr=2e-5),
                  loss='binary_crossentropy',
                  metrics=['acc'])

# Train the Model 
# history = resnet_top.fit(train_images, train_labels,
#                          epochs=30,
#                          batch_size=32,
#                          validation_data=(val_images, val_labels))

### Combining ResNet with our Dense Network
Create the model

In [37]:
kcr_resnet = models.Sequential() 
kcr_resnet.add(resnet_base)
kcr_resnet.add(layers.Flatten())
kcr_resnet.add(resnet_top)

### Fine Tuning our Model 
Freezing some layers.

In [38]:
resnet_base.trainable = True

set_trainable = 'res5c_branch2a'
for layer in resnet_base.layers:
    if layer.name == set_trainable:
        set_trainable = True
    if set_trainable:
        layer.trainable = True
    else:
        layer.trainable = False