# Machine learning on mobile
The goal is to get a trained image classifier working on your phone. The image classifier should be able to detect if an image is a cat or a dog and what percentage cat or dog the picture is.

Steps needed to complete this project.

1. Gather data for training a machine learning model and clean up the data.
2. Traing an image classifier using Keras.
3. Create a restful API that we can send a request to that uses our trained model.
4. Build a mobile application that can take a picture and classify how much cat or dog a picture is

This tutorial is inspired by this Keras example.
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

# Gather data for training a machine learning model

In this tutorial, we will present a few simple yet effective methods that you can use to build a powerful image classifier, using only very few training examples --just a few hundred or thousand pictures from each class you want to be able to recognize.

We will go over the following options:

* training a small network from scratch (as a baseline)
* using the bottleneck features of a pre-trained network
* fine-tuning the top layers of a pre-trained network
* This will lead us to cover the following Keras features:

* fit_generator for training Keras a model using Python data generators
* ImageDataGenerator for real-time data augmentation
* layer freezing and model fine-tuning
* ...and more.

To acquire a few hundreds or thousands of training images belonging to the classes you are interested in, one possibility would be to use the Flickr API to download pictures matching a given tag, under a friendly license.

In our examples we will use two sets of pictures, which we got from Kaggle: 1000 cats and 1000 dogs (although the original dataset had 12,500 cats and 12,500 dogs, we just took the first 1000 images for each class). We also use 400 additional samples from each class as validation data, to evaluate our models.

That is very few examples to learn from, for a classification problem that is far from simple. So this is a challenging machine learning problem, but it is also a realistic one: in a lot of real-world use cases, even small-scale data collection can be extremely expensive or sometimes near-impossible (e.g. in medical imaging). Being able to make the most out of very little data is a key skill of a competent data scientist.

1. The data can be downloaded from kaggle. Because we are awesome we also stored the data on our floydhob server
https://www.floydhub.com/viewer/data/xLHRt9d9UnipGEEr4FkWdF/

https://www.kaggle.com/c/dogs-vs-cats
    
    
Step 1.
Download the data from kaggle and sort the data into two different folders
Train containing 1000 cat images and 1000 dog images and Validation containing 400 images
    

In [None]:
# preprocessing
import os
import errno
import itertools
import scipy.misc
import numpy as np

curr_dir = os.getcwd()
proj_dir = os.path.normpath(os.path.join(curr_dir))
image_dir = 'data/train'
input_filepath = os.path.normpath(os.path.join(proj_dir, image_dir))

validation_dir = 'data/processed/validation'
train_dir = 'data/processed/train'

class_dict = {'dog': 0, 'cat': 1}

def put_classes_into_separate_folders(parent_dir, images):
    make_sure_path_exists(parent_dir)

def make_sure_path_exists(path):
    try:
        os.makedirs(path)
    except OSError as exception:
        if exception.errno != errno.EEXIST:
            raise


train_shuffled_images = []
test_shuffled_images = []
def create_train_validation_shuffle_images():
    test_percentage = 0.2
    for key, _ in class_dict.items():
        class_images = [x for x in os.listdir(input_filepath) if x.startswith(key + ".")]
        k = int(len(class_images) * test_percentage)
        test_shuffled_images = class_images[0:k]
        train_shuffled_images = class_images[k:]
    return train_shuffled_images, test_shuffled_images

train_images = []
validation_images = []
def resize_of_images():
    train_images = [scipy.misc.imresize(scipy.misc.imread(input_filepath + '/' + image), (100,50)) for image in train_shuffled_images]
    validation_images = [scipy.misc.imresize(scipy.misc.imread(input_filepath +
        '/' + image), (100,50)) for image in test_shuffled_images]
    return train_images, validation_images

def put_class_images_in_folders(save_dir, image_files, class_feature):
    counter = 0
    class_counter = 0
    first = True

    make_sure_path_exists(save_dir)

    for image in image_files:
        counter += 1

        if (counter % int(len(image_files) / len(class_feature)) == 0) and (first == False) != (counter == int(len(image_files))):
            class_counter += 1
        first = False

        class_dir = os.path.join(save_dir,class_feature[class_counter])
        make_sure_path_exists(class_dir)

        save_image = os.path.join(class_dir,'{}_{}.jpeg'.format(class_feature[class_counter],"".join((map(str,np.random.randint(0,9,8))))))

        scipy.misc.imsave(save_image, image)


if __name__ == '__main__':
    train_shuffled_images, test_shuffled_images = create_train_validation_shuffle_images()
    train_images, test_images = resize_of_images()
    keys = []
    for k, _ in class_dict.items():
        keys.append(k)
    print(input_filepath)
    put_class_images_in_folders(input_filepath + '/train/', train_images,
            keys)
    put_class_images_in_folders(input_filepath + '/validation/', test_images,
            keys)

# Train an image classifier in Keras

<img src="files/datasteps.png" />

FileNotFoundError: [Errno 2] No such file or directory: 'https://imgur.com/0L8xRap.png'


Image(url="https://imgur.com/0L8xRap")

In [8]:
import keras # machine learning
import numpy as np # math
from parser import load_data #data loading

# link to floydhub data
floydhub = 'https://www.floydhub.com/viewer/data/8t4tSpRFEsqQ9uxsXfqu3i/train/'

# Step 1 - Collect Data
training data = load_data('data/training')
validation_data = load_data('data_validation')

# Step 2 - Build Model
model = Sequential()
model.add(Convolution2D(32,3,3 input_shape=(img_width, img_height,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D)(pool_size=(2,2))

model.add(Convolution2D(32,3,3))
model.add(Activation('relu'))
model.add(MaxPooling2D)(pool_size=(2,2))

model.add(Convolution2D(64,3,3))
model.add(Activation('relu'))
model.add(MaxPooling2D)(pool_size=(2,2))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))


model.compile(loss='binary_crossentropy',
             optimizer='rmsprop',
             metrics=['accuracy'])

# Step 3 - Train Model
model.fit_generator(
        training_data,
        samples_per_epoch=2048,
        nb_epoch=30,
        validation_data=validation_data,
        nb_val_samples=832)
model.save_weights('models/simple_CNN.h5')

SyntaxError: invalid syntax (<ipython-input-8-3b1cf96d141d>, line 9)

# Create a restful API to serve the trained Keras Model and store it on Heroku

# Building a AI web app
Machine learning is it's own paradigm that is complicated to learn. However even if you learn everything about machine learning you need to figure out a way to get your machine learning model up and working online.

There are a few different ways of doing that.

The easiests is using a small python library, while other frameworks can be helpful for web applications



# Flask mini framework
<img src="files/flask.png" />
Flask is a micro framework for creating web server. With it you can create a web application that serves up your trained machine learning models in an API.


In [None]:
# Trained keras model
To be able to serve a model we need to train a model 
that can be returned with an API call. In our example we will train a simple classifier that 
classifies an image and returns the neural networks best guess of what the image is depicting.

# function for prediction
def predict(data):
    # load json and create model
    json_file = open('task.json', 'r')
    loaded_model_json = json_file.read()
    json_file.close()
    loaded_model = model_from_json(loaded_model_json)
    # load weights into new model
    loaded_model.load_weights("task.h5")
    print("Loaded model from disk")
    # predicting the model
    print("Predicting the value")
    prediction = loaded_model.predict(np.swapaxes(np.array(data), 0, 1))
    print(prediction)
    one = False;

    # this can definitely be improved for making better predictions
    if (prediction[0][0] > 0.5):
        one = True
    if (one == False):
        print(predictzero)
    elif (one):
        print(predictone)
    else:
        print("error")

In [None]:
# simple code for a python API using flask
#!flask/bin/python
from flask import Flask, jsonify

app = Flask(__name__)

tasks = [
    {
        'id': 1,
        'title': u'Buy groceries',
        'description': u'Milk, Cheese, Pizza, Fruit, Tylenol',
        'done': False
    },
    {
        'id': 2,
        'title': u'Learn Python',
        'description': u'Need to find a good Python tutorial on the web',
        'done': False
    }
]


from flask import abort

# getting an individual task
@app.route('/todo/api/v1.0/tasks/<int:task_id>', methods=['GET'])
def get_task(task_id):
    task = [task for task in tasks if task['id'] == task_id]
    if len(task) == 0:
        abort(404)
    return jsonify({'task': task[0]})

# getting all the tasks
@app.route('/todo/api/v1.0/tasks', methods=['GET'])
def get_tasks():
    return jsonify({'tasks': tasks})

from flask import make_response

# error handling
@app.errorhandler(404)
def not_found(error):
    return make_response(jsonify({'error': 'Not found'}), 404)

from flask import request

@app.route('/todo/api/v1.0/tasks', methods=['POST'])
def create_task():
    if not request.json or not 'title' in request.json:
        abort(400)
    task = {
        'id': tasks[-1]['id'] +1,
        'title': request.json['title'],
        'description': request.json.get('description', ""),
        'done': False
    }
    tasks.append(task)
    return jsonify({'task': task}), 201

@app.route('/todo/api/v1.0/tasks/<int:task_id>', methods=['PUT'])
def update_task(task_id):
    task = [task for task in tasks if task['id'] == task_id]
    if len(task) == 0:
        abort(404)
    if not request.json:
        abort(400)
    if 'title' in request.json and type(request.json['title']) != unicode:
        abort(400)
    if 'description' in request.json and type(request.json['done']) is not bool:
        abort(400)
    if 'done' in request.json and type(request.json['done']) is not bool:
        abort(400)
    task[0]['title'] = request.json.get('title', task[0]['title'])
    task[0]['description'] = request.json.get('description', task[0]['description'])
    task[0]['done'] = request.json.get('done', task[0]['done'])
    return jsonify({'task': task[0]})

@app.route('/todo/api/v1.0/tasks/<int:task_id>', methods=['DELETE'])
def delete_task(task_id):
    task = [task for task in tasks if task['id'] == task_id]
    if len(task) == 0:
        abort(404)
    tasks.remove(task[0])
    return jsonify({'result': True})

if __name__ == '__main__':
    app.run(debug=True)


# Build a mobile application that can take a picture and classify how much cat or dog a picture is
<img src="files/reactnativeexpo.png" />


There is a lot of moving parts that needs to fit together in order to get a machine learning model working on a mobile phone. Each domain is hard and getting it all together is tricky. Still. We should try to make everything work.

In order to do that we will use React Native. A framework made by Google that let us use Javascript to build native applications for iOS and Android.

https://facebook.github.io/react-native/

## Getting started with React Native
Start by installing React Native from the terminal:
npm install -g create-react-native-app

Secondly, download Expo, an integrative environment on your phone. 
https://expo.io/

By using Expo we can get an app running on any phone that got the expo APP installed. 

You can even use it live on the web.
https://snack.expo.io
