# DOPE
---

With `dope`, our goal is to make all existing standard machine learning frameworks(say sklearn, suprislib, pytorch, tensorflow etc) interoperable. That is, one can devlop and train a model, say, using Linear Regression in sklearn, and score it using a TensorFlow server.

In this tutorial, we walk through an example demonstrating one such scenario.
 

## Usage
---
Setting the context(Terminologies used) -  
1) Primal model - Primal model refers to the base model provided by the user. For example, the primal model in the scenario demonstrated below would be the `LogisticRegression()` class instance from sklearn.  
2) dope - The dope function converts your primal model to it's dnn equivalent. Also, dope ensures that the functional and behavioural aspects of your primal model is retained when it's "dope"d.


*Note - The usage of `dope` is pretty straightforward as long as the user has a decent understanding of basic Sklearn and Keras functionalities.*

## Step 1: Loading and preprocessing dataset
---
In this example we will use the iris dataset. The primal model used here is sklearn's Logistic Regression class. The `dope` function converts sklearn's Logistic Regression model to it's Neural Network equivalent.

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import datasets

iris = datasets.load_iris()

X = iris.data
Y = iris.target

# Split the data in to test and train batches
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.60, random_state=0)

## Step 2: Instantiate the primal model
---
Instantiate the model you wish to convert in to a Neural network. Here, we use sklearn's logistic regression.

In [2]:
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()

## Step 3: "Dope" your primal model!
---
The `dope` function lets you convert your primal model.

In [3]:
from mlsquare import dope
m = dope(model)

Using TensorFlow backend.
2019-11-07 16:21:56,983	INFO node.py:423 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-11-07_16-21-56_789/logs.
2019-11-07 16:21:57,092	INFO services.py:363 -- Waiting for redis server at 127.0.0.1:18579 to respond...
2019-11-07 16:21:57,202	INFO services.py:363 -- Waiting for redis server at 127.0.0.1:30532 to respond...
2019-11-07 16:21:57,204	INFO services.py:760 -- Starting Redis shard with 20.0 GB max memory.
2019-11-07 16:21:57,224	INFO services.py:1384 -- Starting the Plasma object store with 1.0 GB memory using /dev/shm.
Transpiling your model to it's Deep Neural Network equivalent...


---
__Note - The warning message you see about redis server is a part of the optimization process `dope` does. The details about this will be covered in the upcoming tutorials(yet to be published). So fret not! These warning messages can be safely ignored.__

## Step 4: Voila! You have successfully Doped your model
---
Once you have successfully run the `dope` function by passing your primal model, the returned model(the variable `m` here) would behave like any other sklearn models. The only difference being that the model is not a standard sklearn model but a dnn equivalent of the model provided by you.

The below mentioned methods demonstrate the resemblance of an "dope'd" model with sklearn models.

In [4]:
## Fit your model ##
m.fit(x_train, y_train)

2019-11-07 16:22:01,058	INFO tune.py:60 -- Tip: to resume incomplete experiments, pass resume='prompt' or resume=True to run()
2019-11-07 16:22:01,058	INFO tune.py:211 -- Starting a new experiment.


== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/4 CPUs, 0/0 GPUs
Memory usage on this node: 5.5/8.2 GB





== Status ==
Using FIFO scheduling algorithm.
Resources requested: 4/4 CPUs, 0/0 GPUs
Memory usage on this node: 5.6/8.2 GB
Result logdir: /home/shakkeel/ray_results/experiment_name
Number of trials: 1 ({'RUNNING': 1})
RUNNING trials:
 - train_model_0:	RUNNING

[2m[36m(pid=874)[0m Using TensorFlow backend.
[2m[36m(pid=874)[0m 2019-11-07 16:22:05,605	ERROR worker.py:1412 -- Calling ray.init() again after it has already been called.
[2m[36m(pid=874)[0m Instructions for updating:
[2m[36m(pid=874)[0m Colocations handled automatically by placer.
[2m[36m(pid=874)[0m Instructions for updating:
[2m[36m(pid=874)[0m Use tf.cast instead.
[2m[36m(pid=874)[0m 2019-11-07 16:22:05.961197: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[2m[36m(pid=874)[0m 2019-11-07 16:22:05.964389: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2304000000 Hz
[2m[36m(pi

2019-11-07 16:22:06,545	INFO ray_trial_executor.py:178 -- Destroying actor for trial train_model_0. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.


Result for train_model_0:
  checkpoint: 'weights_tune_{''layer_1.units'': 1, ''layer_1.l1'': 0, ''layer_1.l2'':
    0, ''layer_1.activation'': ''sigmoid'', ''optimizer'': ''adam'', ''loss'': ''binary_crossentropy''}.h5'
  date: 2019-11-07_16-22-06
  done: false
  experiment_id: 64b24926e4c24e0487a82cb412ada368
  hostname: shakkeel-TUF-GAMING-FX504GD-FX80GD
  iterations_since_restore: 1
  mean_accuracy: 0.21666667262713116
  node_ip: 192.168.1.4
  pid: 874
  time_since_restore: 0.6463437080383301
  time_this_iter_s: 0.6463437080383301
  time_total_s: 0.6463437080383301
  timestamp: 1573123926
  timesteps_since_restore: 0
  training_iteration: 1
  
[2m[36m(pid=874)[0m 




== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/4 CPUs, 0/0 GPUs
Memory usage on this node: 5.5/8.2 GB
Result logdir: /home/shakkeel/ray_results/experiment_name
Number of trials: 1 ({'TERMINATED': 1})
TERMINATED trials:
 - train_model_0:	TERMINATED, [4 CPUs, 0 GPUs], [pid=874], 0 s, 1 iter, 0.217 acc

Creating model...
Instructions for updating:
Colocations handled automatically by placer.
Loading from /home/shakkeel/ray_results/experiment_name/train_model_0_2019-11-07_16-22-01b80k77v4/weights_tune_{'layer_1.units': 1, 'layer_1.l1': 0, 'layer_1.l2': 0, 'layer_1.activation': 'sigmoid', 'optimizer': 'adam', 'loss': 'binary_crossentropy'}.h5


<keras.engine.sequential.Sequential at 0x7f479f845b00>

In [5]:
## Score your model ##
m.score(x_test, y_test)



[0.10076400770081415, 0.3666666699780358]

In [6]:
## Save your model ##
m.save('demo_path')

The maximum opset needed by this model is only 7.


*Note - The save method expects a single argument - filename. You will be able to find the saved model in the directory you're running your script from. The model by default is saved in three formats - h5, onnx and a serialized pickle file.*