#### NOTE: Before you begin this tutorial the [Upload the Gesture Demo Project to SensiML Cloud](Getting%20Started%20-%20Tutorial%200%20-%20Project%20Upload.ipynb) using the Data Capture Lab

#  Building a Knowledegpack for Gesture Recognition

In this Tutorial we are going to walk you through setting up the **Automated Pipelines with Widgets**. 

The data you are going to use was collected from multiple subjects wearing a device with 6 sensors (Accelerometer x,y,z and Gyroscope x,y,z) and formatted using the Data Capture Lab. The goal is using Automated Pipelines to build a model that is able to classify what type of activity the subjects were performing. 

By the end of this tutorial you should be able to 
* Query data by using **query widget**
* Create models by using **automated pipelines widget** 
* Understand the quality of the model
* Download the binaly/library files by using ** model builder widget**  


### Initialize a KB project

The data for this project has already been labeled and uploaded to the Project "Gesture_data_demo". To access the data first connect to the knowledgebuilder cloud service.

In [3]:
import pandas as pd
import numpy as np

from sensiml import SensiML
from sensiml.widgets import QueryWidget, AutoSenseWidget, DownloadWidget


dsk = SensiML()
dsk.project ='Gesture Demo'

### Initialize a pipeline

The next step is to initialize a pipeline space to work in. The work you do in the pipeline will be stored in KB Cloud so that you can share pipelines with collaborators and come back to stored work in the future.

In [2]:
dsk.pipeline = 'Gesture Pipeline'

### Import necessary library to use widget service 

### Query data by using query widget
In this tutorial we will build a query against data that was uploaded through the Data Caputre Lab.

#### Query
* Query Name: What we want to name our query. This name is also how you will retrieve the query in the future. 
* Segmenter: Name of the segmenter used to create the segments int he DCL
* Label Column: The column that has the gesture classifcation
* Metadata Columns: This is additional information about your data set that is useful for separating out individual datastreams. In this example we have SegmentID, Gesture and Subjects. SegmentID is an id number of the segment. Subject relates to the individual user. Gesture provides a ground truth about what type of activity the user was performing.

* Sensor Columns: The data columns that you would like to include. In our case, these columns are the sensor data from the device
        'AccelerometerZ'
        'AccelerometerY'
        'AccelerometerX'


In [4]:
query_widget = QueryWidget(dsk)
query_widget.create_widget()

### Create models by using automated pipelines widget

Automated pipelines help you find a good set of features and pipeline parameters without having to write as much code. In order to use automation, you need to:
1. Initialize your pipeline with a query or data file and a segmenter (Explained in previous section)
2. Choose a pipeline seed (i.e. a template pipeline)

    #### Pipeline Seeds
    Pipeline seeds are pre-defined pipeline configurations that exist on the server that can be used to populate the functions and parameters of your DSK pipeline. So instead of piecing together your own pipeline with function calls and other code, you can use a pre-set pipeline seed to run feature generators, selectors, transforms, and model generation algorithms based on a common pattern from the database.


##### Guidelines for Picking a Seed
- Basic Features - choose this if:
     - You are wondering where to start
     - You want execution to be as quick as possible
     - You want simple, easy-to-interpret features
- Advanced Features - choose this if:
     - You tried "Basic Features" and didn't get a good model
     - You don't mind if execution takes a while
     - You want the best possible features, even if they are complex 
- Downsampled Features - choose this if:
     - You are creating a gesture recognition application
- Histogram Features - choose this if:
     - You are creating a motor vibration application
- Custom Seed - choose this if:
     - You tried the other seeds and didn't get a good model
     - You want to build your own pipeline and use the genetic algorithm to find the best number of features, best number of neurons, and other model-related parameters
- No Feature Generation - choose this if:
     - You do not want to generate any features, only test the ones you have made offline (Note: resulting knowledgepacks will not have a feature extraction algorithm, so will not operate on a device; intended for testing only)


In [6]:
auto_widget = AutoSenseWidget(dsk)
auto_widget.create_widget()

### Understand the quality of the model

Results of automated pipelines are saved in auto_widget.summary, auto_widget.results 

auto_widget.summary holds two report
- execution_summary: dataframe, it holds information (cached,,name,,runtime, step #, type) regarding the libraries in the pipeline
- fitness_summary: dataframe, it holds fitness scores of pipelines that showed up at final iteration. First one in the dataframe that has the highest fitness score

In [66]:
auto_widget.summary['execution_summary']

Unnamed: 0,cached,name,runtime,step #,type
0,False,test1,1.211877,0,query
1,False,Strip,1.230282,1,transform
2,False,generatorset,1.574082,2,generatorset


In [67]:
auto_widget.summary['fitness_summary']

Unnamed: 0,accuracy,best_model,f1_score,features,fitness,flash,iteration,knowledgepack,latency,neurons,pipeline,positive_predictive_rate,precision,sensitivity,specificity,sram,stack
0,88.888889,Fold 0,83.651436,12.0,2.194459,2970.0,3,67549d5a-599d-4e29-b4b5-b0944c73c5c5,156546375.0,10.0,"[{""outputs"": [""temp.raw""], ""type"": ""query"", ""n...",92.261905,92.261905,81.898148,95.673077,1392.0,210.0
1,88.888889,Fold 0,83.651436,12.0,2.194459,2970.0,4,d9aa8a48-ca02-4dc5-ba03-c7cba0a14eed,156546375.0,10.0,"[{""outputs"": [""temp.raw""], ""type"": ""query"", ""n...",92.261905,92.261905,81.898148,95.673077,1392.0,210.0
2,88.888889,Fold 0,83.651436,12.0,2.194459,2970.0,4,b288bfbd-ef60-467d-ba0b-c918b862c75d,156546375.0,10.0,"[{""outputs"": [""temp.raw""], ""type"": ""query"", ""n...",92.261905,92.261905,81.898148,95.673077,1392.0,210.0
3,88.888889,Fold 0,83.651436,12.0,2.194459,2970.0,5,5e562931-98af-468e-ba89-805184b757eb,156546375.0,10.0,"[{""outputs"": [""temp.raw""], ""type"": ""query"", ""n...",92.261905,92.261905,81.898148,95.673077,1392.0,210.0
4,100.0,Fold 0,100.0,16.0,2.143307,2970.0,4,2adac429-49a2-4fad-b2f3-b40b60362ff3,156546375.0,81.0,"[{""outputs"": [""temp.raw""], ""type"": ""query"", ""n...",100.0,100.0,100.0,100.0,1392.0,210.0
5,76.54321,Fold 0,82.80704,12.0,2.116697,2970.0,2,,156546375.0,10.0,"[{""outputs"": [""temp.raw""], ""type"": ""query"", ""n...",77.883675,77.883675,88.425926,90.315842,1392.0,210.0
6,76.54321,Fold 0,82.80704,12.0,2.116697,2970.0,5,,156546375.0,10.0,"[{""outputs"": [""temp.raw""], ""type"": ""query"", ""n...",77.883675,77.883675,88.425926,90.315842,1392.0,210.0
7,100.0,"Fold 0, Iteration 0",100.0,103.0,1.941732,2970.0,5,,156551250.0,80.0,"[{""outputs"": [""temp.raw""], ""type"": ""query"", ""n...",100.0,100.0,100.0,100.0,1392.0,210.0
8,72.839506,Fold 0,65.19767,9.0,1.917904,2970.0,5,,156544750.0,10.0,"[{""outputs"": [""temp.raw""], ""type"": ""query"", ""n...",72.234735,72.234735,64.305556,88.577361,1392.0,210.0
9,72.839506,Fold 0,65.19767,9.0,1.917904,2970.0,5,,156544750.0,10.0,"[{""outputs"": [""temp.raw""], ""type"": ""query"", ""n...",72.234735,72.234735,64.305556,88.577361,1392.0,210.0


#### Confusion Matrix of the model that has the highest fitness score

In [55]:
results = auto_widget.results
for c in results.configurations:
    if c.models[0]._index.find('Rank 0') > -1:
        print c.models[0]
        break

MODEL INDEX: Rank 0, Fold 0
ACCURACY: 88.9
NEURONS: 10
CONFUSION MATRIX:
                   A         D         M         U       UNK       UNC   Support   Sens(%)
         A      25.0       2.0       0.0       0.0       0.0       0.0      27.0      92.6
         D       0.0      24.0       0.0       0.0       0.0       0.0      24.0     100.0
         M       5.0       1.0       4.0       0.0       0.0       0.0      10.0      40.0
         U       0.0       1.0       0.0      19.0       0.0       0.0      20.0      95.0

     Total        30        28         4        19         0         0        81          

PosPred(%)      83.3      85.7     100.0     100.0                        Acc(%)      88.9



### Download Knowledpack in Library and Binary Form 
Finally, lets download our knowledpack in library and binary form. 

* Firt step is saving the knowledpack. We will use BestModel as its' name . 
We will save the model that has the highest fitness score.

In [68]:
knowledgepack = summary['fitness_summary'].loc[0,'knowledgepack']
print 'knowledgepack id:', knowledgepack
c.models[0].knowledgepack.save('BestModel')

knowledgepack id: 67549d5a-599d-4e29-b4b5-b0944c73c5c5
Knowledgepack name updated.


* The widget below will help you to download your binary/library file

Follow the steps:
1. Select Knowledpack : Select the knowledgepack you saved above. The one with the name "BestModel"
2. Select your target platform: select your device
3. Debug : If you need to debug the code in the platform set it to True 
4. Test Data : None (use if you want to upload a test file to the device which will be used instead of the device sensor)
5. Download Type : binary/library
6. Quark Application : Select LED If you want to see the outputs through LED.

In [7]:
kp = DownloadWidget(dsk)
kp.create_widget()

No Knowledgepacks stored for this project on the cloud.
