## Snow Leopard Detection in 5 minutes with the Azure Machine Learning Workbench
<img src="https://amlbigdemo.blob.core.windows.net/public/connect/SLT.png" width="600"/>

#### There's only 4000-8000 Snow Leopards left in the wild. They face constant threats from: 

<img src="https://amlbigdemo.blob.core.windows.net/public/connect/threats.png" width="800"/>

#### Because they are so rare, we know very little about their behavior, habbitat, or population dynamics. 
#### To protect these creatures and their habitat we need reliable data.

#### The Snow Leopard Trust has been monitoring leopards using a remote camera system for ~9 years.

<img src="https://amlbigdemo.blob.core.windows.net/public/connect/camera_trap.jpg" width="500"/>

#### They have gathered over 1.3 million images, but 90% of them are goats.

<img src="https://amlbigdemo.blob.core.windows.net/public/connect/trap_images.png" width="1900"/>

#### It will take >20,000 hours to manually find all of the leopards in the dataset.

#### We will build our detection system with Microsoft's new open-source, distributed deep-learning library: 

<a href="https://github.com/Azure/mmlspark"><img src="https://amlbigdemo.blob.core.windows.net/public/connect/mmlspark.png" width="600"/></a>

In [6]:
import mmlspark as mml

import pyspark
from pyspark.ml import  PipelineModel
from pyspark.ml.classification import LogisticRegression
import os

#### First we will load and split the data into training and test sets

In [8]:
wasbRoot = "wasb://amlperf2container1@amlbigdemo.blob.core.windows.net/"
images = spark.read.parquet(wasbRoot + "datasets/labelledSnowLeopardData.parquet")
camerasTrain, camerasTest = images.select("camera").distinct().randomSplit([.8, .2], seed=1)
train = images.join(broadcast(camerasTrain), "camera").cache()
test = images.join(broadcast(camerasTest), "camera").cache()

#### We use transfer learning to create a Snow Leopard detector that doesen't require millions of labelled training examples 

<img src="https://amlbigdemo.blob.core.windows.net/public/connect/network_2.png" width="900"/>

In [10]:
# Download the model from our repository of trained models
network = mml.ModelDownloader(spark, wasbRoot + "Models/").downloadByName("ResNet50")

#Create the truncated network
featurizer = mml.ImageFeaturizer(inputCol="image", outputCol="features", cutOutputLayers=2).setModel(network)

# Add the logistic regression
classifier = LogisticRegression(featuresCol = "features", labelCol="label",
                                predictionCol="pred", probabilityCol="prob")
dc = mml.DropColumns(cols=["image", "features"]) 

# Stich it all together into a single pipeline
model = Pipeline(stages=[featurizer, classifier, dc])

In [11]:
fitModel = model.fit(train)
predictions = fitModel.transform(test).cache()    

In [12]:
fig = plt.figure(figsize=(4.5, 4.5))
mml.plot.confusionMatrix(predictions, 'label', 'pred', ['No Leopard', 'Leopard'])
display(fig)

In [13]:
# Stream from a kafka topic
brokers= ["10.0.0.25:9092", "10.0.0.26:9092", "10.0.0.27:9092", "10.0.0.16:9092"]
imageStream = spark.streamFromKafka("images-aml", brokers, test.drop("label").schema, 200)

# Classify the incoming data using our trained model
predictionStream = fitModel.transform(imageStream)\
    .drop('rawPrediction').drop('pred')\
    .withColumn("prob", mml.get_value_at("prob", 1))

# Stream the results to PowerBI 
q1 = predictionStream.streamToPowerBI(os.environ["POWER_BI_URL"], 10000).start()

####  We can stream from Apache Kafka, through our trained network, and out to PowerBI to visualize the results 
<img src="https://amlbigdemo.blob.core.windows.net/public/connect/streaming_3.png" width="700"/>

#### We can now view the PowerBI dashboard and explore our results! 

<img src="https://amlbigdemo.blob.core.windows.net/public/connect/SnowLeopardPowerBI.gif" width="700"/>

In [16]:
fitModel.write().overwrite().save(wasbRoot + "Models/snowLeopardClassifier.mml")

#### For low latency scoring, we can also deploy to a Kubernetes cluster using the az ml command line:

```bash 
az ml service create realtime \
    -f score.py \
    --model-file snowLeopardClassifier.mml \
    -n leopard_service \
    -r spark-py ```

####  We can then easily call out to this endpoint from a website: 


<img src="https://amlbigdemo.blob.core.windows.net/public/connect/SnowLeopardWebsite.png" width="700"/>

In [20]:
q1.isActive()

In [21]:
q1.stop()

####  Where to go for more information :  

- [MMLSpark Github page](https://github.com/Azure/mmlspark)

- [Azure Machine Learning Workbench](https://docs.microsoft.com/en-us/azure/machine-learning/preview/quickstart-installation)

- [Snow Leopard Blog](https://blogs.technet.microsoft.com/machinelearning/2017/06/27/saving-snow-leopards-with-deep-learning-and-computer-vision-on-spark/)

- [Snow Leopard Trust](https://www.snowleopard.org/)

- Whitepages coming soon!