# Continuous Machine Learning over Streaming Data

Streaming technology provides you with the tools to ingest data as it is generated, process the data on-the-fly, and run real-time analytics on the data which can trigger actions. AWS offers a range of streaming tools as part of the [Amazon Kinesis](https://aws.amazon.com/kinesis/) family of services. 

## _Kinesis Data Firehose vs. Kinesis Data Streams_

### Kinesis Data Firehose
* Amazon Kinesis Data Firehose is the easiest way to load streaming data into data stores and analytics tools. 
* It can capture, transform, and load streaming data into Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, and Splunk, enabling near real-time analytics with existing business intelligence tools and dashboards you’re already using today. 
* It is a fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration. It can also batch, compress, and encrypt the data before loading it, minimizing the amount of storage used at the destination and increasing security.

### Kinesis Data Streams
* Amazon Kinesis Data Streams enables you to build custom applications that process or analyze streaming data for specialized needs. 
* You can continuously add various types of data such as clickstreams, application logs, and social media to an Amazon Kinesis data stream from hundreds of thousands of sources. 
* Within seconds, the data will be available for your Amazon Kinesis Applications to read and process from the stream.

## In the following noteboooks, we will show you how you can start implementing continuous machine learning using the Kinesis streaming services. 

* Create a **Kinesis Data Firehose** delivery stream to receive live customer review data, and write the streaming data to S3.
* Invoke a **SageMaker Endpoint** to predict the `star_rating` on streaming data (incoming reviews)
* Analyze Streaming Data with **Kinesis Data Analytics** (calculate average star rating, approximate count, and detect anomalies)
* Use **Kinesis Data Streams** to deliver streaming data to custom consumer applications

# Use Case 1: 
# Invoke a SageMaker Endpoint from Kinesis to receive a `star_rating` prediction 

## _Transform Data in Kinesis Data Firehose delivery stream_

<img src="img/kinesis_firehose_transform.png" width="90%" align="left">

## _Preprocess streaming data in Kinesis Data Analytics_

<img src="img/kinesis-analytics-transformed_data.png" width="90%" align="left">

# Use Case 2: 
# Analyze Streaming Data with Kinesis Data Analytics

## _Calculating AVG Star Rating_

<img src="img/use_case_1_analytics.png" width="80%" align="left">

## _Detect Anomalies of Streaming Data_

<img src="img/use_case_2_anomaly.png" width="70%" align="left">

## _Calculate Approxmimate Counts of Streaming Data_

<img src="img/use_case_3_count.png" width="80%" align="left">

# Use Case 3: 
# Implement Incremental Model Training with Streaming Data using Multi-Armed Bandit models

<img src="img/use_case_4_bandit.png" width="90%" align="left">

In [None]:
%%javascript
Jupyter.notebook.save_checkpoint();
Jupyter.notebook.session.delete();