# Cosmos DB IoT Data Generator

This notebook will simulate IoT devices, loading data into a Cosmos DB contatiner. If you activate the Analytics Store feature, you can also perform reporting and analytics with [Azure Synapse](https://azure.microsoft.com/en-us/services/synapse-analytics/). You can customize the code as necessary, changing the units, measures, number of devices, and outliers behavior. Tips will and best practices will be shared in the code comments and **"Did you know?"** sections. 

## Business Scenario

The hypothetical scenario is Power Plant, where IoT devices are monitoring [steam turbines](https://en.wikipedia.org/wiki/Steam_turbine). The code will created realistic Revolutions per minute (RPM) and MegaWats (MW) data for each turbine. There is one device for each turbine. Each one of the measured units, RPM and MW,  have a base value and a variation. The process will create one data point per second per unit per turbine (or device). 

&nbsp;

<img src="https://cosmosnotebooksdata.blob.core.windows.net/notebookdata/iot-ai-notebook-1.png" alt="Built-in nteract " width="50%"/>

&nbsp;


There will be one outlier per minute, in random frequency. In those situations, RPM values will go up and MW output will go down, because of the circuit protection of the system. The idea is to see the data varying at the same time, but with different signals. Suggested analytics scenarios are [Predictive Maintenance](https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/predictive-maintenance-playbook) and [Anomally Detection](https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/apps-anomaly-detection-api).


## Technical Information

+ The data will be uploaded direclty to a Cosmos Db container.
+ We will create a database and a container, with 400 RU/s. In Cosmos Db, each write uses 5 RUs for each 1 KB. 
+ Because of the 400 RU/s, we will simulate up to 20 IoT devices at the same time. Each one of them will create 2 data points (RPM and MW) per second. For more deails, check the **3.RequestUnit** notebook.
+ We will create 2 containers, both will be partitioned per deviceId.
+ Each measured unit (RPM and MW) has a base value and a variation percentage, that can be positive or negative.
+ The code runs for the number of minutes you want or until you stop it. Minimum value is 1 minute.
+ There will be 1 random outlier per 1 minute. For 2+ minutes executions, the outliers may happen in sequence or not.
+ All devices have outliers at the same time, but with different variations.
+ Outliers will have a bigger base value and a bigger variation.
+ This code is not production grade, it was created as a demo and tutorial. In a production scenario, this notebook code should be enriched with error handling, [global data distribution](https://docs.microsoft.com/en-us/azure/cosmos-db/distribute-data-globally), etc. 

&nbsp;


>**Did you know?** Cosmos Db is a great fit for IoT workloads. Click [here](https://docs.microsoft.com/en-us/azure/cosmos-db/use-cases) to learn more about Cosmos Db recommended use cases.


## 1st Step - Initialization 

We will start by creating, if necessary, the database and the container. To connect to the service, you can use our built-in instance of ```cosmos_client```. This is a ready to use instance of [CosmosClient](https://docs.microsoft.com/python/api/azure-cosmos/azure.cosmos.cosmos_client.cosmosclient?view=azure-python) from our Python SDK. It already has the context of this account baked in.

&nbsp;

We will also create:
+ The function that generates the elementary iot values
+ The main function, the calls the 1st function and saves the data in a Cosmos Db container

&nbsp;

>**Did you know?** If you run the ```dir()``` python command, you will see which libraries are loaded. And ```json```is pre-loaded.

&nbsp;

>**Did you know?** The word "value" is a reserved, what will impact your queries. That's why we are calling the values as ***measureValue***.

In [1]:
import random
import datetime
import time
import azure.cosmos as cosmos
import azure.cosmos.exceptions as exceptions
from azure.cosmos.partition_key import PartitionKey
import uuid



#  Initialization
db_name = "AnalyticsDb"
container_name = "IotData"
partition_key_value = "/deviceId"

# Key Objects Creation
database_client = cosmos_client.create_database_if_not_exists(db_name)
print(database_client, 'ok')

container_client = database_client.create_container_if_not_exists(id=container_name, partition_key=PartitionKey(path=partition_key_value),offer_throughput=400)
print(container_client, 'ok')

###########################################################################################
# The function that creates and returns IoT values
###########################################################################################
def retunrIotValues(deviceId, measureType, unitSymbol, unit, baseValue, variationPercentage,isOutlier,outlierSignal):
    if (isOutlier == 0):
        value = random.randint(int(baseValue - (baseValue * (variationPercentage)/100)), int(baseValue + (baseValue * (variationPercentage)/100 )))
    else: #Outlier!
        if (outlierSignal == 'Positive'):
            value = random.randint(int(baseValue), int(baseValue + (baseValue * (variationPercentage)/100 )))
        else:
            baseValue = int(baseValue/2) #to fix to 30% increase in the main function. 
            value = random.randint(int(baseValue - (baseValue * (variationPercentage)/100)), int(baseValue))

    docId = str(uuid.uuid4())

    iotData = {
    'id' : docId,
    'dateTime' : datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
    'deviceId' : deviceId,
    'measureType' : measureType,
    'unitSymbol' : unitSymbol,
    'unit' : unit,
    'measureValue' : value
    }

    return iotData


print('Functions retunrIotValues ok')

###########################################################################################
# The Main function
###########################################################################################
def iotSimulator(numDevs=40, minutes=1, printOutput=0): 

    # Initial validations
    
    if (numDevs > 40):
        print ("First parameter: Too many devices, maximum = 40")
        return
    
    if (minutes == 0):
        print ("Second parameter: Minimum minutes = 1")
        return

    if (printOutput != 1) and (printOutput != 0):
        print ("Third parameter: Only 1 (yes) and 0 (no) are accepted for terminal output printing")
        return

    # Devices list, startint with dev-1
    devPrefix = 'dev-'
    devicesList = [ ] 
    for i in range(1,numDevs+1):
        deviceId = devPrefix + str(i)
        devicesList.append(deviceId)

    # Units list
    # This is a Power Plant Scenario. Chage as you want
    # layoyt is: measureType, unitSymbol, unit,baseValue, variationPercentage, outlierSignal
    # reference: https://www.mhps.com/products/steamturbines/lineup/thermal-power/1200/
    unitList = [('Rotation Speed','RPM','Revolutions per Minute',3000,10, 'Positive'), ('Output','MW','MegaWatts',1500,10,'Negative')]

    
    # How many measures based on the number per minutes?
    numberMeasures = int(minutes)*60 
    accNumberMeasures = 0
    
    # Outliers
    accOutliers = 0
    ouliersSet = set()
    while len(ouliersSet) < int(minutes):
        outlier = random.randint(1, int(minutes)*60)
        ouliersSet.add(outlier)
    print ('Starting the process. The outlier(s) will happen at: ',ouliersSet)
    
    # Create IoT Values based on a base value and a variation. Every device will return 1 value per unit per second.
    # Data is printed and saved into a Cosmos DB Container
    # Data modeling: We could have one document per device. But this approach is addressed in the data modeling notebook.
    while (accNumberMeasures <= numberMeasures):
        time.sleep(1)
        for deviceId in devicesList:
            for unit in unitList:
                if (accNumberMeasures in ouliersSet):  # If yes, time for outliers, 30% bigger!
                    iotData = retunrIotValues (deviceId,unit[0],unit[1],unit[2],unit[3]*1.3,unit[4]*1.3,1,unit[5])
                    print('Outlier:',deviceId,unit[2])
                else: # Regular measure
                    iotData = retunrIotValues (deviceId,unit[0],unit[1],unit[2],unit[3],unit[4],0,unit[5])
                container_client.create_item(body=iotData)
                if (printOutput == 1):
                    print(iotData)
        accNumberMeasures +=1 

        
print ('Function iotSimulator ok')


<DatabaseProxy [dbs/AnalyticsDb]> ok
<ContainerProxy [dbs/AnalyticsDb/colls/IotData]> ok
Functions retunrIotValues ok
Function iotSimulator ok


## Step 2 - Running the Data Generator

+ 1st parameter is the number of devices.
+ 2nd parameter is the number of minutes.Minimum is 1 minute.
+ 3rd parameter is a boolean, if you want to print the IoT Data to your terminal. It is a good idea to turn it off for +5 minutes executions.

In [2]:
iotSimulator (2,1,1)

Starting the process. The outlier(s) will happen at:  {43}
{'id': '92a4a27d-eb2c-45a1-9291-b6c1fe4f1987', 'dateTime': '2020-04-09 18:53:23', 'deviceId': 'dev-1', 'measureType': 'Rotation Speed', 'unitSymbol': 'RPM', 'unit': 'Revolutions per Minute', 'measureValue': 2744}
{'id': 'c0da7759-0f6a-4b12-9bd1-8d12f50f1544', 'dateTime': '2020-04-09 18:53:23', 'deviceId': 'dev-1', 'measureType': 'Output', 'unitSymbol': 'MW', 'unit': 'MegaWatts', 'measureValue': 1498}
{'id': '9420bb35-8365-4de8-91a5-9a0ada2d1e26', 'dateTime': '2020-04-09 18:53:23', 'deviceId': 'dev-2', 'measureType': 'Rotation Speed', 'unitSymbol': 'RPM', 'unit': 'Revolutions per Minute', 'measureValue': 2972}
{'id': 'af815b27-0ca0-470c-a7a0-90ecbcd82ac3', 'dateTime': '2020-04-09 18:53:23', 'deviceId': 'dev-2', 'measureType': 'Output', 'unitSymbol': 'MW', 'unit': 'MegaWatts', 'measureValue': 1593}
{'id': '68ee0bae-3db1-4f72-8e92-50e331ee59fc', 'dateTime': '2020-04-09 18:53:24', 'deviceId': 'dev-1', 'measureType': 'Rotation Speed

KeyboardInterrupt: 

## Step 3 - Analytics

Let's analyze the data using multiple statistics functions and visualizations. And everything starts with a query, loading 1000 metrics into a data frame. Why 1000? Because it is enough for our objectives with this notebook, but you can change it to any number up to 8046. This is a limitation of the Anomaly Detector API. 

Now we will:

&nbsp;

+ Use [Cosmos DB SQL API](https://docs.microsoft.com/en-us/azure/cosmos-db/sql-query-getting-started) to load data from the database into a data-frame.
+ Vizualize and manipulate the data with [Pandas](https://pandas.pydata.org/) and [matplotlib](https://matplotlib.org/) Python libraries.
+ Apply AI to our data, using the [Anomaly Detector API](https://azure.microsoft.com/en-us/services/cognitive-services/anomaly-detector) to extract hidden insights from our data.
+ Load the results back to Cosmos Db, to persist the data and be able to create Power BI dashboards.

&nbsp;

>**Did you know?** You can set the default database and container context for all new queries using
 ```%database {database_id}``` and ```%container {container_id}``` syntax. But they need to be in separated cells, as the ```%%sql``` magic command.
 
 &nbsp;

>**Did you know?** All steps from now one are executed for only one device, the first one (dev-1). This notebook is intended to teach how to use all this capabilities. In a loop, it would not be possible to teach as we run the code, we can't add markdown in the middle of Python cells. Everything starts with the query below, filtering the deviceId. In a "production scenario", all of them would be loaded and the rest of the could should be in a loop.

In [None]:
%database AnalyticsDb

In [None]:
%container IotData

In [None]:
%%sql  --output df_IotData
SELECT top 1000 c.dateTime, c.unitSymbol, c.measureValue, c.deviceId FROM c

## Step 4 - Built-in Data Visualizations

We'll run queries and use the built-in [nteract data explorer](https://blog.nteract.io/designing-the-nteract-data-explorer-f4476d53f897) to allow data visualizations. After the following code lines, all cells will be enabled with nteract data explorer, with multiple visualization options on the right side of the result sets. For more information, check the **2.Visualization** notebook.

&nbsp;

``pd.options.display.html.table_schema = True``

``pd.options.display.max_rows = None``  

&nbsp;

You will notice that, despite the great charts options, we can't use all possibilities because the units values are mixed in the same column. To fix it, we will filter the data per unit. Chage it as you want and get confortable with the visualization optoins. One suggestion is the **line chart**, where you can use multiple extra visualizations like **stacked area chart**. 

&nbsp;

>**Did you know?** After we run the nteract options, only the last data output command will have its data shown. That's why if you want more than one date-out command, you'll need one cell for each.



In [None]:
# Turniung on the visualization options
pd.options.display.html.table_schema = True
pd.options.display.max_rows = None

# Example: let's see what we loaded from Cosmos Db
# df_IotData.head(10)
df_IotData[(df_IotData.unitSymbol == "RPM") & (df_IotData.deviceId == "dev-1")]


## Step 5 - Data Preparation and Advanced data Visualization

But filtering is not ideal and many other data manipulations will be necessary to reach our objective. That's why we will do some data preparation in the next steps. We need to pivot the data to have one column for each metric, avoiding the mistake to compare different measures. This is also a required process for data science projects, a process called featurization. All columns can be used as input for a machine learning model, that's why the table must be pivoted. 

&nbsp;

Also, we need to remove the rows where we have missing data, what can be caused by the moments when one metric is created in one second and the others in the next second, as you can see in the image below, a real snapshot of the data we just created using this generator.

&nbsp;

<img src="https://cosmosnotebooksdata.blob.core.windows.net/notebookdata/iot-ai-notebook-2.PNG" alt="Built-in nteract " width="75%"/>

&nbsp;


>**Did you know?** Data preparation is a [key phase](https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/lifecycle-modeling) of data science projects and may include [data engineering](https://docs.microsoft.com/en-us/learn/certifications/roles/data-engineer), [data featurization](https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/data-transformation-learning-with-counts), [data cleaning (or cleansing)](https://en.wikipedia.org/wiki/Data_cleansing), and more. It may take up to 80% of the all project!




In [None]:
import pandas as pd
import matplotlib.pyplot as plt

###########################################################################################
# Data Preparation
###########################################################################################

# Lets keep df_IotData as is, so we will never change it. All changes will happen in the new data frame.
# Pivoting the original data for 1 device into a new one, with one column per unit.
new_df = df_IotData[(df_IotData.deviceId == "dev-1")]
new_df=new_df.pivot(index='dateTime', columns = 'unitSymbol' , values =  'measureValue')

# Removing lines with missing data. 
new_df=new_df.dropna()

# The pivot will create a df where the dataTime column is the index. 
# But for the Anomaly detection, we need a column with that content called timestamp
new_df['timestamp']=new_df.index

# We also need a sequential id, to join the dataframes in the end of the process
new_df['index']=list(range(len(new_df)))

# Now let's make the sequential the index of the dataframe
new_df.set_index('index',inplace=True)


###########################################################################################
# Plotting the Data
###########################################################################################
# One chart per unit.
# If you change or add the units, from now on you need to customize the code.
new_df.plot(y='MW', x= 'timestamp', color='green',figsize=(20,5), label = 'Output MW')
plt.title('MW TimeSeries')
new_df.plot(y='RPM', x= 'timestamp', color='red', figsize=(20,5), label = 'RPM')
plt.title('RPM TimeSeries')
plt.legend(loc = 'best')
plt.show()


## Step 6 - Anomaly Detection

The chart above are very clear about the outliers. But can we predict them? Are there variation margins? Let's use AI to do it, leveraging the [Microsoft Azure Anomaly Detector API](https://docs.microsoft.com/en-us/azure/cognitive-services/anomaly-detector/quickstarts/detect-data-anomalies-python?tabs=linux).

&nbsp;

You must have a [Cognitive Services API account](https://docs.microsoft.com/en-us/azure/cognitive-services/cognitive-services-apis-create-account?tabs=multiservice%2Clinux) with access to the Anomaly Detector API. You can get your subscription key from the Azure portal after creating your account.

&nbsp;

The API accepts json files with 2 elements, ```timestamp``` and ```value```. These names are mandatory and we can't mix the units for detection, that's why we create 2 customized data frames in the code below, right before the Anomaly Detection Function execution, one for each unit. Also, we should avoid to mix the devices in the same analysis, since they may have different behavior.

&nbsp;

The output includes, for each unit, the expected value, the lower and upper margin values, and if it was an anomaly or not. With the margins, you can monitor in real time if a value is gettig close to the margins. Anomaly is when a value cross them. 

&nbsp;

**DON'T FORGET TO ADD YOUR ANOMALY DETECTOR API KEY TO THE CODE BELOW!! JUST SEARCH FOR 'paste-your-key-here' AND REPLACE THE TEXT WITH YOUR KEY, THAT YOU CAN GET/COPY FROM THE AZURE PORTAL.**

&nbsp;

>**Did you know?** Anomaly Detector API has 2 endpoints, for 2 different cappabilities: Real time and Batch detection. Real time will detect anomalies for every 10 data points. The Batch module will analyze up to 8046 datapoints, and since we already have the data, that's what we will use.

&nbsp;

>**Did you know?** An Azure Cognitive Services account gives you access to multiple cognitive services like Aext Analytics and Computer Vision APIs.

&nbsp;

>**Did you know?** "Secondly" granularity is not documented yet, but it works and that's exactly what we need for this analysis.


In [None]:
import http.client, urllib.request, urllib.parse, urllib.error, base64

###########################################################################################
# The Anomaly Detection Function
###########################################################################################
def anomalyDetector (myJson):
    headers = {
        # Request headers
        'Content-Type': 'application/json',
        'Ocp-Apim-Subscription-Key': 'paste-your-key-here', # <<---- Paste your key here!!!!
    }

    params = urllib.parse.urlencode({
    })
   
    
    try:
        conn = http.client.HTTPSConnection('westus2.api.cognitive.microsoft.com')
        conn.request("POST", "/anomalydetector/v1.0/timeseries/entire/detect?%s" % params, myJson, headers)
        response = conn.getresponse()
        data = response.read()
        return(data)
        conn.close()
    except Exception as e:
        print("[Errno {0}] {1}".format(e.errno, e.strerror))
 

print ('Function anomalyDetector() ok')
###########################################################################################
# The main funciton, that will process per device
###########################################################################################

# Let's use the same data frame of the first analysis to get data the devices ids.
devicesArray = df_IotData.deviceId.unique()

# Creating Output dataframe
output_df = pd.DataFrame()

# Let's process per device
for deviceId in devicesArray:
    print ('Processing device: ',deviceId)
    
    # Same data preparation we had for data visualization
    new_df = df_IotData[(df_IotData.deviceId == deviceId)]
    # The pivot below will remove deviceId column. We will add it back later.
    new_df=new_df.pivot(index='dateTime', columns = 'unitSymbol' , values =  'measureValue')
    new_df=new_df.dropna()
    
    # The pivot will create a df where the dataTime column is the index. 
    # But for the Anomaly detection, we need a column with that content called timestamp
    new_df['timestamp']=new_df.index

    # We also need a sequential id, to join the dataframes in the end of the process
    new_df['index']=list(range(len(new_df)))

    # Now let's make the sequential the index of the dataframe
    new_df.set_index('index',inplace=True)

    #### RPM
    # Creating a dataframe
    new_df_RPM=new_df[['timestamp','RPM']]
    new_df_RPM.rename(columns={"RPM":"value"},inplace=True)
    # Creating json string
    myJson= '{"granularity":"secondly","series": '
    myJson = myJson+ new_df_RPM.to_json(orient = 'records')+'}'
    # Running and saving the results in a new dataframe
    rpmAnomalies= anomalyDetector (myJson)
    new_df_RPM_results = pd.read_json(rpmAnomalies)
    
    ### MW
    # Creating a dataframe
    new_df_MW=new_df[['timestamp','MW']]
    new_df_MW.rename(columns={"MW":"value"},inplace=True)
    # Creating json string
    myJson= '{"granularity":"secondly","series": '
    myJson = myJson+ new_df_RPM.to_json(orient = 'records')+'}'
    # Running and saving the results in a new dataframe
    mwAnomalies= anomalyDetector (myJson)
    new_df_MW_results = pd.read_json(mwAnomalies)

    # Checking the results. Number of rows should be exactly the same.
    rpmCount = new_df_RPM_results.shape[0]  # gives number of row count
    mwCount = new_df_MW_results.shape[0]  # gives number of row count

    if (rpmCount != mwCount):
        print('Something went wrong, total rows in the results should be the same.')
        print ('rpmCount: ',rpmCount)
        print ('mwCount: ',mwCount)
    else:
        # Preparing the margin values for plotting
        new_df_RPM_results['lowerMargins'] = new_df_RPM_results['expectedValues'] - new_df_RPM_results['lowerMargins']
        new_df_RPM_results['upperMargins'] = new_df_RPM_results['expectedValues'] + new_df_RPM_results['upperMargins']
        new_df_MW_results['lowerMargins'] = new_df_MW_results['expectedValues'] - new_df_MW_results['lowerMargins']
        new_df_MW_results['upperMargins'] = new_df_MW_results['expectedValues'] + new_df_MW_results['upperMargins']
        # Merging original (we need the timestamp/datetime) + results
        new_df_RPM = new_df_RPM.join(new_df_RPM_results,lsuffix='Original',rsuffix='Results')
        new_df_MW = new_df_MW.join(new_df_MW_results,lsuffix='Original',rsuffix='Results')
        # Merging in temp dataframe just to add the deviceId
        temp_df = new_df_RPM.join(new_df_MW,lsuffix='RPM',rsuffix='MW')
        temp_df['deviceId'] = deviceId
        if (output_df.size == 0): #First execution
            output_df = temp_df
        else:
            output_df = output_df.append(temp_df)
    print ('Rows processed: ',mwCount,' for device ',deviceId)        

## Step 7 - Final Data Vizualization

Now let's use the built-in visualizations to analyze the data. The suggestion is to test all options, so you can analyze all data we created using data preparation and AI.

&nbsp;


>**Did you know?** All resulted properties from the Anomaly Detector API are listed [here](https://docs.microsoft.com/en-us/dotnet/api/microsoft.azure.cognitiveservices.anomalydetector.models.entiredetectresponse?view=azure-dotnet-preview). You may want to check this information to create better visualizations.

### Final Data Visualization

You should be able to see interesting line charts, like this image below. Compare all 4 coluns for each unit separated, that't why we pivoted the data. Please notice that in many situations the original value is beyond the margins. The expected value is the base line and you can also check how it compares with the original value.

&nbsp;

<img src="https://cosmosnotebooksdata.blob.core.windows.net/notebookdata/iot-ai-notebook-3.PNG" alt="Built-in nteract " width="100%"/>


In [None]:
output_df.head(100)


## Step 8 - Loading the Metadata into Cosmos DB

Now let's load the results into Cosmos Db. **But Why?** For two reasons:

&nbsp;

1. **Metadata:** Now we have the expected values, per device and per unit: the values between the lower and the upper margins. You can read from Cosmos DB and detect anomalies in real time or in batch. This method is cheaper and faster than using the The Anomaly Detector API for every single datapoint. The API should be used from time to time to refresh the metadata because of sasonality and other factors. The API also can be used to cross validate the anomalies detected.

&nbsp;

2. **Advanced Analytics/AI:** This data can be queried not only from Cosmos DB transactional store, but also from our brand new [Analitics Store](https://azure.microsoft.com/en-us/updates/new-analytics-storage-for-azure-cosmos-db-is-now-in-preview/). You can use both for PBI reports, more Machine Learning analysis, predictions, etc.

&nbsp;

>**Did you know?** Cosmos DB guarantees less than 10-ms latencies for both, reads (indexed) and writes at the 99th percentile, all around the world. For more information, click [here](https://docs.microsoft.com/en-us/azure/cosmos-db/introduction#guaranteed-low-latency-at-99th-percentile-worldwide).

### The loading Function

Now let's create a container, if necessary, and load the Anomalies Detection metadata. 

&nbsp;

**THIS FUNCTION DOESN'T TEST IF THE SAME DATA WAS UPLOADED BEFORE.**

In [None]:
#  Initialization
db_name = "AnalyticsDb"
container_name = "AdMetadata"
partition_key_value = "/deviceId" #Thinking that in the future you will have multiple devices.Also, data will probably analyzed per device.

# Key Objects Creation

database_client = cosmos_client.create_database_if_not_exists(db_name)
print(database_client, 'ok')

container_client = database_client.create_container_if_not_exists(id=container_name, partition_key=PartitionKey(path=partition_key_value),offer_throughput=400)
print(container_client, 'ok')

for i in range(0, len(output_df)):
    
    if (i % 10 == 0 ) and (i > 0): #let's report the progress for each 10 docs uploaded
        print ('Number of docs uploaded: ', i)
   
    adMetaData = {
    'id' : str(uuid.uuid4()),
    'deviceId' :output_df.iloc[i]['deviceId'], 
    'timestamp' : output_df.iloc[i]['timestampRPM'], #Doesn't matter, for this column RPM = MW
    'originalMWValue' : int(output_df.iloc[i]['valueMW']),
    'expectedValuesMW' : int(output_df.iloc[i]['expectedValuesMW']), # Keeping 'values', plural as the original name.
    'isAnomalyMW' : str(output_df.iloc[i]['isAnomalyMW']).lower(),
    'isNegativeAnomalyMW' : str(output_df.iloc[i]['isNegativeAnomalyMW']).lower(),
    'isPositiveAnomalyMW' : str(output_df.iloc[i]['isPositiveAnomalyMW']).lower(),
    'lowerMarginsMW' : int(output_df.iloc[i]['lowerMarginsMW']),
    'upperMarginsMW' : int(output_df.iloc[i]['upperMarginsMW']),
    'periodMW' : int(output_df.iloc[i]['periodMW']),
    'originalRPMValue' : output_df.iloc[i]['valueRPM'],
    'expectedValuesRPM' : output_df.iloc[i]['expectedValuesRPM'], # Keeping 'values', plural as the original name.
    'isAnomalyRPM' : str(output_df.iloc[i]['isAnomalyRPM']).lower(),
    'isNegativeAnomalyRPM' : str(output_df.iloc[i]['isNegativeAnomalyRPM']).lower(),
    'isPositiveAnomalyRPM' : str(output_df.iloc[i]['isPositiveAnomalyRPM']).lower(),
    'lowerMarginsRPM' : int(output_df.iloc[i]['lowerMarginsRPM']),
    'upperMarginsRPM' : int(output_df.iloc[i]['upperMarginsRPM']),
    'periodRPM' : int(output_df.iloc[i]['periodRPM'])
    }
    container_client.create_item(body=adMetaData)

## Step 9 - Checking the Metadata Container

Let's finish with a query to count the number of documents we uploaded to the container.

In [None]:
%container AdMetadata

In [None]:
%database AnalyticsDb

In [None]:
%%sql 
SELECT count(1) FROM c 

## Next Steps

Suggested next steps are:

+ Use Power BI to create dashboards on top of BOTH containers we created. Their combination is very powerful.
+ Try our other Sample Notebooks.
+ Send us your feedback or contribution in our [GitHub repo](https://github.com/Azure-Samples/cosmos-notebooks).