# PyMongo
The first step when working with PyMongo is to create a MongoClient to the running mongod instance.

Make sure you have a MongoDB instance running - see [https://www.mongodb.com/docs/manual/administration/install-community/](https://www.mongodb.com/docs/manual/administration/install-community/)

In [None]:
try:
    from pymongo import MongoClient
except:
    !pip install pymongo
    from pymongo import MongoClient

try:
    import psutil
except:
    !pip install psutil
    import psutil

Being a local server, you can create a client in several ways.

In [None]:
client = MongoClient()
# same as 
#  client = MongoClient('localhost', 27017)
# or 
#  client = MongoClient('mongodb://localhost:27017/')

client

## Databases 
A single client instance of MongoDB can support multiple independent databases. When working with PyMongo you access databases using **attribute style access** on MongoClient instances.

So, the next line will "connect" to (or create if it does not exist) `sensorsDB` database.

This also means that you have to be very careful with the naming.

In [None]:
db = client.sensorsDB

## Collections 
A collection is a group of documents stored in MongoDB, and can be thought of as roughly the equivalent of a table in a relational database. Getting a collection in PyMongo works the same as getting a database.

In [None]:
sensors_location = db.sensors_locations

An important note about collections (and databases) in MongoDB is that they are created lazily - none of the above commands have actually performed any operations on the MongoDB server. Collections and databases are created when the first document is inserted into them.

# Insert documents

To **insert a document** into a collection we can use the `insert_one()` method

In [None]:
data = {
        'location_name': 'Prometheus Server', 
        'description' : 'Prometheus Server @ lab. 163 / ISE /UAlg',
        'sensor': [ 
                    {
                        'sensor_name' : 'cpu_sensor', 
                        'unit' : 'percent'
                    },
                    {
                        'sensor_name' : 'mem_sensor', 
                        'unit' : 'percent'
                    }
             ]
       }

In [None]:
x = sensors_location.insert_one(data)
x

In [None]:
location_id = x.inserted_id
location_id

Let us see what is on the `sensors_location` collection

In [None]:
from pprint import *

for doc in sensors_location.find():
    pprint(doc)

And now, we can insert on document for each reading of the sensor. Now, on the `sensors_readings`collection

In [None]:
import datetime
import psutil

for _ in range(200):
    # creat the document
    data = {
           'sensor' : {'location_id': location_id, 
                       'sensor_name' : 'cpu_sensor' 
                      },
            'value' : psutil.cpu_percent(interval=0.01),
            'units' : 'percent',
            'timestamp' : datetime.datetime.utcnow()
           }
    # send the document to the database
    res = db.sensors_readings.insert_one(data)
    print('.', end='')   

let us store the last `_id` for latter

In [None]:
_id = res.inserted_id
_id

To list all inserted readings

In [None]:
list(db.sensors_readings.find())

We can also list the inserted readings, sorted by value and timestamp

In [None]:
list(
    db.sensors_readings.find().sort([
        ('value',-1),
        ('timestamp', -1)]
    )
)

Given the `ObjectId` (we stored ir earlier), it is possible to get one specific document

In [None]:
list(db.sensors_readings.find({'_id': _id}))

## Embending of information I
In this approach, a single document contains **multiple sensors with a single read**. Also, embedded location info.

In [None]:
import datetime
import psutil

for _ in range(200):
    data = {
        'location_name': 'Prometheus Server', 
        'description' : 'Prometheus Server @ lab. 163 / ISE /UAlg',
        'sensors' : [ 
               {
                   'sensor_name' : 'mem_sensor', 
                   'value' : psutil.virtual_memory().percent,
                   'units' : 'percent'
               },
               {
                   'sensor_name' : 'cpu_sensor', 
                   'value' : psutil.cpu_percent(interval=0.01),
                   'units' : 'percent'
               }
           ],
        'timestamp' : datetime.datetime.utcnow()
    }
    db.sensors_readings.insert_one(data)
    print('.', end='')

Get the last insert

In [None]:
pprint(list(db.sensors_readings.find().sort(
    [('timestamp', -1)]).limit(2)))

## Embending of information II
A single document contains multiple sensors - and multiple reads.

In [None]:
data = {
    'location_name': 'Prometheus Server', 
    'description' : 'Prometheus Server @ lab. 163 / ISE /UAlg',
    'sensors' : [ 
           {
               'sensor_name' : 'mem_sensor', 
               'values' :[] ,
               'units' : 'percent'
           },
           {
               'sensor_name' : 'cpu_sensor', 
               'values' : [],
               'units' : 'percent'
           }
       ],
}

# get the readingd id to latter add values to the readings
readings_id = db.sensors_readings.insert_one(data).inserted_id

However, in this implementation the **full document is upload each time a new read is made**.

In [None]:
for _ in range(100):
    mem = psutil.virtual_memory().percent
    cpu = psutil.cpu_percent(interval=0.01)

    # update the data
    data['sensors'][0]['values'].append({'value': mem, 'timestamp' : datetime.datetime.utcnow()})
    data['sensors'][1]['values'].append({'value': cpu, 'timestamp' : datetime.datetime.utcnow()})
    # update the database, sending the full document again!!
    db.sensors_readings.update_one(
        {'_id': readings_id}, 
        {'$set': data}
    )
    
    print('.', end='')

The last reading is 

In [None]:
x = list(db.sensors_readings.find().sort([('_id', -1)]).limit(1))
x

to get a value...

In [None]:
x[0]['sensors'][0]['values'][0]['value']

## Embending of information III
As previously, a single document contains multiple sensors - and multiple reads. But now, the document is **update in the database each time a new read is made**.

In [None]:
import datetime
import psutil

data = {
    'location_name': 'Prometheus Server', 
    'description' : 'Prometheus Server @ lab. 163 / ISE /UAlg', 
    'sensors' : [ 
           {
               'sensor_name' : 'mem_sensor', 
               'values' : [],
               'units' : 'percent'
           },
           {
               'sensor_name' : 'cpu_sensor', 
               'values' : [],
               'units' : 'percent'
           }
       ]
}

readings_id = db.sensors_readings.insert_one(data).inserted_id

Now, a first document was inserted with no sensors values. The document `_id` was stored and in the following data will be appended/pushed to the corresponding document

In [None]:
for _ in range(200):
    mem = psutil.virtual_memory().percent
    cpu = psutil.cpu_percent(interval=0.1)
    
    # update the database, sending only the update!!
    db.sensors_readings.update_one(
        {'_id': readings_id}, 
        {
            '$push': {
                'sensors.0.values': {'value': mem, 'timestamp' : datetime.datetime.utcnow()},
                'sensors.1.values': {'value': cpu, 'timestamp' : datetime.datetime.utcnow()}        
            }
        }
    )    
    
    print('.', end='')

The last reading is 

In [None]:
pprint(list(db.sensors_readings.find().sort([('_id', -1)]).limit(1)))

# Getting Documents
Getting a single document with find_one()

In [None]:
db.sensors_readings.find_one()

Find one readings from "Prometheus Server"

In [None]:
db.sensors_readings.find_one({'location_name':'Prometheus Server'})

Get the Object id for one reading on the sensor's reading collection

In [None]:
obj_id = db.sensors_readings.find_one({'location_name':'Prometheus Server'})["_id"]
obj_id

Querying By ObjectId

In [None]:
from bson.objectid import ObjectId

db.sensors_readings.find_one({'_id': obj_id})  # update the _id

Do projections, i.e., select which fields to present

In [None]:
db.sensors_readings.find_one(
    {'_id': obj_id},
    {'sensors':1}
)

# Bulk Insert
In addition to inserting a single document, we can also perform bulk insert operations, by passing a list as the first argument to insert_many(). This will insert each document in the list, sending only a single command to the server.

The result from insert_many() now returns multiple ObjectId instances, one for each inserted document.

In [None]:
new_posts = [{
                'sensor': {'location_id': ObjectId('5a95821bdc936e0cfc7c7d96'),
                'sensor_name': 'cpu_sensor'},
                'timestamp': datetime.datetime(2018, 2, 27, 16, 7, 18, 289000),
                'units': 'percent',
                'value': 4.5
            },
             {
                'sensor': {'location_id': ObjectId('5a95821bdc936e0cfc7c7d96'),
                'sensor_name': 'cpu_sensor'},
                'timestamp': datetime.datetime(2018, 2, 27, 16, 7, 18, 289000),
                'units': 'percent',
                'value': 4.5
             }
            ]
result = db.sensors_readings.insert_many(new_posts)

and get the id's of the inserted objects

In [None]:
result.inserted_ids

# Querying for More Than One Document

To get more than a single document as the result of a query we use the find() method. find() returns a Cursor instance, which allows us to iterate over all matching documents. For example, we can iterate over every document in the posts collection:

In [None]:
for doc in db.sensors_readings.find():
    pprint(doc)

you can also limit the output and order it...

In [None]:
for doc in db.sensors_readings.find().sort([('_id',1)]).limit(2):
    pprint(doc)

## Counting
If we just want to know how many documents match a query we can perform a count() operation instead of a full query. We can get a count of all of the documents in a collection:

In [None]:
db.sensors_readings.count_documents({})

In [None]:
db.sensors_readings.count_documents({'location_name':'Prometheus Server'})

## Range Queries
MongoDB supports many different types of advanced queries.

As an example, lets perform a query where we limit results to readings with a certain date (update the datetime to one of the previously inserted documents):

In [None]:
date = datetime.datetime(2021, 4, 12, 18, 58, 13, 611000)

for doc in db.sensors_readings.find({'sensors.values.timestamp': date}):
    print(doc)

And now, limit results to readings older than a certain date:


In [None]:
for doc in db.sensors_readings.find({'sensors.values.timestamp': {'$gt': date}}):
    print(doc)

All readings witha `value` lower than 10

In [None]:
for doc in db.sensors_readings.find({'value': {'$lt': 10}}):
    print(doc)

All CPU readings with `value` lower than 10 in the next type of documents

![./images/doc_example.png](./images/doc_example.png)

In [None]:
for doc in db.sensors_readings.find({'sensors.1.values.value': {'$gt': 10}}):
    print(doc)