## Lesson 8 - Getting Started with MongoDB





### Table of Contents

* [What is a MongoDB?](#WhatsMongoDB)
* [Install PyMongo package](#PyMongo)
* [Check MongoDB startup logs](#StartupLogs)
* [Insert data to MongoDB / Query Collection](#InsertAndQueryData)
* [Create Collection](#CreateCollection)
* [List Collections](#ListCollections)
* [Drop Collection](#DropCollection)
* [Running Commands](#RunningCommands)
* [PyMongo cursor](#PyMongoCursor)
* [PyMongo read all data](#PyMongoReadAllData)
* [PyMongo count documents](#PyMongoCountDocuments)
* [PyMongo filters](#PyMongoFilters)
* [PyMongo projections](#PyMongoProjections)
* [PyMongo sorting documents](#PyMongoSorting)
* [PyMongo aggregations](#PyMongoAggregations)
* [PyMongo limit data output](#PyMongoLimitData)


<a id="WhatsMongoDB"></a>
## What is a MongoDB?

MongoDB is a NoSQL cross-platform document-oriented database. It is one of the most popular databases available. MongoDB is developed by MongoDB Inc. and is published as free and open-source software.

A record in MongoDB is a document, which is a data structure composed of field and value pairs. MongoDB documents are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents. MongoDB stores documents in collections. Collections are analogous to tables in relational databases and documents to rows.

A cursor is a reference to the result set of a query. Clients can iterate through a cursor to retrieve results. By default, cursors timeout after ten minutes of inactivity.

<a id="PyMongo"></a>
## Install PyMongo package

PyMongo is a Python module for working with MongoDB in Python.

- Installing PyMongo

The following command is used to install PyMongo.

```
$ sudo pip install pymongo
We install PyMongo with pip.
```

<a id="StartupLogs"></a>
## 連線MongoDB查看開啟服務日誌

In [3]:
# -*- coding: utf-8 -*-
import pymongo as mg

client = mg.MongoClient('mongodb://127.0.0.1:27017')
db = client['local']
result = db['startup_log'].find({})
lst  =  list(result)

for x in lst:
    print(x['startTime'])

2018-03-01 06:32:11
2018-03-01 08:28:04
2018-03-08 02:45:01
2018-03-23 04:02:58
2018-03-23 08:26:58
2018-05-08 04:22:35
2018-05-17 08:54:49
2018-05-24 03:17:27
2018-05-30 02:27:39
2018-05-30 03:43:44
2018-05-30 08:22:01
2018-06-04 02:03:18
2018-06-11 07:44:25
2018-06-11 08:17:38
2018-07-19 03:28:07
2018-07-19 07:00:42
2018-07-19 09:16:59
2018-07-30 06:44:28
2018-08-22 03:49:40
2018-08-24 08:16:05
2018-09-04 07:25:48
2018-09-05 00:38:40
2018-09-07 00:16:53
2018-10-11 06:38:49
2018-10-22 04:40:08
2018-10-22 09:43:30
2018-10-26 01:24:40
2019-01-25 00:31:03
2019-01-25 04:05:40
2019-04-12 00:54:19
2019-05-02 03:01:46
2019-05-02 03:29:46
2019-05-03 01:42:59
2019-05-06 01:02:07
2019-07-08 00:35:44
2019-07-08 01:17:36
2019-07-09 04:35:54
2019-07-09 09:10:31
2019-07-29 09:57:26
2019-08-08 06:41:51
2019-08-12 06:51:39
2019-08-13 00:34:35
2019-09-16 04:04:42
2019-10-01 00:34:19
2019-10-07 06:29:51
2019-10-15 23:56:40
2019-10-31 00:57:09
2019-11-04 02:30:51
2019-11-22 03:10:37
2019-12-18 06:36:46


<a id="InsertAndQueryData"></a>
## 新增資料進MongoDB / 查詢Collection資料

In [5]:
# -*- coding: utf-8 -*-
import pymongo as mg
import datetime

client = mg.MongoClient('mongodb://127.0.0.1:27017')
db = client['uuutest'] #DBName: uuutest

#post is collection name
result = db.post.insert_one(
{
    "title": 'Headline of the day',
    "body":'blah blah blah...',
    "datetime":datetime.datetime.now()
} )

result = db['post'].find({})
lst  =  list(result)

for x in lst:
    print(x['title'])
    print(x['body'])
    print(str(x['datetime']))

Headline of the day
blah blah blah...
2020-11-09 15:25:41.825000


<a id="CreateCollection"></a>
## PyMongo create collection

In [9]:
from pymongo import MongoClient

cars = [ {'name': 'Audi', 'price': 52642},
    {'name': 'Mercedes', 'price': 57127},
    {'name': 'Skoda', 'price': 9000},
    {'name': 'Volvo', 'price': 29000},
    {'name': 'Bentley', 'price': 350000},
    {'name': 'Citroen', 'price': 21000},
    {'name': 'Hummer', 'price': 41400},
    {'name': 'Volkswagen', 'price': 21600} ]

client = MongoClient('mongodb://localhost:27017/')

with client:
    db = client.testdb
    db.cars.insert_many(cars)

<a id="ListCollections"></a>
## PyMongo list collections

In [14]:
from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')

with client:
    db = client.testdb
    print(db.collection_names())

[]


  import sys


<a id="DropCollection"></a>
## PyMongo drop collection

In [15]:
from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')

with client:
    db = client.testdb
    db.cars.drop()

<a id="RunningCommands"></a>
## PyMongo running commands

We can issue commnads to MongoDB with command(). The serverStatus command returns the status of the MongoDB server.


In [17]:
from pymongo import MongoClient
from pprint import pprint

client = MongoClient('mongodb://localhost:27017/')

with client:
    db = client.testdb
    status = db.command("serverStatus")
    pprint(status)

 'connections': {'available': 999994, 'current': 6, 'totalCreated': 83},
 'extra_info': {'availPageFileMB': 54135,
                'note': 'fields vary by platform',
                'page_faults': 73127,
                'ramMB': 32703,
                'totalPageFileMB': 65471,
                'usagePageFileMB': 168},
 'globalLock': {'activeClients': {'readers': 0, 'total': 15, 'writers': 0},
                'currentQueue': {'readers': 0, 'total': 0, 'writers': 0},
                'totalTime': 13940402000},
 'host': 'G502VS',
 'localTime': datetime.datetime(2020, 11, 9, 7, 34, 52, 950000),
 'locks': {'Collection': {'acquireCount': {'r': 144358, 'w': 429}},
           'Database': {'acquireCount': {'R': 28,
                                         'W': 106,
                                         'r': 144358,
                                         'w': 428}},
           'Global': {'acquireCount': {'W': 11, 'r': 303276, 'w': 534}}},
 'logicalSessionRecordCache': {'activeSessionsCount': 

The example prints a lengthy servers status.

The dbstats command returns statistics that reflect the use state of a single database.

In [18]:
from pymongo import MongoClient
from pprint import pprint

client = MongoClient('mongodb://localhost:27017/')

with client:
    db = client.testdb
    print(db.collection_names())
    status = db.command("dbstats")
    pprint(status)

[]
{'avgObjSize': 0.0,
 'collections': 0,
 'dataSize': 0.0,
 'db': 'testdb',
 'fsTotalSize': 240054693888.0,
 'fsUsedSize': 177930924032.0,
 'indexSize': 0.0,
 'indexes': 0,
 'numExtents': 0,
 'objects': 0,
 'ok': 1.0,
 'storageSize': 0.0,
 'views': 0}


  


The example prints the database statistics of testdb.

<a id="PyMongoCursor"></a>
## PyMongo cursor
The find methods return a PyMongo cursor, which is a reference to the result set of a query.

In [25]:
from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')


cars = [ {'name': 'Audi', 'price': 52642},
    {'name': 'Mercedes', 'price': 57127},
    {'name': 'Skoda', 'price': 9000},
    {'name': 'Volvo', 'price': 29000},
    {'name': 'Bentley', 'price': 350000},
    {'name': 'Citroen', 'price': 21000},
    {'name': 'Hummer', 'price': 41400},
    {'name': 'Volkswagen', 'price': 21600} ]

client = MongoClient('mongodb://localhost:27017/')

# Create collection
with client:
    db = client.testdb
    db.cars.drop()
    db.cars.insert_many(cars)

# walk throgh collection
with client:
    db = client.testdb
    cars = db.cars.find()

    print(cars.next())
    print(cars.next())
    print(cars.next())
    
    cars.rewind()
    print(cars.next())
    print(cars.next())
    print(cars.next())    

    print(list(cars))

{'_id': ObjectId('5fa8f252fd26e8d8c599f484'), 'name': 'Audi', 'price': 52642}
{'_id': ObjectId('5fa8f252fd26e8d8c599f485'), 'name': 'Mercedes', 'price': 57127}
{'_id': ObjectId('5fa8f252fd26e8d8c599f486'), 'name': 'Skoda', 'price': 9000}
{'_id': ObjectId('5fa8f252fd26e8d8c599f484'), 'name': 'Audi', 'price': 52642}
{'_id': ObjectId('5fa8f252fd26e8d8c599f485'), 'name': 'Mercedes', 'price': 57127}
{'_id': ObjectId('5fa8f252fd26e8d8c599f486'), 'name': 'Skoda', 'price': 9000}
[{'_id': ObjectId('5fa8f252fd26e8d8c599f487'), 'name': 'Volvo', 'price': 29000}, {'_id': ObjectId('5fa8f252fd26e8d8c599f488'), 'name': 'Bentley', 'price': 350000}, {'_id': ObjectId('5fa8f252fd26e8d8c599f489'), 'name': 'Citroen', 'price': 21000}, {'_id': ObjectId('5fa8f252fd26e8d8c599f48a'), 'name': 'Hummer', 'price': 41400}, {'_id': ObjectId('5fa8f252fd26e8d8c599f48b'), 'name': 'Volkswagen', 'price': 21600}]


<a id="PyMongoReadAllData"></a>
## PyMongo read all data

In the following example, we read all records from the collection. We use Python for loop to traverse the returned cursor.

In [27]:
from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')

with client:
    db = client.testdb
    cars = db.cars.find()
    for car in cars:
        print('{0} {1}'.format(car['name'], 
            car['price']))

Audi 52642
Mercedes 57127
Skoda 9000
Volvo 29000
Bentley 350000
Citroen 21000
Hummer 41400
Volkswagen 21600


<a id="PyMongoCountDocuments"></a>
## PyMongo count documents

The number of documents is retrieved with the count() method.

In [28]:
from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')

with client:
    db = client.testdb
    n_cars = db.cars.find().count()
    print("There are {} cars".format(n_cars))

There are 8 cars


  import sys


<a id="PyMongoFilters"></a>
## PyMongo filters

The first parameter of find() and find_one() is a filter. The filter is a condition that all documents must match.

The example prints the names of cars whose price is greater than 50000.

```
expensive_cars = db.cars.find({'price': {'$gt': 50000}})
```

In [2]:
from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')

with client:
    db = client.testdb
    expensive_cars = db.cars.find({'price': {'$gt': 50000}})
    for ecar in expensive_cars:
        print(ecar['name'])

Audi
Mercedes
Bentley


<a id="PyMongoProjections"></a>
## PyMongo projections

With projections, we can select specific fields from the returned documents. The projections are passed in the second argument of the find() method.

The example prints the _id and name fields of the documents.

```
cars = db.cars.find({}, {'_id': 1, 'name':1})
```

In [3]:
from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')

with client:
    db = client.testdb
    cars = db.cars.find({}, {'_id': 1, 'name':1})
    for car in cars:
        print(car)

{'_id': ObjectId('5fa8f252fd26e8d8c599f484'), 'name': 'Audi'}
{'_id': ObjectId('5fa8f252fd26e8d8c599f485'), 'name': 'Mercedes'}
{'_id': ObjectId('5fa8f252fd26e8d8c599f486'), 'name': 'Skoda'}
{'_id': ObjectId('5fa8f252fd26e8d8c599f487'), 'name': 'Volvo'}
{'_id': ObjectId('5fa8f252fd26e8d8c599f488'), 'name': 'Bentley'}
{'_id': ObjectId('5fa8f252fd26e8d8c599f489'), 'name': 'Citroen'}
{'_id': ObjectId('5fa8f252fd26e8d8c599f48a'), 'name': 'Hummer'}
{'_id': ObjectId('5fa8f252fd26e8d8c599f48b'), 'name': 'Volkswagen'}


<a id="PyMongoProjections"></a>
## PyMongo sorting documents

We can sort documents with sort().

The example sorts records by price in descending order.

In [4]:
from pymongo import MongoClient, DESCENDING

client = MongoClient('mongodb://localhost:27017/')

with client:
    db = client.testdb
    cars = db.cars.find().sort("price", DESCENDING)
    for car in cars:
        print('{0} {1}'.format(car['name'], 
            car['price']))

Bentley 350000
Mercedes 57127
Audi 52642
Hummer 41400
Volvo 29000
Volkswagen 21600
Citroen 21000
Skoda 9000


<a id="PyMongoAggregations"></a>
## PyMongo aggregations

Aggregations calculate aggregate values for the data in a collection.

In [5]:
from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')

with client:
    db = client.testdb
    agr = [ {'$group': {'_id': 1, 'all': { '$sum': '$price' } } } ]
    val = list(db.cars.aggregate(agr))
    print('The sum of prices is {}'.format(val[0]['all']))

The sum of prices is 581769


The example calculates the sum of all car prices.

agr = [ {`$group`: {`_id`: 1, `all`: { `$sum`: `$price` } } } ]

The `$sum` operator calculates and returns the sum of numeric values. The `$group` operator groups input documents by a specified identifier expression and applies the accumulator expression(s), if specified, to each group.

```
val = list(db.cars.aggregate(agr))
```

The aggregate() method applies the aggregation operation on the cars collection.

The sum of prices is 581769
The sum of all values is 581769.

We can use the `$match` operator to select specific cars to aggregate.

In [7]:
from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')

with client:
    db = client.testdb
    agr = [{ '$match': {'$or': [ { 'name': "Audi" }, { 'name': "Volvo" }] }}, 
           { '$group': {'_id': 1, 'sum2cars': { '$sum': "$price" } }}]
    val = list(db.cars.aggregate(agr))
    print('The sum of prices of two cars is {}'.format(val[0]['sum2cars']))

The sum of prices of two cars is 81642


The example calculates the sum of prices of Audi and Volvo cars.

```
agr = [{ `$match`: {`$or`: [ { `name`: "Audi" }, { `name`: "Volvo" }] }}, 
        { `$group`: {`_id`: 1, 'sum2cars': { `$sum`: "$price" } }}]
```
    
The expression uses `$match`, `$or`, `$group`, and `$sum` operators to do the task.

<a id="PyMongoLimitData"></a>
## PyMongo limit data output

The limit query option specifies the number of documents to be returned and the skip() option some documents.

The example reads from the cars collection, skips the first two documents, and limits the output to three documents.

```
cars = db.cars.find().skip(2).limit(3)
```

In [8]:
from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')

with client:
    db = client.testdb
    cars = db.cars.find().skip(2).limit(3)
    for car in cars:
        print('{0}: {1}'.format(car['name'], car['price']))

Skoda: 9000
Volvo: 29000
Bentley: 350000


## Backup Mongodb database

```
mongodump -h 127.0.0.1 --port 27017 --db ptt -o ptt
```

<img src="images/mongodb_backup_ptt.png">

```
mongorestore -h 127.0.0.1 --port 27017 --db ptt ptt --drop
```
<img src="images/mongodb_restore.png">