# MongoDB and Python

MongoDB is a 'NoSQL database' with support for high-performance document-oriented storage and queries, sharding, and replication.

Terminology:

- A **document** is a single JSON-like object stored in MongoDB
- A **collection** is a respository of documents which may have one or more indexes on them
- A **database** is a group of collections and indexes 


To get started, we'll install the `pymongo` driver and the `dnspython` modules to allow us to use the "mongodb+srv://" URLs to connect to MongoDB:

In [1]:
!pip install pymongo dnspython

[33mYou are using pip version 18.1, however version 19.0.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


## Connecting and accessing databases and collections

In [2]:
import pymongo
password = 'dFoIbycqCCbLkcQc'
cli = pymongo.MongoClient(f'mongodb+srv://class:{password}@eht-6ypgo.mongodb.net/class?retryWrites=true')

In [3]:
cli

MongoClient(host=['eht-shard-00-00-6ypgo.mongodb.net:27017', 'eht-shard-00-02-6ypgo.mongodb.net:27017', 'eht-shard-00-01-6ypgo.mongodb.net:27017'], document_class=dict, tz_aware=False, connect=True, authsource='admin', replicaset='eht-shard-0', ssl=True, retrywrites=True)

In [4]:
db = cli['class']
db

Database(MongoClient(host=['eht-shard-00-00-6ypgo.mongodb.net:27017', 'eht-shard-00-02-6ypgo.mongodb.net:27017', 'eht-shard-00-01-6ypgo.mongodb.net:27017'], document_class=dict, tz_aware=False, connect=True, authsource='admin', replicaset='eht-shard-0', ssl=True, retrywrites=True), 'class')

In [5]:
db.roster

Collection(Database(MongoClient(host=['eht-shard-00-00-6ypgo.mongodb.net:27017', 'eht-shard-00-02-6ypgo.mongodb.net:27017', 'eht-shard-00-01-6ypgo.mongodb.net:27017'], document_class=dict, tz_aware=False, connect=True, authsource='admin', replicaset='eht-shard-0', ssl=True, retrywrites=True), 'class'), 'roster')

## Inserting data

[Additional documentation](http://api.mongodb.com/python/current/api/pymongo/collection.html#pymongo.collection.Collection.insert)

In [6]:
db.roster.insert_one({
    'name': 'Rick Copeland',
    'email': 'rick@arborian.com',
    'role': 'Instructor',
})

<pymongo.results.InsertOneResult at 0x10a529d48>

In [8]:
_.inserted_id

ObjectId('5c9417d5e6ed88a7fa1f0ffb')

## Querying data

[Additional documentation: query operators](https://docs.mongodb.com/manual/reference/operator/query/)

[Additional documentation: find](http://api.mongodb.com/python/current/api/pymongo/collection.html#pymongo.collection.Collection.find)

[Additional documentation: find_one](http://api.mongodb.com/python/current/api/pymongo/collection.html#pymongo.collection.Collection.find_one)

In [9]:
for item in db.roster.find({'name': 'Rick Copeland'}):
    print(item)

{'_id': ObjectId('5c89443bdc1bebd71b97f972'), 'name': 'Rick Copeland', 'email': 'rick@arborian.com', 'role': 'Instructor'}
{'_id': ObjectId('5c9417d5e6ed88a7fa1f0ffb'), 'name': 'Rick Copeland', 'email': 'rick@arborian.com', 'role': 'Instructor'}
{'_id': ObjectId('5c94186466cedabe39360da5'), 'name': 'Rick Copeland', 'email': 'rick@arborian.com', 'role': 'Instructor'}


In [10]:
doc = db.roster.find_one()
doc

{'_id': ObjectId('5c89443bdc1bebd71b97f972'),
 'name': 'Rick Copeland',
 'email': 'rick@arborian.com',
 'role': 'Instructor'}

In [11]:
import re
db.roster.find_one({'role': re.compile('^Ins')})

{'_id': ObjectId('5c89443bdc1bebd71b97f972'),
 'name': 'Rick Copeland',
 'email': 'rick@arborian.com',
 'role': 'Instructor'}

## Updating data

[Additonal documentation: update operators](https://docs.mongodb.com/manual/reference/operator/update/)

[Additional documentation: replace](http://api.mongodb.com/python/current/api/pymongo/collection.html#pymongo.collection.Collection.replace_one)

[Additional documentation: update](http://api.mongodb.com/python/current/api/pymongo/collection.html#pymongo.collection.Collection.update_one)

In [12]:
doc['email'] = 'rick446@arborian.com'
db.roster.replace_one(
    {'_id': doc['_id']},
    doc
)

<pymongo.results.UpdateResult at 0x10a9bb788>

In [13]:
doc = db.roster.find_one()
doc

{'_id': ObjectId('5c9417d5e6ed88a7fa1f0ffb'),
 'name': 'Rick Copeland',
 'email': 'rick@arborian.com',
 'role': 'Instructor'}

In [14]:
db.roster.update_one(
    {'_id': doc['_id']},
    {'$set': {'email': 'rick@arborian.com'}}
)

<pymongo.results.UpdateResult at 0x10a99b908>

In [15]:
doc = db.roster.find_one()
doc

{'_id': ObjectId('5c9417d5e6ed88a7fa1f0ffb'),
 'name': 'Rick Copeland',
 'email': 'rick@arborian.com',
 'role': 'Instructor'}

## Atomic find/modify

In [19]:
coll = db.roster
doc = coll.find_one_and_update(
    {'name': 'Rick Copeland'},
    {'$inc': {'classes': 1}},
    return_document=pymongo.ReturnDocument.AFTER
)
doc

{'_id': ObjectId('5c9417d5e6ed88a7fa1f0ffb'),
 'name': 'Rick Copeland',
 'email': 'rick@arborian.com',
 'role': 'Instructor',
 'classes': 3}

## Delete

In [20]:
import re

res = coll.delete_one({'name': re.compile(r'^Ri')})
res

<pymongo.results.DeleteResult at 0x10a9bb108>

In [21]:
coll.delete_many({'name': re.compile(r'^Ri')})

<pymongo.results.DeleteResult at 0x10aa543c8>

In [22]:
res.deleted_count

1

In [23]:
list(coll.find())

[]

Open [PyMongo Lab](./pymongo-lab.ipynb)

In [24]:
coll = db.stock

In [25]:
data = [
    ("2014-01-02", "F", 12.089),
    ("2014-01-02", "TSLA", 150.1),
    ("2014-01-02", "IBM", 157.6001),
    ("2014-01-02", "AAPL", 72.7741),
    ("2014-01-03", "F", 12.1438),
    ("2014-01-03", "TSLA", 149.56),
    ("2014-01-03", "IBM", 158.543),
    ("2014-01-03", "AAPL", 71.1756),
    ("2014-01-06", "F", 12.1986),
    ("2014-01-06", "TSLA", 147.0),
    ("2014-01-06", "IBM", 157.9993),
    ("2014-01-06", "AAPL", 71.5637),
    ("2014-01-07", "F", 12.042),
    ("2014-01-07", "TSLA", 149.36),
    ("2014-01-07", "IBM", 161.1508),
    ("2014-01-07", "AAPL", 71.0516),
    ("2014-01-08", "F", 12.1673),
    ("2014-01-08", "TSLA", 151.28),
    ("2014-01-08", "IBM", 159.6728),
    ("2014-01-08", "AAPL", 71.5019),
]

In [26]:
docs = [
    {'date': date, 'symbol': symbol, 'price': price}
    for date, symbol, price in data
]

In [27]:
coll.insert_many(docs)


<pymongo.results.InsertManyResult at 0x10aa54ac8>

In [28]:
coll.find_one({'symbol': 'TSLA', 'date': '2014-05-04'})

In [29]:
coll.delete_many({})

<pymongo.results.DeleteResult at 0x10aa560c8>

In [30]:
import pandas as pd
dat = pd.read_csv('data/closing-prices.csv', index_col=0)

In [31]:
from datetime import datetime
data = []
for (date, symbol), price in dat.stack().items():    
    data.append({'date': datetime.strptime(date, '%Y-%m-%d'), 'symbol': symbol, 'price': price})
coll.insert_many(data)

<pymongo.results.InsertManyResult at 0x117695cc8>

In [33]:
coll.find_one({
    'symbol': 'TSLA',
    'date': datetime(2016, 6, 3)
})

{'_id': ObjectId('5c941c3ce6ed88a7fa1f1bbc'),
 'date': datetime.datetime(2016, 6, 3, 0, 0),
 'symbol': 'TSLA',
 'price': 218.99}