In [None]:
!pip install pymongo

In [18]:
from pymongo import MongoClient
import datetime

In [19]:
client = MongoClient()

The above code will connect on the default host and port. We can also specify the host and port explicitly, as follows:

In [20]:
client = MongoClient(host="localhost", port=27017)
# OR
client = MongoClient("mongodb://localhost:27017")

client

MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True)

**Getting a Database**
Once you have a connected instance of MongoClient, 
you can access any database managed by the specified MongoDB server. 
To define which database you want to use, you can use the dot notation:

In [None]:
db = client.test_database
# OR 
db = client["test_database"]
#This statement is handy when the name of your database isn’t a valid Python identifier.

db

In this case, **newCollection** is an instance of **Collection** and represents a physical collection of documents in your database. You can insert documents into tutorial by calling `.insert_one()` on it with a document as an argument:

In [None]:
collection = db.test_collection
#OR
collection = db['test_collection']
#This statement is handy when the name of your collection isn’t a valid Python identifier.

collection

**Sample Document**
Following example shows the document structure of a blog site, which is simply a comma separated key value pair.

In [None]:
post = {"author": "Mike",
        "text": "My first blog post!",
        "tags": ["mongodb", "python", "pymongo"],
        "date": datetime.datetime.utcnow()}

In [None]:
post_collection = db.posts
post_id = post_collection.insert_one(post).inserted_id
post_id

In [None]:
import pprint
pprint.pprint(post_collection.find_one())

In [None]:
pprint.pprint(post_collection.find_one({"author": "Mike"}))

In [None]:
post_collection.find_one({"author": "Eliot"})

In [None]:
pprint.pprint(post_collection.find_one({"_id": post_id}))

## Inserting many Documents
`insert_many()` This method is used to insert multiple entries in a collection or the database in MongoDB. The parameter of this method is a list that contains dictionaries of the data that we want to insert in the collection.


In [None]:
post_2 = {"author": "Leo",
        "text": "Fasting 14-10",
        "tags": ["python", "pymongo", "django"],
        "date": datetime.datetime.utcnow()}

post_3 = {"author": "Jack",
        "text": "Fastest Car",
        "tags": ["mongodb", "python", "pyspark"],
        "date": datetime.datetime.utcnow()}

In [None]:
new_result = post_collection.insert_many([post_2, post_3])

for i in new_result.inserted_ids:
    pprint.pprint(post_collection.find_one({"_id": i}))

This is faster and more straightforward than calling `.insert_one()` multiple times. The call to `.insert_many()` takes an iterable of documents and inserts them into the tutorial collection in your rptutorials database.

## Querying for More Than One Document


To retrieve documents from a collection, you can use `.find()`. Without arguments, `.find()` returns a Cursor object that yields the documents in the collection on demand:

In [None]:
for doc in post_collection.find():
    print(doc)

## Counting

If we just want to know how many documents match a query we can perform a `count_documents()` operation instead of a full query. We can get a count of all of the documents in a collection:

In [None]:
post_collection.count_documents({})

## Aggregation

There are several methods of performing aggregations in MongoDB. These examples cover the new aggregation framework, using map reduce and using the group method.

create a sample collection named inventory with the following document:

In [None]:
db.inventory.insert_one({"_id" : 2, "item" : "ABC1", "sizes": [ "S", "M", "L"]})

The following aggregation uses the $unwind stage to output a document for each element in the sizes array:

In [None]:
result = db.inventory.aggregate( [ { "$unwind": "$sizes" } ] )
print(list(result))

In [None]:
db.inventory.insert_many([{"x": 1, "tags": ["dog", "cat"]},
                                {"x": 2, "tags": ["cat"]},
                                {"x": 2, "tags": ["mouse", "cat", "dog"]},
                                {"x": 3, "tags": []}])


In [None]:
result = db.inventory.aggregate( [ {"$unwind": "$tags"}, {"$group": {"_id": "$tags", "count": {"$sum": 1}}} ] )
print(list(result))

As python dictionaries don’t maintain order you should use `SON` or `collections.OrderedDict` where explicit ordering is required eg `“$sort”`:

In [None]:
from bson.son import SON
pipeline = [
    {"$unwind": "$tags"},
    {"$group": {"_id": "$tags", "count": {"$sum": 1}}},
    {"$sort": {"count": -1, "_id": -1}}
]
result = collection.aggregate( pipeline )

In [None]:
print(list(result))

In [31]:
for doc in db.test_collection.find():
    db.test_collection.delete_one(doc)