# Intro to Pymongo

In [1]:
%pylab inline

from pymongo import MongoClient

Populating the interactive namespace from numpy and matplotlib


## Establishing a connection to the server

A MongoClient instance will establish a connection with a MongoDB server. If we call it with no arguments it will default to the local instance (if it exists)

In [2]:
client = MongoClient()

In [3]:
client

MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True)

In MongoDB, there are three levels of explicit organization, "databases" (dbs), which contain one or more "collections", which contain one or more "documents". Documents can contain sub-documents, but these can't be queried directly. 

We use dot notation to direct our client to the appropriate db and collection. If we direct the client to a db/collection that does not exist, it will be automatically created when we add content to it through pymongo. 

Let's say we want to create a db called my_database with a collection my_collection. We can assign our db and collection to variables in our python namespace as follows

In [4]:
db = client.my_database
collection = db.my_collection

In [5]:
db

Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), u'my_database')

In [6]:
collection

Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), u'my_database'), u'my_collection')

Often, it is simplest to establish this with a one-liner.

In [7]:
collection = MongoClient().my_database.my_collection

Let's check in our terminal to see if our db and/or colletion have been created. We do that in the terminal in the following way 

Type `mongo` and hit `return` -- this should enter you into 

Type `show dbs` -- this should generate a list of databases

You should see a database called `admin` and perhaps some others, but not `my_database` -- this is because it is not created until some content (documents) is inserted into it. Let's insert a document.  

## Inserting a document

In [8]:
# create a dictionary that will serve as our first document 
doc = {'a':20, 'b':'cheezits'}

In [9]:
doc

{'a': 20, 'b': 'cheezits'}

In [10]:
collection.insert_one(doc)

<pymongo.results.InsertOneResult at 0x112fc9f00>

Let's go back to terminal and look at our dbs again using `show dbs`

Now we should see `my_database` in the list! 

To switch to it, type `use my_database`

And to view its collections, type `show collections`

To count the documents in the collection, type `db.my_collection.count()`

There is one document -- hopefully its ours. Let's switch back to python and insert one more. 



In [11]:
doc2 = {'name':'joe', 'favorite animal':'doggy'}
collection.insert_one(doc2)

<pymongo.results.InsertOneResult at 0x112fc9f50>

If we type the command `db.my_collection.count()` in the terminal one more time, we should see there are now two documents. 

Notice that the contents od these two documents DON'T MATCH! This means you must take care in constructing and modifying documents. It is generally best if all documents across a collection share the same "schema" i.e. the set of keys to which values are assigned. 

Also, if we look at our two dictionaries, we will notice they are modified.

In [12]:
print doc, doc2

{'a': 20, 'b': 'cheezits', '_id': ObjectId('5760a1fcc4fc3347764a0cbc')} {'favorite animal': 'doggy', 'name': 'joe', '_id': ObjectId('5760a1fcc4fc3347764a0cbd')}


Each document now contains a field named `'_id'` with an `ObjectId('SomeLongString')`. This was automatically generated when the document was added to MongoDB, and pymongo was kind enough to update our local dictionary to match. 

The `'_id'` field is special for Mongo, it is the unique identifier of a document. Sometimes, it makes sense to let MongoDB generate it when it is okay for it to be arbitrary -- it is also fast. However, sometimes it makes sense to assign it ourselves, for instance when there is another system or database which contains unique identifiers that it is important to keep in correspondence. 

Let's create and insert one more document, this time specifying our own `'_id'`

In [13]:
doc3 = {'_id':'turtle', 'color':'green'}
collection.insert_one(doc3)

<pymongo.results.InsertOneResult at 0x112fc9eb0>

## Querying

In [14]:
print collection.find_one({'_id':'turtle'})

{u'color': u'green', u'_id': u'turtle'}
