Start out by establishing a connection to a MongoDB instance.

In [1]:
import pymongo

conn = pymongo.MongoClient()
db = conn.test
coll = db.objects
coll.drop()

Insert a couple of documents that represent themselves naturally in JSON.

In [2]:
import datetime

coll.insert_one({'a': datetime.datetime.now(), 'b': 1.0})
coll.insert_one({'items': [1, 2, 3, 'd']})

<pymongo.results.InsertOneResult at 0x10489b708>

But what about more complex objects (classes and instances of those classes)?

In [3]:
class MyObject:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        return '{self.__class__.__name__}({self.x}, {self.y})'.format(**vars())


ob1 = MyObject(1, 2)
ob2 = MyObject(0, 5)

In [4]:
coll.insert_one(ob1)

TypeError: document must be an instance of dict, bson.son.SON, bson.raw_bson.RawBSONDocument, or a type that inherits from collections.MutableMapping

In [5]:
import jaraco.modb

jaraco.modb.encode(ob1)

{'py/object': '__main__.MyObject', 'x': 1, 'y': 2}

Because a MyObject instance doesn't have a natural representation in JSON, it's serialized as a dictionary with a special key 'py/object', which signals to the decoder that this is a JSONPickled Python Object. As long as the system doing the decoding implements `__main__.MyObject` with a compatible interface, the object will decode nicely.

In [6]:
coll.insert_one(jaraco.modb.encode(ob1))
coll.insert_one(jaraco.modb.encode(ob2))

<pymongo.results.InsertOneResult at 0x10490c288>

Now the two objects should be persisted to the database. Query them to see how they appear.

In [7]:
list(coll.find())

[{'_id': ObjectId('56c9f5ff10334e93a39158b1'),
  'a': datetime.datetime(2016, 2, 21, 13, 38, 7, 359000),
  'b': 1.0},
 {'_id': ObjectId('56c9f5ff10334e93a39158b2'), 'items': [1, 2, 3, 'd']},
 {'_id': ObjectId('56c9f61410334e93a39158b3'),
  'py/object': '__main__.MyObject',
  'x': 1,
  'y': 2},
 {'_id': ObjectId('56c9f61410334e93a39158b4'),
  'py/object': '__main__.MyObject',
  'x': 0,
  'y': 5}]

In [8]:
next(map(jaraco.modb.decode, coll.find({'x': 0})))

MyObject(0, 5)

But what about more complex objects? Consider ob3 whose x attribute is another MyObject.

In [9]:
ob3 = MyObject(ob2, 2)
ob3

MyObject(MyObject(0, 5), 2)

In [10]:
coll.insert_one(jaraco.modb.encode(ob3))

<pymongo.results.InsertOneResult at 0x10489bf30>

Because MongoDB's document query engine allows reaching deep into the documents, one can even query based on child object's attributes.

In [11]:
# Find all objects whose x attribute has a y attribute with a value of 5
query = {'x.y': 5}
next(map(jaraco.modb.decode, coll.find(query)))

MyObject(MyObject(0, 5), 2)

Where are the limitations? What about integer keys?

In [12]:
coll.insert_one({1: 3})

InvalidDocument: documents must have only string keys, key was 1

In [14]:
res = coll.insert_one(jaraco.modb.encode({1: 3}))

In [17]:
coll.find_one({'_id': res.inserted_id})

{'1': 3, '_id': ObjectId('56c9f63910334e93a39158b8')}

You might note that the integer 1 is now represented as a string '1'. This limitation is an unfortunate side-effect of relying on JSON as a serialization layer.