Skip to content
This repository has been archived by the owner on Jul 14, 2020. It is now read-only.

Structure

namlook edited this page Aug 25, 2013 · 2 revisions

The Structure

The structure is a simple dict which defines the document's schema.

Field Types

Field types are simple python type. By default, MongoKit allow the following types:

- None # Untyped field
- bool 
- int 
- float
- long
- basestring
- unicode
- list
- dict
- datetime.datetime
- bson.binary.Binary
- pymongo.objectid.ObjectId
- bson.dbref.DBRef
- bson.code.Code
- type(re.compile(""))
- uuid.UUID
- CustomType

Untyped field

Sometime you don't want to specify a type for a field. In order to allow a field to have any of the authorized types, just set the field to None into the structure:

class MyDoc(Document):
  structure = {
    'foo': basestring,
    'bar': None
  }

In this example, bar can be of any above types except a CustomType.

Nested Structure

MongoDB allows documents to include nested structures using lists and dicts. You can also use the structure dict to specify these nested structures as well.

Dicts

Python's dict syntax {} is used for describing nested structure::

class Person(Document):
  structure = {
    'biography': {
      'name': basestring,
      'age': int
    }
  }

This validates that each document has an author dict which contains a string name and an integer number of books.

If you want to a nested dict with no validation, you must use the dict type keyword instead:

class Person(Document):
  structure = {
    'biography': dict
  }

If you don't specify the nested structure and don't use the dict type keyword, you won't be able to add values to the nested structure:

class Person(Document):
  structure = {
    'biography': {}
  }

then

>>> bob = Person()
>>> bob['biography']['foo'] = 'bar'
>>> bob.validate()
Traceback (most recent call last):
...
StructureError: unknown fields : ['foo']

Using dict type is useful if you don't know what fields will be added or what types they will be. If you know the type of the field, it's better to explicitly specify them:

class Person(Document):
  structure = {
    'biography': {
      unicode: unicode
    }
  }

This will add another layer to validate the content.

Lists

The basic way to use a list is with no validation of its contents:

    'tags': list

In this example, the tags value must be a list but the contents of tags can be anything at all. To validate the contents of a list, you use Python's list syntax [] instead:

    'tags': [basestring]

You can also validate an array of complex objects by using a dict:

    'tags': [
      {
        'name': basestring,
        'count': int
      }
    ]

Tuples

If you need a structured list with a fixed number of items, you can use tuple to describe it:

class MyDoc(Document):
  structure = {
    'book': (int, basestring, float)
  }

then

>>> con.register([MyDoc])
>>> mydoc = tutorial.MyDoc()
>>> mydoc['book']

[None, None, None]

Tuple are converted into a simple list and add another validation layer. Fields must follow the right type:

>>> mydoc['book'] = ['foobar', 1, 1.0]
>>> mydoc.validate()
Traceback (most recent call last):
...
SchemaTypeError: book must be an instance of int not basestring

And they must have the right number of items:

>>> mydoc['book'] = [1, 'foobar']
>>> mydoc.validate()
Traceback (most recent call last):
...
SchemaTypeError: book must have 3 items not 2

As tuples are converted to list internally, you can make all list operations:

>>> mydoc['book'] = [1, 'foobar', 3.2]
>>> mydoc.validate()
>>> mydoc['book'][0] = 50
>>> mydoc.validate()

Sets

The Set() python type is not supported in pymongo. If you want to use it anyway, use the Set() custom type:

class MyDoc(Document):
  structure = {
    'tags': Set(unicode),
  }

Using Custom Types

Sometimes, we need to work with complex objects while their footprint in the database is fairly simple. Let's take a
datetime object. A datetime object can be useful to compute complex date but while mongodb can deal with datetime object, let's say that we just want to store the unicode representation.

MongoKit allows you to work on a datetime object and store the unicode representation on the fly. In order to do this, we have to implement a CustomType and fill the custom_types attributes:

>>> import datetime

A CustomType object must implement two methods and one attribute:

  • to_bson(self, value): this method will convert the value to fit the correct authorized type before being saved in the db.
  • to_python(self, value): this method will convert the value taken from the db into a python object
  • validate(self, value, path): this method is optional and will add a validation layer. Please, see the Set() CustomType code for more example.
  • You must specify a mongo_type property in the CustomType class. this will describes the type of the value stored in the mongodb.
  • If you want more validation, you can specify a python_type property which is the python type the value will be converted. This is a good thing to specify it as it make a good documentation.
  • init_type attribute will allow to describes an empty value. For example, if you implement the python set as CustomType, you'll set init_type to Set. Note that init_type must be a type or a callable instance.

Example:

class CustomDate(CustomType):
    mongo_type = unicode
    python_type = datetime.datetime # optional, just for more validation
    init_type = None # optional, fill the first empty value

    def to_bson(self, value):
        """convert type to a mongodb type"""
        return unicode(datetime.datetime.strftime(value,'%y-%m-%d'))

    def to_python(self, value):
        """convert type to a python object"""
        if value is not None:
           return datetime.datetime.strptime(value, '%y-%m-%d')

    def validate(self, value, path):
        """OPTIONAL : useful to add a validation layer"""
        if value is not None:
            pass # ... do something here            

Now, let's create a Document:

class Foo(Document):
    structure = {
        'foo':{
            'date': CustomDate(),
        },
    }

Now, we can create Foo's objects and working with python datetime objects:

>>> foo = Foo()
>>> foo['_id'] = 1
>>> foo['foo']['date'] = datetime.datetime(2003,2,1)
>>> foo.save()

The saved object in db has the unicode footprint as expected:

>>> tutorial.find_one({'_id':1})
{u'_id': 1, u'foo': {u'date': u'03-02-01'}}

Querying an object will automatically convert the CustomType into the correct python object:

>>> foo = tutorial.Foo.get_from_id(1)
>>> foo['foo']['date']
datetime.datetime(2003, 2, 1, 0, 0)

OR, NOT, and IS types

You can also use boolean logic to do field type validation.

OR operator

Let's say that we have a field which can be unicode or int or a float. We can use the OR operator to tell MongoKit to validate the field:

>>> from mongokit import OR
>>> from datetime import datetime
>>> class Account(Document): 
...     structure = { 
...         "balance": {'foo': OR(unicode, int, float)} 
...     } 

>>> con.register([Account])
>>> account = tutorial.Account()
>>> account['balance']['foo'] = u'3.0'
>>> account.validate()
>>> account['balance']['foo'] = 3.0
>>> account.validate()

but:

>>> account['balance']['foo'] = datetime.now()
>>> account.validate()
Traceback (most recent call last):
...
SchemaTypeError: balance.foo must be an instance of <unicode or int or float> not datetime

NOT operator

You can also use the NOT operator to tell MongoKit that you don't want a such type for a field:

>>> from mongokit import NOT
>>> class Account(Document): 
...     structure = { 
...         "balance": {'foo': NOT(unicode, datetime)} 
...     } 

>>> con.register([Account])
>>> account = tutorial.Account()
>>> account['balance']['foo'] = 3
>>> account.validate()
>>> account['balance']['foo'] = 3.0
>>> account.validate()

and:

>>> account['balance']['foo'] = datetime.now()
>>> account.validate()
Traceback (most recent call last):
...
SchemaTypeError: balance.foo must be an instance of <not unicode, not datetime> not datetime

>>> account['balance']['foo'] = u'3.0'
>>> account.validate()
Traceback (most recent call last):
...
SchemaTypeError: balance.foo must be an instance of <not unicode, not datetime> not unicode

IS operator

Sometimes, you might want to force a fields to be in a specific value. The IS operator must be use for this purpose:

>>> from mongokit import IS
>>> class Account(Document):
...     structure = {
...         "flag": {'foo': IS(u'spam', u'controversy', u'phishing')}
...     }

>>> con.register([Account])
>>> account = tutorial.Account()
>>> account['flag']['foo'] = u'spam'
>>> account.validate()
>>> account['flag']['foo'] = u'phishing'
>>> account.validate()

and:

>>> account['flag']['foo'] = u'foo'
>>> account.validate()
Traceback (most recent call last):
...
SchemaTypeError: flag.foo must be in [u'spam', u'controversy', u'phishing'] not foo

Schemaless Structure

One of the main advantage of MongoDB is the ability to insert schemaless documents into the database. As of version 0.7, MongoKit allows you to save partially structured documents. For now, this feature must be activated. It will be the default behavior in a future release.

To enable schemaless support, use the use_schemaless attribute:

class MyDoc(Document):
    use_schemaless = True

Setting use_schemaless to True allow to have an unset structure but you still can specify a structure:

class MyDoc(Document):
    use_schemaless = True
    structure = {
        'title': basestring,
        'age': int
    }

    required_fields = ['title']

MongoKit will raise an exception only if required fields are missing:

>>> doc = MyDoc({'age': 21})
>>> doc.save()
Traceback (most recent call last):
...
StructureError: missed fields : ['title']
>>> doc = MyDoc({'age': 21, 'title': 'Hello World !'})
>>> doc.save()