### Experiment with MongoEngine

The ODM we use to access MongoDB

In [1]:
from dexter.DB import DB, Account, Entry, Transaction, Document

from datetime import date

Open the database:

In [2]:
DB.open('pytest')

Make an account:

In [3]:
acct = Account(name='equity', category='Q')

Save it:

In [4]:
acct.save()

<Account: <Acct equity Q>

If we open that DB with `mongosh` we should see the account.

```
$ mongosh

test> use foo
switched to db foo

foo> db.account.find()
[
  {
    _id: ObjectId('67c61fa19d0161a19b80469e'),
    name: 'equity',
    group: 'equity'
  }
]
```

It worked!  🎉

### Contents of a Collection

In [5]:
Account.objects

[<Account: <Acct equity Q>, <Account: <Acct yoyodyne I>, <Account: <Acct checking A>, <Account: <Acct amex L>, <Account: <Acct visa L>, <Account: <Acct groceries E>, <Account: <Acct household E>, <Account: <Acct mortgage E>, <Account: <Acct car E>, <Account: <Acct travel E>, <Account: <Acct equity Q>]

In [6]:
Account.objects[0]

<Account: <Acct equity Q>

In [7]:
acct = Account.objects[0]

In [8]:
acct.name

'equity'

### Low Level API

We can also connect to the DB directly to use the `pymongo` library, _e.g._ to get collection names.

After calling `DB.open` we can get a reference to the client and the current database using static vars of the module:

In [9]:
DB.client

MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, read_preference=Primary(), uuidrepresentation=4, driver=DriverInfo(name='MongoEngine', version='0.29.1', platform=None))

In [10]:
DB.database

Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, read_preference=Primary(), uuidrepresentation=4, driver=DriverInfo(name='MongoEngine', version='0.29.1', platform=None)), 'pytest')

In [11]:
db = DB.database

In [12]:
db.account

Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, read_preference=Primary(), uuidrepresentation=4, driver=DriverInfo(name='MongoEngine', version='0.29.1', platform=None)), 'pytest'), 'account')

In [13]:
db.account.find_one()

{'_id': ObjectId('67e4d0bd36cdfee80581e57d'),
 'name': 'equity',
 'category': 'Q'}

In [14]:
for c in db.list_collections():
    print(c)

{'name': 'entry', 'type': 'collection', 'options': {}, 'info': {'readOnly': False, 'uuid': Binary(b'J\x04F\xec\r\x1fH\x92\x9c\x80N\xf2\x9f\xeaJ\x89', 4)}, 'idIndex': {'v': 2, 'key': {'_id': 1}, 'name': '_id_'}}
{'name': 'account', 'type': 'collection', 'options': {}, 'info': {'readOnly': False, 'uuid': Binary(b'\xac\xcc\x14\xdf\xa3\\J\x89\xb0\xb8c\x00\x84\xef\xe8!', 4)}, 'idIndex': {'v': 2, 'key': {'_id': 1}, 'name': '_id_'}}
{'name': 'transaction', 'type': 'collection', 'options': {}, 'info': {'readOnly': False, 'uuid': Binary(b'\xf2I\xcb=\x82\xf8O\xe6\xb3Wf\xec\xb5;`\xfa', 4)}, 'idIndex': {'v': 2, 'key': {'_id': 1}, 'name': '_id_'}}


In [15]:
for name in db.list_collection_names():
    print(name)

entry
account
transaction


In [16]:
db['account']

Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, read_preference=Primary(), uuidrepresentation=4, driver=DriverInfo(name='MongoEngine', version='0.29.1', platform=None)), 'pytest'), 'account')

In [17]:
db['account'].find_one()

{'_id': ObjectId('67e4d0bd36cdfee80581e57d'),
 'name': 'equity',
 'category': 'Q'}

In [18]:
for obj in db['account'].find():
    print(obj)

{'_id': ObjectId('67e4d0bd36cdfee80581e57d'), 'name': 'equity', 'category': 'Q'}
{'_id': ObjectId('67e4d0bd36cdfee80581e57e'), 'name': 'yoyodyne', 'category': 'I'}
{'_id': ObjectId('67e4d0bd36cdfee80581e57f'), 'name': 'checking', 'category': 'A'}
{'_id': ObjectId('67e4d0bd36cdfee80581e580'), 'name': 'amex', 'category': 'L'}
{'_id': ObjectId('67e4d0bd36cdfee80581e581'), 'name': 'visa', 'category': 'L'}
{'_id': ObjectId('67e4d0bd36cdfee80581e582'), 'name': 'groceries', 'category': 'E'}
{'_id': ObjectId('67e4d0bd36cdfee80581e583'), 'name': 'household', 'category': 'E'}
{'_id': ObjectId('67e4d0bd36cdfee80581e584'), 'name': 'mortgage', 'category': 'E'}
{'_id': ObjectId('67e4d0bd36cdfee80581e585'), 'name': 'car', 'category': 'E'}
{'_id': ObjectId('67e4d0bd36cdfee80581e586'), 'name': 'travel', 'category': 'E'}
{'_id': ObjectId('67e4d19c32bc3fe7eeceadf4'), 'name': 'equity', 'category': 'Q'}


### From Low Level to High Level

Question:  given a collection name ("account") can we find the corresponding MongoEngine class (Account)?

In [19]:
Document

mongoengine.document.Document

In [20]:
Document.__subclasses__()

[mongoengine.document.DynamicDocument,
 dexter.DB.Account,
 dexter.DB.Entry,
 dexter.DB.Transaction]

In [21]:
[cls for cls in Document.__subclasses__() if hasattr(cls, 'objects')]

[dexter.DB.Account, dexter.DB.Entry, dexter.DB.Transaction]

In [22]:
Account._meta

{'abstract': False,
 'max_documents': None,
 'max_size': None,
 'ordering': [],
 'indexes': [],
 'id_field': 'id',
 'index_background': False,
 'index_opts': None,
 'delete_rules': None,
 'allow_inheritance': None,
 'collection': 'account',
 'index_specs': []}

In [23]:
for cls in Document.__subclasses__():
    if not hasattr(cls, 'objects'):
        continue
    print(cls._meta['collection'], cls)

account <class 'dexter.DB.Account'>
entry <class 'dexter.DB.Entry'>
transaction <class 'dexter.DB.Transaction'>


### The Big Picture

Use the high level API when working with data.  MongoEngine converts the documents into objects (which is something we'd be doing ourselves if we didn't use it).

Use the low level API for collective operations: exporting, importing, ...

**NOTE**  It's possible to get a document using the low level API, as shown above, but it will be a `dict`, not a model instance.

### Transactions

In [24]:
t = Transaction(description='hi', comment='aloha')

In [25]:
t.description

'hi'

Nice -- the list fields are initially empty.

In [26]:
t.tags

[]

In [27]:
t.entries

[]

### Entries

In [28]:
e = Entry(uid='xxx', column='credit', date='2025-03-05', amount=1000, account='unknown')

In [29]:
type(e)

dexter.DB.Entry

In [30]:
e

<Entry: <En 2025-03-05 unknown -$1000.0>>

In [31]:
e.column

<Column.cr: 'credit'>

In [32]:
e.amount

1000.0

In [33]:
e.hash

'6fce51cdae9a1803b7c8d26e12244edc'

In [34]:
len(e.hash)

32

In [35]:
{e.uid for e in Entry.objects}

{'0b458afd05842be88d7fcf63faf5ed12',
 '0f066e49b6d7d7df6182f9a8c1e9170e',
 '0f5de0d0dc3b0244e8ee5b62b2efa1f4',
 '2101e974dfca234e4c093b84a9568e46',
 '2150450761027e5e2eafd311e2dc11c9',
 '22b01802c1d8f2bb80933b1ccaa17f2d',
 '2961ef04482764bd274281f568b6fc18',
 '2c8339e1a27b31ce8c69a33f3047bb10',
 '2d643f9ce502fa1a470fec13c257f91b',
 '39f7e339301b367f819580ae4eebfb3d',
 '3ebe742846ad94271ebbdb2e998957a6',
 '4a538877cdd08ec22af5d40b305cc7fc',
 '4b840d81d9b51a429452543bd0c86e35',
 '5e372ff68cf3e9430d2ac65285adce07',
 '6122b459dff6965be697cc43604340eb',
 '64eaff7caf4ac8028571ec5df719c458',
 '6d666f0bc82e916cd23feaf6b98d904c',
 '7029c58c3f1df094df529dc90bace187',
 '717d12e61f0e7e4b0c064fbfb3582faa',
 '728a6508185d6aea56dd6b28f8e776db',
 '7a3a7c0353115742f8df45d13ae8674d',
 '7e44ef5b7241568508bea6405857b619',
 '84890a1120008759237cdc91936af3ce',
 '88109b34c858f1d987d0521bdcb73218',
 '8a56619a42cea0e7e21ce674aae1434b',
 '8c73c88f9d62dca20a74781cf2fcd2eb',
 '9c50e74c3edb5c897ad523a4aaa278e0',
 

In [36]:
s = set()
for e in Entry.objects:
    if e in s:
        print(e.date, e.amount, e.description)
    s.add(e.uid)

In [37]:
len(s)

36

In [38]:
len(Entry.objects)

38

In [41]:
lst = sorted([e.uid for e in Entry.objects])

In [42]:
len(lst)

38

In [44]:
dup = []
for i in range(len(lst)-1):
    if lst[i] == lst[i+1]:
        dup.append(lst[i])

In [45]:
dup

['4a538877cdd08ec22af5d40b305cc7fc', 'b255906404b87487eca4f673896d9129']

In [47]:
for e in Entry.objects:
    if e.uid in dup:
        print(e.date, e.amount, e.account, e.description)

2024-01-02 5000.0 yoyodyne 
2024-01-02 5000.0 yoyodyne 
2024-02-02 5000.0 yoyodyne 
2024-02-02 5000.0 yoyodyne 


### References

The big test -- can we add that Entry to the transaction?

In [None]:
t.entries.append(p)

In [None]:
t.entries

Yes!  🎉

### Misc Commands

In [None]:
db.stats

In [None]:
db.stats.find_one

In [None]:
db.list_collection_names()

In [None]:
db.command('count','account')

In [None]:
db.command('hello')

In [None]:
db.command('hostInfo')

In [None]:
db.command('ping')

### Fetch Transactions

Specify constraints on transactions

In [None]:
Transaction.objects

In [None]:
Transaction.objects(description='Safeway')

In [None]:
for t in Transaction.objects(description='Safeway'):
    for e in t.entries:
        print(e.date, e.account, e.amount, e.column)

In [None]:
for t in Transaction.objects(description='Safeway'):
    print(t.accounts)

In [None]:
for t in Transaction.objects(description='Safeway'):
    print(t.pamount)

In [None]:
for t in Transaction.objects(description='Safeway'):
    print(t.pdate, type(t.pdate))

In [None]:
for t in Transaction.objects(description='Safeway'):
    print(t.originals)

In [None]:
lst = list(Transaction.objects(description='Safeway'))

In [None]:
lst[1].comment

In [None]:
lst[1].pamount

In [None]:
lst[1].pdate

In [None]:
list(Transaction.objects(pamount__lt=175.0))

In [None]:
for t in Transaction.objects(pdate=date(2024,1,2)):
    print(t.pdate, t.pamount, t.pdebit, t.pcredit)

In [None]:
for t in Transaction.objects(pdate__lte=date(2024,1,2)):
    print(t.pdate, t.pamount, t.pdebit, t.pcredit)

### Operators

In [None]:
for t in Transaction.objects(description__gte='Safeway'):
    print(t.pdate, t.description)

In [None]:
for t in Transaction.objects(description__regex='^S'):
    print(t.pdate, t.description)

The operator automatically applies to list elements.

In [None]:
for t in Transaction.objects(description__regex=r'\s'):
    print(t.pdate, t.description, t.pamount)

For compound constraints we need another class from MongoEngine.

In [None]:
from mongoengine.queryset.visitor import Q

In [None]:
for t in Transaction.objects(Q(description__regex=r'^S')):
    print(t.pdate, t.description)

In [None]:
for t in Transaction.objects(Q(description__regex=r'^S') & Q(description__regex=r'\s')):
    print(t.pdate, t.description)

### QuerySet

In [None]:
for a in Account.nominal_accounts:
    print(a.name)

### Combining Query Elements

In [None]:
q = Q(description__regex=r'^S')

In [None]:
q

In [None]:
type(q)

In [None]:
p = Q(description__regex=r'\s')

In [None]:
p & q

In [None]:
for t in Transaction.objects(p & q):
    print(t.pdate, t.description)

Create Q object using dictionaries

In [None]:
dct = {'description__regex': r'^S'}

In [None]:
Q(**dct)

Can an object have multiple constraints?

In [None]:
dct = {'description__regex': r'^S', 'pamount__gt': 100}

In [None]:
Q(**dct)

In [None]:
for t in Transaction.objects(Q(**dct)):
    print(t.pdate, t.description, t.pamount)

Yep!

### Select Method

#### Select Transactions

All transactions:

In [None]:
for t in DB.select(Transaction):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

By date:

In [None]:
for t in DB.select(Transaction, date=date(2024,1,21)):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

In [None]:
for t in DB.select(Transaction, start_date=date(2024,1,21)):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

In [None]:
for t in DB.select(Transaction, end_date=date(2024,1,21)):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

By amount:

In [None]:
for t in DB.select(Transaction, amount=75):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

In [None]:
lst = DB.select(Transaction, amount=75)

In [None]:
all(t.pamount == 75 for t in lst)

In [None]:
for t in DB.select(Transaction, max_amount=75):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

In [None]:
for t in DB.select(Transaction, min_amount=75):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

By description:

In [None]:
for t in DB.select(Transaction, description = r'^s'):
    print(t.pdate, t.pamount, t.description, t.pcredit, t.pdebit)

In [None]:
for t in DB.select(Transaction, comment=r'budget'):
    print(t.pdate, t.pamount, t.description, t.comment, t.pcredit, t.pdebit)

By account:

In [None]:
for t in DB.select(Transaction, debit='mortgage'):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

In [None]:
for t in DB.select(Transaction, credit='mortgage'):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

Some random combinations

In [None]:
for t in DB.select(Transaction, description = r'^s', min_amount=100):
    print(t.pdate, t.pamount, t.description, t.pcredit, t.pdebit)

In [None]:
for t in DB.select(Transaction, start_date = date(2024,2,1), credit='visa'):
    print(t.pdate, t.pamount, t.description, t.pcredit, t.pdebit)

#### Select Entries

All entries:

In [None]:
len(DB.select(Entry))

In [None]:
for e in DB.select(Entry):
    print(e.date, e.account, e.amount, e.column)

By date:

In [None]:
for e in DB.select(Entry, date=date(2024,1,5)):
    print(e.date, e.account, e.amount, e.column)

In [None]:
for e in DB.select(Entry, start_date=date(2024,1,5)):
    print(e.date, e.account, e.amount, e.column)

In [None]:
for e in DB.select(Entry, end_date=date(2024,1,5)):
    print(e.date, e.account, e.amount, e.column)

By amount:

In [None]:
for e in DB.select(Entry, amount=900):
    print(e.date, e.account, e.amount, e.column)

In [None]:
for e in DB.select(Entry, max_amount=900):
    print(e.date, e.account, e.amount, e.column)

In [None]:
for e in DB.select(Entry, min_amount=900):
    print(e.date, e.account, e.amount, e.column)

By account:

In [None]:
for e in DB.select(Entry, account='groceries'):
    print(e.date, e.account, e.amount, e.column)

By column:

In [None]:
for e in DB.select(Entry, column='credit'):
    print(e.date, e.account, e.amount, e.column)

In [None]:
for e in DB.select(Entry, column='debit'):
    print(e.date, e.account, e.amount, e.column)

### Serializing Objects

In [None]:
import json
from bson.objectid import ObjectId
import datetime

In [None]:
lst = DB.select(Transaction, start_date = date(2024,2,1), credit='visa')

In [None]:
lst[0].to_json()

In [None]:
type(lst[0])

In [None]:
obj = Transaction.objects.as_pymongo()[0]

In [None]:
type(obj)

In [None]:
obj

In [None]:
s = lst[0].to_json()

In [None]:
json.loads(s)

In [None]:
Transaction.from_json(s)

In [None]:
s = 'account: {...:...}'

In [None]:
s.find(':')

In [None]:
s[:s.find(':')]

In [None]:
s[s.find(':'):]

### Indexes

We want a field in Entry documents that serves as a unique ID so we can tell if an item was imported before.

MongoEngine has a UUID field.
* how is it computed?  is it a hash of all the other field values?
* when is it computed?  when the object is made, or when it is saved?

In [None]:
from mongoengine import *

In [None]:
class Foo(Document):
    name = StringField()
    amount = FloatField()
    uid = UUIDField(binary=False)

In [None]:
f = Foo(name='Fred', amount=10)

Just declaring it is not enough to give it a value:

In [None]:
f.uid is None

In [None]:
f.save()

This model has an index.  The `#` means it's a "hashed index" but no discussion of what that means or why we'd want one (over say a text index that we compute ourselves).

In [None]:
class Bar(Document):
    name = StringField()
    amount = FloatField()
    uid = UUIDField(binary=False)
    meta = {
        'indexes': ['#uid']
    }

In [None]:
b1 = Bar(name='george', amount=20, uid='123')

In [None]:
b1.uid

Ah -- the UUID is created when the document is saved (the same was true for Foo, above, if a `uid` value passed to the constructor).

```
b1.save()
...
ValidationError: ValidationError (Bar:None) (Could not convert to UUID: badly formed hexadecimal UUID string: ['uid'])
```

So how do we make a UUID?  Do we care?  Why not just use our hashed strings?

#### UUID

It's in the PyMongo docs (and we had to specify how they're represented when we made the DB connection).  It's also a Python library.

In [None]:
from uuid import uuid4

In [None]:
b2 = Bar(name='ringo', amount=30, uid=uuid4())

In [None]:
b2.uid

In [None]:
b2.save()

In [None]:
x = b2.uid

In [None]:
b3 = Bar(name='paul', amount=40, uid=x)

In [None]:
b3.save()

In [None]:
b3.uid == b2.uid

In [None]:
uuid4()

So just defining an index isn't enough to make it unique.

### Unique

In [None]:
class FooBar(Document):
    name = StringField()
    amount = FloatField()
    uid = StringField(unique=True)

In [None]:
f1 = FooBar(name='Fred', amount=100, uid='bedrock')

In [None]:
f2 = FooBar(name='Barney', amount=200, uid='bedrock')

In [None]:
f1.save()

In [None]:
try:
    f2.save()
except NotUniqueError as err:
    print(err)

In [None]:
FooBar._meta

Awesome!  Just defining a field as unique is enough to have MongoEngine create an index.  Don't know (and don't care, yet, at least) about the ramifications of `sparse = False`.

> checked PyMongo docs, it's not what we think, and not something we want (even though we can have it by specifying `sparse=True` in the column spec)