### Experiment with MongoEngine

The ODM we use to access MongoDB

In [1]:
from dexter.DB import DB, Account, Entry, Transaction, Document
from dexter.config import Tag

from datetime import date

Open the database:

In [2]:
DB.open('pytest')

Make an account:

In [3]:
acct = Account(name='equity', category='equity')

Save it:

In [4]:
acct.save()

<Account: <Acct equity equity>>

If we open that DB with `mongosh` we should see the account.

```
$ mongosh

test> use foo
switched to db foo

foo> db.account.find()
[
  {
    _id: ObjectId('67c61fa19d0161a19b80469e'),
    name: 'equity',
    group: 'equity'
  }
]
```

It worked!  🎉

### Contents of a Collection

In [5]:
Account.objects

[<Account: <Acct equity equity>>, <Account: <Acct yoyodyne income>>, <Account: <Acct bank:checking asset>>, <Account: <Acct amex:blue liability>>, <Account: <Acct chase:visa liability>>, <Account: <Acct groceries expense>>, <Account: <Acct household expense>>, <Account: <Acct mortgage expense>>, <Account: <Acct car expense>>, <Account: <Acct travel expense>>, <Account: <Acct equity equity>>]

In [6]:
Account.objects[0]

<Account: <Acct equity equity>>

In [7]:
acct = Account.objects[0]

In [8]:
acct.name

'equity'

In [9]:
acct['name']

'equity'

### Low Level API

We can also connect to the DB directly to use the `pymongo` library, _e.g._ to get collection names.

After calling `DB.open` we can get a reference to the client and the current database using static vars of the module:

In [17]:
DB.client

MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, read_preference=Primary(), uuidrepresentation=4, driver=DriverInfo(name='MongoEngine', version='0.29.1', platform=None))

In [18]:
DB.client.list_database_names()

['admin', 'config', 'dev', 'dexter', 'doo', 'local', 'pytest']

In [11]:
DB.database

Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, read_preference=Primary(), uuidrepresentation=4, driver=DriverInfo(name='MongoEngine', version='0.29.1', platform=None)), 'pytest')

In [12]:
db = DB.database

In [None]:
db.account

In [None]:
db.account.find_one()

In [None]:
for c in db.list_collections():
    print(c)

In [None]:
for name in db.list_collection_names():
    print(name)

In [None]:
db['account']

In [None]:
db['account'].find_one()

In [None]:
for obj in db['account'].find():
    print(obj)

### From Low Level to High Level

Question:  given a collection name ("account") can we find the corresponding MongoEngine class (Account)?

In [None]:
Document

In [None]:
Document.__subclasses__()

In [None]:
[cls for cls in Document.__subclasses__() if hasattr(cls, 'objects')]

In [None]:
Account._meta

In [None]:
for cls in Document.__subclasses__():
    if not hasattr(cls, 'objects'):
        continue
    print(cls._meta['collection'], cls)

### The Big Picture

Use the high level API when working with data.  MongoEngine converts the documents into objects (which is something we'd be doing ourselves if we didn't use it).

Use the low level API for collective operations: exporting, importing, ...

**NOTE**  It's possible to get a document using the low level API, as shown above, but it will be a `dict`, not a model instance.

### Transactions

In [None]:
t = Transaction(description='hi', comment='aloha')

In [None]:
t.description

Nice -- the list fields are initially empty.

In [None]:
t.tags

In [None]:
t.entries

### Entries

In [None]:
e = Entry(uid='xxx', column='credit', date='2025-03-05', amount=1000, account='unknown')

In [None]:
type(e)

In [None]:
e

In [None]:
e.column

In [None]:
e.column.opposite()

In [None]:
e.column.opposite().opposite()

In [None]:
e.amount

In [None]:
e.hash

In [None]:
len(e.hash)

In [None]:
{e.uid for e in Entry.objects}

In [None]:
s = set()
for e in Entry.objects:
    if e in s:
        print(e.date, e.amount, e.description)
    s.add(e.uid)

In [None]:
len(s)

In [None]:
len(Entry.objects)

In [None]:
lst = sorted([e.uid for e in Entry.objects])

In [None]:
len(lst)

In [None]:
dup = []
for i in range(len(lst)-1):
    if lst[i] == lst[i+1]:
        dup.append(lst[i])

In [None]:
dup

In [None]:
for e in Entry.objects:
    if e.uid in dup:
        print(e.date, e.amount, e.account, e.description)

### Tags

In [None]:
e = Entry.objects[0]

In [None]:
e.description

In [None]:
e.tags

In [None]:
e.tags.append(Tag.U)

In [None]:
e.save()

In [None]:
e.tags

In [None]:
e.note = "hello"

In [None]:
e.save()

### References

The big test -- can we add that Entry to the transaction?

In [None]:
t.entries.append(e)

In [None]:
t.entries

Yes!  🎉

### Misc Commands

In [None]:
db.stats

In [None]:
db.stats.find_one

In [None]:
db.list_collection_names()

In [None]:
db.command('count','account')

In [None]:
db.command('hello')

In [None]:
db.command('hostInfo')

In [None]:
db.command('ping')

### Fetch Transactions

Specify constraints on transactions

In [None]:
Transaction.objects

In [None]:
Transaction.objects(description='Safeway')

In [None]:
for t in Transaction.objects(description='Safeway'):
    for e in t.entries:
        print(e.date, e.account, e.amount, e.column)

In [None]:
for t in Transaction.objects(description='Safeway'):
    print(t.accounts)

In [None]:
for t in Transaction.objects(description='Safeway'):
    print(t.pamount)

In [None]:
for t in Transaction.objects(description='Safeway'):
    print(t.pdate, type(t.pdate))

In [None]:
for t in Transaction.objects(description='Safeway'):
    print(t.originals)

In [None]:
lst = list(Transaction.objects(description='Safeway'))

In [None]:
lst[1].comment

In [None]:
lst[1].pamount

In [None]:
lst[1].pdate

In [None]:
list(Transaction.objects(pamount__lt=175.0))

In [None]:
for t in Transaction.objects(pdate=date(2024,1,2)):
    print(t.pdate, t.pamount, t.pdebit, t.pcredit)

In [None]:
for t in Transaction.objects(pdate__lte=date(2024,1,2)):
    print(t.pdate, t.pamount, t.pdebit, t.pcredit)

### Operators

In [None]:
for t in Transaction.objects(description__gte='Safeway'):
    print(t.pdate, t.description)

In [None]:
for t in Transaction.objects(description__regex='^S'):
    print(t.pdate, t.description)

The operator automatically applies to list elements.

In [None]:
for t in Transaction.objects(description__regex=r'\s'):
    print(t.pdate, t.description, t.pamount)

For compound constraints we need another class from MongoEngine.

In [None]:
from mongoengine.queryset.visitor import Q

In [None]:
for t in Transaction.objects(Q(description__regex=r'^S')):
    print(t.pdate, t.description)

In [None]:
for t in Transaction.objects(Q(description__regex=r'^S') & Q(description__regex=r'\s')):
    print(t.pdate, t.description)

### QuerySet

In [None]:
for a in Account.nominal_accounts:
    print(a.name)

### Combining Query Elements

In [None]:
q = Q(description__regex=r'^S')

In [None]:
q

In [None]:
type(q)

In [None]:
p = Q(description__regex=r'\s')

In [None]:
p & q

In [None]:
for t in Transaction.objects(p & q):
    print(t.pdate, t.description)

Create Q object using dictionaries

In [None]:
dct = {'description__regex': r'^S'}

In [None]:
Q(**dct)

Can an object have multiple constraints?

In [None]:
dct = {'description__regex': r'^S', 'pamount__gt': 100}

In [None]:
Q(**dct)

In [None]:
for t in Transaction.objects(Q(**dct)):
    print(t.pdate, t.description, t.pamount)

Yep!

### Select Method

#### Select Transactions

All transactions:

In [None]:
for t in DB.select(Transaction):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

By date:

In [None]:
for t in DB.select(Transaction, date=date(2024,1,21)):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

In [None]:
for t in DB.select(Transaction, start_date=date(2024,1,21)):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

In [None]:
for t in DB.select(Transaction, end_date=date(2024,1,21)):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

By amount:

In [None]:
for t in DB.select(Transaction, amount=75):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

In [None]:
lst = DB.select(Transaction, amount=75)

In [None]:
all(t.pamount == 75 for t in lst)

In [None]:
for t in DB.select(Transaction, max_amount=75):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

In [None]:
for t in DB.select(Transaction, min_amount=75):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

By description:

In [None]:
for t in DB.select(Transaction, description = r'^s'):
    print(t.pdate, t.pamount, t.description, t.pcredit, t.pdebit)

In [None]:
for t in DB.select(Transaction, comment=r'budget'):
    print(t.pdate, t.pamount, t.description, t.comment, t.pcredit, t.pdebit)

By account:

In [None]:
for t in DB.select(Transaction, debit='mortgage'):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

In [None]:
for t in DB.select(Transaction, credit='mortgage'):
    print(t.pdate, t.pamount, t.pcredit, t.pdebit)

Some random combinations

In [None]:
for t in DB.select(Transaction, description = r'^s', min_amount=100):
    print(t.pdate, t.pamount, t.description, t.pcredit, t.pdebit)

In [None]:
for t in DB.select(Transaction, start_date = date(2024,2,1), credit='visa'):
    print(t.pdate, t.pamount, t.description, t.pcredit, t.pdebit)

#### Select Entries

All entries:

In [None]:
len(DB.select(Entry))

In [None]:
for e in DB.select(Entry):
    print(e.date, e.account, e.amount, e.column)

By date:

In [None]:
for e in DB.select(Entry, date=date(2024,1,5)):
    print(e.date, e.account, e.amount, e.column)

In [None]:
for e in DB.select(Entry, start_date=date(2024,1,5)):
    print(e.date, e.account, e.amount, e.column)

In [None]:
for e in DB.select(Entry, end_date=date(2024,1,5)):
    print(e.date, e.account, e.amount, e.column)

By amount:

In [None]:
for e in DB.select(Entry, amount=900):
    print(e.date, e.account, e.amount, e.column)

In [None]:
for e in DB.select(Entry, max_amount=900):
    print(e.date, e.account, e.amount, e.column)

In [None]:
for e in DB.select(Entry, min_amount=900):
    print(e.date, e.account, e.amount, e.column)

By account:

In [None]:
for e in DB.select(Entry, account='medical'):
    print(e.date, e.account, e.amount, e.column)

By column:

In [None]:
for e in DB.select(Entry, column='credit'):
    print(e.date, e.account, e.amount, e.column)

In [None]:
for e in DB.select(Entry, column='debit'):
    print(e.date, e.account, e.amount, e.column)

By tag:

In [None]:
for e in DB.select(Entry, tag='unpaired'):
    print(e.date, e.account, e.amount, e.column)

### Serializing Objects

In [None]:
import json
from bson.objectid import ObjectId
import datetime

In [None]:
lst = DB.select(Transaction, start_date = date(2024,2,1), credit='visa')

In [None]:
lst[0].to_json()

In [None]:
type(lst[0])

In [None]:
obj = Transaction.objects.as_pymongo()[0]

In [None]:
type(obj)

In [None]:
obj

In [None]:
s = lst[0].to_json()

In [None]:
json.loads(s)

In [None]:
Transaction.from_json(s)

In [None]:
s = 'account: {...:...}'

In [None]:
s.find(':')

In [None]:
s[:s.find(':')]

In [None]:
s[s.find(':'):]

### Indexes

We want a field in Entry documents that serves as a unique ID so we can tell if an item was imported before.

MongoEngine has a UUID field.
* how is it computed?  is it a hash of all the other field values?
* when is it computed?  when the object is made, or when it is saved?

In [None]:
from mongoengine import *

In [None]:
class Foo(Document):
    name = StringField()
    amount = FloatField()
    uid = UUIDField(binary=False)

In [None]:
f = Foo(name='Fred', amount=10)

Just declaring it is not enough to give it a value:

In [None]:
f.uid is None

In [None]:
f.save()

This model has an index.  The `#` means it's a "hashed index" but no discussion of what that means or why we'd want one (over say a text index that we compute ourselves).

In [None]:
class Bar(Document):
    name = StringField()
    amount = FloatField()
    uid = UUIDField(binary=False)
    meta = {
        'indexes': ['#uid']
    }

In [None]:
b1 = Bar(name='george', amount=20, uid='123')

In [None]:
b1.uid

Ah -- the UUID is created when the document is saved (the same was true for Foo, above, if a `uid` value passed to the constructor).

```
b1.save()
...
ValidationError: ValidationError (Bar:None) (Could not convert to UUID: badly formed hexadecimal UUID string: ['uid'])
```

So how do we make a UUID?  Do we care?  Why not just use our hashed strings?

#### UUID

It's in the PyMongo docs (and we had to specify how they're represented when we made the DB connection).  It's also a Python library.

In [None]:
from uuid import uuid4

In [None]:
b2 = Bar(name='ringo', amount=30, uid=uuid4())

In [None]:
b2.uid

In [None]:
b2.save()

In [None]:
x = b2.uid

In [None]:
b3 = Bar(name='paul', amount=40, uid=x)

In [None]:
b3.save()

In [None]:
b3.uid == b2.uid

In [None]:
uuid4()

So just defining an index isn't enough to make it unique.

### Unique

In [None]:
class FooBar(Document):
    name = StringField()
    amount = FloatField()
    uid = StringField(unique=True)

In [None]:
f1 = FooBar(name='Fred', amount=100, uid='bedrock')

In [None]:
f2 = FooBar(name='Barney', amount=200, uid='bedrock')

If uncommented this cell will add a record to the database.  To reset the database to its original condition run pytest again.

In [None]:
# f1.save()

In [None]:
# try:
#     f2.save()
# except NotUniqueError as err:
#     print(err)

In [None]:
FooBar._meta

Awesome!  Just defining a field as unique is enough to have MongoEngine create an index.  Don't know (and don't care, yet, at least) about the ramifications of `sparse = False`.

> checked PyMongo docs, it's not what we think, and not something we want (even though we can have it by specifying `sparse=True` in the column spec)

### Match Account Names

In [None]:
Account.objects(name__contains='checking')

In [None]:
Account.objects(name__contains='g')

In [None]:
DB.find_account('g')

In [None]:
DB.account_name_parts()

In [None]:
DB.account_name_parts('expense')

In [None]:
DB.full_names()

In [None]:
DB.full_names('expense')