### Code for parsers

In [1]:
import re

Text sample from a Journal

In [2]:
lines = [
    '# Groceries',
    '; account equity',
    '2024-02-02 paycheck',
    '2024-01-07  Safeway ; weekly groceries',
    '    income:yoyodyne      $5,000.00',
    '    income:yoyodyne      $5,000.00  ; Feb',
]

In [3]:
for s in lines:
    cmnd, *comment = re.split(r'[;#]', s)
    print(cmnd, comment)


 [' Groceries']
 [' account equity']
2024-02-02 paycheck []
2024-01-07  Safeway  [' weekly groceries']
    income:yoyodyne      $5,000.00 []
    income:yoyodyne      $5,000.00   [' Feb']


Make sure `re.sub` strips out the chars we want from an amount.

In [4]:
amounts = [
    '123.45',
    '$123.45',
    '-123.45',
    '$-123.45',
    '-$123.45',
    '9,123.45',
    '$9,123.45',
    '-9,123.45',
    '$-9,123.45',
    '-$9,123.45',    
]

In [5]:
for s in amounts:
    t = re.sub(r'[,$]','',s)
    print(float(t))


123.45
123.45
-123.45
-123.45
-123.45
9123.45
9123.45
-9123.45
-9123.45
-9123.45


### TOML

We're using TOML (a JSON alternative) for configuration files.

In [7]:
import tomllib
from pathlib import Path

In [10]:
p = Path.home() / 'Personal/Finances/dex.toml'

In [11]:
p.is_file()

True

In [27]:
with open(p, 'rb') as f:
    res = tomllib.load(f)

In [28]:
res

{'terminology': {'credit': 'credit', 'debit': 'debit'},
 'csv': {'occu': {'accounts': ['buckets',
    'checking-jc',
    'checking-lc',
    'savings-jc',
    'savings-lc',
    'reservoir'],
   'description': 'desc',
   'date': 'date',
   'amount': 'amount',
   'column': 'column'}}}

What we learned:
* each section in the TOML document is a key in the resulting dict
* if a section has subsections, the section name associated with another dictionary where the key is the subsection name (minus the section name)

In [30]:
from collections import namedtuple

In [31]:
ColMap = namedtuple('ColMap', ['description', 'date', 'amount', 'column'])

In [32]:
res

{'terminology': {'credit': 'credit', 'debit': 'debit'},
 'csv': {'occu': {'accounts': ['buckets',
    'checking-jc',
    'checking-lc',
    'savings-jc',
    'savings-lc',
    'reservoir'],
   'description': 'desc',
   'date': 'date',
   'amount': 'amount',
   'column': 'column'}}}

In [35]:
dct = res['csv']['occu']

In [36]:
dct.pop('accounts')

['buckets',
 'checking-jc',
 'checking-lc',
 'savings-jc',
 'savings-lc',
 'reservoir']

In [37]:
dct

{'description': 'desc', 'date': 'date', 'amount': 'amount', 'column': 'column'}

In [38]:
p = {'a': 0, 'b': 1}

In [40]:
p |= dct

In [42]:
p

{'a': 0,
 'b': 1,
 'description': 'desc',
 'date': 'date',
 'amount': 'amount',
 'column': 'column'}