Interface to KMyMoney saved files
===============
You must save your data as SQL (or you can export your current data to an sql file)

In [None]:
# For convenience, autoreload scripts (kmymoney.py) before executing commands.
%load_ext autoreload
%autoreload 2

In [None]:
from kmymoney import KMyMoney    # our own file
from jupyter_utils import disp, as_numeric   # our own file

kmm = KMyMoney('/Users/briot/Comptes.kmm')

In [None]:
# Direct look at the database (needs `pip install ipython-sql`)
%load_ext sql
sqlite = kmm.sqlite
%sql $sqlite

In [None]:
# Setup matplotlib
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
# List of accounts
bitcoin = 'A000242'      # to check fees
bourscommun = 'A000106'
eurokidA = 'A000195'
pilotee = 'A000216'
ethereum = 'A000241'

Display the ledger for one or more accounts
---------
A single SQL query is used to compute the lines of the ledger in a format similar to what KMyMoney outputs. We compute the running balance directly in the sql query, which is useful to find operations that brought the balance over some threshold for instance.

We then manipulate the result via Python's Pandas to extract specific information. We cannot, for instance, reduce the range of dates directly in the SQL query, since the running balance is computed on the result rows, so it would be wrong if we remove some rows. (An alternative might be to have a third SELECT statement to filter, but doing that in Pandas is more flexible since it also limits the number of queries we do to sqlite -- keeping everything in memory works fine for a typical kmymoney file)

In [None]:
disp(kmm.ledger(accounts=[ethereum, bitcoin], mindate="2017-05-01").fillna('-'))

All transactions in a given category
------
Display a subset of transactions. Useful to cleanup a file for instance

In [None]:
p = kmm.ledger(mindate="2019-01-01")  # Get all transactions
cat = (
    'Interne',
    'reconciliation',
    # 'Opening Balances',
)
p = p[ p['category'].isin(cat) ]              # Only keep transactions from specific categories
p = p.drop(['balance', 'reconcile'], axis=1)  # meaningless columns in this view
disp(
   p.loc[ 
       p[['deposit', 'paiement']].max(axis=1)  # create a series with max(deposit,paiement)
       .sort_values(ascending=False).index     # sort it, and retrieves the indexes in the original series
   ]  # list the rows of p using the indexes of the sorted series
   .append(p.sum(numeric_only=True), ignore_index=True), # Add a 'Total' row
   height=200
)

Deposits and Paiements for a specific date range
------------
These plots paiements and deposits, not exactly the same as plotting the Expenses and Income accounts, because it is possible to make either paiements or deposits on either of those (for instance a reimbursement for some paiement you made earlier)

In [None]:
kmm.plot_by_category(mindate="2020-01-01", values=['paiement', 'deposit'])
kmm.plot_by_category(mindate="2020-01-01", values=['amount'], kind='bar')


Net worth by month
---------------
Networth is computed by looking at the current positions in all accounts (EUR or number of shares) at the end of some periods (monthly, yearly,...), and applying the price of the stocks as of the end of that period. In a ledger, the prices are computed as of the transaction itself.

In [None]:
p = kmm.networth(mindate="2015-01-01", maxdate="2020-12-31", by_year=True)
disp(p, height=600)

p = kmm.networth(mindate="2020-01-01", maxdate="2020-06-31", by_year=False, with_total=True)
disp(p, height=600)

Group the transactions into bins
-----------------------

See https://medium.com/@soulsinporto/group-data-using-bins-and-categories-with-pandas-836c9c9bbd46

In [None]:
from typing import List, Union
import plotly.express as px
import pandas as pd


def bin_and_plot(series: pd.Series) -> pd.Series:
    """
    Group data into bins of specific ranges:  [0, 10), [10, 20), ...
    """
    bins = [(0, 50), (50, 100), (100, 200), (200, 500), (500, 1000), 
            (1000, 3000), (3000, 5000), (5000, 10000), (10000, 100000000)]

    intervals = pd.IntervalIndex.from_tuples(bins, closed="left")
    labels = [f"[{l},{r})" for l, r in bins]
    binned = pd.cut(
        series,
        intervals,
        labels=labels,
        precision=0,
        include_lowest=True
    )

    binned.sort_values(ascending=True, inplace=True)
    # Change the values from categorical to string to be able to plot them
    binned = binned.astype("str")

    # For each element in `series`, binned contains the name of the bin it
    # belongs to.
    
    plot_histogram(
        binned, nbins=len(bins), title='Size of paiements',
        axes_titles=['Paiements', ''])

def plot_histogram(
    data_series: pd.Series,
    nbins: int,
    title: str,
    axes_titles: List[Union[str, None]]
) -> None:
    fig = px.histogram(
        x=data_series,
        nbins=nbins,
        title=title
    )

    fig.update_layout(
        xaxis_title=axes_titles[0],
        yaxis_title=axes_titles[1]
    )

    fig.update_layout(
        uniformtext_minsize=14,
        uniformtext_mode="hide",
        bargap=0.1,
        title_x=0.5
    )

    fig.show()

l = kmm.ledger(mindate='2015-01-01')
l = l[ ~l['paiement'].isnull() ]   # Only keep transactions with a paiement
bin_and_plot(l['paiement'])


Ideas
======

- investment value over time (plot)
- compute mean investment price, and current return (`current_price / mean_price`), or using one of the other usual valuation methods that GNUCash provides
- compute total invested in a given investment, and its current book value
- cleanup the `ledger_investment` function, merge with `ledger`
- networth should compute earliest date from database
- networth should display diff between two columns