Enhancement: Include transfer account path in list_splits() #2

donpellegrino · 2017-12-22T20:39:18Z

It seems that the transfer account path field is not included in book.list_splits(). It would useful to include the transfer account path as a columns within the DataFrame returned from list_splits(). This would allow for easy filtering and analysis on properties of the transfer account in combination with the split or transaction data.

LiosK · 2017-12-23T00:48:45Z

I do not necessarily think it is a good idea to join transfer accounts because transfer accounts cannot be determined when a transaction consists of more than two splits. Instead, you can easily join transfer accounts using pandas' capability.

def join_transfer(sp):
    pairs = sp.reset_index() \
                .groupby(["trn_idtype", "trn_id"]) \
                .filter(lambda x: len(x) == 2)
    dup = pairs[["trn_idtype", "trn_id", "idtype", "id"]] \
              .join(pairs.set_index(["trn_idtype", "trn_id"]) \
              .add_prefix("transfer_"), on=["trn_idtype", "trn_id"])
    dedup = dup[(dup["id"] != dup["transfer_id"]) \
                | (dup["idtype"] != dup["transfer_idtype"])] \
                .drop(columns=["trn_idtype", "trn_id"]) \
                .set_index(["idtype", "id"])
    return sp.join(dedup)

See this gist for usage.

donpellegrino · 2017-12-23T03:36:35Z

@LiosK You are right. Good point.

I was exploring the use of a function similar to the following that might allow for the addition of a column containing a list of account paths for the other splits within the same transaction. That might be one way of expressing the case of more than two splits within a transaction. However, it is not clear how useful such a derived column would be for downstream analytics. The join_transfer() function you provided is much more elegant. Thanks!

splits = book.list_splits()

def expenses(split):
    # Lookup the transaction for the give split
    transaction = split.trn_id
    
    # Get all splits for that transaction
    all_tx_splits = splits.loc[lambda df: df.trn_id == transaction, :]
    
    # Exclude the current split
    other_tx_splits = all_tx_splits.loc[~all_tx_splits.index.isin([split.name])]
    
    # Return the path to the other splits
    return other_tx_splits.act_path.tolist()

My basic use case was to explore all of the accounts that money from a credit card is flowing into within a given time period. I suspect I could be approaching it more simply.

LiosK · 2017-12-23T04:13:34Z

As an accountant I would say that is always a challenge in the accounting field, and that's why simple, standardized split designs are often preferred over detailed flexible schemas. One suggested approach would be: collect transactions that include the credit card account, aggregate their splits by account and dr/cr, and manually exclude irrelevant accounts. This is actually similar to what GnuCash's cash flow statement does.

donpellegrino changed the title ~~Enhancement: Include transfer account data in list_splits()~~ Enhancement: Include transfer account path in list_splits() Dec 22, 2017

LiosK closed this as completed Dec 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: Include transfer account path in list_splits() #2

Enhancement: Include transfer account path in list_splits() #2

donpellegrino commented Dec 22, 2017 •

edited

Loading

LiosK commented Dec 23, 2017

donpellegrino commented Dec 23, 2017

LiosK commented Dec 23, 2017

Enhancement: Include transfer account path in list_splits() #2

Enhancement: Include transfer account path in list_splits() #2

Comments

donpellegrino commented Dec 22, 2017 • edited Loading

LiosK commented Dec 23, 2017

donpellegrino commented Dec 23, 2017

LiosK commented Dec 23, 2017

donpellegrino commented Dec 22, 2017 •

edited

Loading