Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: Include transfer account path in list_splits() #2

Closed
donpellegrino opened this issue Dec 22, 2017 · 3 comments
Closed

Comments

@donpellegrino
Copy link

donpellegrino commented Dec 22, 2017

It seems that the transfer account path field is not included in book.list_splits(). It would useful to include the transfer account path as a columns within the DataFrame returned from list_splits(). This would allow for easy filtering and analysis on properties of the transfer account in combination with the split or transaction data.

@donpellegrino donpellegrino changed the title Enhancement: Include transfer account data in list_splits() Enhancement: Include transfer account path in list_splits() Dec 22, 2017
@LiosK
Copy link
Owner

LiosK commented Dec 23, 2017

I do not necessarily think it is a good idea to join transfer accounts because transfer accounts cannot be determined when a transaction consists of more than two splits. Instead, you can easily join transfer accounts using pandas' capability.

def join_transfer(sp):
    pairs = sp.reset_index() \
                .groupby(["trn_idtype", "trn_id"]) \
                .filter(lambda x: len(x) == 2)
    dup = pairs[["trn_idtype", "trn_id", "idtype", "id"]] \
              .join(pairs.set_index(["trn_idtype", "trn_id"]) \
              .add_prefix("transfer_"), on=["trn_idtype", "trn_id"])
    dedup = dup[(dup["id"] != dup["transfer_id"]) \
                | (dup["idtype"] != dup["transfer_idtype"])] \
                .drop(columns=["trn_idtype", "trn_id"]) \
                .set_index(["idtype", "id"])
    return sp.join(dedup)

See this gist for usage.

@donpellegrino
Copy link
Author

@LiosK You are right. Good point.

I was exploring the use of a function similar to the following that might allow for the addition of a column containing a list of account paths for the other splits within the same transaction. That might be one way of expressing the case of more than two splits within a transaction. However, it is not clear how useful such a derived column would be for downstream analytics. The join_transfer() function you provided is much more elegant. Thanks!

splits = book.list_splits()

def expenses(split):
    # Lookup the transaction for the give split
    transaction = split.trn_id
    
    # Get all splits for that transaction
    all_tx_splits = splits.loc[lambda df: df.trn_id == transaction, :]
    
    # Exclude the current split
    other_tx_splits = all_tx_splits.loc[~all_tx_splits.index.isin([split.name])]
    
    # Return the path to the other splits
    return other_tx_splits.act_path.tolist()

My basic use case was to explore all of the accounts that money from a credit card is flowing into within a given time period. I suspect I could be approaching it more simply.

@LiosK
Copy link
Owner

LiosK commented Dec 23, 2017

As an accountant I would say that is always a challenge in the accounting field, and that's why simple, standardized split designs are often preferred over detailed flexible schemas. One suggested approach would be: collect transactions that include the credit card account, aggregate their splits by account and dr/cr, and manually exclude irrelevant accounts. This is actually similar to what GnuCash's cash flow statement does.

@LiosK LiosK closed this as completed Dec 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants