## Coinbase Trasaction Statement.
Contains information regarding trsaction performed on the base Coinbase platform (not Coinbase Pro/GDAX)
These include receipt or transfers of money or crypto, as well as any coinbase assisted trades/sales
### How to get:
- login to Coinbase.com
- Go to: https://accounts.coinbase.com/profile 
- Select "statements"
- Click on the "transactions" tab
- "Generate custom statement" with:
    - "all assets"
    - "all transactions"
    - Select the desired year

#### Record types

What we are interested in here are:
- BTC/ETH "Receive" - Receipt of new crypto (income) as an Acquisition - tho maybe also just a transfer into CB before a sale in cbrpo? One tip is that even-numbered values from an ext acct are probably from me.
- BTC/ETH "Withdrawl" - implicit CB-assisted sale as a Disposition
- BTC/ETH "Send" - maybe a payment (Disposition) or maybe just a transfer out of CB, which isn;t a disposition. We can't know so it gets exported as a disp and you have to decide in FIFO-toll what to do with it.

Less directly interesting (we don;t do anything with them) are:

- BTC/ETH "Exchange Deposit" - Crypto being sent to CBPro/GDAX - almost certainly to be sold - not exported
- USD "Exchange withdrawal" - almost always right after the BTC Exchange deposit - is the cash from the sale - not exported
- USD "Withdrawal" - usually the same amount as the exchange withdrawal, and right after it: the money being transferred to a bank - not exported

...And there are new ones

- Pro Deposit/Withdrawal - at some point in 2018 "Pro" deposits and withdrawal appeared. I think they are the same a "Exchange" ones? There are reports that have both - I'm just assuming they mean the same thing. We still don;t process them.
- Buy - buy an asset on CB. *Acquisition*.
- Convert - "Convert" an asset to another or cash - like "USD", which is just a sale. *Disposition* (unless the asset is the "Price currency" - but I'll bet that never happens. For now we'll check and makes sure it's ASSET -> USD and blow up otherwise.
- Reward Income - Yikes! Ignore.
- Advanced Trade Sell - GDAX/PRO is now "Advanced"?  *Disposition*. This is not reflected in the year's PRO Account report, so there's no chance of duplication

#### Fees

Fees are reported inconsistently negative or positive for different transaction types. There's probably a ryhme or reason, but it seems to me that
all fees are payed by the user, so should be consistent. I choose positive

| Record |  Sign  | Example      |
|-----|----------|-------------|
| Buy | Positive | (Fee: 3.67) |
| Convert | Positive | (Fee: 3.5497) |
| Withdrawal | Negative | (Fee: -94.06) |
| Advanced Trade Sell | Negative | (Fee: -7.014658035216) |
| Advanced Trade Sell | Positive | (Fee: 13.9426) |

WTF? Looks like the Advanced Trade Sell records in 2023 reported negative fees, positive thereafter?
A: Actually, the report in 2023 was inconsistent with other reports. A BTC "Withdrawal" in 2015 was also reported with negative fees.

We are just using the asset amount, asset price, and fees columns, so everything works out as long as we always call fees positive and subtract them
from amoutn*price to get net proceeds/cost

#### Notes

Start by assuming "receive" is an acq, and "send' and an asset "withdrawl" are always dispositions

Now also process "Buy", "Convert", "Advanced Sell Trade"

I guess we should flag anything unrecognized or unexpected. 

What about other assets? like an ETH tx when we are processing BTC. Shoudl we at least log it? Not doing it ATM

Also - just as an aside - the records in these TX statements are ordered newest to oldest... which is weird.


In [7]:
# allow import of local fifo-tool stuff
from typing import List
import os
import sys
sys.path.insert(0, os.path.abspath('../src'))

In [8]:
from typing import Dict
from datetime import datetime
import json
import numpy as np
import pandas as pd

from models.acquisition import Acquisition
from models.disposition import Disposition
from models.stash import Stash

In [9]:
def read_statement_csv(file_path):
    """Read a coinbase transaction statement and return a pandas dataframe.
        conversions done:
            'time' - parsed into a datestamp
            'trade id' - read as a string
    """
    dol = lambda x: float(x.replace('$','')) # money fields seem to have $ signs...
    date_flds = ['Timestamp']
    dollar_flds = {'Price at Transaction':dol,'Subtotal':dol,'Total (inclusive of fees and/or spread)':dol,'Fees and/or Spread':dol}
    forced_dtypes = {}
    df =  pd.read_csv(file_path, skiprows=3, parse_dates=date_flds, converters=dollar_flds)
    return df

In [27]:
# tx_type: Send, Receive, Withdrawal
def parse_transactions(data, asset, tx_type):
    type_mask = data["Transaction Type"]==tx_type
    asset_mask = data["Asset"]==asset
    return data[asset_mask & type_mask]    


In [28]:
def debug_fee_data(rec):
    if rec['Fees and/or Spread']:
        print(f"Date: {rec['Timestamp']}, Rec Type: {rec['Transaction Type']}, Asset: {rec['Asset']}, Fee: {rec['Fees and/or Spread']}")

def rec_to_disposition(rec, asset) -> 'Disposition':
    """ rec is a single file row as a dict keyed by column names """
    debug_fee_data(rec)
    return Disposition(
        rec['Timestamp'].timestamp(),
        rec['Asset'] , # asset_type sold
        rec['Quantity Transacted'], # asset_amount
        rec['Price at Transaction'], # asset_price,
        abs(rec['Fees and/or Spread']), # fees - reported vairiously CB as >0 or <0. We want them positive
        f"CB Id: {rec['ID']}", # reference
        rec['Notes']
    )

def rec_to_acquisition(rec, asset) -> 'Acquisition':
    debug_fee_data(rec)    
    return Acquisition(
        rec['Timestamp'].timestamp(),
        rec['Asset'], # asset_type bought/received
        rec['Quantity Transacted'], # asset_amount
        rec['Price at Transaction'], # asset_price,
        abs(rec['Fees and/or Spread']), # fees - see above
        f"Coinbase Id: {rec['ID']}", # reference
        f"{rec['Notes']}" #comment
    )

In [29]:
def process_recs(src_df, asset):
    # Receive
    rcv_df =  parse_transactions(src_df, asset, 'Receive') # cryto receive *might* be an acquisition... or just a self-transfer
    rcv_recs = [r for r in rcv_df.to_dict(orient='index').values()] # converts DataFrame into a list of dicts
    # Send
    send_df = parse_transactions(src_df, asset, 'Send') # crypto "send" _might_ be a disposition (payement, for instance)... of just self-xfer
    send_recs = [r for r in send_df.to_dict(orient='index').values()] 
    # Withdrawal
    wd_df = parse_transactions(src_df, asset, 'Withdrawal') # a crypto "withdrawal" from coinbase (not Pro) _is_ a sale/disposition
    wd_recs = [r for r in wd_df.to_dict(orient='index').values()] 
    # Buy
    buy_df = parse_transactions(src_df, asset, 'Buy') # 'Buy' is an acquisition
    buy_recs = [r for r in buy_df.to_dict(orient='index').values()] 
    # Convert
    cnv_df = parse_transactions(src_df, asset, 'Convert') # Assumes convert is converting FROM asset to USDC
    cnv_recs = [r for r in cnv_df.to_dict(orient='index').values()]  # Disposition
    # Advanced Sell Trade
    ast_df = parse_transactions(src_df, asset, 'Advanced Trade Sell') # Sell
    ast_recs = [r for r in ast_df.to_dict(orient='index').values()]  # Disposition
    
    acqs = [rec_to_acquisition(r, asset) for r in rcv_recs + buy_recs]
    disps = [rec_to_disposition(r, asset) for r in send_recs + wd_recs + cnv_recs + ast_recs ] 
    return acqs, disps

In [30]:

def process_file( year: str, assets: List[str]) -> None:
    filebase = f'local_data/coinbase-txs-{year}'
    main_df = read_statement_csv(filebase+'.csv')
    for asset in assets:
        acqs, disps = process_recs(main_df, asset)
        data = Stash(asset, f"Coinbase  {asset} txs - {year}", acqs, disps)
         #json.dumps(jd)
        with open(filebase+f'-{asset}.json', 'w') as f:
            jd = data.to_json_dict()
            json.dump(jd, f, indent=2)



In [31]:
for yr in ['2015', '2016', '2017', '2018', '2019', '2020', '2021', '2022', '2023', '2024']:  
    process_file(yr, ['BTC', 'ETH'] )  # there are probably other cryto assets, but I dont care at th moment

Date: 2015-10-22 18:56:12+00:00, Rec Type: Withdrawal, Asset: BTC, Fee: -94.06
Date: 2020-08-19 03:28:57+00:00, Rec Type: Buy, Asset: BTC, Fee: 3.67
Date: 2020-08-20 14:21:37+00:00, Rec Type: Buy, Asset: ETH, Fee: 3.09
Date: 2021-12-31 21:54:14+00:00, Rec Type: Convert, Asset: BTC, Fee: 3.5497
Date: 2023-12-28 20:54:52+00:00, Rec Type: Advanced Trade Sell, Asset: ETH, Fee: -7.014658035216
Date: 2023-12-28 20:54:52+00:00, Rec Type: Advanced Trade Sell, Asset: ETH, Fee: -2.399941964784
Date: 2024-12-31 12:27:49+00:00, Rec Type: Advanced Trade Sell, Asset: BTC, Fee: 15.136
Date: 2024-12-30 20:30:57+00:00, Rec Type: Advanced Trade Sell, Asset: BTC, Fee: 0.012657953
Date: 2024-12-30 20:30:57+00:00, Rec Type: Advanced Trade Sell, Asset: BTC, Fee: 0.013959176
Date: 2024-12-30 20:30:57+00:00, Rec Type: Advanced Trade Sell, Asset: BTC, Fee: 34.846583079
Date: 2024-11-18 21:10:37+00:00, Rec Type: Advanced Trade Sell, Asset: BTC, Fee: 13.9426
Date: 2024-10-14 19:49:45+00:00, Rec Type: Advanced Tr

In [None]:
# work below here

In [None]:
recs = parse_transactions(main_df, "BTC", "Receive")

In [None]:
 # **** Take a multi-row DataFrame and write out a list of dicts keyed by column name
[ i for i in recs.to_dict(orient='index').values()] 

# 

In [None]:
# deal with "receive" records
btc_mask = main_df["Asset"]=='BTC'
eth_mask = main_df["Asset"]=='ETH'
usd_mask = main_df["Asset"]=='USD'

In [None]:
receive_mask = main_df["Transaction Type"]=="Receive"
main_df[receive_mask & btc_mask]
#main_df[receive_mask]

In [None]:
wd_mask = main_df["Transaction Type"]=="Withdrawal"
main_df[wd_mask]

In [None]:
mask = main_df["Transaction Type"]=="Send"
main_df[mask]