# Calculating potential loss of transaction intent leakage

Input: a list of transaction hashes that we think have been leaked 
Output: a file with the potential loss in dollars for each transaction as well as the total potential loss in dollar, ethereum, and original coin used in the transaction.

## The methodology: 
1. We have a list of transaction hashes that we think have been leaked 
2. We get the details of this transaction using the Infuria and Etherscan APIs
3. We simulate the result of each transaction if it would have been top of the block using the Tenderly API
4. We calculate the difference in dollars for each transaction
5. We sum the potential loss of each transaction to get the total potential loss in dollars. 

## Usage
1. Have a csv file with one column called ' user_tx' (space is important) having all the transactions hashes that you think were leeked
2. open the config_example.py file and follow the instructions for the configuration
3. run the Jupyter Notebook. The output file is called results/final_results_<INCIDENT_NAME>.csv - but the Jupyter Notebook is full of interesting information.

## Final results data catalog

The final result file has the following columns
| Column | Meaning |
| :--- | :--- |
| ```tx_hash``` | the transaction hash of the potentially leaked transaction |
| ```sender``` | address which initiated the transaction |
| ```delta_eth``` | this is the potential loss for that transaction, it is the difference in ethereum between the value of the transaction at top of block vs. at the actual position in the block. A value of -1 means that the leaked transaction cost the sender 1 ethereum |
| ```delta_dollar``` | difference in dollars between the value of the transaction at top of block vs. at the actual position in the block |
| ```token_name_A``` | name of one of the two tokens exchanged by the sender |
| ```token_contract_address_A``` |  token contract address of one of the two tokens exchanged by the sender |
| ```delta_token_A``` | difference in token A between if the transaction was at top of block vs. at the actual position in the block.  A value of -1 means that the leaked transaction cost the sender 1 token A  |
| ```token_name_B``` |  name of one of the two tokens exchanged by the sender |
| ```token_contract_address_B``` | token contract address of one of the two tokens exchanged by the sender |
| ```delta_token_B``` | difference in token B between if the transaction was at top of block vs. at the actual position in the block.  A value of -1 means that the leaked transaction cost the sender 1 token B


In [2]:
import pandas as pd
import numpy as np 
import json
import requests 
from configurations import *

In [311]:
df_main = pd.read_csv(csv_file_path)
#drop the duplicate transaction in case there is any
df_main = df_main.drop_duplicates(subset=[' user_tx'])
df_main.head()

Unnamed: 0,block_number,user_tx,fees
0,19375342,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,47408284226876256
1,19375361,0x202fc38a52652a0c49927c1771de43939b47e083ba1c...,70226021788882271
2,19375372,0xee51506e07ace44eaad85041210d025ac46241526d2d...,25816233944667279
3,19375384,0x4589dc3b7be6df22ed3657b3310bfff117329a0a7e68...,252120514831501060
4,19375388,0x153a70478d17e082740c30f9d5d20fbca5d298c34cc4...,272612861208189719


In [312]:
#make a list of transaction hash we need to analyse
tx_hash_list = [x for x in df_main[' user_tx'].to_list() if pd.notnull(x)]
print(f'There are {len(tx_hash_list)} transactions')

There are 130 transactions


## API calls - skip this if you already made them 

### Infuria
With this API, we want to get all the inputs necessary to simulate the transaction again later on. Infuria gives us all of these inputs except fot the timestamp of the transaction, which is why we need to use the Etherscan API later on.

#### Call Infuria for the first transaction

In [4]:
# create a df to store all the potenatially problematic transactions and the reason why 
problematic_transactions = pd.DataFrame(columns=['tx_hash or block', 'where_problem_happened'])

In [5]:
#the url and the api key are stored in the config file.
url = infuria_url

#Get the infuria response for the first transaction in the list to create a dataframe
payload = json.dumps({
  "jsonrpc": "2.0",
  "method": "eth_getTransactionByHash",
  "params": [tx_hash_list[0]],
  "id": 1
})
headers = {
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)
result = response.json()['result']
result = {k: None if not v else v for k, v in result.items()} # making sure none of the values are empty
df_infuria = pd.DataFrame(result, index=[0])
df_infuria.head()

Unnamed: 0,accessList,blockHash,blockNumber,chainId,from,gas,gasPrice,hash,input,maxFeePerGas,maxPriorityFeePerGas,nonce,r,s,to,transactionIndex,type,v,value,yParity
0,,0x40f2c6a5cca6816f67bc7a3b5e667bd27f7169ec84da...,0x127a4ee,0x1,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c,0x2bfd7,0xeadb616ac,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,0x7ff36ab5000000000000000000000000000000000000...,0x13ad304380,0x5e69ec0,0x54,0x2a47d3286b8276d24c68cae1e1db8a05c1725b4759ba...,0x3ee3481e8d33fe94124d4e0e76c66694ae5118383e31...,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,0x1,0x2,0x0,0x429d069189e0000,0x0


#### Call Infuria for all other transactions - !! This will take a few minutes

In [6]:
# get the infuria response for all the other transactions in the list and append the results to the above dataframe
for tx_hash in tx_hash_list[1:]:
  payload = json.dumps({
    "jsonrpc": "2.0",
    "method": "eth_getTransactionByHash",
    "params": [tx_hash],
    "id": 1
  })
  headers = {
    'Content-Type': 'application/json'
  }

  response = requests.request("POST", url, headers=headers, data=payload)

  if response.ok:
    result = response.json()['result']
    result = {k: None if not v else v for k, v in result.items()} # making sure none of the values are empty
    df_temp = pd.DataFrame(result, index=[0])
    df_infuria = pd.concat([df_infuria, df_temp])
    
  else: 
    print(f"error code {response.status_code} for transaction {tx_hash}")
    new_row = {'tx_hash or block':  tx_hash, 'where_problem_happened': 'infuria'}
    problematic_transactions = pd.concat([problematic_transactions, pd.DataFrame([new_row])], ignore_index=True)

df_infuria.head()


Unnamed: 0,accessList,blockHash,blockNumber,chainId,from,gas,gasPrice,hash,input,maxFeePerGas,maxPriorityFeePerGas,nonce,r,s,to,transactionIndex,type,v,value,yParity
0,,0x40f2c6a5cca6816f67bc7a3b5e667bd27f7169ec84da...,0x127a4ee,0x1,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c,0x2bfd7,0xeadb616ac,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,0x7ff36ab5000000000000000000000000000000000000...,0x13ad304380,0x5e69ec0,0x54,0x2a47d3286b8276d24c68cae1e1db8a05c1725b4759ba...,0x3ee3481e8d33fe94124d4e0e76c66694ae5118383e31...,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,0x1,0x2,0x0,0x429d069189e0000,0x0
0,,0xbfe2748fb504fd42ee26c5d7db6c3cab5d99d56bbde3...,0x127a501,0x1,0x77314da6f40f71c3a850c89e1a05c438a0acd405,0x34479,0xf4c2310e3,0x202fc38a52652a0c49927c1771de43939b47e083ba1c...,0x3593564c000000000000000000000000000000000000...,0x158887c893,0x9402a0,0x4d,0xac3934714a06fc8c26f120276c842b67eee672f2b521...,0x59683f997587e36875f3d1830e8c2c77c8b7983a05d5...,0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad,0x1,0x2,0x0,0x0,0x0
0,,0xbc5e86793e57d80ef647106b54cce1a1f2a5b9ecdecc...,0x127a50c,0x1,0x1f7ea43d283d0ef906ee92ddead883a8f078cbc9,0x41bc6,0xff4f60c10,0xee51506e07ace44eaad85041210d025ac46241526d2d...,0x3593564c000000000000000000000000000000000000...,0x15ad608d4c,0x2017a9b,0x416,0xf085eab90fe023a2f519442d5cd4cb054fc63b4b22eb...,0x71ae9fd16ff569a146f8862f036845a40914ae1c27b4...,0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad,0x4,0x2,0x0,0x0,0x0
0,,0x6a845bcc4b05da8db0c5da927a0d9867903195a63624...,0x127a518,0x1,0x4e6b065262e3504f2511ef5b8cadc039630803be,0x2e52c,0x1039a71567,0x4589dc3b7be6df22ed3657b3310bfff117329a0a7e68...,0x3593564c000000000000000000000000000000000000...,0x15d0265dd6,0x1c69447,0x29,0x59b8ba6e68ed7261a58ae3d45a00f10cb5aaa064f8d7...,0x42d467edb9dd15965d31f6b444083a95e769294b4213...,0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad,0x4,0x2,0x1,0x0,0x1
0,,0xd2dcda496d5c7e73285f9ad24267ab537edbcd14efc6...,0x127a51c,0x1,0xdd3d41d3817abe28519f4f5c0890e9c0f0cfe69b,0x1e8480,0x14f46b0400,0x153a70478d17e082740c30f9d5d20fbca5d298c34cc4...,0xb6f9de95000000000000000000000000000000000000...,,,0x3f4,0x3be3b5b6a8c858a4546cdda89f350555c868e31a6f50...,0x5111bccd8ee8b1dd5dd68bee7d60e3a408884df5fd4f...,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,0x1,0x0,0x25,0x429d069189e0000,


In [7]:
print(f"there were {len(problematic_transactions)} transactions where the infuria api call did not work")

there were 0 transactions where the infuria api call did not work


#### merge infuria response to main

In [8]:
df_main = df_main.merge(df_infuria, left_on = ' user_tx', right_on = 'hash', how ='outer')
df_main.columns

Index(['block_number', ' user_tx', ' fees', 'accessList', 'blockHash',
       'blockNumber', 'chainId', 'from', 'gas', 'gasPrice', 'hash', 'input',
       'maxFeePerGas', 'maxPriorityFeePerGas', 'nonce', 'r', 's', 'to',
       'transactionIndex', 'type', 'v', 'value', 'yParity'],
      dtype='object')

#### save the api results
Store the results in a file so that we do not have to rerun the calls if we need this data again

In [9]:
df_main.to_csv(f'data/intermediary/{name_of_incident}_infuria.csv')

In [10]:
# read the file so if we want to rerun the notebook, there is no need to remake the Infuria calls
df_main = pd.read_csv(f'data/intermediary/{name_of_incident}_infuria.csv', index_col=0)

### Etherscan 
We use this API to have the timestamp of the blocks rather than the transactions themselves to reduce the amount of API calls (tx and block time are the same for all tx in the block). We need the timestamp of the transactions because in the Tenderly API, if we do not override the timestamp, then it uses the current time as input variable

In [11]:
#getting all the block numbers of the transactions we want to analyse
block_number_list = list(set([x for x in df_main['block_number'].to_list() if pd.notnull(x)]))
print(f'there are {len(block_number_list)} different blocks')

there are 114 different blocks


#### Get the ehterscan response for the first block in the list to create a dataframe

In [12]:
url_eth = f"https://api.etherscan.io/api?module=block&action=getblockreward&blockno={block_number_list[0]}&apikey={eth_scan_api_key}"

response_eth = requests.request("POST", url_eth)

result = response_eth.json()['result']
result = {k: None if not v else v for k, v in result.items()} # making sure none of the values are empty
df_eth = pd.DataFrame(result, index=[0])
df_eth.head()

Unnamed: 0,blockNumber,timeStamp,blockMiner,blockReward,uncles,uncleInclusionReward
0,19376128,1709728067,0x4838b106fce9647bdf1e7877bf73ce8b0bad5f97,175903410846177465,,0


#### Get the etherscan responses for the other blocks in the list - !!! This will take a few minutes

In [13]:
for block in block_number_list[1:]:
  url_temp = f"https://api.etherscan.io/api?module=block&action=getblockreward&blockno={block}&apikey={eth_scan_api_key}"
  response_temp = requests.request("POST", url_temp)

  if response_temp.ok:
    dct_temp = response_temp.json()['result']
    dct_temp = {k: None if not v else v for k, v in dct_temp.items()} # making sure none of the values are empty
    df_temp = pd.DataFrame(dct_temp, index=[0])
    df_eth = pd.concat([df_eth, df_temp])
    
  else: 
    print(f"error code {response_temp.status_code} for block {block}")
    new_row = {'tx_hash or block':  block, 'where_problem_happened': 'etherscan. This is a block number'}
    problematic_transactions = pd.concat([problematic_transactions, pd.DataFrame([new_row])], ignore_index=True)

In [14]:
print(f"there were {len(problematic_transactions[problematic_transactions['where_problem_happened'] ==  'etherscan. This is a block number'])} blocks where the etherscan api call did not work")

there were 0 blocks where the etherscan api call did not work


In [15]:
# quick cleaning for later
df_eth['blockNumber'] = df_eth['blockNumber'].astype(int)

#### merge etherscan results to main

In [16]:
df_main = df_main.merge(df_eth, left_on = 'block_number', right_on = 'blockNumber', how ='outer')
df_main.columns

Index(['block_number', ' user_tx', ' fees', 'accessList', 'blockHash',
       'blockNumber_x', 'chainId', 'from', 'gas', 'gasPrice', 'hash', 'input',
       'maxFeePerGas', 'maxPriorityFeePerGas', 'nonce', 'r', 's', 'to',
       'transactionIndex', 'type', 'v', 'value', 'yParity', 'blockNumber_y',
       'timeStamp', 'blockMiner', 'blockReward', 'uncles',
       'uncleInclusionReward'],
      dtype='object')

#### save the results of etherscan 
Store the results in a file so that we do not have to rerun the calls if we need this data again


In [17]:
df_main.to_csv(f'data/intermediary/{name_of_incident}_etherscan.csv')

In [18]:
df_main = pd.read_csv(f'data/intermediary/{name_of_incident}_etherscan.csv', index_col=0)

### Tenderly !! This will take a few minutes
Here we finally do the simulation. We do it once at the original index position to get the amount of coin transferred originally. Then we do it again at index position 0.

In [19]:
headers = {
    'X-Access-Key': f'{tenderly_access_token}',
    'content-type': 'application/json',
}

#creating an empty DataFrame for the results
columns = ['tx_hash', 'index', 'type', 'raw_amount', 'dollar_value', 'token_contract_address', 'token_name', 'token_dollar_value', 'from', 'to', 'sender', 'timestamp']
df_results = pd.DataFrame(columns = columns)


#iterating over every row of the main dataframe (one row is one transaction)
for index, row in df_main.iterrows():
    tx_index_list = [0]
    tx_index_list.append(int(row['transactionIndex'], 0))
    # for each transaction, simulate twice: once for each index
    for tx_index in tx_index_list:
        json_data = {
        'network_id': int(row['chainId'], 0),
        'from': row['from'],
        'to': row['to'],
        'input': row['input'],
        'block_number': row['block_number'],
        'transaction_index': tx_index,
        'simulation_type': 'quick',
        'gas': int(row['gas'], 0),
        'value': int(row['value'], 0),
        'gas_price': int(row['gasPrice'], 0),
        'l1_timestamp': int(row['timeStamp'])
        }
        
        response = requests.post(
        'https://api.tenderly.co/api/v1/account/aurelie2/project/cowswap2/simulate',
        headers=headers,
        json=json_data,
        )

        try:
            for data in response.json()['transaction']['transaction_info']['asset_changes']:
                tx_type = data['type']
                tx_raw_amount = data['raw_amount']
                tx_dollar_value = data['dollar_value']
                sender = response.json()['transaction']['from']

                #sometimes the following values are empty 
                try:
                    contract_address = data['token_info']['contract_address']
                except:
                    contract_address = 'None'

                try:
                    token_name = data['token_info']['name']
                except:
                    token_name = 'None'
                try:
                    token_dollar_value = data['token_info']['dollar_value']
                except:
                    token_dollar_value = 'None'

                try:
                    tx_from = data['from']
                except:
                    tx_from = 'None'

                try:
                    tx_to = data['to']
                except: 
                    tx_to = 'None'

                new_row = {
                    'tx_hash' : row['hash'],
                    'from':tx_from,
                    'to': tx_to,
                    'index' : tx_index, 
                    'type': tx_type, 
                    'raw_amount': tx_raw_amount, 
                    'dollar_value' : tx_dollar_value, 
                    'token_contract_address': contract_address, 
                    'token_name': token_name, 
                    'token_dollar_value': token_dollar_value,
                    'sender': sender, 
                    'timestamp': int(row['timeStamp'])
                    }
                df_results = pd.concat([df_results, pd.DataFrame([new_row])], ignore_index=True)
        except:
            new_row = {'tx_hash or block':  row['hash'], 'where_problem_happened': 'tenderly'}
            problematic_transactions = pd.concat([problematic_transactions, pd.DataFrame([new_row])], ignore_index=True)
            


#### creating a new dataframe with only the transactions where tenderly api returned something.
This meanse that there are 2 index values for these good transactions.

In [20]:
grouped = df_results.groupby('tx_hash')
df_results_good = grouped.filter(lambda x: x['index'].nunique() == 2)
df_results_good.head()

Unnamed: 0,tx_hash,index,type,raw_amount,dollar_value,token_contract_address,token_name,token_dollar_value,from,to,sender,timestamp
0,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,0,Mint,300000000000000000,965.1120117187501,0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2,WETH,3217.0400390625,,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c,1709718599
1,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,0,Transfer,300000000000000000,965.1120117187501,0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2,WETH,3217.0400390625,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,0xa2fdb9b10af2d62d4baba5f165b781794428f385,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c,1709718599
2,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,0,Transfer,109634497747398718977119,718.5828474254013,0x0026dfbd8dbb6f8d0c88303cc1b1596409fda542,SANSHU!,0.0065543497912585,0xa2fdb9b10af2d62d4baba5f165b781794428f385,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c,1709718599
3,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,0,Transfer,300000000000000000,965.925,,Ethereum,3219.75,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c,1709718599
4,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,0,Transfer,300000000000000000,965.925,,Ethereum,3219.75,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c,1709718599


In [29]:
print(
    'number of transactions for which tenderly worked:', df_results_good.tx_hash.nunique(),
    "\nnumber of transactions for which tenderly did not worked:", problematic_transactions[problematic_transactions['where_problem_happened'] == 'tenderly']['tx_hash or block'].nunique(),
    '\npercentage of properly simulated transactions with tenderly:', format(df_results_good.tx_hash.nunique()/(df_results_good.tx_hash.nunique() +  problematic_transactions[problematic_transactions['where_problem_happened'] == 'tenderly']['tx_hash or block'].nunique()), ".2%")
    )


number of transactions for which tenderly worked: 120 
number of transactions for which tenderly did not worked: 10 
percentage of properly simulated transactions with tenderly: 92.31%


#### Save the tenderly results 

In [30]:
#saving the transactions and it's simulated data where the tenderly api worked
df_results_good.to_csv(f'data/intermediary/{name_of_incident}_tenderly.csv')

#### Save the problematic transaction/block 

In [31]:
problematic_transactions.to_csv('data/results/transactions_or_blocks_with_api_problem.csv')

## Data Wrangling

In [191]:
problematic_transactions = pd.read_csv('data/results/transactions_or_blocks_with_api_problem.csv', index_col=0)

In [192]:
df_results_good = pd.read_csv(f'data/intermediary/{name_of_incident}_tenderly.csv', index_col=0)
print(f'we are now working with {df_results_good.tx_hash.nunique()} transactions that have been simulated')

we are now working with 120 transactions that have been simulated


### Some Cleanup and new columns
add some new columns and transform the raw amount of certain tokens that have been returned by tenderly in much bigger amounts

In [193]:
df_results_good['raw_amount'] = df_results_good['raw_amount'].astype(float)
df_results_good['sender_is_involved'] = (df_results_good['from'] == df_results_good['sender']) |  (df_results_good['to'] == df_results_good['sender'])
df_results_good['sender_gave_this'] = (df_results_good['from'] == df_results_good['sender'])
df_results_good['token_name'] = df_results_good['token_name'].fillna(df_results_good['token_contract_address'])
df_results_good['token_contract_address'] = df_results_good['token_contract_address'].fillna(df_results_good['token_name'])
df_results_good['dollar_value_net']  = np.where(df_results_good['sender_gave_this'], - df_results_good['dollar_value'], df_results_good['dollar_value'])

df_results_good.loc[df_results_good['token_name'] == 'WETH', 'raw_amount'] /= 1e+18
df_results_good.loc[df_results_good['token_name'] == 'Ethereum', 'raw_amount'] /= 1e+18
df_results_good.loc[df_results_good['token_name'] == 'USDC', 'raw_amount'] /= 1000000
df_results_good.loc[df_results_good['token_name'] == 'Tether', 'raw_amount'] /= 1000000
df_results_good.loc[df_results_good['token_name'] == 'Dai', 'raw_amount'] /= 1e+18
df_results_good.loc[df_results_good['token_name'] == 'Wrapped Bitcoin', 'raw_amount'] /= 1e+8

### get only the part of the transactions where the sender was directly involved

In [194]:
df_results_clean = df_results_good.copy()
df_senders = df_results_clean[df_results_clean['sender_is_involved']]
print(
    'out of the', df_results_clean.tx_hash.nunique(), 'transactions,', 
    'we could identify the sender for ', df_senders.tx_hash.nunique(), 'of them',
    )

out of the 120 transactions, we could identify the sender for  118 of them


#### add the transaction where we can not identify a sender in the problematic transactions df

In [195]:
transaction_with_no_senders = set(df_results_clean.tx_hash.unique()) - set(df_senders.tx_hash.unique())
for transaction in transaction_with_no_senders:
    new_row = {'tx_hash or block':  transaction, 'where_problem_happened': 'could not identify a sender'}
    problematic_transactions = pd.concat([problematic_transactions, pd.DataFrame([new_row])], ignore_index=True)

problematic_transactions.tail()

Unnamed: 0,tx_hash or block,where_problem_happened
9,0xdcff2a7cb06cf1806a7f4dcb063356632a41e7d690ba...,tenderly
10,0xdf7e7bba421b1095029934e0810c4b30c0ec05056747...,tenderly
11,0x1a2e2817a4dcf4a74c6ee4ead753fcedfc773c33bdba...,tenderly
12,0x0c31c90ef634e3396fde9640ecc470c502aef98501d2...,could not identify a sender
13,0x3f334bd0902a6a8e287c173aebd8c8b0a90031d90d6b...,could not identify a sender


### For each sender, aggregate the value of the tokens that are swapped

In [260]:
grouped = df_senders.groupby(['tx_hash', 'index', 'token_name', 'token_contract_address', 'sender_is_involved', 'sender_gave_this'])
aggregated_senders = grouped.agg(
    sum_raw_amount=('raw_amount', 'sum'),               
    sum_dollar_value=('dollar_value', 'sum'),            
    token_dollar_value=('token_dollar_value', 'mean')  # all values are the same so first or mean is fine
).reset_index()
aggregated_senders.head() # one row per tx per token per index

Unnamed: 0,tx_hash,index,token_name,token_contract_address,sender_is_involved,sender_gave_this,sum_raw_amount,sum_dollar_value,token_dollar_value
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0,0xGasless,0x5fc111f3fa4c6b32eaf65659cfebdeed57234069,True,False,3.967761e+22,13389.170411,0.337449
1,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0,Hemule,0xeaa63125dd63f10874f99cdbbb18410e7fc79dd3,True,True,2.946374e+23,5302.383308,0.017996
2,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,1,0xGasless,0x5fc111f3fa4c6b32eaf65659cfebdeed57234069,True,False,3.690018e+22,12451.92919,0.337449
3,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,1,Hemule,0xeaa63125dd63f10874f99cdbbb18410e7fc79dd3,True,True,2.946374e+23,5302.383308,0.017996
4,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,0,Ethereum,Ethereum,True,True,2.0,6439.5,3219.75


### calculte the net amounts for each sender (negative if the sender paid in this token, positive if the sender received the token)

In [261]:
aggregated_senders['raw_amount_net'] = aggregated_senders.apply(lambda row: row['sum_raw_amount'] if row['sender_gave_this'] else -row['sum_raw_amount'], axis=1)
aggregated_senders['dollar_value_net'] = aggregated_senders.apply(lambda row: row['sum_dollar_value'] if row['sender_gave_this'] else -row['sum_dollar_value'], axis=1)

aggregated_senders.head()


Unnamed: 0,tx_hash,index,token_name,token_contract_address,sender_is_involved,sender_gave_this,sum_raw_amount,sum_dollar_value,token_dollar_value,raw_amount_net,dollar_value_net
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0,0xGasless,0x5fc111f3fa4c6b32eaf65659cfebdeed57234069,True,False,3.967761e+22,13389.170411,0.337449,-3.967761e+22,-13389.170411
1,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0,Hemule,0xeaa63125dd63f10874f99cdbbb18410e7fc79dd3,True,True,2.946374e+23,5302.383308,0.017996,2.946374e+23,5302.383308
2,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,1,0xGasless,0x5fc111f3fa4c6b32eaf65659cfebdeed57234069,True,False,3.690018e+22,12451.92919,0.337449,-3.690018e+22,-12451.92919
3,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,1,Hemule,0xeaa63125dd63f10874f99cdbbb18410e7fc79dd3,True,True,2.946374e+23,5302.383308,0.017996,2.946374e+23,5302.383308
4,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,0,Ethereum,Ethereum,True,True,2.0,6439.5,3219.75,2.0,6439.5


In [262]:
# Group by 'tx_hash', 'index', and 'token_name' and aggregate to get one row per group
grouped_df = aggregated_senders.groupby(['tx_hash', 'index', 'token_name', 'token_dollar_value'], as_index=False, dropna=False).agg({
    'raw_amount_net': 'sum',\
    'dollar_value_net': 'sum'
})
grouped_df.head()

Unnamed: 0,tx_hash,index,token_name,token_dollar_value,raw_amount_net,dollar_value_net
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0,0xGasless,0.337449,-3.967761e+22,-13389.170411
1,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0,Hemule,0.017996,2.946374e+23,5302.383308
2,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,1,0xGasless,0.337449,-3.690018e+22,-12451.92919
3,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,1,Hemule,0.017996,2.946374e+23,5302.383308
4,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,0,Ethereum,3219.75,2.0,6439.5


### Add all the tokens where the sender is involved
Only keep the rows where there are exactly 2 tokens

In [263]:
tokens_per_tx = grouped_df.groupby(['tx_hash'])['token_name'].unique().reset_index()
normal_transactions = tokens_per_tx[tokens_per_tx['token_name'].apply(lambda x: len(x) == 2)]
normal_transactions[['token_name_A', 'token_name_B']] = pd.DataFrame(normal_transactions['token_name'].tolist(), index=normal_transactions.index)
normal_transactions = normal_transactions.drop(columns=['token_name'])
tokens_df = pd.merge(normal_transactions, grouped_df, on='tx_hash', how = 'left')
tokens_df.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  normal_transactions[['token_name_A', 'token_name_B']] = pd.DataFrame(normal_transactions['token_name'].tolist(), index=normal_transactions.index)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  normal_transactions[['token_name_A', 'token_name_B']] = pd.DataFrame(normal_transactions['token_name'].tolist(), index=normal_transactions.index)


Unnamed: 0,tx_hash,token_name_A,token_name_B,index,token_name,token_dollar_value,raw_amount_net,dollar_value_net
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0xGasless,Hemule,0,0xGasless,0.337449,-3.967761e+22,-13389.170411
1,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0xGasless,Hemule,0,Hemule,0.017996,2.946374e+23,5302.383308
2,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0xGasless,Hemule,1,0xGasless,0.337449,-3.690018e+22,-12451.92919
3,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0xGasless,Hemule,1,Hemule,0.017996,2.946374e+23,5302.383308
4,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Ethereum,Monai,0,Ethereum,3219.75,2.0,6439.5


#### Add the transactions where there are not two tokens where the sender is involved to the problematic transactions df

In [264]:
transaction_without_two_tokens = set(df_senders.tx_hash.unique()) - set(tokens_df.tx_hash.unique()) 
for transaction in transaction_without_two_tokens:
    new_row = {'tx_hash or block':  transaction, 'where_problem_happened': 'only one token or more than two'}
    problematic_transactions = pd.concat([problematic_transactions, pd.DataFrame([new_row])], ignore_index=True)

problematic_transactions.tail()

Unnamed: 0,tx_hash or block,where_problem_happened
17,0xc28a601d732927a32c5b9faaad082738b6778b556a43...,only one token or more than two
18,0x9d20ab61100de7641c2baa7627f6ca4833abcb242d73...,only one token or more than two
19,0x5c9ef032861b9cda8c047f3dd64713647b156074da88...,only one token or more than two
20,0xfbfcc2d4c15d11e6e73530fa49e06f0b4810bc2cc156...,only one token or more than two
21,0xc28a601d732927a32c5b9faaad082738b6778b556a43...,only one token or more than two


### Calculate the delta in tokens 

In [265]:
#create two dataframes, for the token A and B at index 0
token_A_df_0 = tokens_df[(tokens_df['token_name'] == tokens_df['token_name_A']) & (tokens_df['index'] == 0)][['tx_hash', 'token_dollar_value', 'dollar_value_net', 'raw_amount_net', 'token_name_A']]
token_B_df_0 = tokens_df[(tokens_df['token_name'] == tokens_df['token_name_B']) & (tokens_df['index'] == 0)][['tx_hash', 'token_dollar_value', 'dollar_value_net', 'raw_amount_net', 'token_name_B']]

token_A_df_0.rename(columns = {'token_dollar_value': 'token_A_dollar_value_index_0', 'dollar_value_net' : 'token_A_tx_dollar_value_index_0', 'raw_amount_net' : 'token_A_raw_amount_index_0'}, inplace = True)
token_B_df_0.rename(columns = {'token_dollar_value': 'token_B_dollar_value_index_0', 'dollar_value_net' : 'token_B_tx_dollar_value_index_0', 'raw_amount_net' : 'token_B_raw_amount_index_0'}, inplace = True)

token_A_df_0 = token_A_df_0.drop_duplicates()
token_B_df_0 = token_B_df_0.drop_duplicates()

In [266]:
#create two dataframes, for the token A and B at index other than 0
token_A_df_other = tokens_df[(tokens_df['token_name'] == tokens_df['token_name_A']) & (tokens_df['index'] != 0)][['tx_hash', 'token_dollar_value','dollar_value_net', 'raw_amount_net', 'token_name_A']]
token_B_df_other = tokens_df[(tokens_df['token_name'] == tokens_df['token_name_B']) & (tokens_df['index'] != 0)][['tx_hash', 'token_dollar_value', 'dollar_value_net', 'raw_amount_net',  'token_name_B']]

token_A_df_other.rename(columns = {'token_dollar_value': 'token_A_dollar_value_index_other', 'dollar_value_net' : 'token_A_tx_dollar_value_index_other', 'raw_amount_net' : 'token_A_raw_amount_index_other'}, inplace = True)
token_B_df_other.rename(columns = {'token_dollar_value': 'token_B_dollar_value_index_other', 'dollar_value_net' : 'token_B_tx_dollar_value_index_other', 'raw_amount_net' : 'token_B_raw_amount_index_other'}, inplace = True)

token_A_df_other = token_A_df_other.drop_duplicates()
token_B_df_other = token_B_df_other.drop_duplicates()

In [267]:
# merge the 4 dataframes to have one with all the data of the tx with one row per tx
#token_B_df_other_grouped, token_B_df_0_grouped, token_A_df_other_grouped, token_A_df_0_grouped
merged_df = pd.merge(token_B_df_other, token_B_df_0, on=['tx_hash', 'token_name_B'])
merged_df = pd.merge(merged_df, token_A_df_other, on=['tx_hash'])
merged_df = pd.merge(merged_df, token_A_df_0, on=['tx_hash', 'token_name_A'])
merged_df.columns 


Index(['tx_hash', 'token_B_dollar_value_index_other',
       'token_B_tx_dollar_value_index_other', 'token_B_raw_amount_index_other',
       'token_name_B', 'token_B_dollar_value_index_0',
       'token_B_tx_dollar_value_index_0', 'token_B_raw_amount_index_0',
       'token_A_dollar_value_index_other',
       'token_A_tx_dollar_value_index_other', 'token_A_raw_amount_index_other',
       'token_name_A', 'token_A_dollar_value_index_0',
       'token_A_tx_dollar_value_index_0', 'token_A_raw_amount_index_0'],
      dtype='object')

In [268]:
def calculate_difference(row, column_0, column_other):
    return row[column_0] - row[column_other]

In [269]:
# add new rows with the delta in tokens
merged_df['token_A_delta_raw_amount'] = merged_df.apply(lambda row: calculate_difference(row, 'token_A_raw_amount_index_0', 'token_A_raw_amount_index_other'), axis=1)
merged_df['token_B_delta_raw_amount'] = merged_df.apply(lambda row: calculate_difference(row, 'token_B_raw_amount_index_0', 'token_B_raw_amount_index_other'), axis=1)

merged_df['token_A_delta_dollar'] = merged_df.apply(lambda row: calculate_difference(row, 'token_A_tx_dollar_value_index_0', 'token_A_tx_dollar_value_index_other'), axis=1)
merged_df['token_B_delta_dollar'] = merged_df.apply(lambda row: calculate_difference(row, 'token_B_tx_dollar_value_index_0', 'token_B_tx_dollar_value_index_other'), axis=1)
merged_df.columns

Index(['tx_hash', 'token_B_dollar_value_index_other',
       'token_B_tx_dollar_value_index_other', 'token_B_raw_amount_index_other',
       'token_name_B', 'token_B_dollar_value_index_0',
       'token_B_tx_dollar_value_index_0', 'token_B_raw_amount_index_0',
       'token_A_dollar_value_index_other',
       'token_A_tx_dollar_value_index_other', 'token_A_raw_amount_index_other',
       'token_name_A', 'token_A_dollar_value_index_0',
       'token_A_tx_dollar_value_index_0', 'token_A_raw_amount_index_0',
       'token_A_delta_raw_amount', 'token_B_delta_raw_amount',
       'token_A_delta_dollar', 'token_B_delta_dollar'],
      dtype='object')

In [270]:
nearly_final = merged_df[['tx_hash','token_name_A', 'token_A_delta_raw_amount', 'token_A_delta_dollar', 'token_name_B',  'token_B_delta_raw_amount', 'token_B_delta_dollar']]
nearly_final = nearly_final.rename(columns = {'token_A_delta_dollar' : 'token_A_delta_dollar_tenderly', 'token_B_delta_dollar' : 'token_B_delta_dollar_tenderly'})
nearly_final.head()  # one row per transaction with the delta in raw amounts, and also in tenderly dollars

Unnamed: 0,tx_hash,token_name_A,token_A_delta_raw_amount,token_A_delta_dollar_tenderly,token_name_B,token_B_delta_raw_amount,token_B_delta_dollar_tenderly
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0xGasless,-2.777431e+21,-937.241221,Hemule,0.0,0.0
1,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Ethereum,0.0,0.0,Monai,-6.152742e+20,-163.221164
2,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,0x857ffc55b1aa61a7ff847c82072790cae73cd883,-8.028344e+17,0.0,Ethereum,0.0,0.0
3,0x054d9a64147c776a19391680e82077d31de60d84ef07...,Ethereum,0.0,0.0,MetaZero,-5.19562e+21,-654.866321
4,0x067ecb28afdb4ebe732845b9321bd2c815bdda53d3ab...,Ribbon Finance,-3.949858e+20,-466.08324,Tether,0.0,0.0


### Use the stable coin of the transaction to calculate the difference in dollars 
Tenderly gives us a value for the tokens, but this value is a bit strange for "meme" tokens, or unstable tokens. Therefore, we use the stable token that is present in the transaction (there is always one), calculate the worth of that transaction in dollar based on the amount of this stable token and the dollar value of that stable token, and finally calculate the worth of the unstable token based on the ratio of amount of unstable swapped for stable.

In [271]:
stable_coins = ['WETH', 'Ethereum', 'USDC', 'Tether', 'Wrapped Bitcoin', 'Dai']

In [272]:
# find the stable coin in order of preference of the transaction
def select_stable_coin(group):
    for coin in stable_coins:
        involved_row = group[(group['sender_is_involved'] == True) & (group['token_name'] == coin)]
        if not involved_row.empty:
            return coin
 
    return None  

# Group by 'tx_hash' and apply custom function to each group to find the stable coin for all the transactions
selected_stable_token = df_senders.groupby('tx_hash').apply(select_stable_coin).reset_index(name='selected_stable_coin')
selected_stable_token.head()


  selected_stable_token = df_senders.groupby('tx_hash').apply(select_stable_coin).reset_index(name='selected_stable_coin')


Unnamed: 0,tx_hash,selected_stable_coin
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,
1,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Ethereum
2,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,Ethereum
3,0x054d9a64147c776a19391680e82077d31de60d84ef07...,Ethereum
4,0x067ecb28afdb4ebe732845b9321bd2c815bdda53d3ab...,Tether


In [273]:
# Create two dataframes with the dollar values and raw amounts of the stable coins at index 0 and the original index of the transaction
stable_coins_df_index_0 = df_senders[(df_senders['token_name'].isin(stable_coins)) & (df_senders['index'] == 0) & df_senders['raw_amount'] != 0][['tx_hash', 'token_name', 'dollar_value_net', 'raw_amount', 'sender_is_involved', 'sender_gave_this']]
stable_coins_df_index_0 = stable_coins_df_index_0.groupby(['tx_hash', 'token_name'])[['dollar_value_net', 'raw_amount']].sum().reset_index()


stable_coins_df_index_other = df_senders[(df_senders['token_name'].isin(stable_coins)) & (df_senders['index'] != 0) & df_senders['raw_amount'] != 0][['tx_hash', 'token_name', 'token_dollar_value', 'dollar_value_net', 'raw_amount', 'sender_gave_this']]
stable_coins_df_index_other = stable_coins_df_index_other.groupby(['tx_hash', 'token_name'])[['dollar_value_net', 'raw_amount']].sum().reset_index()
stable_coins_df_index_other.head()

Unnamed: 0,tx_hash,token_name,dollar_value_net,raw_amount
0,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Ethereum,-6439.5,2.0
1,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,Ethereum,-3219.75,1.0
2,0x054d9a64147c776a19391680e82077d31de60d84ef07...,Ethereum,-11269.125,3.5
3,0x067ecb28afdb4ebe732845b9321bd2c815bdda53d3ab...,Tether,99999.302626,100000.0
4,0x0dd4f1b0148c804994264c891276e69004e1b4bd7bc3...,Ethereum,-2253.825,0.7


In [274]:
# two dataframes with one row per tx_hash with the value of ther stable token if it was exchanged by the sender 
selected_stable_coins_df_index_0 = pd.merge(selected_stable_token, stable_coins_df_index_0, on = ['tx_hash'])
selected_stable_coins_df_index_0 = selected_stable_coins_df_index_0.drop('token_name', axis = 1)
selected_stable_coins_df_index_0 = selected_stable_coins_df_index_0.rename(columns = {'dollar_value_net' : 'tx_stable_dollar_value_index_0_net', 'raw_amount' : 'stable_raw_amount_index_0'})


selected_stable_coins_df_index_other = pd.merge(selected_stable_token, stable_coins_df_index_other, on = ['tx_hash'])
selected_stable_coins_df_index_other = selected_stable_coins_df_index_other.drop('token_name', axis = 1)
selected_stable_coins_df_index_other = selected_stable_coins_df_index_other.rename(columns = { 'dollar_value_net' : 'tx_stable_dollar_value_index_other_net', 'raw_amount' : 'stable_raw_amount_index_other'})

selected_stable_coins_df_index_other.head() 

Unnamed: 0,tx_hash,selected_stable_coin,tx_stable_dollar_value_index_other_net,stable_raw_amount_index_other
0,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Ethereum,-6439.5,2.0
1,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,Ethereum,-3219.75,1.0
2,0x054d9a64147c776a19391680e82077d31de60d84ef07...,Ethereum,-11269.125,3.5
3,0x067ecb28afdb4ebe732845b9321bd2c815bdda53d3ab...,Tether,99999.302626,100000.0
4,0x0dd4f1b0148c804994264c891276e69004e1b4bd7bc3...,Ethereum,-2253.825,0.7


In [275]:
# create a dataframe where we merge the dataframes to get two rows per transaction with the amount of tokens

filtered_df_all = df_senders.groupby(['tx_hash', 'index', 'token_name', 'sender_gave_this'])['raw_amount'].sum().reset_index()
df_null_0 = filtered_df_all[(filtered_df_all['index'] == 0 )]

merged_null_0_all = pd.merge(df_null_0, selected_stable_coins_df_index_0, on=['tx_hash'], how = 'left')
merged_null_0_all = merged_null_0_all.drop('index', axis = 1)
merged_null_0_all['other_token_value_index_0'] = merged_null_0_all['tx_stable_dollar_value_index_0_net'].abs() / merged_null_0_all['raw_amount']
merged_null_0_all.head()

Unnamed: 0,tx_hash,token_name,sender_gave_this,raw_amount,selected_stable_coin,tx_stable_dollar_value_index_0_net,stable_raw_amount_index_0,other_token_value_index_0
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0xGasless,False,3.967761e+22,,,,
1,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,Hemule,True,2.946374e+23,,,,
2,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Ethereum,True,2.0,Ethereum,-6439.5,2.0,3219.75
3,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Monai,False,6.971418e+21,Ethereum,-6439.5,2.0,9.237001e-19
4,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,0x857ffc55b1aa61a7ff847c82072790cae73cd883,False,1.68596e+19,Ethereum,-3219.75,1.0,1.909743e-16


In [276]:
# calculate the transaction and net transaction dollar value

calculated_dollar_df_all = pd.merge(filtered_df_all, merged_null_0_all.drop('raw_amount', axis = 1), on=['tx_hash', 'token_name', 'sender_gave_this'], how = 'left')
calculated_dollar_df_all['calculated_tx_dollar_value'] = calculated_dollar_df_all['raw_amount']* calculated_dollar_df_all['other_token_value_index_0']
calculated_dollar_df_all['calculated_tx_dollar_value_net']  = np.where(calculated_dollar_df_all['sender_gave_this'], - calculated_dollar_df_all['calculated_tx_dollar_value'], calculated_dollar_df_all['calculated_tx_dollar_value'])
calculated_dollar_df_all.tail()

Unnamed: 0,tx_hash,index,token_name,sender_gave_this,raw_amount,selected_stable_coin,tx_stable_dollar_value_index_0_net,stable_raw_amount_index_0,other_token_value_index_0,calculated_tx_dollar_value,calculated_tx_dollar_value_net
461,0xfbfcc2d4c15d11e6e73530fa49e06f0b4810bc2cc156...,5,USDC,False,1000.0,USDC,1000.0,1000.0,1.0,1000.0,1000.0
462,0xfc4054990dfc6415785d866463cd4d9ae32c37932127...,0,Ethereum,False,3.126788,Ethereum,10067.474338,3.126788,3219.75,10067.474338,10067.474338
463,0xfc4054990dfc6415785d866463cd4d9ae32c37932127...,0,ZynCoin,True,1.9e+23,Ethereum,10067.474338,3.126788,5.298671e-20,10067.474338,-10067.474338
464,0xfc4054990dfc6415785d866463cd4d9ae32c37932127...,3,Ethereum,False,3.095869,Ethereum,10067.474338,3.126788,3219.75,9967.924351,9967.924351
465,0xfc4054990dfc6415785d866463cd4d9ae32c37932127...,3,ZynCoin,True,1.9e+23,Ethereum,10067.474338,3.126788,5.298671e-20,10067.474338,-10067.474338


In [277]:
final_test_all = pd.merge(df_senders, calculated_dollar_df_all, on = ['tx_hash', 'index', 'token_name', 'sender_gave_this'], how = 'left')
final_test_all['dollar_value_net']  = np.where(final_test_all['sender_gave_this'], - final_test_all['dollar_value'], final_test_all['dollar_value'])
final_test_all.head()

Unnamed: 0,tx_hash,index,type,raw_amount_x,dollar_value,token_contract_address,token_name,token_dollar_value,from,to,...,sender_is_involved,sender_gave_this,dollar_value_net,raw_amount_y,selected_stable_coin,tx_stable_dollar_value_index_0_net,stable_raw_amount_index_0,other_token_value_index_0,calculated_tx_dollar_value,calculated_tx_dollar_value_net
0,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,0,Transfer,1.096345e+23,718.582847,0x0026dfbd8dbb6f8d0c88303cc1b1596409fda542,SANSHU!,0.006554,0xa2fdb9b10af2d62d4baba5f165b781794428f385,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c,...,True,False,718.582847,1.096345e+23,Ethereum,-965.925,0.3,8.810411e-21,965.925,965.925
1,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,0,Transfer,0.3,965.925,Ethereum,Ethereum,3219.75,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,...,True,True,-965.925,0.3,Ethereum,-965.925,0.3,3219.75,965.925,-965.925
2,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,1,Transfer,7.313843e+22,479.37485,0x0026dfbd8dbb6f8d0c88303cc1b1596409fda542,SANSHU!,0.006554,0xa2fdb9b10af2d62d4baba5f165b781794428f385,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c,...,True,False,479.37485,7.313843e+22,Ethereum,-965.925,0.3,8.810411e-21,644.379633,644.379633
3,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,1,Transfer,0.3,965.925,Ethereum,Ethereum,3219.75,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,...,True,True,-965.925,0.3,Ethereum,-965.925,0.3,3219.75,965.925,-965.925
4,0x202fc38a52652a0c49927c1771de43939b47e083ba1c...,0,Transfer,7.359291e+23,,0x7f3b4b68ca0238f387d8b1a8fbc002d0e6d4cd5b,0x7f3b4b68ca0238f387d8b1a8fbc002d0e6d4cd5b,,0x77314da6f40f71c3a850c89e1a05c438a0acd405,0x4b882b9c26b3b3afd13307b4ab79ea4ec35e878e,...,True,True,,7.359291e+23,Ethereum,2936.876355,0.912144,3.990706e-21,2936.876355,-2936.876355


In [278]:
def prio_calculated(row):
    if not pd.isnull(row['calculated_tx_dollar_value_net']):
        return row['calculated_tx_dollar_value_net']
    else:
        return row['dollar_value_net']

# Apply the function to create the new column

final_test_all['calculated_dollar_value_priority'] = final_test_all.apply(prio_calculated, axis=1)

results = final_test_all.groupby(['tx_hash', 'index', 'token_name', 'sender_gave_this'])[['calculated_dollar_value_priority']].sum().reset_index()
results.head()

Unnamed: 0,tx_hash,index,token_name,sender_gave_this,calculated_dollar_value_priority
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0,0xGasless,False,13389.170411
1,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0,Hemule,True,-5302.383308
2,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,1,0xGasless,False,12451.92919
3,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,1,Hemule,True,-5302.383308
4,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,0,Ethereum,True,-6439.5


In [283]:
# Define a function to calculate differences in dollars of each coin
def calculate_differences(group):
    if len(group) == 2:
        difference_calculated = group.loc[group['index'] != 0, 'calculated_dollar_value_priority'].values[0] - \
                                group.loc[group['index'] == 0, 'calculated_dollar_value_priority'].values[0]
        return pd.Series({'difference_calculated': difference_calculated})
    else:
        return pd.Series({'difference_calculated': np.nan})

result = results.groupby(['tx_hash', 'token_name']).apply(calculate_differences).reset_index()

# for each tx there is the same amount of rows as tokens where the sender is involved (so 2 in case of normal transactions)
result.head()

  result = results.groupby(['tx_hash', 'token_name']).apply(calculate_differences).reset_index()


Unnamed: 0,tx_hash,token_name,difference_calculated
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0xGasless,-937.241221
1,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,Hemule,0.0
2,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Ethereum,0.0
3,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Monai,-568.328817
4,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,0x857ffc55b1aa61a7ff847c82072790cae73cd883,-153.320721


In [284]:
#get the value of the transaction for each transaction
result.groupby('tx_hash')[['difference_calculated']].sum().reset_index()

Unnamed: 0,tx_hash,difference_calculated
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,-937.241221
1,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,-568.328817
2,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,-153.320721
3,0x054d9a64147c776a19391680e82077d31de60d84ef07...,-834.739871
4,0x067ecb28afdb4ebe732845b9321bd2c815bdda53d3ab...,-549.012222
...,...,...
113,0xf608e9543dd4951df24b8925e7b839a3a48e0dab91c2...,-6633.690459
114,0xf7681b8cc9849d0171bf84cc1af3cc9dce68de21792b...,-208.125521
115,0xfbab2abde357bed89e9a2aa1fb5f3e2919617849d061...,-233.801466
116,0xfbfcc2d4c15d11e6e73530fa49e06f0b4810bc2cc156...,0.000000


In [285]:
#calcualte the delta in stable coins for the tx 
stable_coins_df_index_0_2 = df_senders[(df_senders['token_name'].isin(stable_coins)) & (df_senders['index'] == 0) & df_senders['raw_amount'] != 0][['tx_hash', 'token_name', 'dollar_value_net', 'raw_amount',  'sender_gave_this']]
stable_coins_df_index_0_2 = stable_coins_df_index_0_2.groupby(['tx_hash', 'token_name', 'sender_gave_this'])[['dollar_value_net', 'raw_amount']].sum().reset_index()
stable_coins_df_index_0_2.rename(columns= {'token_name': 'stable_token_name', 'dollar_value_net': 'stable_dollar_value_net', 'raw_amount': 'stable_raw_amount'}, inplace = True)
stable_coins_df_index_0_2.head()

Unnamed: 0,tx_hash,stable_token_name,sender_gave_this,stable_dollar_value_net,stable_raw_amount
0,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Ethereum,True,-6439.5,2.0
1,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,Ethereum,True,-3219.75,1.0
2,0x054d9a64147c776a19391680e82077d31de60d84ef07...,Ethereum,True,-11269.125,3.5
3,0x067ecb28afdb4ebe732845b9321bd2c815bdda53d3ab...,Tether,False,99999.302626,100000.0
4,0x0dd4f1b0148c804994264c891276e69004e1b4bd7bc3...,Ethereum,True,-2253.825,0.7


In [286]:
#calcualte the delta in other coins for the tx
other_coins_df_index_0_2 = df_senders[(df_senders['index'] == 0) & (~df_senders.token_name.isin(stable_coins))][['tx_hash', 'token_name', 'dollar_value_net', 'raw_amount',  'sender_gave_this']]
other_coins_df_index_0_2 = other_coins_df_index_0_2.groupby(['tx_hash', 'token_name', 'sender_gave_this'])[['dollar_value_net', 'raw_amount']].sum().reset_index()
other_coins_df_index_0_2.rename(columns= {'token_name': 'other_token_name', 'dollar_value_net': 'other_dollar_value_net', 'raw_amount': 'other_raw_amount'}, inplace = True)
other_coins_df_index_0_2.head()


Unnamed: 0,tx_hash,other_token_name,sender_gave_this,other_dollar_value_net,other_raw_amount
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0xGasless,False,13389.170411,3.967761e+22
1,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,Hemule,True,-5302.383308,2.946374e+23
2,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Monai,False,1849.391855,6.971418e+21
3,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,0x857ffc55b1aa61a7ff847c82072790cae73cd883,False,0.0,1.68596e+19
4,0x054d9a64147c776a19391680e82077d31de60d84ef07...,MetaZero,False,8840.802613,7.014172e+22


In [287]:
# get the ratios between other token and stable token
token_ratios = pd.merge(other_coins_df_index_0_2, stable_coins_df_index_0_2, on = ['tx_hash'])
token_ratios['ratio_other_to_stable_index_0'] = token_ratios['stable_raw_amount'] / token_ratios['other_raw_amount'] 
token_ratios = token_ratios[['tx_hash', 'other_token_name', 'stable_token_name', 'ratio_other_to_stable_index_0']]
token_ratios.head()

Unnamed: 0,tx_hash,other_token_name,stable_token_name,ratio_other_to_stable_index_0
0,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Monai,Ethereum,2.868857e-22
1,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,0x857ffc55b1aa61a7ff847c82072790cae73cd883,Ethereum,5.931339e-20
2,0x054d9a64147c776a19391680e82077d31de60d84ef07...,MetaZero,Ethereum,4.9898970000000004e-23
3,0x067ecb28afdb4ebe732845b9321bd2c815bdda53d3ab...,Ribbon Finance,Tether,1.389964e-18
4,0x0dd4f1b0148c804994264c891276e69004e1b4bd7bc3...,TAO INU,Ethereum,6.0513889999999996e-24


In [288]:
nearly_final_empty_A = nearly_final[(nearly_final['token_name_A'].isin(stable_coins)) & (nearly_final['token_A_delta_raw_amount'] == 0)]
nearly_final_empty_B = nearly_final[(nearly_final['token_name_B'].isin(stable_coins)) & (nearly_final['token_B_delta_raw_amount'] == 0)]
nearly_final_empty_A = pd.merge(nearly_final_empty_A, token_ratios, on = 'tx_hash')
nearly_final_empty_B = pd.merge(nearly_final_empty_B, token_ratios, on = 'tx_hash')
nearly_final_empty_A['token_A_delta_raw_amount'] = nearly_final_empty_A['token_B_delta_raw_amount'] * nearly_final_empty_A['ratio_other_to_stable_index_0']
nearly_final_empty_B['token_B_delta_raw_amount'] = nearly_final_empty_B['token_A_delta_raw_amount'] * nearly_final_empty_B['ratio_other_to_stable_index_0']

nearly_final_empty_A[['tx_hash', 'token_name_A', 'token_A_delta_raw_amount']].head()


Unnamed: 0,tx_hash,token_name_A,token_A_delta_raw_amount
0,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Ethereum,-0.176513
1,0x054d9a64147c776a19391680e82077d31de60d84ef07...,Ethereum,-0.259256
2,0x0dd4f1b0148c804994264c891276e69004e1b4bd7bc3...,Ethereum,-0.039622
3,0x1237263af074ecda6a6329f59aaf62148038ca8b41ac...,Ethereum,-0.308643
4,0x124213ce582eac769d8fecde46154dffd81fef4e081d...,Ethereum,-0.2


In [289]:
# merge them all to get the delata in all coins
delta_stable = pd.merge(nearly_final,nearly_final_empty_A[['tx_hash', 'token_name_A', 'token_name_B', 'token_A_delta_raw_amount']], on = ['tx_hash', 'token_name_A', 'token_name_B'], how = 'left')
delta_stable.loc[delta_stable['token_A_delta_raw_amount_x'] == 0, 'token_A_delta_raw_amount_x'] = delta_stable['token_A_delta_raw_amount_y']
delta_stable_2 =  pd.merge(delta_stable, nearly_final_empty_B[['tx_hash', 'token_name_A', 'token_name_B', 'token_B_delta_raw_amount']], on = ['tx_hash', 'token_name_A', 'token_name_B'], how = 'left')
delta_stable_2.loc[delta_stable_2['token_B_delta_raw_amount_x'] == 0, 'token_B_delta_raw_amount_x'] = delta_stable_2['token_B_delta_raw_amount_y']
delta_stable_2.head()

Unnamed: 0,tx_hash,token_name_A,token_A_delta_raw_amount_x,token_A_delta_dollar_tenderly,token_name_B,token_B_delta_raw_amount_x,token_B_delta_dollar_tenderly,token_A_delta_raw_amount_y,token_B_delta_raw_amount_y
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0xGasless,-2.777431e+21,-937.241221,Hemule,,0.0,,
1,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Ethereum,-0.1765133,0.0,Monai,-6.152742e+20,-163.221164,-0.176513,
2,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,0x857ffc55b1aa61a7ff847c82072790cae73cd883,-8.028344e+17,0.0,Ethereum,-0.04761883,0.0,,-0.047619
3,0x054d9a64147c776a19391680e82077d31de60d84ef07...,Ethereum,-0.2592561,0.0,MetaZero,-5.19562e+21,-654.866321,-0.259256,
4,0x067ecb28afdb4ebe732845b9321bd2c815bdda53d3ab...,Ribbon Finance,-3.949858e+20,-466.08324,Tether,-549.0161,0.0,,-549.016051


In [290]:
def select_stable_token(row, column_A, column_B):
    if row['token_name_A'] in stable_coins:
        return row[column_A]
    elif row['token_name_B'] in stable_coins:
        return row[column_B]
    else:
        return None  # Return None if neither token is in stable_coins


delta_stable_2['token_name_stable'] = delta_stable_2.apply(lambda row: select_stable_token(row, 'token_name_A', 'token_name_B'), axis=1)
delta_stable_2['token_delta_stable'] = delta_stable_2.apply(lambda row: select_stable_token(row, 'token_A_delta_raw_amount_x', 'token_B_delta_raw_amount_x'), axis=1)

delta_stable_2.head()

Unnamed: 0,tx_hash,token_name_A,token_A_delta_raw_amount_x,token_A_delta_dollar_tenderly,token_name_B,token_B_delta_raw_amount_x,token_B_delta_dollar_tenderly,token_A_delta_raw_amount_y,token_B_delta_raw_amount_y,token_name_stable,token_delta_stable
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,0xGasless,-2.777431e+21,-937.241221,Hemule,,0.0,,,,
1,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,Ethereum,-0.1765133,0.0,Monai,-6.152742e+20,-163.221164,-0.176513,,Ethereum,-0.176513
2,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,0x857ffc55b1aa61a7ff847c82072790cae73cd883,-8.028344e+17,0.0,Ethereum,-0.04761883,0.0,,-0.047619,Ethereum,-0.047619
3,0x054d9a64147c776a19391680e82077d31de60d84ef07...,Ethereum,-0.2592561,0.0,MetaZero,-5.19562e+21,-654.866321,-0.259256,,Ethereum,-0.259256
4,0x067ecb28afdb4ebe732845b9321bd2c815bdda53d3ab...,Ribbon Finance,-3.949858e+20,-466.08324,Tether,-549.0161,0.0,,-549.016051,Tether,-549.016051


In [291]:
dollar_diff = result.groupby('tx_hash')[['difference_calculated']].sum().reset_index()
stable_delta = pd.merge(dollar_diff, delta_stable_2, on = 'tx_hash', how = 'left')
stable_delta

Unnamed: 0,tx_hash,difference_calculated,token_name_A,token_A_delta_raw_amount_x,token_A_delta_dollar_tenderly,token_name_B,token_B_delta_raw_amount_x,token_B_delta_dollar_tenderly,token_A_delta_raw_amount_y,token_B_delta_raw_amount_y,token_name_stable,token_delta_stable
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,-937.241221,0xGasless,-2.777431e+21,-937.241221,Hemule,,0.000000,,,,
1,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,-568.328817,Ethereum,-1.765133e-01,0.000000,Monai,-6.152742e+20,-163.221164,-0.176513,,Ethereum,-0.176513
2,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,-153.320721,0x857ffc55b1aa61a7ff847c82072790cae73cd883,-8.028344e+17,0.000000,Ethereum,-4.761883e-02,0.000000,,-0.047619,Ethereum,-0.047619
3,0x054d9a64147c776a19391680e82077d31de60d84ef07...,-834.739871,Ethereum,-2.592561e-01,0.000000,MetaZero,-5.195620e+21,-654.866321,-0.259256,,Ethereum,-0.259256
4,0x067ecb28afdb4ebe732845b9321bd2c815bdda53d3ab...,-549.012222,Ribbon Finance,-3.949858e+20,-466.083240,Tether,-5.490161e+02,0.000000,,-549.016051,Tether,-549.016051
...,...,...,...,...,...,...,...,...,...,...,...,...
113,0xf608e9543dd4951df24b8925e7b839a3a48e0dab91c2...,-6633.690459,Ribbon Finance,-2.855250e+21,-3369.194805,USDC,-3.316845e+03,0.000000,,-3316.845230,USDC,-3316.845230
114,0xf7681b8cc9849d0171bf84cc1af3cc9dce68de21792b...,-208.125521,Ethereum,-6.464027e-02,-208.125521,Kendu Inu,,0.000000,,,Ethereum,-0.064640
115,0xfbab2abde357bed89e9a2aa1fb5f3e2919617849d061...,-233.801466,Ethereum,-7.261479e-02,-233.801466,Ribbon Finance,,0.000000,,,Ethereum,-0.072615
116,0xfbfcc2d4c15d11e6e73530fa49e06f0b4810bc2cc156...,0.000000,,,,,,,,,,


### Calculate the value in dollars based on the price of the stable coin

In [292]:
stable_value_index_0 = df_senders[(df_senders['index'] == 0) & (df_senders['token_name'].isin(stable_coins))][['tx_hash', 'token_name', 'token_dollar_value']]
stable_value_index_0.rename(columns = {'token_dollar_value' : 'stable_token_dollar_value_0', 'token_name' : 'token_name_stable'}, inplace = True)
stable_value_index_0

Unnamed: 0,tx_hash,token_name_stable,stable_token_dollar_value_0
3,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,Ethereum,3219.75
14,0x202fc38a52652a0c49927c1771de43939b47e083ba1c...,Ethereum,3219.75
32,0x4589dc3b7be6df22ed3657b3310bfff117329a0a7e68...,Ethereum,3219.75
41,0x153a70478d17e082740c30f9d5d20fbca5d298c34cc4...,Ethereum,3219.75
51,0xe2d9647da2db921969932a9050387703e6b317f61906...,Ethereum,3219.75
...,...,...,...
1302,0xc28a601d732927a32c5b9faaad082738b6778b556a43...,Ethereum,3219.75
1313,0x5d3b49daa767c190038a4c6a8ad66dea031ec1e2a2d5...,Ethereum,3219.75
1323,0x4ceedb9e5fc1b1912575173fdbba82e46499672888c2...,Ethereum,3219.75
1332,0xaf41b2b05c808213faad380a847a14442392bc72691b...,Ethereum,3219.75


In [293]:
stable_delta_with_dollar = pd.merge(stable_delta, stable_value_index_0, on = ['tx_hash', 'token_name_stable'], how = 'left')
def calculate_tx_dollar_loss(row):
    if row['token_name_stable'] is not None:
        return row['token_delta_stable'] * row['stable_token_dollar_value_0']

# Apply custom function to create 'delta_dollar' column
stable_delta_with_dollar['delta_dollar'] = stable_delta_with_dollar.apply(calculate_tx_dollar_loss, axis=1)
stable_delta_with_dollar['delta_dollar'].fillna(stable_delta_with_dollar['difference_calculated'], inplace=True)
stable_delta_with_dollar
#stable_delta_with_dollar.drop_duplicates().to_csv('yeah_results-2.csv')

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  stable_delta_with_dollar['delta_dollar'].fillna(stable_delta_with_dollar['difference_calculated'], inplace=True)


Unnamed: 0,tx_hash,difference_calculated,token_name_A,token_A_delta_raw_amount_x,token_A_delta_dollar_tenderly,token_name_B,token_B_delta_raw_amount_x,token_B_delta_dollar_tenderly,token_A_delta_raw_amount_y,token_B_delta_raw_amount_y,token_name_stable,token_delta_stable,stable_token_dollar_value_0,delta_dollar
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,-937.241221,0xGasless,-2.777431e+21,-937.241221,Hemule,,0.000000,,,,,,-937.241221
1,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,-568.328817,Ethereum,-1.765133e-01,0.000000,Monai,-6.152742e+20,-163.221164,-0.176513,,Ethereum,-0.176513,3219.750000,-568.328817
2,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,-153.320721,0x857ffc55b1aa61a7ff847c82072790cae73cd883,-8.028344e+17,0.000000,Ethereum,-4.761883e-02,0.000000,,-0.047619,Ethereum,-0.047619,3219.750000,-153.320721
3,0x054d9a64147c776a19391680e82077d31de60d84ef07...,-834.739871,Ethereum,-2.592561e-01,0.000000,MetaZero,-5.195620e+21,-654.866321,-0.259256,,Ethereum,-0.259256,3219.750000,-834.739871
4,0x067ecb28afdb4ebe732845b9321bd2c815bdda53d3ab...,-549.012222,Ribbon Finance,-3.949858e+20,-466.083240,Tether,-5.490161e+02,0.000000,,-549.016051,Tether,-549.016051,0.999993,-549.012222
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
116,0xf608e9543dd4951df24b8925e7b839a3a48e0dab91c2...,-6633.690459,Ribbon Finance,-2.855250e+21,-3369.194805,USDC,-3.316845e+03,0.000000,,-3316.845230,USDC,-3316.845230,1.000000,-3316.845230
117,0xf7681b8cc9849d0171bf84cc1af3cc9dce68de21792b...,-208.125521,Ethereum,-6.464027e-02,-208.125521,Kendu Inu,,0.000000,,,Ethereum,-0.064640,3219.750000,-208.125521
118,0xfbab2abde357bed89e9a2aa1fb5f3e2919617849d061...,-233.801466,Ethereum,-7.261479e-02,-233.801466,Ribbon Finance,,0.000000,,,Ethereum,-0.072615,3219.750000,-233.801466
119,0xfbfcc2d4c15d11e6e73530fa49e06f0b4810bc2cc156...,0.000000,,,,,,,,,,,,0.000000


### Add the sender row

In [294]:
only_senders = df_senders[['tx_hash', 'sender']].drop_duplicates().reset_index(drop = True)
only_senders.head()

Unnamed: 0,tx_hash,sender
0,0xa9a1533c37d53d461be2821ca53bf04a426903809575...,0xf299dc09ec306e9ed207cdc1296ac6d0d9c5dc7c
1,0x202fc38a52652a0c49927c1771de43939b47e083ba1c...,0x77314da6f40f71c3a850c89e1a05c438a0acd405
2,0xee51506e07ace44eaad85041210d025ac46241526d2d...,0x1f7ea43d283d0ef906ee92ddead883a8f078cbc9
3,0x4589dc3b7be6df22ed3657b3310bfff117329a0a7e68...,0x4e6b065262e3504f2511ef5b8cadc039630803be
4,0x153a70478d17e082740c30f9d5d20fbca5d298c34cc4...,0xdd3d41d3817abe28519f4f5c0890e9c0f0cfe69b


In [295]:
stable_delta_with_dollar = pd.merge(stable_delta_with_dollar, only_senders, on = 'tx_hash')

### Add the token_address row

In [296]:
token_name_address = df_senders[['token_contract_address', 'token_name']].drop_duplicates()
token_name_address.head()

Unnamed: 0,token_contract_address,token_name
2,0x0026dfbd8dbb6f8d0c88303cc1b1596409fda542,SANSHU!
3,Ethereum,Ethereum
10,0x7f3b4b68ca0238f387d8b1a8fbc002d0e6d4cd5b,0x7f3b4b68ca0238f387d8b1a8fbc002d0e6d4cd5b
20,0x8ee325ae3e54e83956ef2d5952d3c8bc1fa6ec27,Fable Of The Dragon
23,0x15ee3f09712f4715904e1923c1ad504a673e88ac,0x15ee3f09712f4715904e1923c1ad504a673e88ac


In [297]:
# Merge for token_name_A
stable_delta_with_dollar = stable_delta_with_dollar.merge(
    token_name_address.rename(columns={'token_name': 'token_name_A', 'token_contract_address': 'token_contract_address_A'}),
    on='token_name_A', how='left')

# Merge for token_name_B
stable_delta_with_dollar = stable_delta_with_dollar.merge(
    token_name_address.rename(columns={'token_name': 'token_name_B', 'token_contract_address': 'token_contract_address_B'}),
    on='token_name_B', how='left')

stable_delta_with_dollar.head()

Unnamed: 0,tx_hash,difference_calculated,token_name_A,token_A_delta_raw_amount_x,token_A_delta_dollar_tenderly,token_name_B,token_B_delta_raw_amount_x,token_B_delta_dollar_tenderly,token_A_delta_raw_amount_y,token_B_delta_raw_amount_y,token_name_stable,token_delta_stable,stable_token_dollar_value_0,delta_dollar,sender,token_contract_address_A,token_contract_address_B
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,-937.241221,0xGasless,-2.777431e+21,-937.241221,Hemule,,0.0,,,,,,-937.241221,0xb15300f1eb79782eae04f10529adc0e1b85aa9aa,0x5fc111f3fa4c6b32eaf65659cfebdeed57234069,0xeaa63125dd63f10874f99cdbbb18410e7fc79dd3
1,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,-568.328817,Ethereum,-0.1765133,0.0,Monai,-6.152742e+20,-163.221164,-0.176513,,Ethereum,-0.176513,3219.75,-568.328817,0x725ef823a0c7ea654561e13796a9d81a9aa8398a,Ethereum,0x8c282c35b5e1088bb208991c151182a782637699
2,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,-153.320721,0x857ffc55b1aa61a7ff847c82072790cae73cd883,-8.028344e+17,0.0,Ethereum,-0.04761883,0.0,,-0.047619,Ethereum,-0.047619,3219.75,-153.320721,0xf10c20623790b8a92d1cb32d0e0b8a0384179130,0x857ffc55b1aa61a7ff847c82072790cae73cd883,Ethereum
3,0x054d9a64147c776a19391680e82077d31de60d84ef07...,-834.739871,Ethereum,-0.2592561,0.0,MetaZero,-5.19562e+21,-654.866321,-0.259256,,Ethereum,-0.259256,3219.75,-834.739871,0x941111f2be8ed9b4e0ce7cea556c8e1eee7077c2,Ethereum,0x328a268b191ef593b72498a9e8a481c086eb21be
4,0x067ecb28afdb4ebe732845b9321bd2c815bdda53d3ab...,-549.012222,Ribbon Finance,-3.949858e+20,-466.08324,Tether,-549.0161,0.0,,-549.016051,Tether,-549.016051,0.999993,-549.012222,0x9de09e96aa328cf6c0b4a67a62aaf464d084459e,0x6123b0049f904d730db3c36a31167d9d4121fa6b,0xdac17f958d2ee523a2206206994597c13d831ec7


###  Get the delta in ether at the same hour

In [298]:
#make sure the final df stable_delta_with_dollar has a timestamp column
timestamps_per_transaction = df_senders[['tx_hash', 'timestamp']].drop_duplicates()
stable_delta_with_dollar = pd.merge(stable_delta_with_dollar, timestamps_per_transaction, on = 'tx_hash')

In [299]:
#value of the weth token at the different timestamps of the dataset
weth_values = df_senders[df_senders['token_name'] == 'WETH'].groupby(['timestamp'])[['token_dollar_value']].mean().reset_index()
weth_values.rename(columns= {'token_dollar_value' : 'weth_dollar_value'}, inplace = True)
weth_values.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   timestamp          3 non-null      int64  
 1   weth_dollar_value  3 non-null      float64
dtypes: float64(1), int64(1)
memory usage: 180.0 bytes


In [300]:
all_timestamps = pd.DataFrame({'timestamp': df_senders['timestamp'].unique()})
all_timestamps.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 106 entries, 0 to 105
Data columns (total 1 columns):
 #   Column     Non-Null Count  Dtype
---  ------     --------------  -----
 0   timestamp  106 non-null    int64
dtypes: int64(1)
memory usage: 980.0 bytes


In [301]:
#one df with all the timestamps and the weth dollar value for the nearest timestamp
weth_values_all = pd.merge_asof(all_timestamps, weth_values, on='timestamp', direction='nearest')
weth_values_all['weth_dollar_value'] = weth_values_all['weth_dollar_value'].astype(float)
weth_values_all.head()

Unnamed: 0,timestamp,weth_dollar_value
0,1709718599,3217.040039
1,1709718839,3217.040039
2,1709718971,3217.040039
3,1709719115,3217.040039
4,1709719163,3217.040039


In [302]:
stable_delta_with_dollar = pd.merge(stable_delta_with_dollar, weth_values_all, on = 'timestamp')
#calculate the value of the tx loss in eth
stable_delta_with_dollar['delta_eth'] = stable_delta_with_dollar['delta_dollar'] / stable_delta_with_dollar['weth_dollar_value']
stable_delta_with_dollar.head()

Unnamed: 0,tx_hash,difference_calculated,token_name_A,token_A_delta_raw_amount_x,token_A_delta_dollar_tenderly,token_name_B,token_B_delta_raw_amount_x,token_B_delta_dollar_tenderly,token_A_delta_raw_amount_y,token_B_delta_raw_amount_y,token_name_stable,token_delta_stable,stable_token_dollar_value_0,delta_dollar,sender,token_contract_address_A,token_contract_address_B,timestamp,weth_dollar_value,delta_eth
0,0x02a5aed1bec0904ffe147e0e13cb029d4e4790e42dff...,-937.241221,0xGasless,-2.777431e+21,-937.241221,Hemule,,0.0,,,,,,-937.241221,0xb15300f1eb79782eae04f10529adc0e1b85aa9aa,0x5fc111f3fa4c6b32eaf65659cfebdeed57234069,0xeaa63125dd63f10874f99cdbbb18410e7fc79dd3,1709720003,3217.040039,-0.291337
1,0x03dd2c8d113eb60e10d571b99a419e4fb7e4f437e803...,-568.328817,Ethereum,-0.1765133,0.0,Monai,-6.152742e+20,-163.221164,-0.176513,,Ethereum,-0.176513,3219.75,-568.328817,0x725ef823a0c7ea654561e13796a9d81a9aa8398a,Ethereum,0x8c282c35b5e1088bb208991c151182a782637699,1709722667,3217.040039,-0.176662
2,0x04b1c431a72f7641fe176d132b2938dc24ca2ea2d522...,-153.320721,0x857ffc55b1aa61a7ff847c82072790cae73cd883,-8.028344e+17,0.0,Ethereum,-0.04761883,0.0,,-0.047619,Ethereum,-0.047619,3219.75,-153.320721,0xf10c20623790b8a92d1cb32d0e0b8a0384179130,0x857ffc55b1aa61a7ff847c82072790cae73cd883,Ethereum,1709732903,3217.040039,-0.047659
3,0x054d9a64147c776a19391680e82077d31de60d84ef07...,-834.739871,Ethereum,-0.2592561,0.0,MetaZero,-5.19562e+21,-654.866321,-0.259256,,Ethereum,-0.259256,3219.75,-834.739871,0x941111f2be8ed9b4e0ce7cea556c8e1eee7077c2,Ethereum,0x328a268b191ef593b72498a9e8a481c086eb21be,1709727623,3217.040039,-0.259475
4,0x067ecb28afdb4ebe732845b9321bd2c815bdda53d3ab...,-549.012222,Ribbon Finance,-3.949858e+20,-466.08324,Tether,-549.0161,0.0,,-549.016051,Tether,-549.016051,0.999993,-549.012222,0x9de09e96aa328cf6c0b4a67a62aaf464d084459e,0x6123b0049f904d730db3c36a31167d9d4121fa6b,0xdac17f958d2ee523a2206206994597c13d831ec7,1709724935,3217.040039,-0.170658


### Remove the transactions that are not interesting 

In [303]:
problematic_hashes = problematic_transactions['tx_hash or block']
stable_delta_with_dollar = stable_delta_with_dollar[~stable_delta_with_dollar['tx_hash'].isin(problematic_hashes)]
stable_delta_with_dollar = stable_delta_with_dollar.drop_duplicates()

### Cleanup by keeping only the interseting columns before exporting the file

In [304]:
columns_to_keep = ['tx_hash', 'sender', 'delta_eth', 'delta_dollar', 'token_name_A', 'token_contract_address_A', 'token_A_delta_raw_amount_x', 'token_name_B', 'token_contract_address_B', 'token_B_delta_raw_amount_x']
final = stable_delta_with_dollar[columns_to_keep]

In [305]:
final = final.rename(columns = {'token_A_delta_raw_amount_x' : 'delta_token_A', 'token_B_delta_raw_amount_x' : 'delta_token_B'})
final.to_csv(f'data/results/final_results_{name_of_incident}.csv')

In [306]:
print("total potential loss in dollars for the given transactions if using calculated dollar values", final.delta_dollar.sum())

total potential loss in dollars for the given transactions if using calculated dollar values -88437.68901918392


In [320]:
df_main = pd.read_csv(csv_file_path)
tx_hash_list = [x for x in df_main[' user_tx'].to_list() if pd.notnull(x)]
print("FINAL STATS:",
      "\nOut of the", df_main[' user_tx'].nunique(), "original transactions,",
      "\nwe were able to find deltas in ethereum and dollar by simulating if the transactions had been on top of block",
      "\nfor", final.tx_hash.nunique(), "transactions in total.")

FINAL STATS: 
Out of the 130 original transactions, 
we were able to find deltas in ethereum and dollar by simulating if the transactions had been on top of block 
for 114 transactions in total.
