# Calculating potential loss of transaction intent leakage

Input: a list of transaction hashes that we think have been leaked 
Output: a file with the potential loss in dollars for each transaction as well as the total potential loss in dollar 

## The methodology: 
1. We have a list of transaction hashes that we think have been leaked 
2. We get the details of this transaction using the Infuria and Etherscan APIs
3. We simulate the result of each transaction if it would have been top of the block using the Tenderly API
4. We calculate the difference in dollars for each transaction
5. We sum the potential loss of each transaction to get the total potential loss in dollars. 


## Usage
1. Have a csv file with one column called ' user_tx' (space is important) having all the transactions hashes that you think were leeked
2. open the config_example.py file and follow the instructions for the configuration
3. run the Jupyter Notebook. The output file is called results.csv - but the Jupyter Notebook is full of interesting information.


In [1]:
#todo add requirements.txt
import pandas as pd
import numpy as np 
import json
import requests 
from configurations import *

In [2]:
df_main = pd.read_csv(csv_file_path)
#drop the duplicate transaction in case there is any
df_main = df_main.drop_duplicates(subset=[' user_tx'])
df_main.head()

Unnamed: 0,block_number,user_tx,fees
0,19412019,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,18613219333282008
1,19412030,0x354e8386267ca643793de913739df3f9680895776f3a...,46947531382579069
2,19412041,0xee8fd2c76181afa14ca0da158e0a01bba2d3df8e62c5...,61825063809872640
3,19412043,0x5f3954cc3cb4fbb88803a910d852aab6566af2866acf...,128344417502120006
4,19412043,0xbd9c86df3327871d630a0921c65cb7bec9b0c199d80c...,36408612437377108


In [3]:
#make a list of transaction hash we need to analyse
tx_hash_list = [x for x in df_main[' user_tx'].to_list() if pd.notnull(x)]
print(f'There are {len(tx_hash_list)} transactions')

There are 37 transactions


### Infuria
With this API, we want to get all the inputs necessary to simulate the transaction again later on. Infuria gives us all of these inputs except fot the timestamp of the transaction, which is why we need to use the Etherscan API later on.

#### Call Infuria for the first transaction

In [4]:
#the api key is stored in the config file
#todo: put url as variable not just the key
url = f"https://mainnet.infura.io/v3/{infura_api_key}"

#Get the infuria response for the first transaction in the list to create a dataframe
payload = json.dumps({
  "jsonrpc": "2.0",
  "method": "eth_getTransactionByHash",
  "params": [tx_hash_list[0]],
  "id": 1
})
headers = {
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)
dct = response.json()['result']
dct = {k: None if not v else v for k, v in dct.items()} # making sure none of the values are empty
df_infuria = pd.DataFrame(dct, index=[0])
df_infuria.head()

Unnamed: 0,accessList,blockHash,blockNumber,chainId,from,gas,gasPrice,hash,input,maxFeePerGas,maxPriorityFeePerGas,nonce,r,s,to,transactionIndex,type,v,value,yParity
0,,0x2c1ca96b35f726a2171d3ebc050e390aca8dd1f83a23...,0x1283433,0x1,0x36a2ffb33c1b427c46c3d30adac3ca4e8ed36179,0x30eb5,0x11cd060113,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,0x4a25d94a000000000000000000000000000000000000...,0x187a3c7e80,0x5e69ec0,0x707,0xad8743a8c08357fb122fe099ce4ab1cc5155197bf17d...,0x6ac3aca96164f1bf4582dc771aa945939eccac0c562a...,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,0x2,0x2,0x0,0x0,0x0


#### Call Infuria for all other transactions - !! This will take a few minutes

In [5]:
# get the infuria response for all the other transactions in the list and append the results to the above dataframe
for tx_hash in tx_hash_list[1:]:
  payload = json.dumps({
    "jsonrpc": "2.0",
    "method": "eth_getTransactionByHash",
    "params": [tx_hash],
    "id": 1
  })
  headers = {
    'Content-Type': 'application/json'
  }

  response = requests.request("POST", url, headers=headers, data=payload)

  if response.ok:
    dct = response.json()['result']
    dct = {k: None if not v else v for k, v in dct.items()} # making sure none of the values are empty
    df_temp = pd.DataFrame(dct, index=[0])
    df_infuria = pd.concat([df_infuria, df_temp])
    
  else: 
    print(f"error code {response.status_code} for transaction {tx_hash}")

df_infuria.head()


Unnamed: 0,accessList,blockHash,blockNumber,chainId,from,gas,gasPrice,hash,input,maxFeePerGas,maxPriorityFeePerGas,nonce,r,s,to,transactionIndex,type,v,value,yParity
0,,0x2c1ca96b35f726a2171d3ebc050e390aca8dd1f83a23...,0x1283433,0x1,0x36a2ffb33c1b427c46c3d30adac3ca4e8ed36179,0x30eb5,0x11cd060113,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,0x4a25d94a000000000000000000000000000000000000...,0x187a3c7e80,0x5e69ec0,0x707,0xad8743a8c08357fb122fe099ce4ab1cc5155197bf17d...,0x6ac3aca96164f1bf4582dc771aa945939eccac0c562a...,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,0x2,0x2,0x0,0x0,0x0
0,,0xdd26fb3d61779a2527538b20a6710090ef329f09a3c0...,0x128343e,0x1,0x4ffb89a61a6db0586aff308efcfce39207aed2b2,0x59970,0x1254a6dbb8,0x354e8386267ca643793de913739df3f9680895776f3a...,0x24856bc3000000000000000000000000000000000000...,0x15bb9caac7,0x854d13a5,0x491,0x498ee90a0565829b8b473ede7b76174226bc1d5ed8ad...,0x2289f4cb7afc50f5db6b839681ae3f5d999e1bcf0eff...,0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad,0x1,0x2,0x1,0x0,0x1
0,,0x657b57c868eb09df8631cca1ddd349ee8ca921e33082...,0x1283449,0x1,0xbbf46514de004992d8dcbcec19f02f2e772ab51d,0x4ab7a,0x1172aa4101,0xee8fd2c76181afa14ca0da158e0a01bba2d3df8e62c5...,0x3593564c000000000000000000000000000000000000...,0x11bf4b06ba,0xe57e0,0x121,0xa80847747875f34798ef26d97d4ea1002a6ec099cf85...,0x193addaa244d04a85bfc29027fb9d26c6d5e8d1ab757...,0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad,0x2,0x2,0x1,0x0,0x1
0,,0x4163d5676b7bd172a054a5d85dda15ae43563d24ac0c...,0x128344b,0x1,0xaed06a6a9737ac56b5c3c7ecdfb233d70d64fc38,0xe6be,0xfdc73a9d9,0x5f3954cc3cb4fbb88803a910d852aab6566af2866acf...,0x095ea7b3000000000000000000000000000000000022...,0x1469ab8236,0x99d399e3,0x0,0x4f715cf4d4321da48de6d3fcdb07cdfca6b0c28366ff...,0xddc690541719b963ec3ab29167402aa86ca7b0b14bdc...,0xdac17f958d2ee523a2206206994597c13d831ec7,0x4,0x2,0x0,0x0,0x0
0,,0x4163d5676b7bd172a054a5d85dda15ae43563d24ac0c...,0x128344b,0x1,0xaed06a6a9737ac56b5c3c7ecdfb233d70d64fc38,0x4b07c,0xfdc73a9d9,0xbd9c86df3327871d630a0921c65cb7bec9b0c199d80c...,0x24856bc3000000000000000000000000000000000000...,0x1469ab8236,0x99d399e3,0x1,0x86b62356a6073de56805fc48478b472a06a3ccbfb5cc...,0x42c6bb462a201bbc2e2621f591c4a2b7cb293206b369...,0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad,0x5,0x2,0x0,0x0,0x0


#### merge infuria response to main

In [6]:
df_main = df_main.merge(df_infuria, left_on = ' user_tx', right_on = 'hash', how ='outer')
df_main.columns

Index(['block_number', ' user_tx', ' fees', 'accessList', 'blockHash',
       'blockNumber', 'chainId', 'from', 'gas', 'gasPrice', 'hash', 'input',
       'maxFeePerGas', 'maxPriorityFeePerGas', 'nonce', 'r', 's', 'to',
       'transactionIndex', 'type', 'v', 'value', 'yParity'],
      dtype='object')

#### save the api results
Store the results in a file so that we do not have to rerun the calls if we need this data again

In [7]:
df_main.to_csv(f'results/{name_of_incident}_infuria.csv')

In [8]:
df_main = pd.read_csv(f'results/{name_of_incident}_infuria.csv')

### Etherscan 

We use this API to have the timestamp of the blocks rather than the transactions themselves to reduce the amount of API calls (tx and block time are the same for all tx in the block). We need the timestamp of the transactions because in the Tenderly API, if we do not override the timestamp, then it uses the current time as input variable

In [9]:
#getting all the block numbers of the transactions we want to analyse
block_number_list = list(set([x for x in df_main['block_number'].to_list() if pd.notnull(x)]))
print(f'there are {len(block_number_list)} different blocks')

there are 30 different blocks


#### Get the ehterscan response for the first block in the list to create a dataframe

In [10]:
url_eth = f"https://api.etherscan.io/api?module=block&action=getblockreward&blockno={block_number_list[0]}&apikey={eth_scan_api_key}"

response_eth = requests.request("POST", url_eth)

dct_eth = response_eth.json()['result']
dct_eth = {k: None if not v else v for k, v in dct_eth.items()} # making sure none of the values are empty
df_eth = pd.DataFrame(dct_eth, index=[0])
df_eth.head()

Unnamed: 0,blockNumber,timeStamp,blockMiner,blockReward,uncles,uncleInclusionReward
0,19412096,1710162983,0x95222290dd7278aa3ddd389cc1e1d165cc4bafe5,85616354749290394,,0


#### Get the etherscan responses for the other blocks in the list - !!! This will take a ffew minutes

In [11]:
for block in block_number_list[1:]:
  url_temp = f"https://api.etherscan.io/api?module=block&action=getblockreward&blockno={block}&apikey={eth_scan_api_key}"
  response_temp = requests.request("POST", url_temp)

  if response_temp.ok:
    dct_temp = response_temp.json()['result']
    dct_temp = {k: None if not v else v for k, v in dct_temp.items()} # making sure none of the values are empty
    df_temp = pd.DataFrame(dct_temp, index=[0])
    df_eth = pd.concat([df_eth, df_temp])
    
  else: 
    print(f"error code {response_temp.status_code} for block {block}")

In [12]:
# quick cleaning for later
df_eth['blockNumber'] = df_eth['blockNumber'].astype(int)

#### merge etherscan results to main

In [13]:
df_main = df_main.merge(df_eth, left_on = 'block_number', right_on = 'blockNumber', how ='outer')
df_main.columns

Index(['Unnamed: 0', 'block_number', ' user_tx', ' fees', 'accessList',
       'blockHash', 'blockNumber_x', 'chainId', 'from', 'gas', 'gasPrice',
       'hash', 'input', 'maxFeePerGas', 'maxPriorityFeePerGas', 'nonce', 'r',
       's', 'to', 'transactionIndex', 'type', 'v', 'value', 'yParity',
       'blockNumber_y', 'timeStamp', 'blockMiner', 'blockReward', 'uncles',
       'uncleInclusionReward'],
      dtype='object')

#### save the results of etherscan 
Store the results in a file so that we do not have to rerun the calls if we need this data again


In [14]:
df_main.to_csv(f'results/{name_of_incident}_etherscan.csv')

In [15]:
df_main = pd.read_csv(f'results/{name_of_incident}_etherscan.csv')

### Tenderly !! This will take a few minutes
Here we finally do the simulation. We do it once at the original index position to get the amount of coin transferred originally. Then we do it again at index position 0.

In [16]:
headers = {
    'X-Access-Key': f'{tenderly_access_token}',
    'content-type': 'application/json',
}

#creating an empty DataFrame for the results
columns = ['tx_hash', 'index', 'type', 'raw_amount', 'dollar_value', 'token_contract_address', 'token_name', 'token_dollar_value', 'from', 'to', 'sender']
df_results = pd.DataFrame(columns = columns)

# creating a list of tx hashes where the tenderly api returned nothing
tx_hash_problem_list = []

#iterating over every row of the main dataframe (one row is one transaction)
for index, row in df_main.iterrows():
    tx_index_list = [0]
    tx_index_list.append(int(row['transactionIndex'], 0))
    # for each transaction, simulate twice: once for each index
    for tx_index in tx_index_list:
        json_data = {
        'network_id': int(row['chainId'], 0),
        'from': row['from'],
        'to': row['to'],
        'input': row['input'],
        'block_number': row['block_number'],
        'transaction_index': tx_index,
        'simulation_type': 'quick',
        'gas': int(row['gas'], 0),
        'value': int(row['value'], 0),
        'gas_price': int(row['gasPrice'], 0),
        'l1_timestamp': int(row['timeStamp'])
        }
        
        response = requests.post(
        'https://api.tenderly.co/api/v1/account/aurelie2/project/cowswap2/simulate',
        headers=headers,
        json=json_data,
        )

        try:
            for data in response.json()['transaction']['transaction_info']['asset_changes']:
                tx_type = data['type']
                tx_raw_amount = data['raw_amount']
                tx_dollar_value = data['dollar_value']
                sender = response.json()['transaction']['from']

                #sometimes the following values are empty 
                try:
                    contract_address = data['token_info']['contract_address']
                except:
                    contract_address = 'None'

                try:
                    token_name = data['token_info']['name']
                except:
                    token_name = 'None'
                try:
                    token_dollar_value = data['token_info']['dollar_value']
                except:
                    token_dollar_value = 'None'

                try:
                    tx_from = data['from']
                except:
                    tx_from = 'None'

                try:
                    tx_to = data['to']
                except: 
                    tx_to = 'None'

                new_row = {
                    'tx_hash' : row['hash'],
                    'from':tx_from,
                    'to': tx_to,
                    'index' : tx_index, 
                    'type': tx_type, 
                    'raw_amount': tx_raw_amount, 
                    'dollar_value' : tx_dollar_value, 
                    'token_contract_address': contract_address, 
                    'token_name': token_name, 
                    'token_dollar_value': token_dollar_value,
                    'sender': sender
                    }
                df_results = pd.concat([df_results, pd.DataFrame([new_row])], ignore_index=True)
        except:
            tx_hash_problem_list.append(row['hash'])
            


#### creating a new dataframe with only the transactions where tenderly api returned something.
This meanse that there are 2 index values for these good transactions.

In [17]:
grouped = df_results.groupby('tx_hash')
df_results_good = grouped.filter(lambda x: x['index'].nunique() == 2)
df_results_good.head()

Unnamed: 0,tx_hash,index,type,raw_amount,dollar_value,token_contract_address,token_name,token_dollar_value,from,to,sender
0,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,0,Transfer,807452531860501402532,,0xb0699d63aef20df3f1cffa9ca2bb8670416271d2,,,0x36a2ffb33c1b427c46c3d30adac3ca4e8ed36179,0xe1ebdf64f7f3a31723e767a561345f958233bb7d,0x36a2ffb33c1b427c46c3d30adac3ca4e8ed36179
1,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,0,Transfer,400000000000000000,1417.68798828125,0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2,WETH,3544.219970703125,0xe1ebdf64f7f3a31723e767a561345f958233bb7d,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,0x36a2ffb33c1b427c46c3d30adac3ca4e8ed36179
2,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,0,Burn,400000000000000000,1417.68798828125,0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2,WETH,3544.219970703125,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,,0x36a2ffb33c1b427c46c3d30adac3ca4e8ed36179
3,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,0,Transfer,400000000000000000,1417.69599609375,,Ethereum,3544.239990234375,0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,0x36a2ffb33c1b427c46c3d30adac3ca4e8ed36179
4,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,0,Transfer,400000000000000000,1417.69599609375,,Ethereum,3544.239990234375,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,0x36a2ffb33c1b427c46c3d30adac3ca4e8ed36179,0x36a2ffb33c1b427c46c3d30adac3ca4e8ed36179


In [18]:
print(
    'number of transactions for which this approach worked:', df_results_good.tx_hash.nunique(),
    "\nnumber of transactions for which this approach did not worked:", len(set(tx_hash_problem_list)),
    '\npercentage of properly simulated transactions:', format(df_results_good.tx_hash.nunique()/(df_results_good.tx_hash.nunique() + len(set(tx_hash_problem_list))), ".2%")
    )


number of transactions for which this approach worked: 28 
number of transactions for which this approach did not worked: 9 
percentage of properly simulated transactions: 75.68%


#### Save the tenderly results 

In [19]:
df_results_good.to_csv(f'results/{name_of_incident}_tenderly.csv')

In [140]:
df_results_good = pd.read_csv(f'results/{name_of_incident}_tenderly.csv')

In [141]:
df_tx_hash_problem = pd.DataFrame(data={"tx_hash_problem": tx_hash_problem_list})
df_tx_hash_problem.to_csv(f'results/{name_of_incident}_tx_hash_problem.csv')

### Calculate the dollar value of the potential loss for the coins that don't have a dollar value
The logic here is 
1. filter out the parts of transactions that don't have the sender address in them (we find the sender as it's the first address in each transaction)
2. calculate the difference in dollar if the simulation gives it to us: dollar_value_diff
3. if we do not have any dollar value for a coin, there will always be a WETH value in the transaction. So we will get the value of the unknown coin based on the WETH value: calulated_dollar_diff
4. we get the loss for that transaction (dollar_diff) by chosing the dollar_value_diff (if exists) or calulated_dollar_diff (if dollar_value_diff does not exist) 
5. we sum the dollar_diff to calculte the total loss over all transaction

#### 1. Only keep the sender rows

In [205]:
df_results_good['raw_amount'] = df_results_good['raw_amount'].astype(float)
df_results_good.tx_hash.nunique()

28

In [382]:
df_results_good['is_sender'] = (df_results_good['from'] == df_results_good['sender']) |  (df_results_good['to'] == df_results_good['sender'])
df_results_good['sender_gave_this'] = (df_results_good['from'] == df_results_good['sender'])
df_senders = df_results_good.copy()
columns_to_keep = ['tx_hash', 'index', 'raw_amount', 'dollar_value', 'token_contract_address', 'token_name', 'token_dollar_value', 'sender_gave_this']
df_senders = df_senders[df_senders['is_sender']]

#### 2.. dollar_value_diff

Replace the token_contract_address with Ethereum when token_name = Ethereum


In [383]:
df_senders.loc[df_senders['token_name'] == 'Ethereum', 'token_contract_address'] = 'Ethereum'
# we sort to make sure the original index is always on top
df_senders = df_senders.sort_values(by = ['tx_hash', 'index'], ascending = [True, False])
df_senders.head()

Unnamed: 0.1,Unnamed: 0,tx_hash,index,type,raw_amount,dollar_value,token_contract_address,token_name,token_dollar_value,from,to,sender,is_sender,sender_gave_this
237,260,0x012778bb6330737bed53ca488e582500498d81e1db22...,2,Transfer,1.110371e+28,1202.387914,0x4fe8d4775b7cb2546b9ee86182081cdf8f77b053,KAIJUNO8,1.08287e-07,0xda8f06e5e0fc0900fd09ec98a72630d3623e1158,0x9ea02f652955b90c0dd4f256003e4e339d3a4184,0x9ea02f652955b90c0dd4f256003e4e339d3a4184,True,False
238,261,0x012778bb6330737bed53ca488e582500498d81e1db22...,2,Transfer,2e+17,708.847998,Ethereum,Ethereum,3544.24,0x9ea02f652955b90c0dd4f256003e4e339d3a4184,0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad,0x9ea02f652955b90c0dd4f256003e4e339d3a4184,True,True
232,255,0x012778bb6330737bed53ca488e582500498d81e1db22...,0,Transfer,1.654453e+28,1791.557918,0x4fe8d4775b7cb2546b9ee86182081cdf8f77b053,KAIJUNO8,1.08287e-07,0xda8f06e5e0fc0900fd09ec98a72630d3623e1158,0x9ea02f652955b90c0dd4f256003e4e339d3a4184,0x9ea02f652955b90c0dd4f256003e4e339d3a4184,True,False
233,256,0x012778bb6330737bed53ca488e582500498d81e1db22...,0,Transfer,2e+17,708.847998,Ethereum,Ethereum,3544.24,0x9ea02f652955b90c0dd4f256003e4e339d3a4184,0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad,0x9ea02f652955b90c0dd4f256003e4e339d3a4184,True,True
45,53,0x08e622acdf6b27fe26f24e33815bb1a0789158f2d326...,2,Transfer,6.594532e+26,6601.126374,0xd807f7e2818db8eda0d28b5be74866338eaedb86,Jim,1.001e-05,0xe342253d5a0c1ac9da0203b0256e33c5cfe084f0,0x76ec733f445358232ea24aaf03d4536057439bfc,0x76ec733f445358232ea24aaf03d4536057439bfc,True,False


In [384]:
#first if token name is WETH, then the raw_amount need tos be converted. Every transaction has WETH so I will base the analysis on that 
df_results_good['raw_amount'] = df_results_good['raw_amount'].astype(float)
df_results_good.loc[df_results_good['token_name'] == 'WETH', 'raw_amount'] /= 1e+18
weth_df = df_results_good[(df_results_good['token_name'] == 'WETH') & (df_results_good['index'] == 0) & df_results_good['raw_amount'] != 0][['tx_hash', 'token_dollar_value', 'dollar_value', 'raw_amount']]
weth_df.rename(columns = {'token_dollar_value': '1_weth_dollar_value_index_0', 'dollar_value' : 'tx_weth_dollar_value_index_0', 'raw_amount' : 'weth_raw_amount_index_0'}, inplace = True)
weth_df = weth_df.drop_duplicates()
#the weth df has one row per transaaction with the value of weth at index 0
weth_df.head()

Unnamed: 0,tx_hash,1_weth_dollar_value_index_0,tx_weth_dollar_value_index_0,weth_raw_amount_index_0
1,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,3544.219971,1417.687988,4e-235
10,0xee8fd2c76181afa14ca0da158e0a01bba2d3df8e62c5...,3544.219971,30579.732767,8.628057e-234
18,0x77929a2b313bd1f5b15c9dbc61cc598d21e6f3594b30...,3544.219971,708.843994,2e-235
28,0x4f9b4cddc272c3803df798041fff63ae7e439982810f...,3544.219971,14006.496884,3.951927e-234
38,0x08e622acdf6b27fe26f24e33815bb1a0789158f2d326...,3544.219971,10632.659912,3e-234


In [385]:
filtered_df = df_senders[(df_senders['dollar_value'].isnull())]
filtered_df = filtered_df.groupby(['tx_hash', 'index', 'token_contract_address', 'sender_gave_this'])['raw_amount'].sum().reset_index()
df_null_0 = filtered_df[(filtered_df['index'] == 0 )]

merged_null_0 = pd.merge(df_null_0, weth_df, on='tx_hash', how = 'left') #we only care about the transactions without any dollar value
merged_null_0 = merged_null_0.drop('index', axis = 1)
merged_null_0['other_token_value_index_0'] = merged_null_0['tx_weth_dollar_value_index_0']/ merged_null_0['raw_amount']
# one row per transaction with the amount of token where there is no dollar value
merged_null_0

Unnamed: 0,tx_hash,token_contract_address,sender_gave_this,raw_amount,1_weth_dollar_value_index_0,tx_weth_dollar_value_index_0,weth_raw_amount_index_0,other_token_value_index_0
0,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,0xb0699d63aef20df3f1cffa9ca2bb8670416271d2,True,8.074525e+20,3544.219971,1417.687988,4e-235,1.755754e-18
1,0x77929a2b313bd1f5b15c9dbc61cc598d21e6f3594b30...,0x1db61c337e5216941f53e6a0e41eed9640aec8bb,False,2.103232e+23,3544.219971,708.843994,2e-235,3.37026e-21
2,0x80756a409316570be65efd0861a99cc6c26adecbc1f2...,0xc548e90589b166e1364de744e6d35d8748996fe8,False,325968200000000.0,3544.219971,1660.649063,4.685514e-235,5.094512e-12
3,0xa231f74955550b91c045955dcba05ce6793104ffdc32...,0x0fa0ed0cbe0412379cd181320c93448968c76c1c,True,2.5e+23,3544.219971,16330.179871,4.607553e-234,6.532072e-20
4,0xaad7774f70fccad4b551f32964baa56254dc0796f8ea...,0xc548e90589b166e1364de744e6d35d8748996fe8,False,739617700000000.0,3544.219971,3498.124662,9.869942e-235,4.729639e-12
5,0xb9de9f69f52607d2661ef19d7ddd565d86d56ee8a8ef...,0x50a69cea809b4afed9a31a72f049a5b0b33bf5e3,False,1.603558e+23,3544.219971,2365.662937,6.674707e-235,1.475259e-20
6,0xc1a796c24a30a5c110b7cea4123085339445bb1953c0...,0x56b8be7c2d3ffe0d8d6feb4d4eb4650c3ea10bb6,False,1.036674e+23,3544.219971,3544.219971,1e-234,3.418836e-20
7,0xe081cfe651591ab6525ad185ca38ad9bfe3f6154e981...,0x857ffc55b1aa61a7ff847c82072790cae73cd883,True,1e+19,3544.219971,6609.582385,1.864891e-234,6.609582e-16
8,0xe3871507ad9fe26dd3b88e5b30185726417a222d2105...,0x1db61c337e5216941f53e6a0e41eed9640aec8bb,False,1.012629e+23,3544.219971,354.421997,1e-235,3.500017e-21


In [387]:
filtered_df = filtered_df[['tx_hash', 'index', 'raw_amount', 'token_contract_address', 'sender_gave_this']]
# sum the amount of same token per transaction and index so that we have one token per transaction
filtered_df.head()

Unnamed: 0,tx_hash,index,raw_amount,token_contract_address,sender_gave_this
0,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,0,8.074525e+20,0xb0699d63aef20df3f1cffa9ca2bb8670416271d2,True
1,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,2,8.492115e+20,0xb0699d63aef20df3f1cffa9ca2bb8670416271d2,True
2,0x77929a2b313bd1f5b15c9dbc61cc598d21e6f3594b30...,0,2.103232e+23,0x1db61c337e5216941f53e6a0e41eed9640aec8bb,False
3,0x77929a2b313bd1f5b15c9dbc61cc598d21e6f3594b30...,1,1.450621e+23,0x1db61c337e5216941f53e6a0e41eed9640aec8bb,False
4,0x80756a409316570be65efd0861a99cc6c26adecbc1f2...,0,325968200000000.0,0xc548e90589b166e1364de744e6d35d8748996fe8,False


In [389]:
calculated_dollar_df = pd.merge(filtered_df, merged_null_0.drop('raw_amount', axis = 1), on=['tx_hash', 'token_contract_address', 'sender_gave_this'], how = 'left')
calculated_dollar_df['calculated_tx_dollar_value'] = calculated_dollar_df['raw_amount']* calculated_dollar_df['other_token_value_index_0']
calculated_dollar_df['calculated_token_value'] = calculated_dollar_df['raw_amount'] / calculated_dollar_df['calculated_tx_dollar_value']
calculated_dollar_df = calculated_dollar_df.drop('calculated_tx_dollar_value', axis = 1)
calculated_dollar_df.head()

Unnamed: 0,tx_hash,index,raw_amount,token_contract_address,sender_gave_this,1_weth_dollar_value_index_0,tx_weth_dollar_value_index_0,weth_raw_amount_index_0,other_token_value_index_0,calculated_token_value
0,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,0,8.074525e+20,0xb0699d63aef20df3f1cffa9ca2bb8670416271d2,True,3544.219971,1417.687988,4e-235,1.755754e-18,5.695559e+17
1,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,2,8.492115e+20,0xb0699d63aef20df3f1cffa9ca2bb8670416271d2,True,3544.219971,1417.687988,4e-235,1.755754e-18,5.695559e+17
2,0x77929a2b313bd1f5b15c9dbc61cc598d21e6f3594b30...,0,2.103232e+23,0x1db61c337e5216941f53e6a0e41eed9640aec8bb,False,3544.219971,708.843994,2e-235,3.37026e-21,2.96713e+20
3,0x77929a2b313bd1f5b15c9dbc61cc598d21e6f3594b30...,1,1.450621e+23,0x1db61c337e5216941f53e6a0e41eed9640aec8bb,False,3544.219971,708.843994,2e-235,3.37026e-21,2.96713e+20
4,0x80756a409316570be65efd0861a99cc6c26adecbc1f2...,0,325968200000000.0,0xc548e90589b166e1364de744e6d35d8748996fe8,False,3544.219971,1660.649063,4.685514e-235,5.094512e-12,196289600000.0


In [390]:
final_test = pd.merge(df_senders, calculated_dollar_df, on = ['tx_hash', 'index', 'token_contract_address', 'sender_gave_this'], how = 'left')
final_test['calculated_tx_dollar_value'] = final_test['raw_amount_x']* final_test['other_token_value_index_0']

In [391]:
def calculate_use_dollar_value(row):
    if not pd.isnull(row['dollar_value']):
        if row['sender_gave_this']:
            return -row['dollar_value']
        else:
            return row['dollar_value']
    else:
        if row['sender_gave_this']:
            return -row['calculated_tx_dollar_value']
        else:
            return row['calculated_tx_dollar_value']

# Apply the function to create the new column
final_test['use_this_dollar_value'] = final_test.apply(calculate_use_dollar_value, axis=1)
final_test

Unnamed: 0.1,Unnamed: 0,tx_hash,index,type,raw_amount_x,dollar_value,token_contract_address,token_name,token_dollar_value,from,...,is_sender,sender_gave_this,raw_amount_y,1_weth_dollar_value_index_0,tx_weth_dollar_value_index_0,weth_raw_amount_index_0,other_token_value_index_0,calculated_token_value,calculated_tx_dollar_value,use_this_dollar_value
0,260,0x012778bb6330737bed53ca488e582500498d81e1db22...,2,Transfer,1.110371e+28,1202.387914,0x4fe8d4775b7cb2546b9ee86182081cdf8f77b053,KAIJUNO8,1.082870e-07,0xda8f06e5e0fc0900fd09ec98a72630d3623e1158,...,True,False,,,,,,,,1202.387914
1,261,0x012778bb6330737bed53ca488e582500498d81e1db22...,2,Transfer,2.000000e+17,708.847998,Ethereum,Ethereum,3.544240e+03,0x9ea02f652955b90c0dd4f256003e4e339d3a4184,...,True,True,,,,,,,,-708.847998
2,255,0x012778bb6330737bed53ca488e582500498d81e1db22...,0,Transfer,1.654453e+28,1791.557918,0x4fe8d4775b7cb2546b9ee86182081cdf8f77b053,KAIJUNO8,1.082870e-07,0xda8f06e5e0fc0900fd09ec98a72630d3623e1158,...,True,False,,,,,,,,1791.557918
3,256,0x012778bb6330737bed53ca488e582500498d81e1db22...,0,Transfer,2.000000e+17,708.847998,Ethereum,Ethereum,3.544240e+03,0x9ea02f652955b90c0dd4f256003e4e339d3a4184,...,True,True,,,,,,,,-708.847998
4,53,0x08e622acdf6b27fe26f24e33815bb1a0789158f2d326...,2,Transfer,6.594532e+26,6601.126374,0xd807f7e2818db8eda0d28b5be74866338eaedb86,Jim,1.001000e-05,0xe342253d5a0c1ac9da0203b0256e33c5cfe084f0,...,True,False,,,,,,,,6601.126374
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
109,165,0xf575687f780edd7918193dcfe97f5f2d871ea275b9c7...,0,Transfer,8.146946e+18,28874.730616,Ethereum,Ethereum,3.544240e+03,0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad,...,True,False,,,,,,,,28874.730616
110,250,0xf8196d4b1341fb7700603dd0abfebc5ee859b279eb9c...,4,Transfer,1.547439e+21,4905.381294,0x21e5c85a5b1f38bddde68307af77e38f747cd530,Doggensnout Skeptic,3.170000e-09,0x898fcb7b4e3bee37ebb0ca3a3fbd08cefdc8c995,...,True,False,,,,,,,,4905.381294
111,251,0xf8196d4b1341fb7700603dd0abfebc5ee859b279eb9c...,4,Transfer,3.000000e+18,10632.719971,Ethereum,Ethereum,3.544240e+03,0xc50259330f6984da3c322c1e77e053fef02ea347,...,True,True,,,,,,,,-10632.719971
112,245,0xf8196d4b1341fb7700603dd0abfebc5ee859b279eb9c...,0,Transfer,1.555176e+21,4929.907219,0x21e5c85a5b1f38bddde68307af77e38f747cd530,Doggensnout Skeptic,3.170000e-09,0x898fcb7b4e3bee37ebb0ca3a3fbd08cefdc8c995,...,True,False,,,,,,,,4929.907219


In [392]:

#final_test['use_this_dollar_value'] = final_test.apply(lambda row: row['dollar_value'] if pd.notnull(row['dollar_value']) else row['calculated_tx_dollar_value'], axis=1)
#final_test.head()
final_test[final_test.tx_hash == '0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8ebb80b968a367ed5e581']

Unnamed: 0.1,Unnamed: 0,tx_hash,index,type,raw_amount_x,dollar_value,token_contract_address,token_name,token_dollar_value,from,...,is_sender,sender_gave_this,raw_amount_y,1_weth_dollar_value_index_0,tx_weth_dollar_value_index_0,weth_raw_amount_index_0,other_token_value_index_0,calculated_token_value,calculated_tx_dollar_value,use_this_dollar_value
16,5,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,2,Transfer,8.492115e+20,,0xb0699d63aef20df3f1cffa9ca2bb8670416271d2,,,0x36a2ffb33c1b427c46c3d30adac3ca4e8ed36179,...,True,True,8.492115e+20,3544.219971,1417.687988,4e-235,1.755754e-18,5.695559e+17,1491.006475,-1491.006475
17,9,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,2,Transfer,4e+17,1417.695996,Ethereum,Ethereum,3544.23999,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,...,True,False,,,,,,,,1417.695996
18,0,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,0,Transfer,8.074525e+20,,0xb0699d63aef20df3f1cffa9ca2bb8670416271d2,,,0x36a2ffb33c1b427c46c3d30adac3ca4e8ed36179,...,True,True,8.074525e+20,3544.219971,1417.687988,4e-235,1.755754e-18,5.695559e+17,1417.687988,-1417.687988
19,4,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,0,Transfer,4e+17,1417.695996,Ethereum,Ethereum,3544.23999,0x7a250d5630b4cf539739df2c5dacb4c659f2488d,...,True,False,,,,,,,,1417.695996


In [393]:
final = final_test.groupby(['tx_hash', 'index'])['use_this_dollar_value'].sum().reset_index()
final#one row per transaction and index for the total value of coins transfered by sender

Unnamed: 0,tx_hash,index,use_this_dollar_value
0,0x012778bb6330737bed53ca488e582500498d81e1db22...,0,1082.70992
1,0x012778bb6330737bed53ca488e582500498d81e1db22...,2,493.539916
2,0x08e622acdf6b27fe26f24e33815bb1a0789158f2d326...,0,-3998.588352
3,0x08e622acdf6b27fe26f24e33815bb1a0789158f2d326...,2,-4031.593596
4,0x1a5eafe643fb4e41e4666f045341584ea22bb73d5857...,0,10209.706351
5,0x1a5eafe643fb4e41e4666f045341584ea22bb73d5857...,4,10122.178102
6,0x1bcc89149f51daa2d79355a340d576af22555be34567...,0,-7675.061354
7,0x1bcc89149f51daa2d79355a340d576af22555be34567...,2,-7675.758595
8,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,0,0.008008
9,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,2,-73.310478


In [394]:
# Define a function to calculate the difference
def calculate_difference(group):
    if len(group) == 2:
        return group.iloc[1]['use_this_dollar_value'] - group.iloc[0]['use_this_dollar_value']
    else:
        return np.nan

# Group by 'tx_hash' and apply the custom function
result = final.groupby('tx_hash').apply(calculate_difference).reset_index(name='difference')

result.head()

Unnamed: 0,tx_hash,difference
0,0x012778bb6330737bed53ca488e582500498d81e1db22...,-589.170004
1,0x08e622acdf6b27fe26f24e33815bb1a0789158f2d326...,-33.005245
2,0x1a5eafe643fb4e41e4666f045341584ea22bb73d5857...,-87.528248
3,0x1bcc89149f51daa2d79355a340d576af22555be34567...,-0.697241
4,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,-73.318486


In [395]:
result.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 28 entries, 0 to 27
Data columns (total 2 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   tx_hash     28 non-null     object 
 1   difference  28 non-null     float64
dtypes: float64(1), object(1)
memory usage: 580.0+ bytes


#### 2. calculated_dollar_diff

#### Find the loss for each transaction

In [396]:
print("this is the total potential loss in dollars for the given transactions", result.difference.sum())

this is the total potential loss in dollars for the given transactions -8514.660605194244


### exporting the file

In [397]:
result.to_csv(f'results/{name_of_incident}_final.csv')

# Test

In [398]:
result[result.tx_hash == '0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8ebb80b968a367ed5e581']

Unnamed: 0,tx_hash,difference
4,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,-73.318486


In [370]:
result

Unnamed: 0,tx_hash,difference
0,0x012778bb6330737bed53ca488e582500498d81e1db22...,-589.170004
1,0x08e622acdf6b27fe26f24e33815bb1a0789158f2d326...,-33.005245
2,0x1a5eafe643fb4e41e4666f045341584ea22bb73d5857...,-87.528248
3,0x1bcc89149f51daa2d79355a340d576af22555be34567...,-0.697241
4,0x1ee8d8e23e4a026ce8afbf2dc79c196c6b1d43d5e9f8...,73.318486
5,0x36453126f7b08f65178da484754a9195f7f044b032bc...,-29.744761
6,0x3fbadfd097e3edf52396ca21392bbd34a2ad3aaba529...,-1.21003
7,0x4d8761e86be1ac935c925e4ac28513872a38583d4bf6...,-141.149817
8,0x4f9b4cddc272c3803df798041fff63ae7e439982810f...,-330.543631
9,0x623bae0e273896d748142353d5cec32276bf7f49b0ec...,-145.857418
