# Analysis of Transitions
With our implemented Heuristic Process Miner we mined transitions in a process. Let's reverse engineer them and see if we can explain, what is happening on the Ethereum Blockchain

In [59]:
import pandas as pd
import numpy as np

pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

pd.options.display.max_rows = 4000
pd.options.display.max_seq_items = 2000

In [46]:
path_contracts_lookup = '/Users/marcelmuller/Documents/Uni/Master/Semester_9_SS_18/Masterarbeit/contractsWithERCFlags.csv'
path_block_times = '/Users/marcelmuller/Documents/Uni/Master/Semester_9_SS_18/Masterarbeit/blockTimes.csv'

In [4]:
ts = pd.read_csv('../../parity_transactions/rgd_1523059200_to_1524528000.csv', index_col=0)

In [11]:
ts

Unnamed: 0_level_0,CtC->CtC,CtC->CtU,CtC->UtU,CtC->end,CtU->CtC,CtU->CtU,CtU->UtC,CtU->UtU,CtU->end,UtC->CtC,UtC->CtU,UtC->UtC,UtC->UtU,UtC->end,UtU->CtC,UtU->CtU,UtU->UtC,UtU->UtU,UtU->end,sta->CtC,sta->CtU,sta->UtC,sta->UtU,total_events
day,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1
1523059200000000000,3644,257,0,14,98,13,0,0,5,744,153,0,0,13,0,0,0,0,10,22,0.0,2443,2317,9.138355e+18
1523145600000000000,409560,40639,0,53574,30086,10735,0,0,30849,66147,21964,0,0,229435,0,0,0,0,259778,133,0.0,324546,268339,9.138874e+18
1523232000000000000,527713,32792,0,21920,18405,10861,0,0,9194,75365,17934,21,0,58682,0,0,0,0,79661,243,0.0,312377,295444,2.132525e+19
1523318400000000000,481590,45377,0,39946,35747,55599,0,0,17749,70366,18796,0,0,145202,0,0,0,0,194030,99,0.0,292383,276304,1.218655e+19
1523404800000000000,593556,56604,0,49423,46485,82875,0,0,19856,77733,18861,0,0,169408,0,0,0,0,185035,17,2.0,331382,276265,1.523405e+19
1523491200000000000,580596,30188,0,15991,17973,21309,0,0,6364,77943,15084,4,0,62977,0,0,0,0,82502,20,0.0,370173,319853,9.140947e+18
1523577600000000000,529568,25297,0,76059,14696,27605,0,0,27829,90966,17918,46,0,264359,0,0,0,0,343873,11,0.0,379062,349888,6.09431e+18
1523664000000000000,579405,23645,0,24610,14493,4293,0,0,6643,92045,13003,0,0,65303,0,0,0,0,66690,11,0.0,338424,271131,2.285496e+19
1523750400000000000,653266,28377,0,53692,16476,8500,0,0,19427,92682,15236,0,0,165027,0,0,0,0,200592,25,0.0,340373,278364,1.676125e+19
1523836800000000000,590996,26102,0,53966,13519,9124,0,0,17084,93054,12927,5,0,159039,0,0,0,0,184768,34,0.0,353828,308708,2.742906e+19


Breakdown of Transitions:
*  CtC->CtC: These events are event only invoking contract activities. They are resulting from a contract calling another contract or itself followed by another transaction of a contract calling another contract or itself
*  CtC->CtU: These events can be explained by traces, where a contract executes some logic involving a transaction to external owned user account. Many applications can be an example for that, like crowd funding like situations or any kind of contract representing a composite transaction structure with payouts to one or more externally owned accounts.
*  CtC->UtU: We are not having within one case a contract to contract transaction which is being directly followed by a user to user transaction, since user to X transactions can only be triggered from extern. However, we can see, that one day, this transition happens for times (-> **EXAMINE**)
*  CtC->end: these kind of transaction are common in the executing of any smart contract, which doesn't have a transaction to a user account at the end of it's execution.
*  CtU->CtC and CtU->CtU: these transitions make sense when we revisit how or event log is generated: the log is a linear sequential abstraction of the subtraces in the smart contract execution, which themselves have more complex tree strucutres. Any comisite of CtU..CtU tranansitions mean that there are several exernally owned accounts getting paid by a smart contract. If the composite includes a CtC transation this means that the logic also includes a intermediate step.
*  CtU->UtC and CtU->UtU: such transitions don't exist because user transactions can only get triggered from extern.
*  CtU->end: is a contract invoking a transaction to an externally owned account as last activity before its termination of execution.
*  UtC->CtC: this transition is the "starting" of the programming logic (CtC) in a smart contract by getting triggered by an externally owned account (UtC). This is the way of how contracts can get started.
*  UtC->CtU: "forwarding" events (for example ERC20 tokens). A user issues a transactionto a contract, which is directly afterwards issuing another transaction to another user.
*  UtC->UtC, UtC->UtU: UtC->UtC is not so easily explainable. If a user transfers something to a contract account, the contract account should be the next one to be able to issue a transaction in the same trace and not the user, since externally owned accounts do not have logic inside them. UtC->UtU has the same explaination and according to our reasoning it makes sense, that there is no occurence of that event. (-> **Examine UtC->UtC**)
*  UtC->end: this is a simple transaction where a user transfers an amount to a contract and the contract "stores" it, meaning does no more computational steps afterwards.
*  UtU->CtC, UtU->CtU, UtU->UtC, UtU->UtU: all these transition to occurr 0 times in our sample, which is explainable that after a user issued a transaction to another external user, everything what the second one would do with the amount, would be in another transaction / case.
*  UtU->end: these are simple (BTC like transaction) where a user sends an amount of ether to another user and then does nothing else with it.
*  sta->CtC, sta->CtU: these seem to be contract transactions being triggered "out of thin air" (meanining not by an externally owned user account). **Examine**
*  sta->UtU, sta->UtC: regular starting events. Nothing special to observe.

From our transition explaination we are now examining the points, which we cannot understand easily to find out what is happening there.

## CtC->UtU - Why is this happening?
In our sample time frame this exactly happens once, at timestamp 1524096000000000000, translation to April 19th 2018.

In [26]:
transitions_19_1 = pd.read_csv('/Users/marcelmuller/Documents/Uni/Master/Semester_9_SS_18/Masterarbeit/parity_transactions/mr_2018-04-16 00:00:00_in_transactions5450000-5459999_4_transitions.csv', index_col=0)

  mask |= (ar1 == a)


In [29]:
import glob, os
os.chdir("/Users/marcelmuller/Documents/Uni/Master/Semester_9_SS_18/Masterarbeit/parity_transactions/")
for file in glob.glob("*.csv"):
    print(file)

transactions5440000-5449999.csv
mr_2018-04-16 00:00:00_in_transactions5440000-5449999_0_transitions.csv
transactions5500000-5500000.csv
rgd_1523232000_to_1524528000.csv
mr_2018-04-19 00:00:00_in_transactions5460000-5469999_8_confidence.csv
transactions5490000-5499999.csv
mr_2018-04-16 00:00:00_in_transactions5450000-5459999_4_confidence.csv
transactions5410000-5419999.csv
mr_2018-04-08 00:00:00_in_transactions5400000-5409999_5_transitions.csv
rgd_1523664000_to_1524528000.csv
mr_2018-04-22 00:00:00_in_transactions5480000-5489999_6_transitions_agg.csv
mr_2018-04-23 00:00:00_in_transactions5480000-5489999_6_transitions.csv
ps_transactions5400000-5409999_trace_lengths.csv
ps_transactions5460000-5469999_trace_lengths.csv
mr_2018-04-14 00:00:00_in_transactions5430000-5439999_7_confidence.csv
ps_transactions5450000-5459999_event_log.csv
mr_2018-04-23 00:00:00_in_transactions5490000-5499999_2_transitions_agg.csv
rgtl_1523059200_to_1524528000.csv
mr_2018-04-13 00:00:00_in_transactions5430000-54

In [28]:
transitions_19_1.groupby('transition').count()

Unnamed: 0_level_0,total_pos,timestamp
transition,Unnamed: 1_level_1,Unnamed: 2_level_1
CtC->CtC,372713,372713
CtC->CtU,18419,18419
CtC->end,53966,53966
CtU->CtC,9638,9638
CtU->CtU,3312,3312
CtU->end,17084,17084
UtC->CtC,64281,64281
UtC->CtU,8651,8651
UtC->UtC,1,1
UtC->end,159039,159039


As we can see the 4 events do not occure in that file. We have occassionally more then one csv file per saved day since our raw input files are grouped by blocks and not day (-> overlap)

In [33]:
transitions_19_2 = pd.read_csv('/Users/marcelmuller/Documents/Uni/Master/Semester_9_SS_18/Masterarbeit/parity_transactions/mr_2018-04-19 00:00:00_in_transactions5460000-5469999_8_transitions.csv', index_col=0)

  mask |= (ar1 == a)


In [35]:
transitions_19_2.groupby('transition').count()

Unnamed: 0_level_0,total_pos,timestamp
transition,Unnamed: 1_level_1,Unnamed: 2_level_1
CtC->CtC,424077,424077
CtC->CtU,15326,15326
CtC->UtU,4,4
CtU->CtC,6879,6879
CtU->CtU,9701,9701
UtC->CtC,66308,66308
UtC->CtU,10363,10363
sta->CtC,773,773
sta->UtC,387439,387439
sta->UtU,281305,281305


There our not explainable CtC->UtU transactions are.

In [41]:
ccuus = transitions_19_2[transitions_19_2['transition']=='CtC->UtU']

In [52]:
ccuus

Unnamed: 0,total_pos,timestamp,transition
4015020,5467829002004747,1524132332,CtC->UtU
4074762,5467943002034763,1524134036,CtC->UtU
4086084,5467969002040454,1524134430,CtC->UtU
4118586,5468031002056786,1524135398,CtC->UtU


In [44]:
timestamp = 1524132332

In [49]:
block_times = pd.read_csv(path_block_times, index_col=0)

  mask |= (ar1 == a)


In [56]:
transaction_lookup = pd.read_csv('/Users/marcelmuller/Documents/Uni/Master/Semester_9_SS_18/Masterarbeit/transaction_lookup.csv')

In [51]:
block_times[block_times['timestamp']==1524132332]

Unnamed: 0_level_0,timestamp
number,Unnamed: 1_level_1
5467829,1524132332


In [53]:
log_546 = pd.read_csv('/Users/marcelmuller/Documents/Uni/Master/Semester_9_SS_18/Masterarbeit/parity_transactions/ps_transactions5460000-5469999_event_log.csv', index_col=0)

  mask |= (ar1 == a)


In [73]:
candidate_tx_hashes = log_546.merge(ccuus, on='total_pos').merge(transaction_lookup, left_on='transaction_id', right_on='id')['transaction_hash']

In [78]:
candidate_tx_hashes.values

array(['0x90f43e6e82b0305fcb6addf312d33a4c51f20b6f8af76c39917b912bfe37d02a',
       '0x5b6b4bb20e0b0f71377fe942bff7688eb2be6878915be228ccd904a0154fc394',
       '0x89ae3ac9140d4536f2313835f20c98d78ae2dbfc3e30e24dd330f3582ba8131e',
       '0x8659e5ff13cfa643c9d857f2dbc6794e41040fd307205220e49060b81ab0d645'],
      dtype=object)

In [None]:
raw_545 = pd.read_csv('/Users/marcelmuller/Documents/Uni/Master/Semester_9_SS_18/Masterarbeit/parity_transactions/transactions5460000-5469999.csv')

In [None]:
raw_545.head()