Near-cliques and stars: In a Bitcoin transaction graph, near-cliques could correspond to a group of addresses that frequently transact with each other, indicating potential collusion or money laundering activities. On the other hand, stars could represent addresses that are isolated from the rest of the network, potentially indicating suspicious behavior or attempts to hide connections.

Heavy vicinities: In the context of Bitcoin transactions, heavy vicinities could indicate abnormal transaction patterns between a specific address and a large number of distinct addresses. This could suggest activities such as mixing services or attempts to obfuscate the transaction flow by involving multiple addresses.

Dominant heavy links: A dominant heavy link in the 1-step neighborhood of an address could indicate a disproportionate amount of transactions with a particular address or entity. This could be indicative of a stalker-like behavior or an address that is being used excessively for fraudulent activities.

## Convert edges to txt

In [1]:
import numpy as np
import pandas as pd

In [2]:
df_edge = pd.read_csv('data/t_finance/edges.csv')
display(df_edge.head())
print(df_edge.shape)

Unnamed: 0,ID1,ID2
0,12230,6579
1,12230,8455
2,12230,272
3,12230,7868
4,12230,12859


(6032438, 2)


In [3]:
# cout duplicate edges
print(df_edge.duplicated().sum())

0


In [4]:
# convert deges to (1 2 weight) format in txt file
df_edge['ID1'] = df_edge['ID1'].astype(str)
df_edge['ID2'] = df_edge['ID2'].astype(str)
# calculate weight, number of occurences of each edge
df_edge['weight'] = df_edge.groupby(['ID1', 'ID2'])['ID1'].transform('count')
display(df_edge.head())

df_edge[['ID1', 'ID2', 'weight']].to_csv('oddball/converted_data/t_finance.txt', sep=' ', header=False, index=False)

Unnamed: 0,ID1,ID2,weight
0,12230,6579,1
1,12230,8455,1
2,12230,272,1
3,12230,7868,1
4,12230,12859,1


## Run oddball

## Run
The input is a weighted undirected graph which format is 'edge1 edge2 weight'.  
### Options:  
  --input: input file  
  --output: output file  
  --lof: Use LOF. 0: not use. 1: use. Default value is 0.  
  --anomaly_type: Anomaly Type. 1:star_or_clique. 2:heavy_vicinity. 3:dominant_edge.

You can use --help for more details.  
Here is a sample.
```
python main.py --input inputFile --output outputFile --lof 0 --anomaly_type 1
```

In [5]:
!python oddball/code/main.py --input oddball/converted_data/t_finance.txt --output oddball/converted_data/result/output_type_2.txt --lof 0 --anomaly_type 2