# Playing around with Blockchain Graphs
We want to explore blockchain data in a more intuitive way using graphs.  We use the "ShroomDK" API from flipside crypto to make queries, and then use some graph network libraries to play with the results. To gain access to ShroomDK, visit [https://sdk.flipsidecrypto.xyz/shroomdk](https://sdk.flipsidecrypto.xyz/shroomdk) and mint your API key, use this as the "my_key" variable below.

The way this is structured is that we are interested in following connections between wallets as far as it takes until we hit a labelled wallet. This way, we have information about known entities, and bunch of un-labelled addresses that might be of interest.

## General flow
- Input a list of seed addresses from which we will grow a graph of transactions and associated wallets
- Run SQL queries with ShroomDK to find all transactions involving those seed addresses, as well as looking for any address labels, and also check if any addresses are contracts or not.
- We Then use NetworkX to produce a graph from these transactions. This creates node and edge objects from a dataframe of transactions
- Using the address labels, we simplify the graph, e.g. if there are 10 different addresses associated with "Binance",  we represent them as a single node.
- We can then enrich the graph with data about the nodes and edges (e.g total transaction volume between nodes, and net volume/direction)

### First, initiate the api

In [1]:
%load_ext autoreload
%autoreload 2

import pandas as pd
from shroomdk import ShroomDK
import networkx as nx
from utils import grow_df, draw_graph

# my key
my_key = '20a4ac26-4880-4fa2-a4b7-53ea31edb54f'
sdk = ShroomDK(my_key)

### Grow the dataframe from the seed addresses

In [2]:
# list of starting address(es) of interest
seed_addresses = ['0x64e9B9cD74A46f71e7631CB033afA6E7849a8683']

# grow_df is designed to work the same for every step of growing the graph, so we must start with some initializations of empty objects
nogrow_addresses = [] # this is a list of addresses that we won't continue to grow from, we can update this as we go.
contracts = [] # This is a list of contract addresses that we won't grow the graph from. Can also update as we go.
address_label_dict = {} # initialize an empty dictionary for the address to label dict.
df = pd.DataFrame() # initialize an empty pandas dataframe 

# make sure lowercase addresses, otherwize problems
seed_addresses = [x.lower() for x in seed_addresses] # just making sure lower case addresses
nogrow_addresses = [x.lower() for x in nogrow_addresses] # just making sure lower case addresses
contracts = [x.lower() for x in contracts] # just making sure lower case addresses

limit_connections = 500 # max number of transactions to return from the set of seed addresses
rank_by = "amount_usd" # Ranking of above number of transactions, e.g. if rank_by ="amount_usd" and limit_connections=500, then we get the top 500 transactions from the seed addresses ranked by transaction usd amount 

out = grow_df(seed_addresses,
              nogrow_addresses,
              sdk,
              address_label_dict_prev=address_label_dict,
              contracts_prev=contracts,
              df=df,limit_connections=limit_connections)

seed_addresses, nogrow_addresses, address_label_dict, label_address_dict, contracts, df  = out

Running address query
Running contract query
Running label query


### Use the dataframe to make a graph, exported as an html
Currently setup such that:

- Addresses are circles
- Contracts are squares
- labelled addresses are yellow and large
- unlabelled addresses are red and small
- edges connecting to at least one labelled node are yellow
- all other edges are red (connecting only to unlabelled addresses)
- edge widths are determined by total transaction volume

html file is saved to directory with name "filename"

In [3]:
filename = "test1"
html = draw_graph(df,address_label_dict,label_address_dict,contracts,filename)

### Now grow graph by another step, i.e. for every new addresss, find all connections to those addresses too.

In [4]:
out = grow_df(seed_addresses,
              nogrow_addresses,
              sdk,
              address_label_dict_prev=address_label_dict,
              contracts_prev=contracts,
              df=df,limit_connections=limit_connections)

seed_addresses, nogrow_addresses, address_label_dict, label_address_dict, contracts, df = out

Running address query
Running contract query
Running label query


In [5]:
# make another graph of the second order connections
filename = "test2"
html = draw_graph(df,address_label_dict,label_address_dict,contracts,filename)

### And one more step

In [6]:
out = grow_df(seed_addresses,
              nogrow_addresses,
              sdk,
              address_label_dict_prev=address_label_dict,
              contracts_prev=contracts,
              df=df,limit_connections=limit_connections)

seed_addresses, nogrow_addresses, address_label_dict, label_address_dict, contracts, df = out

Running address query
Running contract query
Running label query


In [7]:
# make another graph
filename = "test3"
html = draw_graph(df,address_label_dict,label_address_dict,contracts,filename)