# Exercise 5 - Automatically find Exchange links for DWM Query

In this demo we will see how to combine DWM and Iknaio to automatically find connections to exchanges given a set of crypto addresses mentioned in some genre of darkweb sites. Our topic today is CP.

## Preparations

First, we install the graphsense-python package and define an API-key. An API-key for the [GraphSense](https://graphsense.github.io/) instance hosted by [Iknaio](https://www.ikna.io/) can be requested by sending an email to [contact@iknaio.com](contact@iknaio.com).

In [57]:
!pip install graphsense-python seaborn tqdm json-api-doc openpyxl

import graphsense
from graphsense.api import bulk_api, general_api

import json
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from datetime import datetime

# Request the HTML for this web page:
# response = requests.get("https://stackoverflow.com/questions/31126596/saving-response-from-requests-to-file")
# with open("dwm.py", "w") as f:
#     f.write(response.text)

import dwm

def ts_to_pds(ts):
    return datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [2]:
# secrets = {
#     "gs-api-key" : "",
#     "dwm-credentials" : {"username": "somename@somedomain.io", "password": ""}
# }
with open("secrets.json") as f:
    secrets = json.load(f)

# 1. Load Starting Addresses from DWM

In [3]:
# Request authentication token
headers = dwm.authenticate_api(secrets['dwm-credentials'])

## Load Domains

In [4]:
# Collect domains related to title
title = "Alice with violence CP"

df_domains_all = dwm.get_domains_by_title(title, headers)
df_domains_all

Processed 1 out of 3 pages
Processed 2 out of 3 pages
Processed 3 out of 3 pages


Unnamed: 0,type,id,domain_url,title,status,uptime,page_count,clearnet_cohost_count,darknet_cohost_count,inbound_count,outbound_count,discovered_at
0,torv3,15979412,http://x5w2vdx4lmvha27xjgnnnceudiqd6f3gjuegadu...,Alice with violence CP,online,93,4,0,0,2,0,2024-10-29T20:28:53.000Z
1,torv3,15979413,http://x5cj2bvcxngjohqi7hpkf67fqbqg7wkptcqa2sa...,Alice with violence CP,online,96,4,0,0,2,0,2024-10-29T20:28:53.000Z
2,torv3,15979414,http://vvniruuxyborklcc3i7s5mlerjuysw2rwrd6svr...,Alice with violence CP,online,94,4,0,0,1,0,2024-10-29T20:28:53.000Z
3,torv3,15979415,http://bv34z4lb4mr7djs7y7y62db6pocnmwoa7suxs4o...,Alice with violence CP,online,98,4,0,0,1,0,2024-10-29T20:28:53.000Z
4,torv3,15979416,http://bzuk5hv4r2z3n3asimysuxzwctm75eq3fzcd2ah...,Alice with violence CP,online,99,4,0,0,1,0,2024-10-29T20:28:53.000Z
...,...,...,...,...,...,...,...,...,...,...,...,...
2378,torv2,213940,http://ai4gvgc3syetwn4q.onion,Alice with violence CP,offline,83,4,0,0,4,0,2020-03-31T19:20:52.000Z
2379,torv2,213884,http://4cw2nl4jpeaekp2x.onion,Alice with violence CP,offline,83,4,0,0,4,0,2020-03-31T19:19:16.000Z
2380,torv3,206429,http://c5u4kpqwzbns7ikojebppox22mic44ewokk2mxl...,Alice with violence CP,offline,79,3,0,0,34,0,2020-03-13T01:57:51.000Z
2381,torv2,171377,http://yt33fue5lk4j7bks.onion,Alice with violence CP,offline,5,3,0,0,0,0,2020-02-04T09:18:26.000Z


### Only Keep Online Domains

In [14]:
# only keep online domains
df_domains = df_domains_all.query("status=='online'")
nr_domains = len(df_domains)
print(f"We have found {nr_domains} online domains with title: {title}")

We have found 590 online domains with title: Alice with violence CP


## Get Crypto Addresses on the Domains

In [6]:
df_cryptos_all = dwm.get_crypto_addresses_for_domains(df_domains, headers)

Processing domains: 100%|██████████| 590/590 [05:17<00:00,  1.86it/s]


### Only Keep BTC

In [17]:
df_cryptos = df_cryptos_all.query("type=='BTC'")
unique_addresses = len(df_cryptos["address"].unique())
print(f"We have found {len(df_cryptos)} addresses on these domains {unique_addresses} of which are unique")

We have found 8483 addresses on these domains 6226 of which are unique


In [49]:
# save output in an excel file
with pd.ExcelWriter("alice_dwm.xlsx") as writer:
    df_domains.to_excel(writer, sheet_name="Domains", index=False)
    df_cryptos.to_excel(writer, sheet_name="Crypto-Assets", index=False)

# Save unique addresses in a CSV file
df_cryptos[["address"]].drop_duplicates(subset=["address"]).to_csv("addresses.csv")

# 2. Finding Exchanges with Iknaio

In [23]:
configuration = graphsense.Configuration(
    host = "https://api.ikna.io/",
    api_key = {
        'api_key': secrets["gs-api-key"]
    }
)

GraphSense supports Bitcoin-like UTXO and Ethereum-like Account-Model ledgers. Iknaio currently hosts BTC, LTC, BCH, ZEC, and ETH.

We are investigating Bitcoin transactions, therefore we set the default currency to Bitcoin **BTC**.

In [24]:
CURRENCY = 'btc'

We can test whether or client works, by checking what data the GraphSense endpoint provides

In [25]:
with graphsense.ApiClient(configuration) as api_client:
    api_instance = general_api.GeneralApi(api_client)
    api_response = api_instance.get_statistics()
    display({x['name']:x['no_blocks'] for x in api_response['currencies']})

{'btc': 879056,
 'bch': 880818,
 'ltc': 2826418,
 'zec': 2784600,
 'eth': 21614477,
 'trx': 68627653}

# Q1. How many of the addresses are used?

Instead of querying each address individually, we just pass the dataframe of the known addresses.

In [79]:
seed_addresses = pd.read_csv("addresses.csv")

with graphsense.ApiClient(configuration) as clnt:
    blkapi = bulk_api.BulkApi(clnt)

    # documentation about available bulk operations can be found
    # here https://api.ikna.io/#/bulk/bulk_csv
    rcsv = blkapi.bulk_csv(
                CURRENCY,
                operation="get_address",
                body={
                    'address': seed_addresses['address'].to_list()
                },
                num_pages=1,
                _preload_content=False
              )
    respAddrDF = pd.read_csv(rcsv)

used_addresses = respAddrDF[["address", "balance_eur", "total_received_eur", "total_spent_eur", "in_degree", "out_degree", "no_incoming_txs", "no_outgoing_txs", "first_tx_timestamp", "last_tx_timestamp", "entity"]].dropna()
used_addresses.head(5)

Unnamed: 0,address,balance_eur,total_received_eur,total_spent_eur,in_degree,out_degree,no_incoming_txs,no_outgoing_txs,first_tx_timestamp,last_tx_timestamp,entity
6173,3BHqn4f41ee6Rhb7mXTsoL65BmyBsqjcRL,34.07,35.18,0.0,1.0,0.0,1.0,0.0,1735207000.0,1735207000.0,1357103000.0
6174,3MnjbkxCdkTYgZD3BkWbC8JA38sGS374To,37.7,38.56,0.0,1.0,0.0,1.0,0.0,1736027000.0,1736027000.0,1360100000.0
6175,3Ettx8jZGs1AK7WLs48sSEcjX1nz5EoeLt,33.1,33.07,0.0,1.0,0.0,1.0,0.0,1733043000.0,1733043000.0,1348621000.0
6176,3CmTgpAcgGU8XwzX7Sqowrj2AA2chLajCK,52.41,31.39,0.0,1.0,0.0,1.0,0.0,1728061000.0,1728061000.0,1330405000.0
6177,3DKx1zDQw3F7zz5Gwo7G9qbrikSmWjvqHP,0.0,0.27,0.27,1.0,1.0,1.0,1.0,1689982000.0,1690897000.0,1142124000.0


In [80]:
print(f"{len(used_addresses)} addresses received {sum(used_addresses['total_received_eur']):.2f} EUR, Balance {sum(used_addresses['balance_eur']):.2f} EUR")
print(f"Activity period of the addresses was: {ts_to_pds(min(used_addresses['first_tx_timestamp']))} to {ts_to_pds(max(used_addresses['last_tx_timestamp']))}")

53 addresses received 10044.19 EUR, Balance 546.49
Activity period of the addresses was: 2020-10-26 06:29:13 to 2025-01-04 22:50:09


# Q2: Are there direct links to exchanges?

In [81]:
with graphsense.ApiClient(configuration) as clnt:
    blkapi = bulk_api.BulkApi(clnt)

    # documentation about available bulk operations can be found
    # here https://api.ikna.io/#/bulk/bulk_csv
    rcsv = blkapi.bulk_csv(
                CURRENCY,
                operation="list_address_neighbors",
                body={
                    'address': used_addresses['address'].to_list(),
                    'direction': 'out',
                    'include_labels': True
                },
                num_pages=1,
                _preload_content=False
              )
    respAddrDF = pd.read_csv(rcsv)

with_label = respAddrDF.query("labels.notnull()")

with_outgoing = respAddrDF.query("_info != 'no data'")

print(f"We have found {len(with_outgoing)} outgoing neighbors, {len(with_label)} are known")

We have found 126 outgoing neighbors, 0 are known


# Q3: Can I find connections via using Clusters?

We now fetch summary statistics for each entity.

In [87]:
with graphsense.ApiClient(configuration) as clnt:
  blkapi = bulk_api.BulkApi(api_client)
  rcsv = blkapi.bulk_csv(
                                 CURRENCY,
                                 operation = "get_entity",
                                 body={
                                     'entity': used_addresses['entity'].drop_duplicates().to_list()
                                     },
                                 num_pages=1,
                                 _preload_content=False
                                 )
  respEntityDF = pd.read_csv(rcsv)

clusters = respEntityDF[
    ["best_address_tag_label",
     "root_address",
     "no_addresses",
     "balance_eur",
     "total_received_eur",
     "total_spent_eur",
     "first_tx_timestamp",
     "last_tx_timestamp"]
     ]

print(f"{sum(clusters['no_addresses'])-len(used_addresses)} new addresses have been found. They received {sum(clusters['total_received_eur']):.2f} EUR, Balance {sum(clusters['balance_eur']):.2f} EUR")
print(f"Activity period of the cluster addresses were: {ts_to_pds(min(clusters['first_tx_timestamp']))} to {ts_to_pds(max(clusters['last_tx_timestamp']))}")
clusters.query("best_address_tag_label.notnull()")

4640 new addresses have been found. They received 543251.10 EUR, Balance 3370.23 EUR
Activity period of the cluster addresses were: 2014-08-09 23:48:57 to 2025-01-12 13:50:05


Unnamed: 0,best_address_tag_label,root_address,no_addresses,balance_eur,total_received_eur,total_spent_eur,first_tx_timestamp,last_tx_timestamp
1,Dark Web,3MnjbkxCdkTYgZD3BkWbC8JA38sGS374To,1,37.7,38.56,0.0,1736027409,1736027409
2,Dark Web,3BRfy6BnyFb5XEdzCGrhjbqwMK71egoRDu,1,0.94,0.95,0.0,1732533419,1732533419
4,Dark Web,3CmTgpAcgGU8XwzX7Sqowrj2AA2chLajCK,1,52.41,31.39,0.0,1728060509,1728060509
7,Dark Web,3E2YaXkyp3uxSRVLqwx81mveAowHGEWSQw,1,53.33,31.87,0.0,1727915708,1727915708
12,Dark Web,3NV97tRYA74egEFfMen9JD7JzbM5U8hjYc,1,30.34,31.86,0.0,1734306315,1734306315
13,Dark Web,3BHqn4f41ee6Rhb7mXTsoL65BmyBsqjcRL,1,34.07,35.18,0.0,1735206829,1735206829
16,Dark Web,3EC9u8Jz9RzFAfinDBttDjfHnNiq3c8cPi,1,59.02,35.34,0.0,1728079448,1728079448
17,Dark Web,3BPtjKVvFhsFfBu4R5NAHWcqtsQJ1yVZqF,1,48.09,242.54,315.33,1697853167,1735528705
20,Dark Web,3Ettx8jZGs1AK7WLs48sSEcjX1nz5EoeLt,1,33.1,33.07,0.0,1733043388,1733043388
25,Dark Web,37MVTJ315tNA1fcEUiGDBkrkyvPLxyrqsT,1,42.29,32.45,0.0,1731084774,1731084774


# Q4: What if I look at multiple hops? Are there any exchanges?