# Web3 Community and Education sybil analysis

This notebooks looks for patterns of sybil bahevior in the Web3 Community and Education round.

This round was selected because during the grant review I flagged many projects from this round as suspicious.


We use sybil-scorer as well as new metrics to identify suspicious behavior.

We will then manually review the addresses using etherscan and DeBank to confirm our suspicions.

Then we try to see if any project is more affected by these flagged addresses in particular to back up our claims of suspicious project applications. This will also allow us to see if the sybil behavior is targeted towards a specific project.

In [5]:
import os
from pathlib import Path
import numpy as np

import pandas as pd

from sbdata.FlipsideApi import FlipsideApi

# Set path to data folder
current_dir = Path(os.getcwd())
PATH_TO_EXPORT = os.path.join(current_dir.parent.parent, 'tx_data', 'web3_community_and_education')
DATA_DIR = os.path.join(current_dir.parent.parent, 'data-regen-rangers')

# read the address from oss grant

api_key = os.environ['FLIPSIDE_API_KEY']
flipside_api = FlipsideApi(api_key, max_address=400)
PATH_TO_VOTES = os.path.join(DATA_DIR, "beta_round_votes.csv")
PATH_TO_GRANTS = os.path.join(DATA_DIR, "all-allo-rounds.csv")
PATH_TO_PROJECTS = os.path.join(DATA_DIR, "projects_QmQurt.csv")

# load data
df_votes = pd.read_csv(PATH_TO_VOTES)
df_grants = pd.read_csv(PATH_TO_GRANTS)
df_application = pd.read_csv(PATH_TO_PROJECTS)
# Lowercase all addresses because flipside api return lowercase address
df_grants['Round ID'] = df_grants['Round ID'].str.lower()
str_columns_votes = ['id', 'transaction', 'projectId', 'roundId', 'voter', 'grantAddress']
df_votes[str_columns_votes] = df_votes[str_columns_votes].applymap(lambda x: x.lower())

str_columns_application = ['id', 'roundId', 'metadata.application.round', 'metadata.application.recipient']
df_application[str_columns_application] = df_application[str_columns_application].applymap(lambda x: str(x).lower())

round_id = df_grants[df_grants['Round name'] == 'Web3 Community and Education']['Round ID'].values[0]
array_unique_address = df_votes[df_votes['roundId'] == round_id]['voter'].unique()

array_unique_address = np.char.lower(array_unique_address.astype(str))


In [None]:
np.diff1d(array_unique_address, array_unique_address)

In [7]:
from sbutils import LoadData

# Load data
data_loader = LoadData.LoadData(PATH_TO_EXPORT)
df_tx = data_loader.create_df_tx('ethereum')

In [8]:
len(array_unique_address)

372

In [9]:
df_tx.EOA.nunique()

371

In [10]:
c = np.setxor1d(array_unique_address, df_tx.EOA.values)
c

array(['0x4a35674727c44cf4375d80c6171281ba2f764213'], dtype=object)

## Looking at a specific address donation

I don't know why that address is not retrieved by the API, but it is a voter and had some interesting patterns to explore. 
The address is: 0x4a35674727c44cf4375d80c6171281ba2f764213 voted to many projects by giving exactly the same amount to all of them. Let see:

In [11]:
suspicious_add = c[0]

In [16]:
df_vote_sus = df_votes[df_votes.voter== suspicious_add]
print(f'Number of votes: {len(df_vote_sus)}')
print(f'Number of unique projects: {df_vote_sus.projectId.nunique()}')
print(f'Number of unique grants: {df_vote_sus.grantAddress.nunique()}')

Number of votes: 48
Number of unique projects: 48
Number of unique grants: 47


Which project received the most votes from this address?

In [19]:
df_vote_sus['grantAddress'].value_counts().head(3)

grantAddress
0xb6e780438882f2daa11da0972807f4d12166af8b    2
0x486c853c41885d9c1edb57c423853fc4fbc60769    1
0xc36e4889a820bd8089a8ad226ee9d0c703aed314    1
Name: count, dtype: int64

In [27]:
# look at what is the project of the grant
suspicious_add = df_vote_sus['grantAddress'].value_counts().index[0]
df_application[df_application['metadata.application.recipient'] == suspicious_add]['metadata.application.project.title']


200    Impact DAOs Research + Podcast + Book : Impact...
269    Impact DAO Course + IRL/Online Events: Educati...
Name: metadata.application.project.title, dtype: object

I find it very suspicious that this project "Impact DAO" applied for a grant with two subprojects. It would be interesting to explore how much common voter overlap there is between the two subprojects.

In [30]:
df_votes[df_votes['grantAddress'] == suspicious_add].groupby('projectId').count()

Unnamed: 0_level_0,id,transaction,blockNumber,applicationId,roundId,voter,grantAddress,token,amount,amountUSD,amountRoundToken
projectId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
0x3e7af275e7342f380ce8182e2f1e19ecbf32086974f53a386a8e1278016f6a50,7,7,7,7,7,7,7,7,7,7,7
0xa35e52635d40be090a49c587e58fab7410c30b9cff7ad5cc66b547d88cb93200,12,12,12,12,12,12,12,12,12,12,12


In [32]:
df_votes[df_votes['grantAddress'] == suspicious_add].groupby('projectId')['voter'].nunique()

projectId
0x3e7af275e7342f380ce8182e2f1e19ecbf32086974f53a386a8e1278016f6a50     7
0xa35e52635d40be090a49c587e58fab7410c30b9cff7ad5cc66b547d88cb93200    12
Name: voter, dtype: int64

In [34]:
df_votes[df_votes['grantAddress'] == suspicious_add].groupby('projectId')['amount'].describe()

Unnamed: 0_level_0,count,unique,top,freq
projectId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0x3e7af275e7342f380ce8182e2f1e19ecbf32086974f53a386a8e1278016f6a50,7,6,2000000000000000,2
0xa35e52635d40be090a49c587e58fab7410c30b9cff7ad5cc66b547d88cb93200,12,9,5000000000000000000,3


The average donation is 1$ in both cases! 7 votes for 7$ and 12 votes for 12$.

This project is very suspicious and should proably be squelched.

## Computing legos booleans

In [41]:
from sblegos.TransactionAnalyser import TransactionAnalyser as txa
tx_analyser = txa(df_tx, df_address=pd.DataFrame(np.intersect1d(df_tx.EOA.unique(), array_unique_address)))

In [42]:
df_matching_address = pd.DataFrame(df_tx.EOA.unique(), columns=["address"])
df_matching_address.head(2)

Unnamed: 0,address
0,0x0010e3bce7e7d5890849fa2bb2681174f4352bc4
1,0x00173c4cb23e6d876fcb036ba954a2f9cfcafa19


Compute the boolean 

In [43]:
df_matching_address['seed_same_naive'] = df_matching_address.loc[:, 'address'].apply(lambda x : tx_analyser.has_same_seed_naive(x))
df_matching_address['seed_same'] = df_matching_address.loc[:, 'address'].apply(lambda x : tx_analyser.has_same_seed(x))
df_matching_address['seed_suspicious'] = df_matching_address.loc[:, 'seed_same_naive'].ne(df_matching_address.loc[:, 'seed_same'])
df_matching_address['less_5_tx'] = df_matching_address.loc[:, 'address'].apply(lambda x : tx_analyser.has_less_than_n_transactions(x, 5))
df_matching_address['less_10_tx'] = df_matching_address.loc[:, 'address'].apply(lambda x : tx_analyser.has_less_than_n_transactions(x, 10))
df_matching_address['interacted_other_ctbt'] = df_matching_address.loc[:, 'address'].apply(lambda x : tx_analyser.has_interacted_with_other_contributor(x))

In [131]:
print(f'Number of voters: {len(df_matching_address)}')

Number of voters: 371


In [27]:
df_matching_address.sum()

address                  0x0010e3bce7e7d5890849fa2bb2681174f4352bc40x00...
interacted_other_ctbt                                                   69
seed_same_naive                                                        159
seed_same                                                              162
seed_suspicious                                                          3
less_5_tx                                                                7
less_10_tx                                                              25
dtype: object

### Investing the boolean seed suspicious

In [44]:
df_matching_address[df_matching_address['seed_suspicious'] == True]

Unnamed: 0,address,seed_same_naive,seed_same,seed_suspicious,less_5_tx,less_10_tx,interacted_other_ctbt
152,0x61ffe691821291d02e9ba5d33098adcee71a3a17,False,True,True,False,False,False
289,0xc28064b875ae25f9a2ca28c08f116a5c26229f69,False,True,True,False,False,False
331,0xe51200a4d161935fc311ed8a0401feb1abf20e3a,False,True,True,False,False,False


In [288]:
df_application2

Unnamed: 0,id,projectNumber,roundId,status,amountUSD,votes,uniqueContributors,createdAtBlock,statusUpdatedAtBlock,metadata.signature,...,metadata.application.project.credentials.github.credentialSubject.hash,metadata.application.project.credentials.github.credentialSubject.@context,metadata.application.project.credentials.github.issuer,metadata.application.project.credentials.github.issuanceDate,metadata.application.project.credentials.github.proof.type,metadata.application.project.credentials.github.proof.proofPurpose,metadata.application.project.credentials.github.proof.verificationMethod,metadata.application.project.credentials.github.proof.created,metadata.application.project.credentials.github.proof.jws,metadata.application.project.credentials.github.expirationDate
0,0xf0b0e62028f4a530344930f60db7a79d51b2810505f6...,350.0,0x87306d9cc4fd16b702470e0ebee3c05162ff238b,APPROVED,0.000000,0,0,17025221,17025224,0x952db3336fb9b536bd94de7fe42d9aca89bf1183d3df...,...,,,,,,,,,,
1,0x6aeb3ddab061203d32594b03901bf458679a66fbeab9...,355.0,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED,375.978974,65,62,17031319,17170541,0x4c3db176694bdfc0f9e4a04738205d562ef31a026699...,...,v0.0.0:f/yZ93nizxeRUzXoC4gixi1RdxQbOBduuVbmj+8...,"[{'hash': 'https://schema.org/Text', 'provider...",did:key:z6MkghvGHLobLEdj1bgRLhS4LPGJAvbMA1tn2z...,2023-04-12T09:48:40.402Z,Ed25519Signature2018,assertionMethod,did:key:z6MkghvGHLobLEdj1bgRLhS4LPGJAvbMA1tn2z...,2023-04-12T09:48:40.402Z,eyJhbGciOiJFZERTQSIsImNyaXQiOlsiYjY0Il0sImI2NC...,2023-07-11T09:48:40.402Z
2,0xdeffc2d190ddec88d06ba86e1eb09abe5a9ccb49ab1d...,174.0,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED,207.890062,28,25,17033387,17170541,0xc9f5e29fcb4db0b0c9a5909f68c162ea5d16a275f705...,...,v0.0.0:8/ii1lQxgy9/Dlk8MqHNNzoGhu6HeAvgA8HmTZs...,"[{'hash': 'https://schema.org/Text', 'provider...",did:key:z6MkghvGHLobLEdj1bgRLhS4LPGJAvbMA1tn2z...,2023-01-13T08:12:23.363Z,Ed25519Signature2018,assertionMethod,did:key:z6MkghvGHLobLEdj1bgRLhS4LPGJAvbMA1tn2z...,2023-01-13T08:12:23.363Z,eyJhbGciOiJFZERTQSIsImNyaXQiOlsiYjY0Il0sImI2NC...,2023-04-13T08:12:23.363Z
3,0x2f4109ec6cfc94018aba83b965d4debe69381b9e86c9...,291.0,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,REJECTED,0.000000,0,0,17034127,17170541,0x0d5b170ceeb72eaec00a595f8270a676937acbeb67b9...,...,v0.0.0:XA1gKtiO/NRInNlk96jyNIX4pvTuIvvMYQyD8Hq...,"[{'hash': 'https://schema.org/Text', 'provider...",did:key:z6MkghvGHLobLEdj1bgRLhS4LPGJAvbMA1tn2z...,2023-04-12T19:34:27.203Z,Ed25519Signature2018,assertionMethod,did:key:z6MkghvGHLobLEdj1bgRLhS4LPGJAvbMA1tn2z...,2023-04-12T19:34:27.203Z,eyJhbGciOiJFZERTQSIsImNyaXQiOlsiYjY0Il0sImI2NC...,2023-07-11T19:34:27.203Z
4,0xfea4b1dccc14b6b2d06d08ec300222497194f0f95437...,307.0,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED,3469.969739,115,110,17034383,17170541,0x32a6db08937189778a4fee954a738ad534598a1fc113...,...,v0.0.0:mvFg9cmBxIIsD0Dk4GkiMNXpJpV/1IdZF/porl/...,"[{'hash': 'https://schema.org/Text', 'provider...",did:key:z6MkghvGHLobLEdj1bgRLhS4LPGJAvbMA1tn2z...,2023-04-12T20:30:12.252Z,Ed25519Signature2018,assertionMethod,did:key:z6MkghvGHLobLEdj1bgRLhS4LPGJAvbMA1tn2z...,2023-04-12T20:30:12.252Z,eyJhbGciOiJFZERTQSIsImNyaXQiOlsiYjY0Il0sImI2NC...,2023-07-11T20:30:12.252Z
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
998,0x9335f70ec86665583005d31960562645d65b6420dc37...,840.0,0x64aa545c9c63944f8e765d9a65eda3cbbdc6e620,APPROVED,1.319853,1,1,17129225,17130975,0xb1dae0931988487ed1b36861832bde3931541d656885...,...,,,,,,,,,,
999,0x5d6aa906f7e3e33ecab4e99464a7b5d0171013794310...,746.0,0x64aa545c9c63944f8e765d9a65eda3cbbdc6e620,APPROVED,18.991042,1,1,17129850,17130975,0xd885ef4026295437c32c0ad7ea5089bf59d072cbc226...,...,,,,,,,,,,
1000,0xd3bb637c6c9d44aab8806a28f8b2b7d8a1aadc0a42bb...,743.0,0x64aa545c9c63944f8e765d9a65eda3cbbdc6e620,APPROVED,50.761380,7,6,17130240,17130975,0xa9af0c56b1249b841b8cc8454b9609cdc27f1e57e4fc...,...,,,,,,,,,,
1001,0x489bddc9bab6a3a7e3502e05e14606f66b13b11cd8c2...,723.0,0x64aa545c9c63944f8e765d9a65eda3cbbdc6e620,APPROVED,0.000000,0,0,17130382,17130975,0xc14ee9027c60eb759929126b898453c91b1fa6b0c96c...,...,,,,,,,,,,


In [294]:
projects_voted = df_votes[df_votes['voter'] == '0xc28064b875ae25f9a2ca28c08f116a5c26229f69']
print(f'Number of votes {projects_voted.shape[0]} Number of projects voted: {projects_voted.grantAddress.nunique()}')
# Merge the project the user voted for and the projects 
projects_voted.merge(df_application, left_on='grantAddress', right_on='metadata.application.recipient', how='left').drop_duplicates(subset='grantAddress').loc[:, ['grantAddress', 'metadata.application.project.title', 'status', 'metadata.application.round']].reset_index(drop=True)

Number of votes 16 Number of projects voted: 16


Unnamed: 0,grantAddress,metadata.application.project.title,status,metadata.application.round
0,0x10666d9c6295e838d3b8b84ffcc97d62ef7e6120,Inverter Network - Fund and build in web 3 wit...,APPROVED,0x12bb5bbbfe596dbc489d209299b8302c3300fa40
1,0x11ee133a1408fe2d7c62296d7eb33f234b774503,dm3 protocol - the interoperability initiative,REJECTED,0x12bb5bbbfe596dbc489d209299b8302c3300fa40
2,0x7d658841f8ba93299970f6e765c2ce205f1e70dd,Loanshark,APPROVED,0x12bb5bbbfe596dbc489d209299b8302c3300fa40
3,0xb67fd6f4b908d702c2cf0e2b9d30d52d4ea5b2bc,ECHO,APPROVED,0x12bb5bbbfe596dbc489d209299b8302c3300fa40
4,0x713bc00d1df5c452f172c317d39eff71b771c163,Kakarot zkEVM,APPROVED,0xdf22a2c8f6ba9376ff17ee13e6154b784ee92094
5,0xdecf6615152ac768bfb688c4ef882e35debe08ac,Rhino Review - Ethereum Staking Journal,APPROVED,0xdf22a2c8f6ba9376ff17ee13e6154b784ee92094
6,0x187089b65520d2208ab93fb471c4970c29eaf929,Ape Framework,APPROVED,0xdf22a2c8f6ba9376ff17ee13e6154b784ee92094
7,0xb352bb4e2a4f27683435f153a259f1b207218b1b,eth.limo,REJECTED,0x12bb5bbbfe596dbc489d209299b8302c3300fa40
8,0xb7081fd06e7039d198d10a8b72b824e60c1b1e16,Otterscan,APPROVED,0xdf22a2c8f6ba9376ff17ee13e6154b784ee92094
9,0x52277e5cf8df8ef51d9bd37d8d7553a72d378bef,Trustless zkMafia,APPROVED,0x274554eb289004e15a7679123901b7f070dda0fa


Some of the projects he donated to are in the list of Rejected projects showing that this address is indeed a sybil. And may have contributed to ohter fraudulent projects.

- Pulsar is not very active and is forked code for the most part
- Fusion not very active on github but has a lot of activity on twitter 
- Share suspicious Github with no activity, twitter does not exists: suspicious

Other projects are ok

In [127]:
projects_voted = df_votes[df_votes['voter'] == '0x61ffe691821291d02e9ba5d33098adcee71a3a17']
print(f'Number of votes {projects_voted.shape[0]} Number of projects voted: {projects_voted.grantAddress.nunique()}')
# Merge the project the user voted for and the projects 
projects_voted.merge(df_application, left_on='grantAddress', right_on='metadata.application.recipient', how='left').drop_duplicates(subset='grantAddress').loc[:, ['grantAddress', 'metadata.application.project.title']].reset_index(drop=True)

Number of votes 1 Number of projects voted: 1


Unnamed: 0,grantAddress,metadata.application.project.title
0,0x4cd6b4503d02973d78cff59c9c56f5f378688274,Mechanism Institute: Democratizing Cryptoecono...


In [283]:
projects_voted = df_votes[df_votes['voter'] == '0xe51200a4d161935fc311ed8a0401feb1abf20e3a']
print(f'Number of votes {projects_voted.shape[0]} Number of projects voted: {projects_voted.grantAddress.nunique()}')
# Merge the project the user voted for and the projects 
projects_voted.merge(df_application2, left_on='grantAddress', right_on='metadata.application.recipient', how='left').drop_duplicates(subset='grantAddress').loc[:, ['grantAddress', 'metadata.application.project.title']].reset_index(drop=True)

Number of votes 2 Number of projects voted: 2


Unnamed: 0,grantAddress,metadata.application.project.title
0,0x2f6c8f867df4e49e4fb24741c414f315e266c21b,Ethereum Lima
1,0xba97f51b1b097ce1ea43cf941cd62dfbcd70cae3,ETH Kipu


The boolean seed suspicious is not relevant for that round we will not use it for the analysis.

### Computing the new dex interaction score
It was investigated in another notebook 

In [133]:
def get_interacted_address(from_address, to_address, address):
    if from_address == address:
        return to_address
    else:
        return from_address

def count_interaction_with_any(tx_analyser, address, array_address):
    """
    Return an integer of the number of interactions with the addresses in the array_address
    Parameters
    ----------
    address : str
        The address to check

    Returns
    -------
    count_interaction_with_any : int
        The number of interactions with the addresses in the array_address
    """
    tx_analyser.set_group_by_sorted_EOA()

    df = tx_analyser.gb_EOA_sorted.get_group(address)
    address_interacted = df.apply(lambda x: get_interacted_address(x['from_address'], x['to_address'], address), axis=1)
    tx_boolean_interacted = address_interacted.isin(array_address)
    return tx_boolean_interacted.sum()

def has_interacted_with_any(tx_analyser, address, array_address):
    """
    Return a boolean whether the address has interacted with any address in the array_address
    Parameters
    ----------
    address : str
        The address to check

    Returns
    -------
    has_interacted_with_any : bool
        True if the address has interacted with one or more of the addresses in the array_address
    """
    count_interaction_with_any = count_interaction_with_any(tx_analyser, address, array_address)
    return count_interaction_with_any > 0

In [135]:
label_query = '''
SELECT ADDRESS, CREATOR, LABEL_TYPE, ADDRESS_NAME, PROJECT_NAME
FROM crosschain.core.address_labels 
WHERE BLOCKCHAIN='ethereum'
AND LABEL_SUBTYPE = 'pool' 
;'''
df_label = flipside_api.execute_query(label_query)

In [136]:
# extract all the pool addresses
array_pool_address = df_label['address'].unique()

In [138]:
tx_analyser.set_group_by_sorted_EOA()

In [159]:
# Compute the number of interactions with any of the pools for each address
df_matching_address['count_interaction_with_pool'] = df_matching_address['address'].apply(lambda x: count_interaction_with_any(tx_analyser, x, array_pool_address))

In [160]:
(df_matching_address['count_interaction_with_pool'] > 0).sum() / len(df_matching_address)

0.40431266846361186

In [146]:
label_query = '''
SELECT DISTINCT(LABEL_SUBTYPE)
FROM crosschain.core.address_labels 
WHERE BLOCKCHAIN='ethereum'
;'''
df_distinct_labels = flipside_api.execute_query(label_query)

In [149]:
df_distinct_labels.label_subtype.unique()

array(['strategy', 'chadmin', 'router', 'multisig', 'airdrop_contract',
       'nf_token_contract', 'token_distribution', 'nf_position_manager',
       'donation_address', 'mining_pool', 'deposit_wallet',
       'staking_contract', 'pool', 'token_contract', 'bridge',
       'mint_burn', 'mint_contract', 'contract_deployer', 'marketplace',
       'foundation', 'vault', 'voting', 'cold_wallet', 'reserve',
       'swap_router', 'hot_wallet', 'toxic', 'fee_wallet', 'governance',
       'general_contract', 'swap_contract', 'distributor_cex', 'escrow',
       'rewards', 'dao', 'treasury', 'token_sale', 'oracle',
       'aggregator_contract'], dtype=object)

From these tags lets flag any address that have interacted with a toxic wallet

In [153]:
label_query = '''
SELECT ADDRESS, CREATOR, LABEL_TYPE, ADDRESS_NAME, PROJECT_NAME
FROM crosschain.core.address_labels 
WHERE BLOCKCHAIN='ethereum'
AND LABEL_SUBTYPE = 'toxic'
;'''
df_toxic = flipside_api.execute_query(label_query)

In [155]:
df_toxic.shape

(4857, 6)

In [158]:
# Compute the number of interactions with any of the scam for each address
df_matching_address['count_interaction_with_toxic'] = df_matching_address['address'].apply(lambda x: count_interaction_with_any(tx_analyser, x, df_toxic['address'].unique()))

In [192]:
print(f'Percentage of addresses that have interacted with a toxic address: {int((df_matching_address["count_interaction_with_toxic"] > 0).sum() / len(df_matching_address) *100)}%')

Percentage of addresses that have interacted with a toxic address: 3%


In [151]:
tag_query = '''
SELECT DISTINCT(TAG_TYPE)
FROM crosschain.core.address_tags 
WHERE BLOCKCHAIN='ethereum'
;'''
df_distinct_tags = flipside_api.execute_query(tag_query)

In [152]:
df_distinct_tags.tag_type.values

array(['contract', 'Balancer Delegates', 'wallet', 'activity', 'cex',
       'chainlink oracle', 'Aave Delegates', 'NFT', 'nft', 'dex'],
      dtype=object)

I found the tag_name "airdrop master" could be interesting 

In [183]:
api_key2 = os.environ['FLIPSIDE_API_KEY2']
flipside_api2 = FlipsideApi(api_key2, max_address=400)

In [184]:
query_airdrop_master = '''
SELECT BLOCKCHAIN, CREATOR, ADDRESS, TAG_NAME
FROM crosschain.core.address_tags 
WHERE BLOCKCHAIN='ethereum'
AND TAG_NAME = 'airdrop master'
;
'''
df_airdrop_master = flipside_api2.execute_query(query_airdrop_master)

In [193]:
# Compute the number of interactions with any of the aidrop for each address
df_matching_address['count_interaction_with_airdrop_m'] = df_matching_address['address'].apply(lambda x: count_interaction_with_any(tx_analyser, x, df_airdrop_master['address'].unique()))
print(f'Percentage of addresses that interacted with airdrop master: {int((df_matching_address["count_interaction_with_airdrop_m"] > 0).sum() / len(df_matching_address) * 100)}%')

Percentage of addresses that interacted with airdrop master: 28%


In [194]:
# Boolean whether the address is a aidrop master
df_matching_address['is_airdrop_master'] = df_matching_address['address'].apply(lambda x: x in df_airdrop_master['address'].unique()) 
print(f'Percentage of addresses that are airdrop master: {int((df_matching_address["is_airdrop_master"]).sum() / len(df_matching_address) * 100)}%')

Percentage of addresses that are airdrop master: 10%


In [195]:
sql_query_tornado = '''
SELECT DISTINCT PROJECT_NAME, ADDRESS
FROM crosschain.core.address_labels 
WHERE BLOCKCHAIN='ethereum'
AND PROJECT_NAME LIKE '%tornado%'
;
'''
df_tornado = flipside_api2.execute_query(sql_query_tornado)

In [196]:
# Count the number of interactions with tornado
df_matching_address['count_interaction_with_tornado'] = df_matching_address['address'].apply(lambda x: count_interaction_with_any(tx_analyser, x, df_tornado['address'].unique()))
print(f'Percentage of addresses that interacted with tornado: {int((df_matching_address["count_interaction_with_tornado"] > 0).sum() / len(df_matching_address) * 100)}%')

Percentage of addresses that interacted with tornado: 4%


In [200]:
# Count the number of time the address interatec with disperse contract: '0xD152f549545093347A162Dce210e7293f1452150'
df_matching_address['count_interaction_with_disperse'] = df_matching_address['address'].apply(lambda x: count_interaction_with_any(tx_analyser, x, [str.lower('0xD152f549545093347A162Dce210e7293f1452150')]))
print(f'Percentage of addresses that interacted with disperse: {int((df_matching_address["count_interaction_with_disperse"] > 0).sum() / len(df_matching_address) * 100)}%')

Percentage of addresses that interacted with disperse: 2%


In [207]:
df_matching_address.describe(percentiles=[0.1, 0.25, 0.5, 0.75, 0.9])

Unnamed: 0,count_interaction_with_toxic,count_interaction_with_pool,count_interaction_with_airdrop_m,count_interaction_with_tornado,count_interaction_with_disperse
count,371.0,371.0,371.0,371.0,371.0
mean,0.053908,9.339623,2.975741,0.132075,0.12938
std,0.356112,37.67669,9.506013,0.76145,1.131597
min,0.0,0.0,0.0,0.0,0.0
10%,0.0,0.0,0.0,0.0,0.0
25%,0.0,0.0,0.0,0.0,0.0
50%,0.0,0.0,0.0,0.0,0.0
75%,0.0,3.0,1.0,0.0,0.0
90%,0.0,21.0,8.0,0.0,0.0
max,5.0,551.0,96.0,8.0,15.0


In [210]:
df_matching_address['has_interaction_toxic'] = df_matching_address['count_interaction_with_toxic'] > 0
df_matching_address['has_no_pool_interaction'] = df_matching_address['count_interaction_with_pool'] < 6
df_matching_address['has_interaction_airdrop_m'] = df_matching_address['count_interaction_with_airdrop_m'] > 0
df_matching_address['has_interaction_tornado'] = df_matching_address['count_interaction_with_tornado'] > 0
df_matching_address['has_interaction_disperse'] = df_matching_address['count_interaction_with_disperse'] > 0

In [209]:
boolean_features = ['has_interaction_toxic', 'has_no_pool_interaction', 'has_interaction_airdrop_m', 'has_interaction_tornado', 'has_interaction_disperse', 'is_airdrop_master', 'interacted_other_ctbt', 'less_10_tx', 'less_5_tx']

In [211]:
df_matching_address[boolean_features].sum() 

has_interaction_toxic         13
has_no_pool_interaction      303
has_interaction_airdrop_m    107
has_interaction_tornado       17
has_interaction_disperse      11
is_airdrop_master             38
interacted_other_ctbt         71
less_10_tx                    22
less_5_tx                      6
dtype: int64

In [216]:
(df_matching_address[boolean_features].sum(axis=1) > 2).sum()

51

In [298]:
df_matching_address['count_flags'] = df_matching_address[boolean_features].sum(axis=1)

In [299]:
df_matching_address['suspicious_1'] = df_matching_address['count_flags'] > 2

In [300]:
df_suspicious_1 = df_matching_address[df_matching_address['suspicious_1'] == True]

### Investigating the grants receiving the most votes from the flagged addresses

In [223]:
df_vote_sus1 = df_votes[df_votes['voter'].isin(df_suspicious_1['address'])]

In [295]:
print(f'Number of votes {df_vote_sus1.shape[0]} Number of projects voted: {df_vote_sus1.grantAddress.nunique()}')
# Merge the project the user voted for and the projects 
gr_sus = df_vote_sus1['grantAddress'].value_counts().reset_index().merge(df_application, left_on='grantAddress', right_on='metadata.application.recipient', how='left').drop_duplicates(subset='grantAddress').loc[:, ['grantAddress', 'metadata.application.project.title', 'count', 'roundId', 'status']].reset_index(drop=True)

Number of votes 542 Number of projects voted: 267


In [296]:
gr_sus.head(30)

Unnamed: 0,grantAddress,metadata.application.project.title,count,roundId,status
0,0x4adc8cc149a03f44386bee80bab36f9e8022b195,Unitap,13,0xaa40e2e5c8df03d792a52b5458959c320f86ca18,APPROVED
1,0x3a5bd1e37b099ae3386d13947b6a90d97675e5e3,Lenster,9,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
2,0x18aa467e40e1defb1956708830a343c1d01d3d7c,JediSwap,9,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
3,0x08a3c2a819e3de7aca384c798269b3ce1cd0e437,DefiLlama,8,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
4,0x4d9339dd97db55e3b9bcbe65de39ff9c04d1c2cd,Giveth,6,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
5,0xe126b3e5d052f1f575828f61feba4f4f2603652a,Revoke.cash - Helping you stay safe in web3,6,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
6,0x8110d1d04ac316fdcace8f24fd60c86b810ab15a,Commons Stack,6,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
7,0xda3dddae8119644f8be18c7afd16850b02a4c841,ETH Venezuela Community Projects,5,0xaa40e2e5c8df03d792a52b5458959c320f86ca18,APPROVED
8,0x0035cc37599241d007d0aba1fb931c5fa757f7a1,EVMcrispr,5,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
9,0x57ea12a3a8e441f5fe7b1f3af1121097b7d3b6a8,Umbra,5,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED


Upchain twitter handle does not exists but the repo is very active Suspicious?

- Seoul bound is either very new or suspicious
- Gravity DAO old project not sure it is still active

Other projects are not suspicious

### Look at the addresses with many flags

In [309]:
df_matching_address[boolean_features].sum()

has_interaction_toxic         13
has_no_pool_interaction      303
has_interaction_airdrop_m    107
has_interaction_tornado       17
has_interaction_disperse      11
is_airdrop_master             38
interacted_other_ctbt         71
less_10_tx                    22
less_5_tx                      6
dtype: int64

We are going to review the addresses that have at least 1 flag:
- has_intercation_toxic 
- has_interaction_tornado
- has_interaction_disperse
- has_interaction_airdrop_master
- is airdrop master

In [316]:
interaction_bool = ['has_interaction_toxic', 'has_interaction_airdrop_m', 'has_interaction_tornado', 'has_interaction_disperse', 'is_airdrop_master']

In [322]:
df_interact_sus = df_matching_address[df_matching_address[interaction_bool].sum(axis=1) > 0]
print(f'Number of addresses that interacted with a suspicious contract or address: {df_interact_sus.shape[0]}')

Number of addresses that interacted with a suspicious contract or address: 120


In [324]:
df_vote_interact_sus = df_votes[df_votes['voter'].isin(df_interact_sus['address'])]

In [371]:
print(f'Number of votes {df_vote_sus1.shape[0]} Number of projects voted: {df_vote_sus1.grantAddress.nunique()}')
# Merge the project the user voted for and the projects 
gr_sus = df_vote_interact_sus['grantAddress'].value_counts().reset_index().merge(df_application, left_on='grantAddress', right_on='metadata.application.recipient', how='left').drop_duplicates(subset='grantAddress').loc[:, ['grantAddress', 'metadata.application.project.title', 'count', 'roundId', 'status']].reset_index(drop=True)

Number of votes 542 Number of projects voted: 267


In [373]:
gr_sus.sort_values(by='count', ascending=False).head(30)

Unnamed: 0,grantAddress,metadata.application.project.title,count,roundId,status
0,0x3a5bd1e37b099ae3386d13947b6a90d97675e5e3,Lenster,25,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
1,0x18aa467e40e1defb1956708830a343c1d01d3d7c,JediSwap,22,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
2,0x08a3c2a819e3de7aca384c798269b3ce1cd0e437,DefiLlama,20,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
3,0xe126b3e5d052f1f575828f61feba4f4f2603652a,Revoke.cash - Helping you stay safe in web3,19,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
4,0x4adc8cc149a03f44386bee80bab36f9e8022b195,Unitap,16,0xaa40e2e5c8df03d792a52b5458959c320f86ca18,APPROVED
5,0x4b8810b079eb22ecf2d1f75e08e0abbd6fd87dbf,BrightID 🔆 Universal Proof of Uniqueness,14,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
6,0x6c5a2688c83c806150ca9dd0b2f10f16f8f1c33e,L2BEAT,13,0xdf22a2c8f6ba9376ff17ee13e6154b784ee92094,APPROVED
7,0x01d79bceaeaadfb8fd2f2f53005289cfcf483464,Lenstube,13,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
8,0x8110d1d04ac316fdcace8f24fd60c86b810ab15a,Commons Stack,13,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED
9,0x99b36fdbc582d113af36a21eba06bfeab7b9be12,Taho - Open Source and Community Owned Wallet,13,0x12bb5bbbfe596dbc489d209299b8302c3300fa40,APPROVED


No projects look suspicious in the top 30 by contribution

### Investigating more in details the type of donations made by the flagged addresses

In [379]:
# look at the average donation in USD
gb_donation_usd = df_vote_interact_sus.groupby('voter').describe()['amountUSD']

In [380]:
gb_donation_usd[gb_donation_usd['count'] == 1].sort_values(by='count', ascending=True)

Unnamed: 0_level_0,count,mean,std,min,25%,50%,75%,max
voter,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0x014eecfa2e58d4975991f46026a2332561161912,1.0,18.666023,,18.666023,18.666023,18.666023,18.666023,18.666023
0x7b84c9d300551af54f1b415d5bf211f852725dd6,1.0,1.141818,,1.141818,1.141818,1.141818,1.141818,1.141818
0x7c99670ecc3a57cd1a8d1bdfc8e2ff5b4527b382,1.0,45.793905,,45.793905,45.793905,45.793905,45.793905,45.793905
0x80bfb857770a802f7ef375921ad5e83c76214a2d,1.0,100.117889,,100.117889,100.117889,100.117889,100.117889,100.117889
0x817c48bb59e866d5baefc9a90d04a0ce4e7d543b,1.0,55.24601,,55.24601,55.24601,55.24601,55.24601,55.24601
0x83a76d3d563a57b3ee7689232c3789e64ba71772,1.0,38.060456,,38.060456,38.060456,38.060456,38.060456,38.060456
0x85274906f537e0aa3823855bd6a0e374c771d19b,1.0,11.4046,,11.4046,11.4046,11.4046,11.4046,11.4046
0x87d922aec2a6a4c4b212a95a34a5a99245d2dd99,1.0,5.665876,,5.665876,5.665876,5.665876,5.665876,5.665876
0x9aa470470cac9ab75e2b8648c9c054560749b33d,1.0,9.995054,,9.995054,9.995054,9.995054,9.995054,9.995054
0xa34faa5b9b9e29c191e163a520edeea93a574ba3,1.0,9.526369,,9.526369,9.526369,9.526369,9.526369,9.526369


This is interesting lets look at which project they gave

Nonetheless some donations are very similar in the amount donated lets try to dig into them

In [401]:
# inparticular address that donated close to 9.5$
same_amount = gb_donation_usd[np.logical_and(gb_donation_usd['count'] == 1, np.floor(gb_donation_usd['mean']) == 9)].sort_values(by='count', ascending=True)
same_amount

Unnamed: 0_level_0,count,mean,std,min,25%,50%,75%,max
voter,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0x2e3e22f9f4d2de3dbdee9cbf350e22afb1c1d6b9,1.0,9.521698,,9.521698,9.521698,9.521698,9.521698,9.521698
0x70b224998167a500ef39ed91e35ad133d996e744,1.0,9.526369,,9.526369,9.526369,9.526369,9.526369,9.526369
0x71f84aa585e1eda8852dad8dff3a69d114366dba,1.0,9.760757,,9.760757,9.760757,9.760757,9.760757,9.760757
0x9aa470470cac9ab75e2b8648c9c054560749b33d,1.0,9.995054,,9.995054,9.995054,9.995054,9.995054,9.995054
0xa34faa5b9b9e29c191e163a520edeea93a574ba3,1.0,9.526369,,9.526369,9.526369,9.526369,9.526369,9.526369
0xaee2e6432db54a6f3770d57f1108cd326f2b224f,1.0,9.510891,,9.510891,9.510891,9.510891,9.510891,9.510891
0xbb49ccec023dc369844eec3190503853b1380606,1.0,9.510891,,9.510891,9.510891,9.510891,9.510891,9.510891
0xe4c140a67a3a95913a6e2fcfc6d6434d94c07641,1.0,9.443127,,9.443127,9.443127,9.443127,9.443127,9.443127
0xe8fb09228d1373f931007ca7894a08344b80901c,1.0,9.475181,,9.475181,9.475181,9.475181,9.475181,9.475181


In [399]:
def get_grant_title(address, df_application, df_votes):
    project_address = df_votes[df_votes['voter'] == address]['grantAddress'].values[0]
    return df_application[df_application['metadata.application.recipient'] == project_address]['metadata.application.project.title'].values[0]

In [402]:
for address in same_amount.index:
    print(f'Address: {address} Project: {get_grant_title(address, df_application, df_votes)}')

Address: 0x2e3e22f9f4d2de3dbdee9cbf350e22afb1c1d6b9 Project: Unitap
Address: 0x70b224998167a500ef39ed91e35ad133d996e744 Project: BanklessDAO Projects
Address: 0x71f84aa585e1eda8852dad8dff3a69d114366dba Project: Web3TalentFair
Address: 0x9aa470470cac9ab75e2b8648c9c054560749b33d Project: Blocktrend（區塊勢）
Address: 0xa34faa5b9b9e29c191e163a520edeea93a574ba3 Project: BanklessDAO Projects
Address: 0xaee2e6432db54a6f3770d57f1108cd326f2b224f Project: BanklessDAO Projects
Address: 0xbb49ccec023dc369844eec3190503853b1380606 Project: BanklessDAO Projects
Address: 0xe4c140a67a3a95913a6e2fcfc6d6434d94c07641 Project: ReFi Japan
Address: 0xe8fb09228d1373f931007ca7894a08344b80901c Project: Unitap


This analysis is pointing to a suspicion for the project "BanklessDAO Projects" and "Unitap"

However the project "BanklessDAO Projects" is well known.

My guess is that there is a lot of collusion within these addresses and there donation should not be matched. Especially knowing some of these addresses are flagged. Let's verify it.

In [404]:
df_matching_address[df_matching_address['address'].isin(same_amount.index)]

Unnamed: 0,address,seed_same_naive,seed_same,seed_suspicious,less_5_tx,less_10_tx,interacted_other_ctbt,count_interaction_with_toxic,count_interaction_with_pool,count_interaction_with_airdrop_m,is_airdrop_master,count_interaction_with_tornado,count_interaction_with_disperse,has_interaction_toxic,has_no_pool_interaction,has_interaction_airdrop_m,has_interaction_tornado,has_interaction_disperse,suspicious_1,count_flags
57,0x2e3e22f9f4d2de3dbdee9cbf350e22afb1c1d6b9,False,False,False,False,False,True,0,5,12,True,0,0,False,True,True,False,False,True,4
173,0x70b224998167a500ef39ed91e35ad133d996e744,False,False,False,False,False,True,0,2,4,True,0,0,False,True,True,False,False,True,4
178,0x71f84aa585e1eda8852dad8dff3a69d114366dba,False,False,False,False,False,False,0,8,1,True,0,0,False,False,True,False,False,False,2
229,0x9aa470470cac9ab75e2b8648c9c054560749b33d,False,False,False,False,False,False,1,14,4,False,0,0,True,False,True,False,False,False,2
246,0xa34faa5b9b9e29c191e163a520edeea93a574ba3,False,False,False,False,False,True,0,3,42,False,1,0,False,True,True,True,False,True,4
259,0xaee2e6432db54a6f3770d57f1108cd326f2b224f,False,False,False,False,False,True,0,2,19,False,0,0,False,True,True,False,False,True,3
277,0xbb49ccec023dc369844eec3190503853b1380606,False,False,False,False,False,True,0,2,18,False,0,0,False,True,True,False,False,True,3
330,0xe4c140a67a3a95913a6e2fcfc6d6434d94c07641,False,False,False,False,False,False,0,18,2,False,0,0,False,False,True,False,False,False,1
338,0xe8fb09228d1373f931007ca7894a08344b80901c,False,False,False,False,False,True,0,5,20,False,0,0,False,True,True,False,False,True,3


In [407]:
same_amount_tx = df_tx[np.logical_or(df_tx['from_address'].isin(same_amount.index), df_tx['to_address'].isin(same_amount.index))]
txa_same_amount = txa(same_amount_tx, pd.DataFrame(same_amount.index, columns=['address']))

All true values mean that the wallet are connected between themselves in that small cluster of addresses

In [409]:
same_amount.reset_index()['voter'].apply(lambda x : txa_same_amount.has_interacted_with_other_contributor(x)).values

array([ True,  True, False, False,  True,  True,  True, False,  True])

In [391]:
add_sus = gb_donation_usd[np.logical_and(gb_donation_usd['count'] == 1, gb_donation_usd['mean'] < 25)].index.values
len(add_sus)

28

In [392]:
df_1_donation = df_vote_interact_sus[df_vote_interact_sus['voter'].isin(add_sus)]

In [393]:
print(f'Number of votes {df_1_donation.shape[0]} Number of projects voted: {df_1_donation.grantAddress.nunique()}')
# Merge the project the user voted for and the projects 
gr_sus = df_1_donation['grantAddress'].value_counts().reset_index().merge(df_application, left_on='grantAddress', right_on='metadata.application.recipient', how='left').drop_duplicates(subset='grantAddress').loc[:, ['grantAddress', 'metadata.application.project.title', 'count', 'roundId', 'status']].reset_index(drop=True)
gr_sus.sort_values(by='count', ascending=False)

Number of votes 28 Number of projects voted: 14


Unnamed: 0,grantAddress,metadata.application.project.title,count,roundId,status
0,0x4adc8cc149a03f44386bee80bab36f9e8022b195,Unitap,8,0xaa40e2e5c8df03d792a52b5458959c320f86ca18,APPROVED
1,0xf26d1bb347a59f6c283c53156519cc1b1abaca51,BanklessDAO Projects,4,0xaa40e2e5c8df03d792a52b5458959c320f86ca18,APPROVED
2,0x4cd6b4503d02973d78cff59c9c56f5f378688274,Mechanism Institute: Democratizing Cryptoecono...,3,0xaa40e2e5c8df03d792a52b5458959c320f86ca18,APPROVED
3,0x36f322fc85b24ab13263cfe9217b28f8e2b38381,Blocktrend（區塊勢）,2,0xaa40e2e5c8df03d792a52b5458959c320f86ca18,APPROVED
4,0x2ec655969160ab38d308140f3817a0e3be0ca9f2,KultureCity F THE NEVERS,2,0xaa40e2e5c8df03d792a52b5458959c320f86ca18,APPROVED
5,0x300da191248a500b2174aed992d6697bf97f9139,DAZE,1,0xaa40e2e5c8df03d792a52b5458959c320f86ca18,APPROVED
6,0xd6823f807c45efdc56c9ae8db0226ca10af6e8ab,登链社区(Upchain),1,0xaa40e2e5c8df03d792a52b5458959c320f86ca18,APPROVED
7,0xf2f099d2d133e2f3d1ed8417ca71e914a3185897,Crypto Fundraising,1,0xaa40e2e5c8df03d792a52b5458959c320f86ca18,APPROVED
8,0x92ed8f6a9211f9eb0f16c83a052e75099b7bf4a5,Web3TalentFair,1,0xaa40e2e5c8df03d792a52b5458959c320f86ca18,APPROVED
9,0x3c8ad6f38a060a5b01af26fe1e1ed9c50c32902b,ReFi Japan,1,0xaa40e2e5c8df03d792a52b5458959c320f86ca18,APPROVED


By reviewing manually I did not find that any of these projects were suspicious.

In [315]:
df_matching_address[df_matching_address['has_interaction_toxic']]

Unnamed: 0,address,seed_same_naive,seed_same,seed_suspicious,less_5_tx,less_10_tx,interacted_other_ctbt,count_interaction_with_toxic,count_interaction_with_pool,count_interaction_with_airdrop_m,is_airdrop_master,count_interaction_with_tornado,count_interaction_with_disperse,has_interaction_toxic,has_no_pool_interaction,has_interaction_airdrop_m,has_interaction_tornado,has_interaction_disperse,suspicious_1,count_flags
10,0x0a251df99a88a20a93876205fb7f5faf2e85a481,False,False,False,False,False,False,1,106,10,True,1,0,True,False,True,True,False,True,4
14,0x0bb602f88bf886282ff69d4cec937cc2a7d9e19a,True,True,False,False,False,False,5,15,0,False,0,0,True,False,False,False,False,False,1
114,0x504c11bdbe6e29b46e23e9a15d9c8d2e2e795709,False,False,False,False,False,False,1,34,11,False,4,2,True,False,True,True,True,True,4
137,0x5a930b098ed8d58dd4590577af85a8e864a8f6fe,True,True,False,False,False,True,1,0,0,False,0,0,True,True,False,False,False,True,3
167,0x6d526f6b4c86fbdc8e359e6bef4cd6a42acea2d7,True,True,False,False,False,False,1,6,3,True,0,0,True,False,True,False,False,True,3
184,0x767a60f295aedd958932088f9cd6a4951d8739b6,False,False,False,False,False,False,1,78,8,True,0,0,True,False,True,False,False,True,3
200,0x85274906f537e0aa3823855bd6a0e374c771d19b,False,False,False,False,False,True,1,4,3,False,0,0,True,True,True,False,False,True,4
216,0x93907de38066d70109935732757b625d636e47b6,True,True,False,False,False,True,1,16,0,False,0,0,True,False,False,False,False,False,2
225,0x984b18b1823fef04a4ca7cf1e8a0ef5359fa522f,False,False,False,False,False,False,2,4,10,False,0,0,True,True,True,False,False,True,3
229,0x9aa470470cac9ab75e2b8648c9c054560749b33d,False,False,False,False,False,False,1,14,4,False,0,0,True,False,True,False,False,False,2


In [314]:
df_matching_address[np.logical_and(np.logical_and(df_matching_address['has_interaction_airdrop_m'], df_matching_address['is_airdrop_master']), df_matching_address['has_interaction_disperse'])]

Unnamed: 0,address,seed_same_naive,seed_same,seed_suspicious,less_5_tx,less_10_tx,interacted_other_ctbt,count_interaction_with_toxic,count_interaction_with_pool,count_interaction_with_airdrop_m,is_airdrop_master,count_interaction_with_tornado,count_interaction_with_disperse,has_interaction_toxic,has_no_pool_interaction,has_interaction_airdrop_m,has_interaction_tornado,has_interaction_disperse,suspicious_1,count_flags
122,0x52432473144056fe91aeae9240ad21e6b8213440,False,False,False,False,False,True,0,3,4,True,0,1,False,True,True,False,True,True,5
198,0x839395e20bbb182fa440d08f850e6c7a8f6f0780,False,False,False,False,False,False,0,245,96,True,8,1,False,False,True,True,True,True,4
263,0xb08f95dbc639621dbaf48a472ae8fce0f6f56a6e,True,True,False,False,False,True,0,56,40,True,0,3,False,False,True,False,True,True,4
363,0xfb40932271fc9db9dbf048e80697e2da4aa57250,False,False,False,False,False,True,1,76,68,True,0,1,True,False,True,False,True,True,5


Amost every airdrop master has interacted with another aidrop master. This is very suspicious. It means they are very connected to each other. These addresses are probably controlled by the same person or a group of person and thus they donation should not be matched.

In [304]:
df_suspicious_1.sort_values(by='count_flags', ascending=False).head(20)

Unnamed: 0,address,seed_same_naive,seed_same,seed_suspicious,less_5_tx,less_10_tx,interacted_other_ctbt,count_interaction_with_toxic,count_interaction_with_pool,count_interaction_with_airdrop_m,is_airdrop_master,count_interaction_with_tornado,count_interaction_with_disperse,has_interaction_toxic,has_no_pool_interaction,has_interaction_airdrop_m,has_interaction_tornado,has_interaction_disperse,suspicious_1,count_flags
363,0xfb40932271fc9db9dbf048e80697e2da4aa57250,False,False,False,False,False,True,1,76,68,True,0,1,True,False,True,False,True,True,5
161,0x693edbcf118ec982f5a8101498b6c789470b0b89,False,False,False,False,False,True,0,1,3,True,2,0,False,True,True,True,False,True,5
122,0x52432473144056fe91aeae9240ad21e6b8213440,False,False,False,False,False,True,0,3,4,True,0,1,False,True,True,False,True,True,5
114,0x504c11bdbe6e29b46e23e9a15d9c8d2e2e795709,False,False,False,False,False,False,1,34,11,False,4,2,True,False,True,True,True,True,4
309,0xd11b4b2a4b9b31bf4e8f879e2036ddccab6fcb6f,False,False,False,False,True,True,0,0,1,False,0,0,False,True,True,False,False,True,4
283,0xbe278527d392ebb1cbe4818b95d984ff0a773d73,False,False,False,False,False,True,0,0,1,False,3,0,False,True,True,True,False,True,4
282,0xbe0284a4a260df4c58eec491d41995ceee3fac58,True,True,False,False,False,False,0,5,4,True,2,0,False,True,True,True,False,True,4
263,0xb08f95dbc639621dbaf48a472ae8fce0f6f56a6e,True,True,False,False,False,True,0,56,40,True,0,3,False,False,True,False,True,True,4
246,0xa34faa5b9b9e29c191e163a520edeea93a574ba3,False,False,False,False,False,True,0,3,42,False,1,0,False,True,True,True,False,True,4
200,0x85274906f537e0aa3823855bd6a0e374c771d19b,False,False,False,False,False,True,1,4,3,False,0,0,True,True,True,False,False,True,4


In [None]:
from scipy.cluster.hierarchy import dendrogram, linkage
X1 = df_tx
Z1 = linkage(X1, method='single', metric='euclidean')

NameError: name 'X1' is not defined

In [None]:
from sklearn.cluster import AgglomerativeClustering

In [None]:
from sklearn.cluster import AgglomerativeClustering

Z1 = AgglomerativeClustering(n_clusters=2, linkage='ward')

Z1.fit_predict(X1)

print(Z1.labels_)

NameError: name 'X1' is not defined

In [None]:
from scipy.cluster.hierarchy import dendrogram, linkage

Z1 = linkage(X1, method='single', metric='euclidean')

ModuleNotFoundError: No module named 'scipy'

In [None]:
plt.figure(figsize=(15, 10))
plt.subplot(2,2,1), dendrogram(Z1), plt.title('Single')

ideas
create a new function
that whitelist the address that are not suspicious as funding wallet address
for example many people are funded by binance or coinbase wallets

Other idea apply funding time detection to the wallets
if address created on the same day and given to the same grants then suspicious?

is it useful to create a complex algorithm to detect suspicious funding time? 

donation time could also be inspected for each wallet

donation pattern? same number of donations and donation to the same grants?

