This Notebook will import an NFT Similarity Matrix and also a dataframe containing a list of users who have purchased an NFT collection in the past. We will then use these information to train out model following these steps :<br>
<br>
1) Get the Name of an NFT collection and use the cosine similarity matrix to get a list of collections which we deem are "simillar" <br><br>
2) Get the list of wallet addresses which have purchased the NFT collection in the past. <br><br>
3) Then use the Alchemy NFT API endpoint to get a list of nft collections that this user owns. If the user owns any of the following recommended NFT collections, increase the similairty score in the similarity matrix by a lot. If the User does not own a recommended NFT collection, decrease the similality score by a small amount. We provide a large bonus for correct recommendations and a small penaltiy for false recommendations because it is much much more likley that a user does not own a recommended NFT than a does own a recommened NFT


In [2]:
#Import the Neccessary Libraries
import numpy as np
import pandas as pd

In [3]:
Collection_Addresses = pd.read_csv('OpenseaData/OpenSeaSalesDataRightFormat_csv.csv', error_bad_lines=False)

In [4]:
'''
Here we are importing a data frame which contains a matrix containing similarity scores of
different NFT collections. These similarity scores were calculated using cosine similairity
in another notebook which is present in the github repo
'''
similarity_data_frame = pd.read_csv('OpenseaData/CosineSimilairtyDataFrame_csv.csv', error_bad_lines=False)

In [5]:
'''
Here we are importing a dataframe which contains the collections certain addresses use
'''
address_to_collections = pd.read_csv('OpenseaData/address_to_collections.csv', error_bad_lines=False)

In [6]:
'''
This function will use the similairty matrix to return a list of recommended NFTs. The
recommended nfts are the collections with the highest similarity scores to the inputted 
collection. It returns a list of collection names that is the size of output_size
'''
def getRecommendations(input_collection_name, output_size):
    scores = list(enumerate(similarity_data_frame[input_collection_name]))
    scores_sorted = sorted(scores, key=lambda x:x[1], reverse=True)
    
    j = 1
    output_array = []
    for i in range(0, output_size):
        collection_name = similarity_data_frame.iloc[scores_sorted[i][0]].values[0]
        if collection_name != input_collection_name:
            output_array.append(collection_name)
            j+=1
    return output_array

In [7]:
#Here we are testing the recommendation function using CryptoPunks
crypto_punk_output = getRecommendations("CryptoPunks", 10)
crypto_punk_output

['FLUF World',
 'Cryptoadz',
 'CyberBrokers',
 'Aurory',
 'Hashmasks',
 'Animetas',
 'Acrocalypse',
 '0N1 Force',
 'Emblem Vault']

In [8]:
'''
Here we have an example of a user
'''
User1 = ['NBA Top Shot', 'The Sandbox', 'Imaginary Ones', 'Pixel Vault Founders DAO']

In [9]:
User1_0_rec = getRecommendations(User1[0], 40)
User1_0_rec

['Otherdeed',
 'Doodles',
 'Meebits',
 'Sorare',
 'DeadFellaz',
 'ALIENFRENS',
 'Axie Infinity',
 'Art Blocks',
 'The Sandbox',
 'Farmers World',
 'ZED RUN',
 'NFT Worlds',
 'Cool Pets',
 'Creature World NFT',
 'Ethereum Name Service',
 'MURI',
 'Bored Ape Yacht Club',
 'Mutant Ape Yacht Club',
 'VeeFriends Series 2',
 'Tubby Cats',
 'Panini America',
 'Azuki',
 'CloneX',
 'Moonbirds',
 'Crabada',
 'MFER',
 'projectPXN',
 'LOSTPOETS',
 '3Landers',
 'goblintown',
 'Adam Bomb Squad',
 'Gods Unchained Immutable',
 'NFL All Day',
 'Cool Cats',
 'Pudgy Penguins',
 'Okay Bears',
 'Lazy Lions',
 'Alien Worlds',
 'Mooncats']

In [10]:
'''
Here we have the learning rates. The missing_penalty is the amount we would decrease the 
similarity score if the test subject does not own a recommend NFT. The having_bonus is the 
amount we would increase the similarity score if the test subect does own the recommended NFT.
'''
missing_penalty = -0.000001
having_bonus = 0.001

In [11]:
'''
Here we are making a set of the NFT's the test subject owns. This way we can check if a user owns a 
certain NFT in 0(1) time
'''
user1_nft_set = set(User1)
user1_nft_set

{'Imaginary Ones', 'NBA Top Shot', 'Pixel Vault Founders DAO', 'The Sandbox'}

In [12]:
similarity_data_frame.head()

Unnamed: 0,Collections,Axie Infinity,Bored Ape Yacht Club,CryptoPunks,Mutant Ape Yacht Club,Art Blocks,Otherdeed,NBA Top Shot,Azuki,CloneX,...,CryptoonGoonz,Deafbeef,Illuminati,Bastard Gan Punks V2,Fishy Fam,Potatoz,Mindblowon,Sipherian Surge,Wool Pouch,Los Muertos
0,Axie Infinity,1.0,0.57735,0.182574,0.57735,0.666667,0.730297,0.617213,0.547723,0.547723,...,0.0,0.0,0.0,0.0,0.0,0.0,0.182574,0.0,0.0,0.166667
1,Bored Ape Yacht Club,0.57735,1.0,0.158114,0.875,0.57735,0.632456,0.534522,0.474342,0.474342,...,0.0,0.0,0.0,0.0,0.0,0.0,0.158114,0.0,0.0,0.144338
2,CryptoPunks,0.182574,0.158114,1.0,0.158114,0.182574,0.2,0.169031,0.2,0.4,...,0.4,0.0,0.4,0.0,0.182574,0.2,0.2,0.365148,0.182574,0.182574
3,Mutant Ape Yacht Club,0.57735,0.875,0.158114,1.0,0.57735,0.632456,0.534522,0.474342,0.474342,...,0.0,0.0,0.0,0.0,0.0,0.0,0.158114,0.0,0.0,0.144338
4,Art Blocks,0.666667,0.57735,0.182574,0.57735,1.0,0.730297,0.617213,0.547723,0.547723,...,0.0,0.0,0.0,0.0,0.0,0.0,0.182574,0.0,0.0,0.166667


In [13]:
'''
Here we are making a copy of the similarity matrix we will use for the training. 
We need it 
'''
similarity_training_data_frame = similarity_data_frame.set_index('Collections')

In [14]:
similarity_training_data_frame.head()

Unnamed: 0_level_0,Axie Infinity,Bored Ape Yacht Club,CryptoPunks,Mutant Ape Yacht Club,Art Blocks,Otherdeed,NBA Top Shot,Azuki,CloneX,Moonbirds,...,CryptoonGoonz,Deafbeef,Illuminati,Bastard Gan Punks V2,Fishy Fam,Potatoz,Mindblowon,Sipherian Surge,Wool Pouch,Los Muertos
Collections,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Axie Infinity,1.0,0.57735,0.182574,0.57735,0.666667,0.730297,0.617213,0.547723,0.547723,0.547723,...,0.0,0.0,0.0,0.0,0.0,0.0,0.182574,0.0,0.0,0.166667
Bored Ape Yacht Club,0.57735,1.0,0.158114,0.875,0.57735,0.632456,0.534522,0.474342,0.474342,0.474342,...,0.0,0.0,0.0,0.0,0.0,0.0,0.158114,0.0,0.0,0.144338
CryptoPunks,0.182574,0.158114,1.0,0.158114,0.182574,0.2,0.169031,0.2,0.4,0.4,...,0.4,0.0,0.4,0.0,0.182574,0.2,0.2,0.365148,0.182574,0.182574
Mutant Ape Yacht Club,0.57735,0.875,0.158114,1.0,0.57735,0.632456,0.534522,0.474342,0.474342,0.474342,...,0.0,0.0,0.0,0.0,0.0,0.0,0.158114,0.0,0.0,0.144338
Art Blocks,0.666667,0.57735,0.182574,0.57735,1.0,0.730297,0.617213,0.547723,0.547723,0.547723,...,0.0,0.0,0.0,0.0,0.0,0.0,0.182574,0.0,0.0,0.166667


In [15]:
for i in range(0, len(User1_0_rec)):
    if User1_0_rec[i] in user1_nft_set:
        print("In user set")
        similarity_training_data_frame[User1_0_rec[i]][User1[0]] = similarity_training_data_frame[User1_0_rec[i]][User1[0]] + having_bonus
        similarity_training_data_frame[User1[0]][User1_0_rec[i]] = similarity_training_data_frame[User1[0]][User1_0_rec[i]] + having_bonus
    else:
        print("Not in user set")
        similarity_training_data_frame[User1_0_rec[i]][User1[0]] = similarity_training_data_frame[User1_0_rec[i]][User1[0]] + missing_penalty
        similarity_training_data_frame[User1[0]][User1_0_rec[i]] = similarity_training_data_frame[User1[0]][User1_0_rec[i]] + missing_penalty
        

Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
In user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set


Made two similarity dataframes. One for the Training and one for getting the recommendations

In [16]:
def Unit_Train(User1, similarity_training_data_frame, User1_0_rec, user1_nft_set):
    for i in range(0, len(User1_0_rec)):
        if User1_0_rec[i] in user1_nft_set:
            print("In user set")
            similarity_training_data_frame[User1_0_rec[i]][User1[0]] = similarity_training_data_frame[User1_0_rec[i]][User1[0]] + having_bonus
            similarity_training_data_frame[User1[0]][User1_0_rec[i]] = similarity_training_data_frame[User1[0]][User1_0_rec[i]] + having_bonus
        else:
            print("Not in user set")
            similarity_training_data_frame[User1_0_rec[i]][User1[0]] = similarity_training_data_frame[User1_0_rec[i]][User1[0]] + missing_penalty
            similarity_training_data_frame[User1[0]][User1_0_rec[i]] = similarity_training_data_frame[User1[0]][User1_0_rec[i]] + missing_penalty
    return similarity_training_data_frame

In [17]:
test = similarity_training_data_frame.reset_index()
test.head()

Unnamed: 0,Collections,Axie Infinity,Bored Ape Yacht Club,CryptoPunks,Mutant Ape Yacht Club,Art Blocks,Otherdeed,NBA Top Shot,Azuki,CloneX,...,CryptoonGoonz,Deafbeef,Illuminati,Bastard Gan Punks V2,Fishy Fam,Potatoz,Mindblowon,Sipherian Surge,Wool Pouch,Los Muertos
0,Axie Infinity,1.0,0.57735,0.182574,0.57735,0.666667,0.730297,0.617212,0.547723,0.547723,...,0.0,0.0,0.0,0.0,0.0,0.0,0.182574,0.0,0.0,0.166667
1,Bored Ape Yacht Club,0.57735,1.0,0.158114,0.875,0.57735,0.632456,0.534521,0.474342,0.474342,...,0.0,0.0,0.0,0.0,0.0,0.0,0.158114,0.0,0.0,0.144338
2,CryptoPunks,0.182574,0.158114,1.0,0.158114,0.182574,0.2,0.169031,0.2,0.4,...,0.4,0.0,0.4,0.0,0.182574,0.2,0.2,0.365148,0.182574,0.182574
3,Mutant Ape Yacht Club,0.57735,0.875,0.158114,1.0,0.57735,0.632456,0.534521,0.474342,0.474342,...,0.0,0.0,0.0,0.0,0.0,0.0,0.158114,0.0,0.0,0.144338
4,Art Blocks,0.666667,0.57735,0.182574,0.57735,1.0,0.730297,0.617212,0.547723,0.547723,...,0.0,0.0,0.0,0.0,0.0,0.0,0.182574,0.0,0.0,0.166667


In [18]:
user2 = ['Jungle Freaks', 'Rug Radio - Genesis NFT', 'Blitmap', 'Bloot']
user2_set = set(user2)
user2_recs = getRecommendations(user2[0], 40)
similarity_training_data_frame = Unit_Train(user2, similarity_training_data_frame, user2_recs, user2_set)
#Update the similarity matrix used for the recommendations using our new improved similarity matrix
similarity_data_frame = similarity_training_data_frame.reset_index()

Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
In user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set
Not in user set


In [19]:
'''
This is the code for testing our model. We count how many of the recommended NFTs a user owns. 
'''
def Test_Metric(user_nfts_set, user_recommendations):
    correct_count = 0
    wrong_count = 0
    for i in range(0, len(user_recommendations)):
        if user_recommendations[i] in user_nfts_set:
            correct_count = correct_count + 1
        else:
            wrong_count = wrong_count + 1
    return (correct_count, wrong_count)
    

In [20]:
test_output = Test_Metric(user2_set, user2_recs)
print("Correct : " + str(test_output[0]))
print("Wrong : " + str(test_output[1]))

Correct : 1
Wrong : 38


In [21]:
AddressesToCollectionsFiltered = pd.read_csv('OpenseaData/AdressToCollectionsFiltered.csv', error_bad_lines=False)

In [33]:
from sklearn.model_selection import train_test_split
train, test = train_test_split(AddressesToCollectionsFiltered, test_size=0.2)

In [34]:
train.head()

Unnamed: 0,winner_account.address,asset.collection.name
62,0xbce3bd3b206946abbe094903ae2b4244b52fb4e9,"{'ETH TOWN', 'MarbleCards', 'BlockchainCuties'}"
1,0x04801e48fc364d1c81b29f6d0cda654c614324ec,"{'MyCryptoHeroes', 'Ethermon', 'BlockchainCuti..."
79,0xe84694e5f139fc33c897767fb7a9aa40e9cf2ada,"{'PandaEarth', 'Ether Kingdoms', 'CryptoRacing'}"
6,0x104d7c320963e7914a570e9a511c5b1d545e4aba,"{'PandaEarth', 'CryptoAssault', 'Chibi Fighter..."
67,0xca11d10ceb098f597a0cab28117fc3465991a63c,"{'PandaEarth', 'CryptoAssault', 'Mythereum', '..."


In [None]:
#Join the databases


In [98]:
def train_model(training_data):
    for index, row in training_data.iterrows():
        #print(row['winner_account.address'])
        #print(type(row['asset.collection.name']))
        #convert to list
        collection_list = row['asset.collection.name'][1:len(row['asset.collection.name'])-1]
        collection_list = collection_list.split(", ")
        #print(collection_list)
        collection_list2 = []
        collection_set = set(collection_list2)
        for word in collection_list:
            collection_list2.append(word[1:len(word)-1])
        #print(collection_list2)
        print(type(collection_list2[0]))
        user_recs = getRecommendations(collection_list2[1], 40)
        similarity_training_data_frame = Unit_Train(collection_list2, similarity_training_data_frame, user_recs, collection_set)
        #Update the similarity matrix used for the recommendations using our new improved similarity matrix
        similarity_data_frame = similarity_training_data_frame.reset_index()
        
    

In [99]:
similarity_data_frame.head()

Unnamed: 0,Collections,Axie Infinity,Bored Ape Yacht Club,CryptoPunks,Mutant Ape Yacht Club,Art Blocks,Otherdeed,NBA Top Shot,Azuki,CloneX,...,CryptoonGoonz,Deafbeef,Illuminati,Bastard Gan Punks V2,Fishy Fam,Potatoz,Mindblowon,Sipherian Surge,Wool Pouch,Los Muertos
0,Axie Infinity,1.0,0.57735,0.182574,0.57735,0.666667,0.730297,0.617212,0.547723,0.547723,...,0.0,0.0,0.0,0.0,0.0,0.0,0.182574,0.0,0.0,0.166667
1,Bored Ape Yacht Club,0.57735,1.0,0.158114,0.875,0.57735,0.632456,0.534521,0.474342,0.474342,...,0.0,0.0,0.0,0.0,0.0,0.0,0.158114,0.0,0.0,0.144338
2,CryptoPunks,0.182574,0.158114,1.0,0.158114,0.182574,0.2,0.169031,0.2,0.4,...,0.4,0.0,0.4,0.0,0.182574,0.2,0.2,0.365148,0.182574,0.182574
3,Mutant Ape Yacht Club,0.57735,0.875,0.158114,1.0,0.57735,0.632456,0.534521,0.474342,0.474342,...,0.0,0.0,0.0,0.0,0.0,0.0,0.158114,0.0,0.0,0.144338
4,Art Blocks,0.666667,0.57735,0.182574,0.57735,1.0,0.730297,0.617212,0.547723,0.547723,...,0.0,0.0,0.0,0.0,0.0,0.0,0.182574,0.0,0.0,0.166667


In [100]:
getRecommendations("Fishy Fam", 10)

['Shinsekai',
 'CatBloxGenesis',
 'Winter Bears',
 'Dour Darcels',
 'VaynerSports Pass',
 'The Wicked Craniums',
 'ApeKidsClub',
 'Smilesss',
 'SolPunks']

In [101]:
getRecommendations("ETH TOWN", 10)

KeyError: 'ETH TOWN'

In [103]:
train_model(train)

<class 'str'>


KeyError: 'MarbleCards'

In [97]:
train_model(train)

TypeError: train_model() missing 2 required positional arguments: 'similarity_training_data_frame' and 'similarity_data_frame'