# DATA COMPREHESION

## DATA MINING

El objetivo es obtener todos los datos relacionados a cada token identificado por un atributo
llamado clase: Clase: (0: IA, 1: Gaming, 2: RWA, 3: Meme).
Se deben capturar datos en los siguientes periodos (desde la fecha del Halving, los siguientes
250 días):
<br/>
- Halving #1: 2/12/2012 + 250 días
- Halving #2: 2/07/2016 + 250 días
- Halving #3: 3/05/2020 + 250 días

Algunos de las características importantes a considerar son las siguientes:
-  Fecha de la captura (en formato dd/mm/yyyy)
-  Token: nombre corto del token
-  Nombre: nombre del token/proyecto
-  Valor del activo (en US$)
-  Capitalización total de mercado (market cap)
-  Posición en el ranking de criptomonedas
-  Volumen (volumen 24h)
-  Suministro circulante (circulating supply)
-  Suministro total (total supply)
-  Suministro máximo (max supply)
-  Calificación
-  Indicador de multi cadena (multichain)
-  Indicador de listado o no en Exchanges Centralizados (CEX)
- Clase: (0: IA, 1: Gaming, 2: RWA, 3: Meme)

**IMPORTANTE: existirán tokens que son nuevos y por tanto no han vivido aun un Halving. Para estos
casos, contemplar adquirir los datos de dichos tokens desde la fecha de su creación hasta la fecha
del Halving #4: 20/04/2024.**


In [2]:
import os
from dotenv import load_dotenv
import time
load_dotenv()
api_key = os.getenv("API_KEY")

In [9]:
from requests import Session
from requests.exceptions import ConnectionError, Timeout, TooManyRedirects
import json
import pandas as pd
from pathlib import Path

In [8]:
#load json file
def load_json(filepath,filename):
    with open(filepath+filename, 'r') as f:
        data = json.load(f)
    return data

In [6]:
class CoinGeckoClass:
    def __init__(self):
        self.api_key="CG-4iykpMQk3bQNCYrS4pMjmmXJ"
        self.url_base="https://api.coingecko.com/api/v3/"
        self.headers={
            "accept": "application/json",
            "x-cg-api-key":"CG-4gmHK6qGSZHmTp1Uh7ThLwr6"
            }
        self.session = Session()
        self.session.headers.update(self.headers)

        
    def __catch_error(self, url):
        try:
            response = self.session.get(url)
            data = json.loads(response.text)
            return data
        except (ConnectionError, Timeout, TooManyRedirects) as e:
            print(e)
    
    def get_categories(self):
        url = 'coins/categories'
        return self.__catch_error(self.url_base+url)
    
    def get_asset_platforms(self):
        url = 'asset_platforms'
        return self.__catch_error(self.url_base+url)
    
    def get_coin_list(self):
        url = 'coins/list?include_platform=true'
        return self.__catch_error(self.url_base+url)
    
    def get_coin_list_with_market_data(self,category):
        url = f"coins/markets?vs_currency=usd&category={category}&per_page=250&sparkline=true&price_change_percentage=1h%2C24h%2C7d&precision=full"
        return self.__catch_error(self.url_base+url)
    
    def get_exchange_list(self):
        url = 'exchanges/list'
        return self.__catch_error(self.url_base+url)

    def save_json(self,data,filepath,filename):
        with open(filepath+filename, 'w') as f:
            json.dump(data, f)

     

    def ping(self):
        url=self.url_base+"ping"
        response = self.session.get(url)
        data = json.loads(response.text)
        print(data)

In [13]:
api = CoinGeckoClass()
api.ping()

{'gecko_says': '(V3) To the Moon!'}


In [18]:
categories=api.get_categories()
api.save_json(categories,"data/raw/","categories.json")

In [22]:
asset_platforms=api.get_asset_platforms()
api.save_json(asset_platforms,"data/raw/","asset_platforms.json")

In [25]:
coin_list=api.get_coin_list()
api.save_json(coin_list,"data/raw/","coin_list.json")

### rwa

In [28]:
coin_list_with_market_data=api.get_coin_list_with_market_data("real-world-assets-rwa")
api.save_json(coin_list_with_market_data,"data/raw/","coin_list_with_market_data_real_world_assets_rwa.json")

### gaming

In [30]:
coin_list_with_market_data=api.get_coin_list_with_market_data("gaming")
api.save_json(coin_list_with_market_data,"data/raw/","coin_list_with_market_data_gaming.json")

In [31]:
coin_list_with_market_data=api.get_coin_list_with_market_data("play-to-earn")
api.save_json(coin_list_with_market_data,"data/raw/coin_list_with_market_data/","play_to_earn.json")


In [32]:
coin_list_with_market_data=api.get_coin_list_with_market_data("gaming-blockchains")
api.save_json(coin_list_with_market_data,"data/raw/coin_list_with_market_data/","gaming_blockchains.json")

In [33]:
coin_list_with_market_data=api.get_coin_list_with_market_data("gaming-utility-token")
api.save_json(coin_list_with_market_data,"data/raw/coin_list_with_market_data/","gaming_utility_token.json")

In [34]:
coin_list_with_market_data=api.get_coin_list_with_market_data("gaming-governance-token")
api.save_json(coin_list_with_market_data,"data/raw/coin_list_with_market_data/","gaming_governance_token.json")

In [35]:
coin_list_with_market_data=api.get_coin_list_with_market_data("gaming-platform")
api.save_json(coin_list_with_market_data,"data/raw/coin_list_with_market_data/","gaming_platform.json")

In [36]:
coin_list_with_market_data=api.get_coin_list_with_market_data("on-chain-gaming")
api.save_json(coin_list_with_market_data,"data/raw/coin_list_with_market_data/","on_chain_gaming.json")

### memes

In [4]:
categories = load_json("data/raw/","categories.json")

In [5]:
df_categories=pd.DataFrame(categories)

In [6]:
df_categories_meme = df_categories.loc[df_categories['content'].apply(lambda x : "meme" in str(x).lower())]

In [7]:
df_categories_meme["id"].to_list()

['meme-token',
 'dog-themed-coins',
 'elon-musk-inspired-coins',
 'solana-meme-coins',
 'cat-themed-coins',
 'base-meme-coins',
 'presale-meme-coins',
 'politifi',
 'ai-meme-coins',
 'parody-meme-coins',
 'ton-meme-coins',
 'anime-themed-coins',
 'duck-themed-coins']

In [15]:
for category in df_categories_meme["id"].to_list():
    coin_list_with_market_data=api.get_coin_list_with_market_data(category)
    api.save_json(coin_list_with_market_data,"data/raw/coin_list_with_market_data/memes/",f"{category.replace('-','_')}.json")

### IA

In [107]:
coin_list_with_market_data=api.get_coin_list_with_market_data("artificial-intelligence")
api.save_json(coin_list_with_market_data,"data/raw/coin_list_with_market_data/AI/","artificial_intelligence.json")

### CEX TOKENS

In [88]:
coin_list_with_market_data=api.get_coin_list_with_market_data("centralized-exchange-token-cex")
api.save_json(coin_list_with_market_data,"data/raw/coin_list_with_market_data/","centralized_exchange_token_cex.json")


In [93]:
data_cex=load_json("data/raw/coin_list_with_market_data/","centralized_exchange_token_cex.json")

In [95]:
len(data_cex)

42

### exchanges

In [74]:
exchange_list=api.get_exchange_list()
api.save_json(exchange_list,"data/raw/","exchange_list.json")

### clasificando tokens por clase

#### gaming

In [27]:
#load gaming json files
data_folder = Path("data/raw/coin_list_with_market_data/gaming/")
gaming_files = [file for file in data_folder.iterdir() if file.is_file()]
# to string list

In [28]:
df_all_tokens_gaming = pd.DataFrame()
for file in gaming_files:
    json_file=load_json("data/raw/coin_list_with_market_data/gaming/",file.name)
    df = pd.DataFrame(json_file)
    df = df[['id','symbol','name']]
    df['class']=1
    df_all_tokens_gaming = pd.concat([df_all_tokens_gaming,df])
    

In [29]:
df_all_tokens_gaming.drop_duplicates(subset=['id'],inplace=True)

In [30]:
#numeor de toknens gaming
len(df_all_tokens_gaming)

329

#### ia

In [31]:
data_folder = Path("data/raw/coin_list_with_market_data/AI/")
files = [file for file in data_folder.iterdir() if file.is_file()]

In [32]:
df_all_tokens_ai = pd.DataFrame()
for file in files:
    json_file=load_json("data/raw/coin_list_with_market_data/AI/",file.name)
    df = pd.DataFrame(json_file)
    df = df[['id','symbol','name']]
    df['class']=0
    df_all_tokens_ai = pd.concat([df_all_tokens_ai,df])

In [41]:
len(df_all_tokens_ai)

166

#### memes

In [33]:
data_folder = Path("data/raw/coin_list_with_market_data/memes/")
files = [file for file in data_folder.iterdir() if file.is_file()]

In [34]:
df_all_tokens_memes = pd.DataFrame()
for file in files:
    json_file=load_json("data/raw/coin_list_with_market_data/memes/",file.name)
    df = pd.DataFrame(json_file)
    df = df[['id','symbol','name']]
    df['class']=3
    df_all_tokens_memes = pd.concat([df_all_tokens_memes,df])

In [35]:
df_all_tokens_memes.duplicated(subset=['id']).sum()

226

In [36]:
df_all_tokens_memes.drop_duplicates(subset=['id'],inplace=True)

In [40]:
len(df_all_tokens_memes)

414

### rwa

In [37]:
data_folder = Path("data/raw/coin_list_with_market_data/rwa/")
files = [file for file in data_folder.iterdir() if file.is_file()]

In [38]:
df_all_tokens_rwa = pd.DataFrame()
for file in files:
    json_file=load_json("data/raw/coin_list_with_market_data/rwa/",file.name)
    df = pd.DataFrame(json_file)
    df = df[['id','symbol','name']]
    df['class']=2
    df_all_tokens_rwa = pd.concat([df_all_tokens_rwa,df])

In [39]:
len(df_all_tokens_rwa)

131

In [44]:
df_all_tokens  = pd.concat([df_all_tokens_gaming,df_all_tokens_ai,df_all_tokens_memes,df_all_tokens_rwa])

In [45]:
df_all_tokens.duplicated(subset=['id']).sum() # observamos que existen tokenes que tienen  mas de una class 

21

In [46]:
df_all_tokens.to_csv("data/clean/df_all_tokens.csv",index=False)

In [22]:
df_all_tokens = pd.read_csv("data/clean/df_all_tokens.csv")

In [23]:
df_all_tokens.head()

Unnamed: 0,id,symbol,name,class
0,immutable-x,imx,Immutable,1
1,floki,floki,FLOKI,1
2,gala,gala,GALA,1
3,beam-2,beam,Beam,1
4,axie-infinity,axs,Axie Infinity,1


In [9]:
len(df_all_tokens)

1040

In [15]:
## is in exchange centralized
exchange_list=load_json("data/raw/coin_list_with_market_data/","centralized_exchange_token_cex.json")
df_exchange_list = pd.DataFrame(exchange_list)

In [17]:
df_exchange_list = df_exchange_list[['id','name']]

In [24]:
## if id is in exchange centralized centralizaed true
df_all_tokens['centralized']=df_all_tokens['id'].apply(lambda x: 1 if x in df_exchange_list['id'].to_list() else 0)

In [26]:
df_all_tokens[df_all_tokens['centralized']==1].count().sum()

0

In [27]:
df_all_tokens.head()

Unnamed: 0,id,symbol,name,class,centralized
0,immutable-x,imx,Immutable,1,0
1,floki,floki,FLOKI,1,0
2,gala,gala,GALA,1,0
3,beam-2,beam,Beam,1,0
4,axie-infinity,axs,Axie Infinity,1,0


In [29]:
token_list_json = load_json("data/raw/","coin_list.json")
df_token_list = pd.DataFrame(token_list_json)

In [30]:
df_token_list.head()

Unnamed: 0,id,symbol,name,platforms
0,01coin,zoc,01coin,{}
1,0chain,zcn,Zus,{'ethereum': '0xb9ef770b6a5e12e45983c5d8054525...
2,0-knowledge-network,0kn,0 Knowledge Network,{'ethereum': '0x4594cffbfc09bc5e7ecf1c2e1c1e24...
3,0-mee,ome,O-MEE,{'ethereum': '0xbd89b8d708809e7022135313683663...
4,0vix-protocol,vix,0VIX Protocol,{}


In [31]:
df_token_list['platforms']=df_token_list['platforms'].apply(lambda x: len(x.keys()))

In [32]:
df_token_list.head()

Unnamed: 0,id,symbol,name,platforms
0,01coin,zoc,01coin,0
1,0chain,zcn,Zus,2
2,0-knowledge-network,0kn,0 Knowledge Network,1
3,0-mee,ome,O-MEE,1
4,0vix-protocol,vix,0VIX Protocol,0


In [33]:
## drop where platforms is 0
df_token_list = df_token_list[df_token_list['platforms']>0]

In [34]:
df_token_list.head()

Unnamed: 0,id,symbol,name,platforms
1,0chain,zcn,Zus,2
2,0-knowledge-network,0kn,0 Knowledge Network,1
3,0-mee,ome,O-MEE,1
5,0vm,zerovm,0VM,1
6,0x,zrx,0x Protocol,4


In [35]:
## if id is in token list multiplatforms true and is more than one platform
df_all_tokens['isMultiplatforms']=df_all_tokens['id'].apply(lambda x: 1 if x in df_token_list['id'].to_list() 
                                                          and df_token_list[df_token_list['id']==x]['platforms'].values[0]>1 else 0)

In [36]:
df_all_tokens.head()

Unnamed: 0,id,symbol,name,class,centralized,isMultiplatforms
0,immutable-x,imx,Immutable,1,0,0
1,floki,floki,FLOKI,1,0,1
2,gala,gala,GALA,1,0,0
3,beam-2,beam,Beam,1,0,1
4,axie-infinity,axs,Axie Infinity,1,0,1


In [37]:
df_all_tokens.to_csv("data/clean/df_all_tokens_v2.csv",index=False)

### join all memes coins in one csv

In [55]:
data_folder = Path("data/raw/coin_list_with_market_data/gaming/")
files = [file for file in data_folder.iterdir() if file.is_file()]
df_gaming = pd.DataFrame()
for file in files:
    df = pd.read_json(file)
    df_gaming = pd.concat([df_gaming,df])
df_gaming.drop_duplicates(subset=['id'],inplace=True)
df_gaming = df_gaming[['id',"symbol","name",'current_price','market_cap','market_cap_rank','fully_diluted_valuation','total_volume','high_24h','low_24h','price_change_24h','price_change_percentage_24h','market_cap_change_24h','market_cap_change_percentage_24h','circulating_supply','total_supply','max_supply']]
df_gaming.to_csv("data/clean/df_gaming.csv",index=False)

In [56]:
data_folder = Path("data/raw/coin_list_with_market_data/memes/")
files = [file for file in data_folder.iterdir() if file.is_file()]
df_memes = pd.DataFrame()


In [57]:
for file in files:
    json_file=load_json("data/raw/coin_list_with_market_data/memes/",file.name)
    df = pd.DataFrame(json_file)
    df_memes = pd.concat([df_gaming,df])
df_memes.drop_duplicates(subset=['id'],inplace=True)
df_memes = df_memes[['id',"symbol","name",'current_price','market_cap','market_cap_rank','fully_diluted_valuation','total_volume','high_24h','low_24h','price_change_24h','price_change_percentage_24h','market_cap_change_24h','market_cap_change_percentage_24h','circulating_supply','total_supply','max_supply']]
df_memes.to_csv("data/clean/df_memes.csv",index=False)

In [58]:
data_folder = Path("data/raw/coin_list_with_market_data/AI/")
files = [file for file in data_folder.iterdir() if file.is_file()]
df_ai = pd.DataFrame()
for file in files:
    df = pd.read_json(file)
    df_ai = pd.concat([df_ai,df])
df_ai.drop_duplicates(subset=['id'],inplace=True)
df_ai = df_ai[['id',"symbol","name",'current_price','market_cap','market_cap_rank','fully_diluted_valuation','total_volume','high_24h','low_24h','price_change_24h','price_change_percentage_24h','market_cap_change_24h','market_cap_change_percentage_24h','circulating_supply','total_supply','max_supply']]
df_ai.to_csv("data/clean/df_ai.csv",index=False)

In [59]:
data_folder = Path("data/raw/coin_list_with_market_data/rwa/")
files = [file for file in data_folder.iterdir() if file.is_file()]
df_rwa = pd.DataFrame()
for file in files:
    df = pd.read_json(file)
    df_rwa = pd.concat([df_rwa,df])
df_rwa.drop_duplicates(subset=['id'],inplace=True)
df_rwa = df_rwa[['id',"symbol","name",'current_price','market_cap','market_cap_rank','fully_diluted_valuation','total_volume','high_24h','low_24h','price_change_24h','price_change_percentage_24h','market_cap_change_24h','market_cap_change_percentage_24h','circulating_supply','total_supply','max_supply']]
df_rwa.to_csv("data/clean/df_rwa.csv",index=False)

In [122]:
df_all_tokens = pd.read_csv("data/clean/df_all_tokens_v2.csv")

In [123]:
df_all_tokens.drop_duplicates(subset=['id'],inplace=True)

In [124]:
df_gaming = pd.read_csv("data/clean/df_gaming.csv")
df_memes = pd.read_csv("data/clean/df_memes.csv")
df_ai = pd.read_csv("data/clean/df_ai.csv")
df_rwa = pd.read_csv("data/clean/df_rwa.csv")

In [125]:
df_gaming = df_gaming.rename(columns=lambda x: x + '_gaming' if x not in ['id'] else x)
df_ai = df_ai.rename(columns=lambda x: x + '_ai' if x not in ['id'] else x)
df_memes = df_memes.rename(columns=lambda x: x + '_memes' if x not in ['id'] else x)
df_rwa = df_rwa.rename(columns=lambda x: x + '_rwa' if x not in ['id'] else x)


In [126]:
df_all_tokens = pd.merge(df_all_tokens,df_gaming,on='id',how='left')
df_all_tokens = pd.merge(df_all_tokens,df_ai,on='id',how='left')
df_all_tokens = pd.merge(df_all_tokens,df_memes,on='id',how='left')
df_all_tokens = pd.merge(df_all_tokens,df_rwa,on='id',how='left')


In [127]:
df_all_tokens.head()

Unnamed: 0,id,symbol,name,class,centralized,isMultiplatforms,symbol_gaming,name_gaming,current_price_gaming,market_cap_gaming,...,total_volume_rwa,high_24h_rwa,low_24h_rwa,price_change_24h_rwa,price_change_percentage_24h_rwa,market_cap_change_24h_rwa,market_cap_change_percentage_24h_rwa,circulating_supply_rwa,total_supply_rwa,max_supply_rwa
0,immutable-x,imx,Immutable,1,0,0,imx,Immutable,2.177597,3178447000.0,...,,,,,,,,,,
1,floki,floki,FLOKI,1,0,1,floki,FLOKI,0.000184,1791909000.0,...,,,,,,,,,,
2,gala,gala,GALA,1,0,0,gala,GALA,0.046156,1615799000.0,...,,,,,,,,,,
3,beam-2,beam,Beam,1,0,1,beam,Beam,0.026762,1424825000.0,...,,,,,,,,,,
4,axie-infinity,axs,Axie Infinity,1,0,1,axs,Axie Infinity,7.421892,1070006000.0,...,,,,,,,,,,


In [128]:
#change nana for 0 
df_all_tokens.fillna(0,inplace=True)

In [129]:
df_all_tokens.head()

Unnamed: 0,id,symbol,name,class,centralized,isMultiplatforms,symbol_gaming,name_gaming,current_price_gaming,market_cap_gaming,...,total_volume_rwa,high_24h_rwa,low_24h_rwa,price_change_24h_rwa,price_change_percentage_24h_rwa,market_cap_change_24h_rwa,market_cap_change_percentage_24h_rwa,circulating_supply_rwa,total_supply_rwa,max_supply_rwa
0,immutable-x,imx,Immutable,1,0,0,imx,Immutable,2.177597,3178447000.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,floki,floki,FLOKI,1,0,1,floki,FLOKI,0.000184,1791909000.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,gala,gala,GALA,1,0,0,gala,GALA,0.046156,1615799000.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,beam-2,beam,Beam,1,0,1,beam,Beam,0.026762,1424825000.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,axie-infinity,axs,Axie Infinity,1,0,1,axs,Axie Infinity,7.421892,1070006000.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [130]:
df_all_tokens.drop(columns=['symbol_gaming','name_gaming'],inplace=True)
df_all_tokens.drop(columns=['symbol_ai','name_ai'],inplace=True)
df_all_tokens.drop(columns=['symbol_memes','name_memes'],inplace=True)
df_all_tokens.drop(columns=['symbol_rwa','name_rwa'],inplace=True)

In [131]:
df_all_tokens['market_cap'] = df_all_tokens['market_cap_gaming']+df_all_tokens['market_cap_ai']+df_all_tokens['market_cap_memes']+df_all_tokens['market_cap_rwa']
df_all_tokens.drop(columns=['market_cap_gaming','market_cap_ai','market_cap_memes','market_cap_rwa'],inplace=True)

In [132]:
df_all_tokens['total_volume'] = df_all_tokens['total_volume_gaming']+df_all_tokens['total_volume_ai']+df_all_tokens['total_volume_memes']+df_all_tokens['total_volume_rwa']
df_all_tokens.drop(columns=['total_volume_gaming','total_volume_ai','total_volume_memes','total_volume_rwa'],inplace=True)

In [133]:
df_all_tokens['high_24h'] = df_all_tokens['high_24h_gaming']+df_all_tokens['high_24h_ai']+df_all_tokens['high_24h_memes']+df_all_tokens['high_24h_rwa']
df_all_tokens.drop(columns=['high_24h_gaming','high_24h_ai','high_24h_memes','high_24h_rwa'],inplace=True)

In [134]:
df_all_tokens['low_24h'] = df_all_tokens['low_24h_gaming']+df_all_tokens['low_24h_ai']+df_all_tokens['low_24h_memes']+df_all_tokens['low_24h_rwa']
df_all_tokens.drop(columns=['low_24h_gaming','low_24h_ai','low_24h_memes','low_24h_rwa'],inplace=True)

In [135]:
df_all_tokens['price_change_24h'] = df_all_tokens['price_change_24h_gaming']+df_all_tokens['price_change_24h_ai']+df_all_tokens['price_change_24h_memes']+df_all_tokens['price_change_24h_rwa']
df_all_tokens.drop(columns=['price_change_24h_gaming','price_change_24h_ai','price_change_24h_memes','price_change_24h_rwa'],inplace=True)



In [136]:
df_all_tokens['price_change_percentage_24h'] = df_all_tokens['price_change_percentage_24h_gaming']+df_all_tokens['price_change_percentage_24h_ai']+df_all_tokens['price_change_percentage_24h_memes']+df_all_tokens['price_change_percentage_24h_rwa']
df_all_tokens.drop(columns=['price_change_percentage_24h_gaming','price_change_percentage_24h_ai','price_change_percentage_24h_memes','price_change_percentage_24h_rwa'],inplace=True)



In [137]:
df_all_tokens['market_cap_change_24h'] = df_all_tokens['market_cap_change_24h_gaming']+df_all_tokens['market_cap_change_24h_ai']+df_all_tokens['market_cap_change_24h_memes']+df_all_tokens['market_cap_change_24h_rwa']
df_all_tokens.drop(columns=['market_cap_change_24h_gaming','market_cap_change_24h_ai','market_cap_change_24h_memes','market_cap_change_24h_rwa'],inplace=True)

In [138]:
df_all_tokens['current_price'] = df_all_tokens['current_price_gaming']+df_all_tokens['current_price_ai']+df_all_tokens['current_price_memes']+df_all_tokens['current_price_rwa']
df_all_tokens.drop(columns=['current_price_gaming','current_price_ai','current_price_memes','current_price_rwa'],inplace=True)

In [139]:
df_all_tokens['circulating_supply'] = df_all_tokens['circulating_supply_gaming']+df_all_tokens['circulating_supply_ai']+df_all_tokens['circulating_supply_memes']+df_all_tokens['circulating_supply_rwa']
df_all_tokens.drop(columns=['circulating_supply_gaming','circulating_supply_ai','circulating_supply_memes','circulating_supply_rwa'],inplace=True)

In [140]:
df_all_tokens['total_supply'] = df_all_tokens['total_supply_gaming']+df_all_tokens['total_supply_ai']+df_all_tokens['total_supply_memes']+df_all_tokens['total_supply_rwa']
df_all_tokens.drop(columns=['total_supply_gaming','total_supply_ai','total_supply_memes','total_supply_rwa'],inplace=True)

In [141]:
df_all_tokens['max_supply'] = df_all_tokens['max_supply_gaming']+df_all_tokens['max_supply_ai']+df_all_tokens['max_supply_memes']+df_all_tokens['max_supply_rwa']
df_all_tokens.drop(columns=['max_supply_gaming','max_supply_ai','max_supply_memes','max_supply_rwa'],inplace=True)

In [142]:
df_all_tokens['fully_diluted_valuation']= df_all_tokens['fully_diluted_valuation_gaming']+df_all_tokens['fully_diluted_valuation_ai']+df_all_tokens['fully_diluted_valuation_memes']+df_all_tokens['fully_diluted_valuation_rwa']
df_all_tokens.drop(columns=['fully_diluted_valuation_gaming','fully_diluted_valuation_ai','fully_diluted_valuation_memes','fully_diluted_valuation_rwa'],inplace=True)

In [143]:
df_all_tokens['market_cap_change_percentage_24h'] = df_all_tokens['market_cap_change_percentage_24h_gaming']+df_all_tokens['market_cap_change_percentage_24h_ai']+df_all_tokens['market_cap_change_percentage_24h_memes']+df_all_tokens['market_cap_change_percentage_24h_rwa']
df_all_tokens.drop(columns=['market_cap_change_percentage_24h_gaming','market_cap_change_percentage_24h_ai','market_cap_change_percentage_24h_memes','market_cap_change_percentage_24h_rwa'],inplace=True)

In [146]:
df_all_tokens.loc[df_all_tokens['class'] == 0, 'rank'] = df_all_tokens['market_cap_rank_ai']
df_all_tokens.loc[df_all_tokens['class'] == 1, 'rank'] = df_all_tokens['market_cap_rank_gaming']
df_all_tokens.loc[df_all_tokens['class'] == 2, 'rank'] = df_all_tokens['market_cap_rank_rwa']
df_all_tokens.loc[df_all_tokens['class'] == 3, 'rank'] = df_all_tokens['market_cap_rank_memes']

df_all_tokens.drop(columns=['market_cap_rank_gaming','market_cap_rank_ai','market_cap_rank_memes','market_cap_rank_rwa'],inplace=True)

In [147]:
df_all_tokens.head()

Unnamed: 0,id,symbol,name,class,centralized,isMultiplatforms,market_cap,total_volume,high_24h,low_24h,price_change_24h,price_change_percentage_24h,market_cap_change_24h,current_price,circulating_supply,total_supply,max_supply,fully_diluted_valuation,market_cap_change_percentage_24h,rank
0,immutable-x,imx,Immutable,1,0,0,6356894000.0,99469710.0,4.86,4.36,-0.332313,-14.17866,-482078100.0,4.355194,2913630000.0,4000000000.0,4000000000.0,8727111000.0,-14.09798,38.0
1,floki,floki,FLOKI,1,0,1,3583818000.0,588523888.0,0.000409,0.000361,-2.7e-05,-13.46436,-233830100.0,0.000367,19422530000000.0,20000000000000.0,20000000000000.0,3690373000.0,-12.24996,61.0
2,gala,gala,GALA,1,0,0,3231598000.0,354364488.0,0.104118,0.092706,-0.009738,-19.08504,-331121800.0,0.092311,69745280000.0,69745300000.0,100000000000.0,3231599000.0,-18.58814,67.0
3,beam-2,beam,Beam,1,0,1,2849650000.0,83199828.0,0.060932,0.053563,-0.006298,-21.0554,-329629100.0,0.053525,105996300000.0,124868000000.0,124868000000.0,3357006000.0,-20.73608,72.0
4,axie-infinity,axs,Axie Infinity,1,0,1,2140012000.0,206036026.0,16.76,14.8,-0.62477,-8.07794,-79353320.0,14.843783,287319800.0,540000000.0,540000000.0,4022022000.0,-7.151,89.0


In [149]:
df_all_tokens.rename(columns={'id':'id_coingecko'},inplace=True)

In [150]:
df_all_tokens.columns.to_list()

['id_coingecko',
 'symbol',
 'name',
 'class',
 'centralized',
 'isMultiplatforms',
 'market_cap',
 'total_volume',
 'high_24h',
 'low_24h',
 'price_change_24h',
 'price_change_percentage_24h',
 'market_cap_change_24h',
 'current_price',
 'circulating_supply',
 'total_supply',
 'max_supply',
 'fully_diluted_valuation',
 'market_cap_change_percentage_24h',
 'rank']

In [152]:
## dd/mm/yyyy
df_all_tokens['date'] = time.strftime("%d/%m/%Y")

In [153]:
df_all_tokens.to_csv("data/clean/df_all_tokens_v3.csv",index=False)

In [154]:
df_all_tokens = pd.read_csv("data/clean/df_all_tokens_v3.csv")

In [155]:
df_all_tokens.head()

Unnamed: 0,id_coingecko,symbol,name,class,centralized,isMultiplatforms,market_cap,total_volume,high_24h,low_24h,...,price_change_percentage_24h,market_cap_change_24h,current_price,circulating_supply,total_supply,max_supply,fully_diluted_valuation,market_cap_change_percentage_24h,rank,date
0,immutable-x,imx,Immutable,1,0,0,6356894000.0,99469710.0,4.86,4.36,...,-14.17866,-482078100.0,4.355194,2913630000.0,4000000000.0,4000000000.0,8727111000.0,-14.09798,38.0,28/04/2024
1,floki,floki,FLOKI,1,0,1,3583818000.0,588523888.0,0.000409,0.000361,...,-13.46436,-233830100.0,0.000367,19422530000000.0,20000000000000.0,20000000000000.0,3690373000.0,-12.24996,61.0,28/04/2024
2,gala,gala,GALA,1,0,0,3231598000.0,354364488.0,0.104118,0.092706,...,-19.08504,-331121800.0,0.092311,69745280000.0,69745300000.0,100000000000.0,3231599000.0,-18.58814,67.0,28/04/2024
3,beam-2,beam,Beam,1,0,1,2849650000.0,83199828.0,0.060932,0.053563,...,-21.0554,-329629100.0,0.053525,105996300000.0,124868000000.0,124868000000.0,3357006000.0,-20.73608,72.0,28/04/2024
4,axie-infinity,axs,Axie Infinity,1,0,1,2140012000.0,206036026.0,16.76,14.8,...,-8.07794,-79353320.0,14.843783,287319800.0,540000000.0,540000000.0,4022022000.0,-7.151,89.0,28/04/2024


In [None]:
### halving tokens