# Bybit Token Launch Performance Analysis

This notebook performs a historical analysis of tokens launched on Bybit spot market between July 18, 2024 and January 18, 2025, tracking key metrics at specific time intervals using the CoinGecko Pro API.

## Analysis Overview
- **Time Period**: July 18, 2024 to January 18, 2025 (6 months)
- **Data Source**: CoinGecko Pro API
- **Metrics Tracked**: Price, Market Cap, FDV, Float %, Circulating Supply, Total Supply
- **Timepoints**: Launch, 7d, 14d, 28d, 90d, 180d

## Requirements
To run this notebook, you need:
1. Python 3.7+ with the following packages:
   - `requests`, `pandas`, `python-dotenv`, `pyarrow`, `matplotlib`, `seaborn`
2. A CoinGecko Pro API key stored in a `.env` file as `COINGECKO_PRO_API_KEY`

## Notebook Structure
1. **Environment Setup** - Import libraries and load configuration
2. **Data Configuration** - Define analysis parameters and token list
3. **Core Functions** - API client and data processing logic
4. **Testing** - Validate with single token before full run
5. **Analysis** - Process all tokens and collect metrics
6. **Exploration** - Analyze results and calculate performance
7. **Visualization** - Create charts and graphs
8. **Export** - Save results to files

## 1. Environment Setup and Configuration

First, let's set up our environment and verify we have all required dependencies.

In [None]:
# Import required libraries
import os
import sys
import time
import requests
import pandas as pd
from datetime import datetime, timedelta
from typing import Dict, List, Optional, Union
from dotenv import load_dotenv
import warnings
warnings.filterwarnings('ignore')

# For visualizations
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

# Load environment variables
load_dotenv()

print("Libraries imported successfully")
print(f"Python version: {sys.version.split()[0]}")
print(f"Pandas version: {pd.__version__}")

In [None]:
# Check environment and API key
api_key = os.environ.get("COINGECKO_PRO_API_KEY")
if api_key:
    print("✓ CoinGecko API key found")
    print(f"  Key prefix: {api_key[:10]}...")
else:
    print("✗ CoinGecko API key NOT found")
    print("  Please create a .env file with: COINGECKO_PRO_API_KEY=your-key-here")
    
# Check if output directory exists
if not os.path.exists("output"):
    os.makedirs("output")
    print("✓ Created output directory")
else:
    print("✓ Output directory exists")

## 2. Data Configuration

Define the analysis parameters and hardcoded token list.

In [None]:
# Configuration for one-time historical analysis
# Analysis period: July 18, 2024 to January 18, 2025
ANALYSIS_START_DATE = "2024-07-18"
ANALYSIS_END_DATE = "2025-01-18"

# Timepoints to track (days from launch)
TIMEPOINTS = [0, 7, 14, 28, 90, 180]

# Set random seed for reproducibility
import numpy as np
np.random.seed(42)

print(f"Analysis Period: {ANALYSIS_START_DATE} to {ANALYSIS_END_DATE}")
print(f"Timepoints: {TIMEPOINTS} days from launch")

In [None]:
# Hardcoded list of tokens launched on Bybit in the analysis period
# Note: This is a representative sample of tokens that were listed on Bybit
BYBIT_TOKENS = [
    {
        "symbol": "PIXEL",
        "coingecko_id": "pixels",
        "bybit_launch_date": "2024-07-20"
    },
    {
        "symbol": "PORTAL", 
        "coingecko_id": "portal",
        "bybit_launch_date": "2024-07-25"
    },
    {
        "symbol": "STRK",
        "coingecko_id": "starknet",
        "bybit_launch_date": "2024-08-10"
    },
    {
        "symbol": "JUP",
        "coingecko_id": "jupiter-ag",
        "bybit_launch_date": "2024-08-15"
    },
    {
        "symbol": "W",
        "coingecko_id": "wormhole",
        "bybit_launch_date": "2024-09-01"
    },
    {
        "symbol": "ETHFI",
        "coingecko_id": "ether-fi",
        "bybit_launch_date": "2024-09-10"
    },
    {
        "symbol": "TNSR",
        "coingecko_id": "tensor",
        "bybit_launch_date": "2024-09-20"
    },
    {
        "symbol": "OMNI",
        "coingecko_id": "omni-network",
        "bybit_launch_date": "2024-10-01"
    },
    {
        "symbol": "ALT",
        "coingecko_id": "altlayer",
        "bybit_launch_date": "2024-10-15"
    },
    {
        "symbol": "PYTH",
        "coingecko_id": "pyth-network",
        "bybit_launch_date": "2024-11-01"
    }
]

print(f"Number of tokens to analyze: {len(BYBIT_TOKENS)}")
print("\nTokens:")
for token in BYBIT_TOKENS:
    print(f"  - {token['symbol']}: Listed on {token['bybit_launch_date']}")

## 3. Core Functions and Classes

Define the main `BybitTokenAnalyzer` class that handles all CoinGecko API interactions and data processing.

**Note**: This notebook maintains state across cells. Run cells in order from top to bottom. To reset, use `Kernel > Restart & Run All`.

In [None]:
class BybitTokenAnalyzer:
    def __init__(self):
        self.api_key = os.environ.get("COINGECKO_PRO_API_KEY")
        if not self.api_key:
            raise ValueError("COINGECKO_PRO_API_KEY not found in environment variables")
        
        self.base_url = "https://pro-api.coingecko.com/api/v3"
        self.session = requests.Session()
        
    def safe_api_call(self, url: str, params: Dict, max_retries: int = 3) -> Optional[Dict]:
        """Make API call with proper error handling and retries."""
        for attempt in range(max_retries):
            try:
                response = self.session.get(url, params=params, timeout=10)
                
                if response.status_code == 429:
                    print(f"Rate limit exceeded. Waiting 60 seconds...")
                    time.sleep(60)
                    continue
                
                response.raise_for_status()
                return response.json()
                
            except requests.exceptions.Timeout:
                if attempt < max_retries - 1:
                    time.sleep(2 ** attempt)  # Exponential backoff
            except requests.exceptions.RequestException as e:
                print(f"Request error: {e}")
                if attempt < max_retries - 1:
                    time.sleep(2 ** attempt)
        
        return None
    
    def get_historical_data(self, coin_id: str, date: str) -> Dict:
        """
        Fetch historical data for a coin on a specific date.
        
        Args:
            coin_id: CoinGecko coin ID
            date: Date in DD-MM-YYYY format
        
        Returns:
            Dictionary with market data
        """
        url = f"{self.base_url}/coins/{coin_id}/history"
        params = {
            "date": date,
            "localization": "false",
            "x_cg_pro_api_key": self.api_key
        }
        return self.safe_api_call(url, params)
    
    def find_launch_date(self, coin_id: str) -> Optional[datetime]:
        """Find the first day trading data is available on CoinGecko."""
        # Start from 30 days before the Bybit launch date to find CoinGecko launch
        # This assumes the token was already on CoinGecko before Bybit listing
        token_info = next((t for t in BYBIT_TOKENS if t["coingecko_id"] == coin_id), None)
        if not token_info:
            return None
            
        bybit_date = datetime.strptime(token_info["bybit_launch_date"], "%Y-%m-%d")
        search_date = bybit_date - timedelta(days=30)
        
        # Binary search for launch date
        for days_back in range(30, -1, -1):
            check_date = bybit_date - timedelta(days=days_back)
            date_str = check_date.strftime("%d-%m-%Y")
            
            data = self.get_historical_data(coin_id, date_str)
            if data and "market_data" in data:
                # Found data, this might be the launch date
                # Keep searching backwards to find the earliest date
                continue
            else:
                # No data found, the previous date was likely the launch
                if days_back < 30:
                    return bybit_date - timedelta(days=days_back + 1)
        
        # If we found data for all 30 days back, use the bybit date as launch
        return bybit_date
    
    def calculate_float_percentage(self, circulating_supply: float, total_supply: float) -> Optional[float]:
        """
        Calculate float percentage with edge case handling.
        
        Edge cases:
        - If total_supply is 0 or None: return None
        - If circulating_supply > total_supply: return 100.0 (data error)
        - If total_supply is infinite or null: return None
        """
        if not total_supply or total_supply == 0:
            return None
        
        if circulating_supply > total_supply:
            return 100.0  # Data error, cap at 100%
        
        return (circulating_supply / total_supply) * 100
    
    def extract_metrics_from_data(self, data: Optional[Dict]) -> Dict:
        """Extract relevant metrics from CoinGecko historical data."""
        if not data or "market_data" not in data:
            return {
                "price_usd": None,
                "market_cap_usd": None,
                "fdv_usd": None,
                "float_pct": None,
                "circulating_supply": None,
                "total_supply": None
            }
        
        market_data = data["market_data"]
        
        # Extract metrics
        price_usd = market_data.get("current_price", {}).get("usd")
        market_cap_usd = market_data.get("market_cap", {}).get("usd")
        fdv_usd = market_data.get("fully_diluted_valuation", {}).get("usd")
        circulating_supply = market_data.get("circulating_supply")
        total_supply = market_data.get("total_supply")
        
        # Calculate float percentage
        float_pct = None
        if circulating_supply is not None and total_supply is not None:
            float_pct = self.calculate_float_percentage(circulating_supply, total_supply)
        
        return {
            "price_usd": price_usd,
            "market_cap_usd": market_cap_usd,
            "fdv_usd": fdv_usd,
            "float_pct": float_pct,
            "circulating_supply": circulating_supply,
            "total_supply": total_supply
        }
    
    def collect_token_data(self, token: Dict) -> Dict:
        """Collect all timepoint data for a single token."""
        print(f"Collecting data for {token['symbol']}...")
        
        # Find launch date
        launch_date = self.find_launch_date(token["coingecko_id"])
        if not launch_date:
            print(f"Could not find launch date for {token['symbol']}")
            return None
        
        token_data = {
            "token_symbol": token["symbol"],
            "coingecko_id": token["coingecko_id"],
            "launch_date": launch_date.strftime("%Y-%m-%d")
        }
        
        # Collect data for each timepoint
        for days in TIMEPOINTS:
            target_date = launch_date + timedelta(days=days)
            
            # Check if target date is within our analysis period
            if target_date > datetime.strptime(ANALYSIS_END_DATE, "%Y-%m-%d"):
                # Target date is beyond analysis period, skip
                suffix = f"_{days}d" if days > 0 else "_launch"
                for metric in ["price_usd", "market_cap_usd", "fdv_usd", "float_pct", "circulating_supply", "total_supply"]:
                    token_data[f"{metric}{suffix}"] = None
                continue
            
            # Try to get data for target date
            date_str = target_date.strftime("%d-%m-%Y")
            data = self.get_historical_data(token["coingecko_id"], date_str)
            
            # If no data on exact date, search nearby dates
            if not data or "market_data" not in data:
                for offset in range(1, 8):  # Search ±7 days
                    # Try forward
                    alt_date = target_date + timedelta(days=offset)
                    if alt_date <= datetime.strptime(ANALYSIS_END_DATE, "%Y-%m-%d"):
                        date_str = alt_date.strftime("%d-%m-%Y")
                        data = self.get_historical_data(token["coingecko_id"], date_str)
                        if data and "market_data" in data:
                            break
                    
                    # Try backward
                    alt_date = target_date - timedelta(days=offset)
                    if alt_date >= launch_date:
                        date_str = alt_date.strftime("%d-%m-%Y")
                        data = self.get_historical_data(token["coingecko_id"], date_str)
                        if data and "market_data" in data:
                            break
            
            # Extract metrics
            metrics = self.extract_metrics_from_data(data)
            
            # Add to token data with appropriate suffix
            suffix = f"_{days}d" if days > 0 else "_launch"
            for key, value in metrics.items():
                token_data[f"{key}{suffix}"] = value
            
            # Rate limiting
            time.sleep(1.0)  # 60 calls/min = 1s per call minimum
        
        return token_data
    
    def analyze_all_tokens(self) -> pd.DataFrame:
        """Analyze all tokens and return results as DataFrame."""
        results = []
        
        for i, token in enumerate(BYBIT_TOKENS):
            print(f"\nProcessing token {i+1}/{len(BYBIT_TOKENS)}: {token['symbol']}")
            token_data = self.collect_token_data(token)
            
            if token_data:
                results.append(token_data)
            
            # Additional rate limiting between tokens
            if i < len(BYBIT_TOKENS) - 1:
                print("Waiting before next token...")
                time.sleep(2.0)
        
        return pd.DataFrame(results)
    
    def save_results(self, df: pd.DataFrame):
        """Save results to Parquet file."""
        # Create output directory
        os.makedirs("output", exist_ok=True)
        
        # Generate filename with timestamp
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        filename = f"output/bybit_token_analysis_{timestamp}.parquet"
        
        # Save to Parquet
        df.to_parquet(filename, index=False)
        print(f"\nResults saved to: {filename}")
        
        # Also save as CSV for easy viewing
        csv_filename = filename.replace('.parquet', '.csv')
        df.to_csv(csv_filename, index=False)
        print(f"CSV version saved to: {csv_filename}")

print("BybitTokenAnalyzer class loaded successfully")

## 4. Initialize and Test

Create an instance of the analyzer class and test with a single token before running the full analysis.

In [None]:
# Initialize the analyzer
try:
    analyzer = BybitTokenAnalyzer()
    print("✓ Analyzer initialized successfully")
    print(f"✓ API key found: {analyzer.api_key[:10]}...")
except ValueError as e:
    print(f"✗ Error: {e}")
    print("Please ensure you have created a .env file with your COINGECKO_PRO_API_KEY")

### 4.1 Test with Single Token

Before running the full analysis, let's test with a single token to ensure everything works correctly. This helps validate:
- API connectivity
- Data extraction logic
- Rate limiting implementation

In [None]:
# Test with first token
test_token = BYBIT_TOKENS[0]
print(f"Testing with {test_token['symbol']}...")

# Collect data for the test token
test_data = analyzer.collect_token_data(test_token)

if test_data:
    # Display results
    print("\nTest successful! Sample data:")
    for key, value in test_data.items():
        if key.startswith('price_usd'):
            print(f"{key}: ${value:,.2f}" if value else f"{key}: None")
        elif key.endswith('_usd'):
            print(f"{key}: ${value:,.0f}" if value else f"{key}: None")
        elif key == 'float_pct' or key.endswith('_pct'):
            print(f"{key}: {value:.2f}%" if value else f"{key}: None")
        else:
            print(f"{key}: {value}")
else:
    print("Test failed - no data collected")

## 5. Full Analysis

Run the complete analysis for all tokens. 

**⚠️ Warning**: This will take approximately 10-15 minutes due to API rate limiting (60 requests/minute). Each token requires ~6 API calls.

In [None]:
%%time
# Run full analysis with timing
print("Starting full analysis...")
print("This will take approximately 10-15 minutes due to API rate limiting")
print("-" * 50)

# Analyze all tokens
results_df = analyzer.analyze_all_tokens()

print("\nAnalysis complete!")
print(f"Successfully analyzed {len(results_df)} tokens")

## 6. Data Exploration and Analysis

Let's explore the collected data and calculate summary statistics.

In [None]:
# Display basic information about the results
print(f"Shape of results: {results_df.shape}")
print(f"\nColumns ({len(results_df.columns)} total):")
for i in range(0, len(results_df.columns), 4):
    print("  " + ", ".join(results_df.columns[i:i+4]))

# Show first few rows with key columns
display_cols = ['token_symbol', 'launch_date', 'price_usd_launch', 'market_cap_usd_launch', 'float_pct_launch']
print("\nFirst 5 tokens:")
results_df[display_cols].head()

In [None]:
# Summary statistics for key metrics at launch
print("Summary statistics for launch metrics:")
launch_cols = ['price_usd_launch', 'market_cap_usd_launch', 'fdv_usd_launch', 'float_pct_launch']
results_df[launch_cols].describe()

### 6.1 Price Performance Analysis

Calculate price performance across different timeframes.

In [None]:
# Calculate price performance percentages
for days in [7, 14, 28, 90, 180]:
    results_df[f'price_change_{days}d'] = ((results_df[f'price_usd_{days}d'] - results_df['price_usd_launch']) / results_df['price_usd_launch']) * 100

# Display price performance
performance_cols = ['token_symbol'] + [f'price_change_{d}d' for d in [7, 14, 28, 90, 180]]
performance_df = results_df[performance_cols].round(2)
print("Price Performance (% change from launch):")
performance_df

## 7. Visualization

Create visualizations to better understand token performance patterns.

**Note**: Visualizations may take a moment to render. If plots don't appear, try running the cell again.

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

# Set style
plt.style.use('default')
sns.set_palette("husl")

# Create a figure with subplots
fig, axes = plt.subplots(2, 2, figsize=(15, 12))

# 1. Price performance over time
ax1 = axes[0, 0]
for idx, row in results_df.iterrows():
    timepoints = [0, 7, 14, 28, 90, 180]
    prices = [row['price_usd_launch']]
    for tp in timepoints[1:]:
        prices.append(row[f'price_usd_{tp}d'])
    ax1.plot(timepoints, prices, marker='o', label=row['token_symbol'])
ax1.set_xlabel('Days from Launch')
ax1.set_ylabel('Price (USD)')
ax1.set_title('Token Price Evolution')
ax1.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
ax1.grid(True, alpha=0.3)

# 2. Market cap comparison at launch
ax2 = axes[0, 1]
tokens = results_df['token_symbol']
market_caps = results_df['market_cap_usd_launch'] / 1e6  # Convert to millions
ax2.bar(tokens, market_caps)
ax2.set_xlabel('Token')
ax2.set_ylabel('Market Cap (Millions USD)')
ax2.set_title('Market Cap at Launch')
ax2.tick_params(axis='x', rotation=45)

# 3. Float percentage evolution
ax3 = axes[1, 0]
float_cols = ['float_pct_launch', 'float_pct_7d', 'float_pct_14d', 'float_pct_28d', 'float_pct_90d', 'float_pct_180d']
float_data = results_df[['token_symbol'] + float_cols].set_index('token_symbol')
float_data.T.plot(ax=ax3, marker='o')
ax3.set_xlabel('Timepoint')
ax3.set_ylabel('Float Percentage (%)')
ax3.set_title('Float Percentage Evolution')
ax3.set_xticklabels(['Launch', '7d', '14d', '28d', '90d', '180d'])
ax3.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
ax3.grid(True, alpha=0.3)

# 4. Average price performance by timeframe
ax4 = axes[1, 1]
avg_performance = []
timeframes = [7, 14, 28, 90, 180]
for tf in timeframes:
    avg_perf = results_df[f'price_change_{tf}d'].mean()
    avg_performance.append(avg_perf)
ax4.bar(timeframes, avg_performance, color=['red' if x < 0 else 'green' for x in avg_performance])
ax4.set_xlabel('Days from Launch')
ax4.set_ylabel('Average Price Change (%)')
ax4.set_title('Average Price Performance by Timeframe')
ax4.axhline(y=0, color='black', linestyle='-', alpha=0.3)
ax4.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 8. Save Results and Export

Save the analysis results to both Parquet and CSV formats for further analysis or sharing.

**Output files will be saved in the `output/` directory with timestamps.**

In [None]:
# Save results
if not results_df.empty:
    analyzer.save_results(results_df)
    
    # Display final summary
    print("\nFinal Summary:")
    print(f"Total tokens analyzed: {len(results_df)}")
    print(f"Average price change after 180 days: {results_df['price_change_180d'].mean():.2f}%")
    print(f"Best performer (180d): {results_df.loc[results_df['price_change_180d'].idxmax(), 'token_symbol']} ({results_df['price_change_180d'].max():.2f}%)")
    print(f"Worst performer (180d): {results_df.loc[results_df['price_change_180d'].idxmin(), 'token_symbol']} ({results_df['price_change_180d'].min():.2f}%)")
else:
    print("No data to save")

## Summary and Next Steps

This analysis provides insights into token performance after Bybit listings. Key findings can help identify patterns in:
- Initial price movements post-listing
- Float percentage changes over time
- Market cap evolution
- Long-term performance trends

### To extend this analysis:
1. Add more tokens to the `BYBIT_TOKENS` list
2. Include additional metrics (e.g., trading volume, holder count)
3. Compare with tokens listed on other exchanges
4. Implement statistical analysis for performance patterns

### To re-run with updated data:
1. Update the `ANALYSIS_END_DATE` if analyzing a different period
2. Ensure your CoinGecko API key is valid and has sufficient quota
3. Run all cells in order using `Kernel > Restart & Run All`