# Cryptocurrency Market Data Analysis with CoinGecko API 

![CoinGecko API](https://img.shields.io/badge/DataSource-CoinGecko_API-blue)
![Python](https://img.shields.io/badge/Language-Python-green)
![ETL](https://img.shields.io/badge/Workflow-ETL-orange)

A project to extract, transform, and analyze top 100 cryptocurrency data from CoinGecko API. Includes database integration and SQL queries for insights.

**Key Features**:
- Real-time data extraction from CoinGecko API
- Currency conversion to EUR/GBP
- Automated logging system
- SQL database integration
- Example analytical queries

---

## Packages & Libraries

### Install Required Packages
```python
%pip install ipython-sql prettytable
%pip install seaborn 

### Imported Libraries

In [None]:
import pandas as pd
import numpy as np
from datetime import datetime
import requests
import prettytable
prettytable.DEFAULT = 'DEFAULT'
import sqlite3
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns

# Project Workflow

### 1. Logging Progres

Logs execution stages to `log_progress` for debugging and tracking.

In [None]:
def log_progress(message):
    ''' This function logs the mentioned message of a given stage of the
    code execution to a log file. Function returns nothing'''
    timestamp_format='%Y-%h-%d-%H:%M:%S'
    now = datetime.now()
    timestamp = now.strftime(timestamp_format)
    with open("./log_progress","a") as f:
        f.write(f"{timestamp}:{message}" + '\n')

### 2. Data Extraction

This function retrieves cryptocurrency data from the CoinGecko API and saves it to a pandas DataFrame. The returned DataFrame is formatted for further analysis or processing.

Fetches top 100 cryptocurrencies' data from CoinGecko API.

**Parameters:**

- `url`: CoinGecko API endpoint (e.g., coins/markets).

**Returns:**

- DataFrame with columns: id, name, current_price, market_cap.

In [None]:
def extract(url):    # sourcery skip: raise-specific-error
    ''' This function aims to extract the required
    information from the CoinGecko and save it to a data frame. The
    function returns the data frame for further processing. '''
    params = {  
               'vs_currency': 'USD'
    }
    response = requests.get(url, params=params)
    if response.status_code != 200:
        raise Exception(f"Failed to fetch data from API. Status code: {response.status_code}")
    data = response.json()
    df = pd.DataFrame(data)
    columns = ['id', 'name', 'current_price', 'market_cap']
    return df[columns]

## 3. Data Transformation

This function accesses a CSV file containing exchange rate information and adds three new columns to the DataFrame. These columns represent the transformed values of the 'Market Cap' and 'Current Price' columns into their respective currencies.

Converts USD values to EUR/GBP using exchange rates from `exchange_rate.csv.`

**Key Steps:**

- Load exchange rates from CSV.

- Calculate new columns: current_price_GBP, current_price_EUR, market_cap_GBP, market_cap_EUR.


In [None]:
def transform(df, csv_path):
    ''' This function accesses the CSV file for exchange rate
    information, and adds three columns to the data frame, each
    containing the transformed version of Market Cap column to
    respective currencies'''

    # Get the exchange rate from the csv file
    exchangerate_df = pd.read_csv(csv_path)
    # Transform the exchange rate in the data frame to a dictionary, in order manipulate it.
    exchange_rate = exchangerate_df.set_index('Currency').to_dict()['Rate']

    # Added new columns
    df['current_price_GBP'] = [np.round(x*exchange_rate['GBP'],2) for x in df['current_price']]
    df['current_price_EUR'] = [np.round(x*exchange_rate['EUR'],2) for x in df['current_price']]
    df['market_cap_GBP'] = [np.round(x*exchange_rate['GBP'],2) for x in df['market_cap']]
    df['market_cap_EUR'] = [np.round(x*exchange_rate['EUR'],2) for x in df['market_cap']]
    return df

## 4. Data Loading

The firts function saves the final data frame as a CSV file in
the provided path. Function returns nothing.

In [None]:
def load_to_csv(df,new_path):
    df.to_csv(new_path)

The second fuction save saves the final data frame to a database
table with the provided name. Function returns nothing.

Additionally, we include the function `connection_to_database`, which establishes the connection to the database.

In [None]:
def conection_to_database(database_name):
    try:
        db_connection = sqlite3.connect(database_name)
    except sqlite3.OperationalError as e:
        raise e
    else:
        print("connected")
    return db_connection

def load_to_db(df,sql_connection,table_name):
        df.to_sql(table_name,sql_connection, if_exists='replace', index=False)

## 5. ETL Execution

Now that we have all the required functions for this process, we can create a DataFrame based on the data extracted from the CoinGecko API, save it to a CSV file, and load it into a database for subsequent analysis.

### Configuration

In [None]:
api_url = "https://api.coingecko.com/api/v3/coins/markets?x_cg_demo_api_key=CG-MbEY8jPE4gh6VQGrJrCCF5st"
exchange_rate_csv = './exchange_rate.csv'
data_csv_path='./Crypto_Data.csv'
db_name='CryptoData.db'
table_name='Crypto_Data'
log_progress("Variables are define. Intiating ETL process")

**Steps**
1. Extract: Fetch data from API.
2. Transform: Convert prices to EUR/GBP.
3. Load: Save to CSV and SQLite DB.

In [None]:
df = extract(api_url)
log_progress("Extracted crypto data from CoinGecko API")
df = transform(df, exchange_rate_csv)
log_progress("Transformed data to EUR and GBP")
df

### Loading data

Now we are saving the data to a new CSV file and loading it into the database. This generates a new CSV file in the current working directory containing cryptocurrency data extracted from the CoinGecko API. Next step is going to be load the data in to a Database.

In [None]:
load_to_csv(df,data_csv_path)
log_progress("Loaded data to CSV")
conn = conection_to_database(db_name)
# In order to execute SQL statements and fetch results from SQL queries, we will need to use a database cursor.
curs = conn.cursor()
load_to_db(df,conn,table_name)
log_progress("Loaded data to SQLite Database")

## 6. SQL Analysis

Now that we have establish the connection with the database, we stablish a connection between SQL magic module and the database CrytoData.db, in order to run queries in jupyter notebook

In [None]:
%load_ext sql
%sql sqlite:///CryptoData.db

### Sample Queries

1. Check if the table exist

In [None]:
%sql SELECT name FROM sqlite_master WHERE type='table'

2. Check the number of rows

In [None]:
%sql SELECT count(name) FROM PRAGMA_TABLE_INFO('Crypto_Data')

3. Check name of the columns

In [None]:
%sql SELECT name,type from PRAGMA_TABLE_INFO('Crypto_Data')

4. List of the total coins

In [None]:
%sql SELECT count(*) FROM Crypto_Data

5. Top 10 Low-Price Coins <$1

In [None]:
%sql SELECT name, current_price FROM Crypto_Data WHERE current_price < 1 LIMIT 10 

6. Coins with "Coin" in Name

In [None]:
%sql SELECT name FROM Crypto_Data WHERE name LIKE '%coin%'

7. Top 10 lowest coins in market cap

In [None]:
%sql SELECT * FROM crypto_data ORDER BY market_cap LIMIT 10

8. Average Price of 10 Smallest Market Cap Coins

In [None]:
%sql SELECT AVG(current_price) as AVERAGE_PRICE_10 FROM( SELECT current_price FROM crypto_data ORDER BY market_cap LIMIT 10)

9. Coin with the minimum market cap

In [None]:
%sql SELECT name, market_cap FROM crypto_data WHERE market_cap = (SELECT MIN(market_cap) FROM crypto_data)

## 7. Graphical Analysis

1. Top 10 Market Cap Coins

In [None]:
top_10 = df.sort_values('market_cap', ascending=False).head(10)
ax = sns.barplot(
    data=top_10,
    x='market_cap',
    y='name',
    palette='viridis',  
    edgecolor='black' 
)

plt.xlabel('Market Capitalization (Millons USD)', fontsize=12, weight='bold')
plt.ylabel('') 
plt.title('Top 10 Cryptos by Market Cap', fontsize=16, pad=20, weight='bold')

2. Last 10 Coins by Market Cap

In [None]:
last_10 = df.sort_values('market_cap', ascending=False).tail(10)
ax = sns.barplot(
    data=last_10,
    x='market_cap',
    y='name',
    palette='viridis',  
    edgecolor='black' 
)

plt.xlabel('Market Capitalization (Millons USD)', fontsize=12, weight='bold')
plt.ylabel('') 
plt.title('Top 10 Cryptos by Market Cap', fontsize=16, pad=20, weight='bold')

3. Market Share of Top 5 Cryptos using a Pie

In [None]:
plt.figure(figsize=(10,10))
top_5 = df.sort_values('market_cap', ascending=False).head(5)
plt.pie(top_5['market_cap'], labels=top_5['name'], autopct='%1.1f%%', 
        wedgeprops={'edgecolor':'black'}, startangle=90)
plt.title('Market Share of Top 10 Cryptos', fontsize=14, pad=20)
plt.show()

4. Correlation between Market Cap vs Current Price in all the Crypto Currencies

In [None]:
plt.figure(figsize=(12,7))
sns.scatterplot(data=df, x='current_price', y='market_cap', hue='name', 
                size='market_cap', sizes=(20, 500), legend=False)
plt.xscale('log')  
plt.yscale('log') 
plt.title('Price vs Market Cap Relationship')
plt.xlabel('Price (USD - Log Scale)')
plt.ylabel('Market Cap (USD - Log Scale)')
plt.grid(alpha=0.3)

**Conclusion:**

There is no correlation between the market cap and current_price. A high market cap does not necessarily indicate a high price, as it can result from either a high price with limited supply or a low price with abundant supply.

## 8. Closing

In [None]:
log_progress("Procces complete")
conn.close()
log_progress("Server Connection closed")