# Fetch and Store Historical Bitcoin Data in PostgreSQL

This notebook demonstrates how to fetch historical Bitcoin (BTC) data from the CryptoCompare API, process the data, and store it in a PostgreSQL database. The process is broken down into the following steps:

## Steps:
1. **Load Environment Variables**: We load API keys and database URL from an `.env` file.
2. **Configure Database Connection**: We set up a connection to the PostgreSQL database using SQLAlchemy.
3. **API Interaction**: We fetch historical data for Bitcoin in chunks, using the CryptoCompare API.
4. **Data Processing**: The data is processed, including renaming columns, converting timestamps, and handling duplicates.
5. **Store in Database**: The processed data is stored in a PostgreSQL database table.

### Required Libraries:
- `os`: For handling environment variables.
- `time`: For managing time intervals.
- `pandas`: For data manipulation and storage.
- `requests`: For making HTTP requests to the CryptoCompare API.
- `dotenv`: For loading environment variables from the `.env` file.
- `sqlalchemy`: For connecting to the PostgreSQL database.


In [8]:
! pip install python-dotenv requests pandas sqlalchemy psycopg2-binary


[0m

In [2]:
import os
import time

import pandas as pd
import requests
from dotenv import load_dotenv
from sqlalchemy import create_engine

In [3]:
# Load Env variables
load_dotenv(dotenv_path="env")  
API_KEY      = os.getenv("API_KEY")
DATABASE_URL = os.getenv("DATABASE_URL")


if not API_KEY:
    raise EnvironmentError("API_KEY missing.")
if not DATABASE_URL:
    raise EnvironmentError("DATABASE_URL missing.")


### PATCH DATABASE URL FOR SQLALCHEMY ≥1.4

In [4]:
# SQLAlchemy 1.4+ rejects "postgres://"; it needs "postgresql://"
if DATABASE_URL.startswith("postgres://"):
    DATABASE_URL = DATABASE_URL.replace("postgres://", "postgresql://", 1)

# Ensure psycopg2 is installed: pip install psycopg2-binary
engine = create_engine(DATABASE_URL, echo=False, future=True)

# CONFIG
TABLE_NAME   = "bitcion_daily_data"
HISTORICAL_CHUNK = 2000  # max per-request limit
BASE_URL         = "https://min-api.cryptocompare.com/data/v2/histoday"
HEADERS          = {"authorization": f"Apikey {API_KEY}"}



## Fetching Data

In [5]:
# FETCH A CHUNK
def fetch_chunk(days: int, to_ts: int = None) -> list:
    limit_param = max(days - 1, 1)
    params = {"fsym": "BTC", "tsym": "USD", "limit": limit_param}
    if to_ts:
        params["toTs"] = to_ts

    resp = requests.get(BASE_URL, params=params, headers=HEADERS)
    resp.raise_for_status()
    data = resp.json()
    if data.get("Response") != "Success":
        raise RuntimeError(f"API Error: {data.get('Message', 'Unknown')}")
    return data["Data"]["Data"]



## Fetch Full Historical Data

In [6]:
# Fetch Full HISTORY
def fetch_full_historical(total_days: int) -> pd.DataFrame:
    to_ts = int(time.time())
    days_left = total_days
    all_days = []

    while days_left > 0:
        batch = min(HISTORICAL_CHUNK, days_left)
        chunk = fetch_chunk(batch, to_ts)
        if not chunk:
            break

        all_days.extend(chunk)
        to_ts = chunk[0]["time"] - 1
        days_left -= batch
        time.sleep(0.2)  # be kind to the API

    df = pd.DataFrame(all_days)
    df["date"] = pd.to_datetime(df["time"], unit="s").dt.date
    df = df.rename(columns={
        "open": "open_usd",
        "high": "high_usd",
        "low": "low_usd",
        "close": "close_usd",
        "volumeto": "volume_usd"
    })
    return (
        df[["date", "open_usd", "high_usd", "low_usd", "close_usd", "volume_usd"]]
        .drop_duplicates("date")
        .sort_values("date")
        .reset_index(drop=True)
    )



## Storing Data in PostgreSQL

In [9]:
# MAIN
def main():
    try:
        # Number of year of data you want to fetch
        num_years = 12
        total_days = int(num_years * 365.25)
    except ValueError:
        print("Invalid input. Please enter a numeric value.")
        return

    print(f"\nFetching ~{total_days} days of BTC/USD data (~{num_years:.2f} years)...")
    df = fetch_full_historical(total_days)
    print(f"Fetched {len(df)} records. Writing to database…")

    # write to Postgres (will replace any existing table with the same name)
    df.to_sql(TABLE_NAME, engine, if_exists="replace", index=False)
    print(f"✅ Data saved to table '{TABLE_NAME}' in your database.")

    # Optional: preview first few rows
    print(df.head())


if __name__ == "__main__":
    main()


Fetching ~4383 days of BTC/USD data (~12.00 years)...
Fetched 4383 records. Writing to database…
✅ Data saved to table 'bitcion_daily_data' in your database.
         date  open_usd  high_usd  low_usd  close_usd   volume_usd
0  2013-05-12    115.64    117.47   112.40     114.82   2357929.41
1  2013-05-13    114.82    118.88   114.50     117.98   3058207.49
2  2013-05-14    117.98    119.80   109.42     111.40  10075279.73
3  2013-05-15    111.40    116.44   103.02     114.22  12997994.80
4  2013-05-16    114.22    118.97   112.10     118.21   5202992.37
