# Downloading Bitcoin Transaction History Database

Various sources provide Bitcoin [datasets](https://www.kaggle.com/datasets/prasoonkottarathil/btcinusd), but most of these datasets are distributed in daily intervals. In our case, we need minute-by-minute intervals. Some datasets do offer minute-by-minute or hour-by-hou Bitcoin price data, but they are neither up-to-date nor complete.

Fortunately, there is a website known as [Bitget](https://www.bitget.com/data-download/spot-historical-transaction-record), that offers Bitcoin transaction history from 2018 to 2024. However, since files from Bitget can only be downloaded individually, we will use Selenium to automate the process of imitating user interactions. This approach will enable us to download the datasets for each year from 2021 to 2024 and later merge them into a single comprehensive dataset.

In [1]:
import time
from selenium import webdriver
from chromedriver_py import binary_path
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import StaleElementReferenceException, WebDriverException

In [2]:
options = Options()

prefs = {"download.default_directory" : r"~/btc-transactions/"} # Set the default download directory for transaction files. Update this path as needed.

options.add_experimental_option("prefs",prefs);
options.add_argument('--no-sandbox')

svc = webdriver.ChromeService(executable_path=binary_path)
driver = webdriver.Chrome(service=svc, options=options)

driver.get("https://www.bitget.com/data-download/spot-historical-transaction-record")

driver.find_element(By.XPATH, "//div[@class='item flex items-center px-18px h-104px rounded-12px text-primaryText cursor-pointer hover:bg-cardBg <md:px-0 <md:h-72px']").click()
time.sleep(2)

element = driver.find_element(By.XPATH, "//div[@class='ml-auto !w-288px !<xl:w-210px !<md:w-full <md:mt-24px bit-input bit-input--medium bit-input--round bit-input--prefix bit-input--suffix']")
time.sleep(2)

In [None]:
input_text = element.find_element(By.XPATH, ".//input[@class='bit-input__inner']")
input_text.clear()
input_text.send_keys('2021')
time.sleep(1)

button_element = driver.find_element(By.XPATH, "//div[@class='file-list']")
next_button = button_element.find_element(By.XPATH, ".//button[@class='pagination-label-arrow__next']")

for i in range(60):
    driver.execute_script("window.scrollTo(0, -document.body.scrollHeight);")

    downloads = driver.find_elements(By.XPATH, "//a[@class='download text-16px text-primaryText cursor-pointer <md:mt-4px <md:ml-40px <md:text-12px']")

    for download in downloads:
        try:
            download.click()
        except StaleElementReferenceException:
            continue
    
    try:
        time.sleep(1)
        next_button.click()
        time.sleep(1)
    except (StaleElementReferenceException, WebDriverException):
        print("Next button is no longer clickable.")
        break

In [None]:
input_text = element.find_element(By.XPATH, ".//input[@class='bit-input__inner']")
input_text.clear()
input_text.send_keys('2022')
time.sleep(1)

button_element = driver.find_element(By.XPATH, "//div[@class='file-list']")
next_button = button_element.find_element(By.XPATH, ".//button[@class='pagination-label-arrow__next']")

for i in range(60):
    driver.execute_script("window.scrollTo(0, -document.body.scrollHeight);")

    downloads = driver.find_elements(By.XPATH, "//a[@class='download text-16px text-primaryText cursor-pointer <md:mt-4px <md:ml-40px <md:text-12px']")

    for download in downloads:
        try:
            download.click()
        except StaleElementReferenceException:
            continue

    try:
        time.sleep(1)
        next_button.click()
        time.sleep(1)
    except (StaleElementReferenceException, WebDriverException):
        print("Next button is no longer clickable.")
        break

In [93]:
input_text = element.find_element(By.XPATH, ".//input[@class='bit-input__inner']")
input_text.clear()
input_text.send_keys('2023')
time.sleep(1)

button_element = driver.find_element(By.XPATH, "//div[@class='file-list']")
next_button = button_element.find_element(By.XPATH, ".//button[@class='pagination-label-arrow__next']")

for i in range(60):
    driver.execute_script("window.scrollTo(0, -document.body.scrollHeight);")

    downloads = driver.find_elements(By.XPATH, "//a[@class='download text-16px text-primaryText cursor-pointer <md:mt-4px <md:ml-40px <md:text-12px']")

    for download in downloads:
        try:
            download.click()
        except StaleElementReferenceException:
            continue

    try:
        time.sleep(1)
        next_button.click()
        time.sleep(1)
    except (StaleElementReferenceException, WebDriverException):
        print("Next button is no longer clickable.")
        break

In [95]:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.common.exceptions import StaleElementReferenceException, WebDriverException

input_text = element.find_element(By.XPATH, ".//input[@class='bit-input__inner']")
input_text.clear()
input_text.send_keys('2024')
time.sleep(1)

button_element = driver.find_element(By.XPATH, "//div[@class='file-list']") 
next_button = button_element.find_element(By.XPATH, ".//button[@class='pagination-label-arrow__next']")

for i in range(30):
    driver.execute_script("window.scrollTo(0, -document.body.scrollHeight);")

    downloads = driver.find_elements(By.XPATH, "//a[@class='download text-16px text-primaryText cursor-pointer <md:mt-4px <md:ml-40px <md:text-12px']")

    for download in downloads:
        try:
            download.click()
        except StaleElementReferenceException:
            continue

    try:
        time.sleep(1)
        next_button.click()
        time.sleep(1)
    except (StaleElementReferenceException, WebDriverException):
        print("Next button is no longer clickable.")
        break

In [97]:
driver.quit()

#### The number of downloaded files is big, so they won't be stored on github, and the final dataset will be stored in Kaggle for more efficient management and accessibility.
Link to Kaggle dataset: _(Not defined yet)_