<a href="https://colab.research.google.com/github/snowyTheHamster/cryptocurrency_historical_value_scrapper/blob/master/scrape_crypto_historical_values.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Get historic values of cryptocurrencies

Scrape historic values of cryptocurrencies & save them to csv files.

+ Make sure to check the TOS of the website you are scraping from.
+ Use this script & the data pulled at your own risk.
+ website: https://coincheckup.com/

---

### How to use
**Prepare folders**

+ Create a project folder in Google Drive.
+ Create folder inside project folder to save csv files.

**Adjust variables and settings in step 2**

+ List the coins you want data for.

+ Change start date you want for the historic values.

+ Make sure the folders you created match the folder names here.

**Run Code**

+ Run each of the blocks in order 1 ~ 4.

### 1. Mount Google Drive, installing packages

In [14]:
from google.colab import drive
import os

# install chromium, its driver, and selenium
!apt update
!apt install chromium-chromedriver
!pip install selenium

from bs4 import BeautifulSoup
import csv
from datetime import date
from time import sleep
import random

# set options to be headless, ..
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

drive.mount('/content/drive/')

Hit:1 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/ InRelease
Ign:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease
Hit:3 http://security.ubuntu.com/ubuntu bionic-security InRelease
Ign:4 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  InRelease
Hit:5 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  Release
Hit:6 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu bionic InRelease
Hit:7 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release
Hit:8 http://ppa.launchpad.net/cran/libgit2/ubuntu bionic InRelease
Hit:9 http://archive.ubuntu.com/ubuntu bionic InRelease
Hit:10 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu bionic InRelease
Hit:11 http://archive.ubuntu.com/ubuntu bionic-updates InRelease
Hit:13 http://archive.ubuntu.com/ubuntu bionic-backports InRelease
Hit:14 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic 

### 2. Adjust Settings Below

In [15]:
# list of coins to check; make sure name is correct.
currency_name_list = [
    "ethereum",
    "bitcoin",
    "cardano",
    "nem",
    "binance-coin",
    "polkadot-new",
    "stellar",
    "xrp",
    ]

start_date = '2017-03-01'

# Make sure folder names match what you created in google drive
project_folder = 'crypto_folder'
output_folder = 'historic_values'


# dont need to change these
work_dir = os.path.join('/content/drive/My Drive/', project_folder)
OUTPUT_DIR = os.path.join(work_dir, output_folder)
end_date = str(date.today())

### 3. Run Script

In [16]:
page_counter = 1 # track number of currencies to download
for currency_name in currency_name_list:
  options = webdriver.ChromeOptions()
  options.add_argument('--headless')
  options.add_argument('--no-sandbox')
  options.add_argument('--disable-dev-shm-usage')

  driver = webdriver.Chrome(options=options)
  driver.get(f"https://coincheckup.com/coins/{currency_name}/charts")
  sleep(10) # let's wait until the page loads

  # input date range here to generate table
  date_picker = driver.find_element(By.XPATH, '//input[@ng-model="dateHistory"]')
  date_picker.click()
  sleep(1)
  date_picker.send_keys(Keys.CONTROL, 'a')
  sleep(0.5)
  date_picker.send_keys(Keys.DELETE)
  sleep(0.5)
  date_picker.send_keys(start_date)
  sleep(0.5)
  date_picker.send_keys(' - ')
  sleep(0.5)
  date_picker.send_keys(end_date)
  sleep(2)
  date_picker.send_keys(Keys.ENTER)

  sleep(10) # let's wait for the data table the fully generate

  page_source = driver.page_source # prepare page for beautiful soup to scrape

  soup = BeautifulSoup(page_source, 'html.parser')
  price_table = soup.find("table",{"class":"table dataTable"})

  # saves table data to csv file
  with open(f"{OUTPUT_DIR}/{currency_name}_price.csv", "w") as ofile:
      print(f'save CSV data for {currency_name} from {start_date} to {end_date}... coin:{page_counter}/{len(currency_name_list)}')
      writer = csv.writer(ofile)
      rows = price_table.find_all("tr")
      for row in rows:
          csv_row = []
          for cell in row.findAll(["td", "th"]):
              csv_row.append(cell.get_text())
          writer.writerow(csv_row)

  driver.quit()

  # pause after each page request to prevent bann, except on last page
  if page_counter < len(currency_name_list):
      n = random.randint(7, 15)
      for i in range(n):
          print(f'pausing {n} seconds before accessing next page')
          sleep(1)
          n -= 1
  else:
      print('finished attempting to download CSV data for all coins')

  page_counter += 1 # increment page_counter by 1



save CSV data for ethereum from 2017-03-01 to 2021-03-18... coin:1/2
pausing 8 seconds before accessing next page
pausing 7 seconds before accessing next page
pausing 6 seconds before accessing next page
pausing 5 seconds before accessing next page
pausing 4 seconds before accessing next page
pausing 3 seconds before accessing next page
pausing 2 seconds before accessing next page
pausing 1 seconds before accessing next page
save CSV data for bitcoin from 2017-03-01 to 2021-03-18... coin:2/2
finished attempting to download CSV data for all coins


### 4. Unmount Google Drive

In [17]:
drive.flush_and_unmount()
print('All changes made in this colab session should now be visible in Drive.')

All changes made in this colab session should now be visible in Drive.
