## Extract Binance financial data Cardano
In this notebook we will use the 'python-binance' package to extract financial data from Binance' API. For this to work one should have their own Binance account and request an API key and secret password. We are looking to extract financial data from Cardano with different time frames (1m, 5m, 1h).

In [1]:
#!pip install python-binance

import pandas as pd
import math
import os.path
import time
from binance.client import Client
from datetime import timedelta, datetime
from dateutil import parser


### API

binance_api_key = '[YOUR API KEY HERE]'
binance_api_secret = '[YOUR SECRET API PASSWORD HERE]'

### CONSTANTS
binsizes = {"1m": 1, "5m": 5, "1h": 60}
batch_size = 750
binance_client = Client(api_key=binance_api_key, api_secret=binance_api_secret)


### FUNCTIONS
def minutes_of_new_data(symbol, kline_size, data, source):
    if len(data) > 0:  old = parser.parse(data["timestamp"].iloc[-1])
    elif source == "binance": old = datetime.strptime('1 Jan 2017', '%d %b %Y')
    if source == "binance": new = pd.to_datetime(binance_client.get_klines(symbol=symbol, interval=kline_size)[-1][0], unit='ms')
    return old, new

def get_all_binance(symbol, kline_size, save = False):
    filename = '%s-%s-data.csv' % (symbol, kline_size)
    if os.path.isfile(filename): data_df = pd.read_csv(filename)
    else: data_df = pd.DataFrame()
    oldest_point, newest_point = minutes_of_new_data(symbol, kline_size, data_df, source = "binance")
    delta_min = (newest_point - oldest_point).total_seconds()/60
    available_data = math.ceil(delta_min/binsizes[kline_size])
    if oldest_point == datetime.strptime('1 Jan 2017', '%d %b %Y'): print('Downloading all available %s data for %s. Be patient..!' % (kline_size, symbol))
    else: print('Downloading %d minutes of new data available for %s, i.e. %d instances of %s data.' % (delta_min, symbol, available_data, kline_size))
    klines = binance_client.get_historical_klines(symbol, kline_size, oldest_point.strftime("%d %b %Y %H:%M:%S"), newest_point.strftime("%d %b %Y %H:%M:%S"))
    data = pd.DataFrame(klines, columns = ['timestamp', 'open', 'high', 'low', 'close', 'volume', 'close_time', 'quote_av', 'trades', 'tb_base_av', 'tb_quote_av', 'ignore' ])
    data['timestamp'] = pd.to_datetime(data['timestamp'], unit='ms')
    if len(data_df) > 0:
        temp_df = pd.DataFrame(data)
        data_df = data_df.append(temp_df)
    else: data_df = data
    data_df.set_index('timestamp', inplace=True)
    if save: data_df.to_csv(filename)
    print('All caught up..!')
    return data_df


In [2]:
DOTUSDT_1m = get_all_binance('ADAUSDT', "1m", save = True )
DOTUSDT_5m = get_all_binance('ADAUSDT', "5m", save = True )
DOTUSDT_1h = get_all_binance('ADAUSDT', "1h", save = True )


Downloading all available 1m data for ADAUSDT. Be patient..!
All caught up..!
Downloading all available 5m data for ADAUSDT. Be patient..!
All caught up..!
Downloading all available 1h data for ADAUSDT. Be patient..!
All caught up..!


After manually inspecting the datasets it became clear that there is data missing in the 1m and 1h dataset on 30/11/2020, 21/12/2020 and 11/02/2021. The missing data corresponds with announced maintenance. Therefore all missing timeframes are manually inserted with 0 trading volume etc. and a non-changing start, high, low, end prices. 
