Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cryptoreader #917

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/source/readers/crypto.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Cryptocurrency Data Reader
--------------------------

.. py:module:: pandas_datareader.crypto

.. autoclass:: CryptoReader
:members:
:inherited-members: read
1 change: 1 addition & 0 deletions docs/source/readers/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,4 @@ Data Readers
tsp
world-bank
yahoo
crypto
104 changes: 104 additions & 0 deletions docs/source/remote_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ Currently the following sources are supported:
- :ref:`Tiingo<remote_data.tiingo>`
- :ref:`World Bank<remote_data.wb>`
- :ref:`Yahoo Finance<remote_data.yahoo>`
- :ref:`Cryptocurrency Data<remote_data.crypto>`

It should be noted, that various sources support different kinds of data, so not all sources implement the same methods and the data elements returned might also differ.

Expand Down Expand Up @@ -762,3 +763,106 @@ The following endpoints are available:

dividends = web.DataReader('IBM', 'yahoo-dividends', start, end)
dividends.head()


.. _remote_data.crypto:

Cryptocurrency Data
===================

Access historical data feed from the most popular and liquid exchanges, data aggregating platforms, and return OHLCV candles.
Platforms such as ``Coingecko`` or ``Coinpaprika`` return in addition the global turnover volume and market capitalization.

The ``CryptoReader`` offers helpful methods to print all supported exchanges/platforms and their listed
currency-pairs:

.. ipython:: python

from pandas_datareader.crypto import CryptoReader
CryptoReader.get_all_exchanges()

['50x',
'alterdice',
...]

And, if an exchange is selected but the supported cryptocurrency pairs are unknown:

.. ipython:: python

Reader = CryptoReader(exchange_name="coinbase")
Reader.get_currency_pairs()

Exchange Base Quote
0 COINBASE SUPER USDT
1 COINBASE COMP USD
2 COINBASE COVAL USD
3 COINBASE GTC USD
4 COINBASE ATOM BTC
.. ... ... ...
399 COINBASE TRB BTC
400 COINBASE GRT USD
401 COINBASE BICO USD
402 COINBASE FET USD
403 COINBASE ORN USD

The CryptoReader class takes the following arguments:

* ``symbols`` - the currency-pair of interest
* ``exchange_name`` - the name of the exchange or platform
* ``start`` - start date
* ``end`` - end date
* ``interval`` - the candle interval (e.g. "minutes", "hours", "days")
* ``**kwargs`` - Additional arguments passes to the parent classes

There are several ways to retrieve cryptocurrency data, with identical arguments:

.. ipython:: python

import pandas_datareader.data as web
web.DataReader(...)

import pandas_datareader as pdr
pdr.get_data_crypto(...)

from pandas_datareader.crypto import CryptoReader
Reader = CryptoReader(...)
Reader.read()

.. ipython:: python

import pandas_datareader.data as web

df = web.DataReader("btc-usd", "coinbase")
df.head()
open high low close volume
time
2015-07-20 23:59:59+00:00 277.98 280.00 277.37 280.00 782.883420
2015-07-21 23:59:59+00:00 279.96 281.27 276.85 277.32 4943.559434
2015-07-22 23:59:59+00:00 277.33 278.54 275.01 277.89 4687.909383
2015-07-23 23:59:59+00:00 277.96 279.75 276.28 277.39 5306.919575
2015-07-24 23:59:59+00:00 277.23 291.52 276.43 289.12 7362.469083

Additionally, the ``CryptoReader`` can be used directly:

.. ipython:: python

from pandas_datareader.crypto import CryptoReader

reader = CryptoReader("eth-usd", "coinbase", interval="minutes", start="2021-10-01", end="2021-10-02")
df = reader.read()
print(df)

open high low close volume
time
2021-10-01 00:00:59+00:00 3001.14 3001.42 2998.49 2999.89 100.564601
2021-10-01 00:01:59+00:00 2999.67 3005.99 2999.67 3005.99 91.007463
2021-10-01 00:02:59+00:00 3006.00 3015.14 3001.83 3014.67 494.276213
2021-10-01 00:03:59+00:00 3015.13 3020.19 3011.47 3020.19 174.329287
2021-10-01 00:04:59+00:00 3019.60 3026.60 3015.58 3024.96 131.651872
... ... ... ... ...
2021-10-01 23:55:59+00:00 3305.32 3307.15 3301.70 3306.91 49.974808
2021-10-01 23:56:59+00:00 3306.90 3308.94 3306.05 3307.74 24.950854
2021-10-01 23:57:59+00:00 3308.18 3309.99 3307.31 3309.66 28.768391
2021-10-01 23:58:59+00:00 3309.97 3311.25 3308.18 3311.23 131.851763
2021-10-01 23:59:59+00:00 3311.23 3311.72 3308.70 3311.16 72.940978
[1444 rows x 5 columns]
2 changes: 2 additions & 0 deletions pandas_datareader/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
get_records_iex,
get_summary_iex,
get_tops_iex,
get_data_crypto,
)

PKG = os.path.dirname(__file__)
Expand Down Expand Up @@ -62,6 +63,7 @@
"get_data_tiingo",
"get_iex_data_tiingo",
"get_data_alphavantage",
"get_data_crypto",
"test",
]

Expand Down
237 changes: 237 additions & 0 deletions pandas_datareader/crypto.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,237 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-

from typing import Dict, List, Union, Optional

import warnings
from abc import ABC
from sys import stdout
import pandas as pd
import time
from datetime import datetime
import pytz
import requests.exceptions

from pandas_datareader.crypto_utils.exchange import Exchange
from pandas_datareader.crypto_utils.utilities import get_exchange_names
from pandas_datareader.crypto_utils.utilities import sort_columns, print_timestamp
from pandas_datareader.exceptions import EmptyResponseError


class CryptoReader(Exchange, ABC):
""" Class to request the data from a given exchange for a given currency-pair.
The class inherits from Exchange to extract and format the request urls,
as well as to extract and format the values from the response json.
The requests are performed by the _BaseReader.
"""

def __init__(
self,
symbols: Union[str, dict] = None,
exchange_name: str = None,
start: Union[str, datetime] = None,
end: Union[str, datetime] = None,
interval: str = "days",
**kwargs,
):
""" Constructor. Inherits from the Exchange and _BaseReader class.

@param symbols: Currency pair to request (i.e. BTC-USD)
@param exchange_name: String repr of the exchange name
@param start: The start time of the request, handed over to the BaseReader.
@param end: The end time of the request, handed over to the BaseReader.
@param interval: Candle interval (i.e. minutes, hours, days, weeks, months)
@param **kwargs: Additional kw-arguments for the _BaseReader class.
"""

if not start:
start = datetime(2009, 1, 1)

super(CryptoReader, self).__init__(
exchange_name, interval, symbols, start, end, **kwargs
)

def _get_data(self) -> Dict:
""" Requests the data and returns the response json.

@return: Response json
"""

# Extract and format the url and parameters for the request
self.param_dict = "historic_rates"
self.url_and_params = "historic_rates"

# Perform the request
resp = self._get_response(self.url, params=self.params, headers=None)

# Await the rate-limit to avoid ip ban.
self._await_rate_limit()

return resp.json()

def read(self, new_symbols: str = None) -> pd.DataFrame:
""" Requests and extracts the data. Requests may be performed iteratively
over time to collect the full time-series.

@param new_symbols: New currency-pair to request, if they differ from
the constructor.
@return: pd.DataFrame of the returned data.
"""

if new_symbols:
self.symbol_setter(new_symbols)

# Check if the provided currency-pair is listed on the exchange.
if not self._check_symbols:
raise KeyError(
"The provided currency-pair is not listed on "
"'%s'. "
"Call CryptoReader.get_currency_pairs() for an overview."
% self.name.capitalize()
)

result = list()
mappings = list()
# Repeat until no "older" timestamp is delivered.
# Cryptocurrency exchanges often restrict the amount of
# data points returned by a single request, thus making it
# necessary to iterate backwards in time and merge the retrieved data.
while True:
# perform request and extract data.
resp = self._get_data()
try:
data, mappings = self.format_data(resp)
# Break if no data is returned
except EmptyResponseError:
break

# or all returned data points already exist.
if result == data or all([datapoint in result for datapoint in data]):
break

if self.interval == "minutes":
print_timestamp(list(self.symbols.values())[0])

# Append new data to the result list
result = result + data

# Find the place in the mapping list for the key "time".
time_key = {v: k for k, v in enumerate(mappings)}
time_key = time_key.get("time")

# Extract the minimum timestamp from the response for further requests.
new_time = min(item[time_key] for item in data)

# Break if min timestamp is lower than initial start time.
if new_time.timestamp() <= self.start.timestamp():
break
# Or continue requesting from the new timestamp.
else:
self.symbols.update({list(self.symbols.keys())[0]: new_time})

# Move cursor to the next line to ensure that
# new print statements are executed correctly.
stdout.write("\n")
# If there is data put it into a pd.DataFrame,
# set index and cut it to fit the initial start/end time.
if result:
result = pd.DataFrame(result, columns=mappings)
result = self._index_and_cut_dataframe(result)

# Reset the self.end date of the _BaseReader for further requesting.
self.reset_request_start_date()

return result

@staticmethod
def get_all_exchanges() -> List:
""" Get all supported exchange names.

@return: List of exchange names.
"""

return get_exchange_names()

def get_currency_pairs(
self, raw_data: bool = False
) -> Optional[Union[pd.DataFrame, List]]:
""" Requests all supported currency pairs from the exchange.

@param raw_data: Return the raw data as a list of tuples.
@return: A list of all listed currency pairs.
"""

self.param_dict = "currency_pairs"
self.url_and_params = "currency_pairs"
try:
resp = self._get_response(self.url, params=None, headers=None)
resp = self.format_currency_pairs(resp.json())
except (requests.exceptions.MissingSchema, Exception):
return None

if raw_data:
return resp

# create pd.DataFrame and apply upper case to values
data = pd.DataFrame(resp, columns=["Exchange", "Base", "Quote"])
data = data.apply(lambda x: x.str.upper(), axis=0)

return data

@property
def _check_symbols(self) -> bool:
""" Checks if the specified currency-pair is listed on the exchange"""

currency_pairs = self.get_currency_pairs(raw_data=True)
symbols = (
self.symbols.keys() if isinstance(self.symbols, dict) else [self.symbols]
)

if currency_pairs is None:
warnings.warn(
"Currency-pair request is dysfunctional. "
"Check for valid symbols is skipped."
)
return True

return any(
[
all(
[
(self.name, *symbol.lower().split("/")) in currency_pairs
for symbol in symbols
]
),
all(
[
(self.name, *symbol.lower().split("-")) in currency_pairs
for symbol in symbols
]
),
]
)

def _await_rate_limit(self):
""" Sleep time in order to not violate the rate limit,
measured in requests per minute."""

time.sleep(self.rate_limit)

def _index_and_cut_dataframe(self, dataframe: pd.DataFrame) -> pd.DataFrame:
""" Set index and cut data according to user specification.

@param dataframe: Requested raw data
@return: pd.DataFrame with specified length and proper index.
"""

# Reindex dataframe and cut it to the specified start and end dates.
dataframe.set_index("time", inplace=True)
dataframe.sort_index(inplace=True)
# Returned timestamps from the exchanges are converted into UTC
# and therefore timezone aware. Make start and end dates
# timezone aware in order to make them comparable.
dataframe = dataframe.loc[
pytz.utc.localize(self.start) : pytz.utc.localize(self.end)
]

return sort_columns(dataframe)
2 changes: 2 additions & 0 deletions pandas_datareader/crypto_utils/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
Loading