##### algom/playbooks

# etl

ETL pipeline for asset prices (OHLCV), standard indicators and engineered features. Loads output data to [BigQuery](https://console.cloud.google.com/bigquery?project=algomosaic-nyc&p=algomosaic-nyc&page=project).


#### Steps

1. Initialize ETL process
2. Specify data and feature libraries (optional)
3. Run ETL process and without loading to BigQuery
4. Run ETL process and load to BigQuery

<br>

<br>

### CryptoCompare: run an ETL process

Extract OHLVC data from the [CrytoCompare API](https://min-api.cryptocompare.com/documentation?key=Historical&cat=dataHistoday).


In [1]:
from src.ta.ticker_extract import run_etl_process


In [2]:
# Run an ETL process
data = run_etl_process(
    ticker='BTC-USD',
    start_date='2017-01-01',
    end_date='2018-01-01',
    project=None,
    destination_table='cryptocompare.features_{ticker}_{interval}_2017',
    table_params={
        'ticker': 'BTC-USD',
        'interval': 'hour'
    },
    interval='hour',
    exchange='CCCAGG',
    data_library='src.ta.cryptocompare_data',
    features_library='src.ta.get_features_1h',
    to_bq=True,
    if_exists='replace'
)


RUNNING: algom-trading:cryptocompare.features_{ticker}_{interval}_2017 is being extracted and transformed.
RUNNING: Extracting data using src.ta.cryptocompare_data.
Extracting 1 of 5: BTC-USD up to 2018-01-01 00:00:00
Extracting 2 of 5: BTC-USD up to 2017-10-09 16:00:00
Extracting 3 of 5: BTC-USD up to 2017-07-18 08:00:00
Extracting 4 of 5: BTC-USD up to 2017-04-26 00:00:00
Extracting 5 of 5: BTC-USD up to 2017-02-01 16:00:00
RUNNING: Applying feature engineering using src.ta.get_features_1h.
RUNNING: loading technical_analysis into BigQuery.


1it [01:07, 67.88s/it]

SUCCESS: algom-trading:cryptocompare.features_BTC_USD_hour_2017 has been loaded to BigQuery. Runtime: 0:01:28.373937.





In [3]:
data = run_etl_process(
    ticker='BTC-USD',
    start_date='2018-01-01',
    end_date='2019-01-01',
    project=None,
    destination_table='cryptocompare.features_{ticker}_{interval}_2018',
    table_params={
        'ticker': 'BTC-USD',
        'interval': 'hour'
    },
    interval='hour',
    exchange='CCCAGG',
    data_library='src.ta.cryptocompare_data',
    features_library='src.ta.get_features_1h',
    to_bq=True,
    if_exists='replace'
)

RUNNING: algom-trading:cryptocompare.features_{ticker}_{interval}_2018 is being extracted and transformed.
RUNNING: Extracting data using src.ta.cryptocompare_data.
Extracting 1 of 5: BTC-USD up to 2019-01-01 00:00:00
Extracting 2 of 5: BTC-USD up to 2018-10-09 16:00:00
Extracting 3 of 5: BTC-USD up to 2018-07-18 08:00:00
Extracting 4 of 5: BTC-USD up to 2018-04-26 00:00:00
Extracting 5 of 5: BTC-USD up to 2018-02-01 16:00:00
RUNNING: Applying feature engineering using src.ta.get_features_1h.
RUNNING: loading technical_analysis into BigQuery.


1it [01:04, 64.26s/it]

SUCCESS: algom-trading:cryptocompare.features_BTC_USD_hour_2018 has been loaded to BigQuery. Runtime: 0:01:33.590202.





In [4]:
data = run_etl_process(
    ticker='BTC-USD',
    start_date='2019-01-01',
    end_date='2020-01-01',
    project=None,
    destination_table='cryptocompare.features_{ticker}_{interval}_2019',
    table_params={
        'ticker': 'BTC-USD',
        'interval': 'hour'
    },
    interval='hour',
    exchange='CCCAGG',
    data_library='src.ta.cryptocompare_data',
    features_library='src.ta.get_features_1h',
    to_bq=True,
    if_exists='replace'
)

RUNNING: algom-trading:cryptocompare.features_{ticker}_{interval}_2019 is being extracted and transformed.
RUNNING: Extracting data using src.ta.cryptocompare_data.
Extracting 1 of 5: BTC-USD up to 2020-01-01 00:00:00
Extracting 2 of 5: BTC-USD up to 2019-10-09 16:00:00
Extracting 3 of 5: BTC-USD up to 2019-07-18 08:00:00
Extracting 4 of 5: BTC-USD up to 2019-04-26 00:00:00
Extracting 5 of 5: BTC-USD up to 2019-02-01 16:00:00
RUNNING: Applying feature engineering using src.ta.get_features_1h.
RUNNING: loading technical_analysis into BigQuery.


1it [01:00, 60.43s/it]

SUCCESS: algom-trading:cryptocompare.features_BTC_USD_hour_2019 has been loaded to BigQuery. Runtime: 0:01:22.749948.





In [5]:
data = run_etl_process(
    ticker='BTC-USD',
    start_date='2020-01-01',
    end_date='2021-01-01',
    project=None,
    destination_table='cryptocompare.features_{ticker}_{interval}_2020',
    table_params={
        'ticker': 'BTC-USD',
        'interval': 'hour'
    },
    interval='hour',
    exchange='CCCAGG',
    data_library='src.ta.cryptocompare_data',
    features_library='src.ta.get_features_1h',
    to_bq=True,
    if_exists='replace'
)

RUNNING: algom-trading:cryptocompare.features_{ticker}_{interval}_2020 is being extracted and transformed.
RUNNING: Extracting data using src.ta.cryptocompare_data.
Extracting 1 of 5: BTC-USD up to 2021-01-01 00:00:00
Extracting 2 of 5: BTC-USD up to 2020-10-09 16:00:00
Extracting 3 of 5: BTC-USD up to 2020-07-18 08:00:00
Extracting 4 of 5: BTC-USD up to 2020-04-26 00:00:00
Extracting 5 of 5: BTC-USD up to 2020-02-02 16:00:00
RUNNING: Applying feature engineering using src.ta.get_features_1h.
RUNNING: loading technical_analysis into BigQuery.


1it [00:47, 47.97s/it]

SUCCESS: algom-trading:cryptocompare.features_BTC_USD_hour_2020 has been loaded to BigQuery. Runtime: 0:01:06.990266.



