##### algom/playbooks

# etl

ETL pipeline for asset prices (OHLCV), standard indicators and engineered features. Loads output data to [BigQuery](https://console.cloud.google.com/bigquery?project=algomosaic-nyc&p=algomosaic-nyc&page=project).


#### Steps

1. Initialize ETL process
2. Specify data and feature libraries (optional)
3. Run ETL process and without loading to BigQuery
4. Run ETL process and load to BigQuery

<br>

In [1]:
from src.extract import ticker_extract

<br><br>

### BTC-USD -- hour -- i01 -- 2016

In [2]:
years = [
    2017,
    2018,
    2019,
    2020,
]

In [3]:
for year in years:
    print("RUNNING: {}.".format(year))
    model = ticker_extract.run_extract_process(
        ticker='BTC-USDT',
        start_date='{}-01-01'.format(year),
        end_date='{}-01-01'.format(year+1),
        project='algom-trading',
        destination_table='train_features.features_{ticker}_{interval}_{iteration}_{year}0101',
        table_params={
            'ticker': 'BTC-USDT',
            'interval': 'hour',
            'iteration': 'i03',
            'year': str(year)
        },
        interval='hour',
        exchange='binance',
        data_library='src.extract.cryptocompare_ticker_data',
        features_library='src.features.algom_trading_v001.get_features_hour_i03',
        to_bq=True,
    )

model.data.df.tail()

RUNNING: 2017.
RUNNING: algom-trading:train_features.features_{ticker}_{interval}_{iteration}_{year}0101 is being extracted and transformed.
RUNNING: Extracting data using src.extract.cryptocompare_ticker_data.
Extracting 1 of 5: BTC-USDT up to 2018-01-01 00:00:00
Extracting 2 of 5: BTC-USDT up to 2017-10-09 16:00:00
Extracting 3 of 5: BTC-USDT up to 2017-07-18 08:00:00
Extracting 4 of 5: BTC-USDT up to 2017-04-26 00:00:00
Extracting 5 of 5: BTC-USDT up to 2017-02-01 16:00:00
RUNNING: Applying feature engineering using src.features.algom_trading_v001.get_features_hour_i03.


  result = getattr(ufunc, method)(*inputs, **kwargs)


RUNNING: Cleaning final dataset.
SUCCESS: Loaded DataFrame.
RUNNING: loading features into BigQuery.


1it [00:15, 15.73s/it]


SUCCESS: algom-trading:train_features.features_BTC_USDT_hour_i03_20170101 has been loaded to BigQuery. Runtime: 0:00:21.724870.
RUNNING: 2018.
RUNNING: algom-trading:train_features.features_{ticker}_{interval}_{iteration}_{year}0101 is being extracted and transformed.
RUNNING: Extracting data using src.extract.cryptocompare_ticker_data.
Extracting 1 of 5: BTC-USDT up to 2019-01-01 00:00:00
Extracting 2 of 5: BTC-USDT up to 2018-10-09 16:00:00
Extracting 3 of 5: BTC-USDT up to 2018-07-18 08:00:00
Extracting 4 of 5: BTC-USDT up to 2018-04-26 00:00:00
Extracting 5 of 5: BTC-USDT up to 2018-02-01 16:00:00
RUNNING: Applying feature engineering using src.features.algom_trading_v001.get_features_hour_i03.
RUNNING: Cleaning final dataset.
SUCCESS: Loaded DataFrame.
RUNNING: loading features into BigQuery.


1it [00:28, 28.63s/it]


SUCCESS: algom-trading:train_features.features_BTC_USDT_hour_i03_20180101 has been loaded to BigQuery. Runtime: 0:00:33.660935.
RUNNING: 2019.
RUNNING: algom-trading:train_features.features_{ticker}_{interval}_{iteration}_{year}0101 is being extracted and transformed.
RUNNING: Extracting data using src.extract.cryptocompare_ticker_data.
Extracting 1 of 5: BTC-USDT up to 2020-01-01 00:00:00
Extracting 2 of 5: BTC-USDT up to 2019-10-09 16:00:00
Extracting 3 of 5: BTC-USDT up to 2019-07-18 08:00:00
Extracting 4 of 5: BTC-USDT up to 2019-04-26 00:00:00
Extracting 5 of 5: BTC-USDT up to 2019-02-01 16:00:00
RUNNING: Applying feature engineering using src.features.algom_trading_v001.get_features_hour_i03.
RUNNING: Cleaning final dataset.
SUCCESS: Loaded DataFrame.
RUNNING: loading features into BigQuery.


1it [00:35, 35.09s/it]


SUCCESS: algom-trading:train_features.features_BTC_USDT_hour_i03_20190101 has been loaded to BigQuery. Runtime: 0:00:41.486127.
RUNNING: 2020.
RUNNING: algom-trading:train_features.features_{ticker}_{interval}_{iteration}_{year}0101 is being extracted and transformed.
RUNNING: Extracting data using src.extract.cryptocompare_ticker_data.
Extracting 1 of 5: BTC-USDT up to 2021-01-01 00:00:00
Extracting 2 of 5: BTC-USDT up to 2020-10-09 16:00:00
Extracting 3 of 5: BTC-USDT up to 2020-07-18 08:00:00
Extracting 4 of 5: BTC-USDT up to 2020-04-26 00:00:00
Extracting 5 of 5: BTC-USDT up to 2020-02-02 16:00:00
RUNNING: Applying feature engineering using src.features.algom_trading_v001.get_features_hour_i03.
RUNNING: Cleaning final dataset.
SUCCESS: Loaded DataFrame.
RUNNING: loading features into BigQuery.


1it [00:34, 34.56s/it]

SUCCESS: algom-trading:train_features.features_BTC_USDT_hour_i03_20200101 has been loaded to BigQuery. Runtime: 0:00:39.200627.





Unnamed: 0,ticker_time_sec,close,high,low,open,volume_base,volume,conversionType,conversionSymbol,partition_date,...,open_low21,open_close22,open_high22,open_low22,open_close23,open_high23,open_low23,open_close24,open_high24,open_low24
4,1609444800,29126.7,29139.65,28862.0,28897.83,1936.48,56103301.54,force_direct,,2021-01-03,...,0.000355,0.000355,0.000355,0.000355,0.000355,0.000356,0.000355,0.000355,0.000356,0.000355
3,1609448400,28966.36,29169.55,28900.79,29126.7,2524.47,73351462.94,force_direct,,2021-01-03,...,0.000353,0.000353,0.000353,0.000352,0.000352,0.000353,0.000352,0.000353,0.000353,0.000352
2,1609452000,29100.84,29143.73,28910.19,28966.36,1438.51,41807122.89,force_direct,,2021-01-03,...,0.000354,0.000355,0.000355,0.000355,0.000355,0.000355,0.000354,0.000354,0.000355,0.000354
1,1609455600,28923.63,29110.35,28780.0,29100.84,1976.42,57243040.07,force_direct,,2021-01-03,...,0.000353,0.000353,0.000353,0.000352,0.000353,0.000353,0.000353,0.000353,0.000353,0.000353
0,1609459200,28995.13,29031.34,28690.17,28923.63,2311.81,66768830.34,force_direct,,2021-01-03,...,0.000355,0.000355,0.000355,0.000355,0.000355,0.000355,0.000354,0.000355,0.000356,0.000355


In [4]:
# list(model.data.df)
# model.data.df.head()