##### algom/playbooks

# etl

ETL pipeline for asset prices (OHLCV), standard indicators and engineered features. Loads output data to [BigQuery](https://console.cloud.google.com/bigquery?project=algomosaic-nyc&p=algomosaic-nyc&page=project).


#### Steps

1. Initialize ETL process
2. Specify data and feature libraries (optional)
3. Run ETL process and without loading to BigQuery
4. Run ETL process and load to BigQuery

<br>

In [1]:
from src.ta.ticker_extract import run_etl_process


<br>

### Yahoo: Run an ETL process

This module uses several default parameters, which are declared in `src/setup.py`. However, these parameters can be overwritten within the function below.


In [2]:
# Run an ETL process 
model = run_etl_process(
    ticker='BTC-USD',
    start_date='2010-01-01',
    end_date=None,
    project_id=None,
    destination_table='yahoo_features.yahoo_features_{ticker}_YYYYMMDD',
    table_params={'ticker': 'BTC-USD'},
    data_library=None,
    features_library=None,
    date_range=None,
    look_back=None,
    look_forward=None,
    to_bq=True
)


RUNNING: algom-trading:yahoo_features.yahoo_features_{ticker}_YYYYMMDD is being extracted and transformed.
RUNNING: Extracting data using src.ta.get_ticker_data.
RUNNING: Applying feature engineering using src.ta.get_features.
RUNNING: loading technical_analysis into BigQuery.


1it [00:20, 20.01s/it]

SUCCESS: algom-trading:yahoo_features.yahoo_features_BTC_USD_20201031 has been loaded to BigQuery. Runtime: 0:00:27.902160.





<br>

### CryptoCompare: run an ETL process

Extract OHLVC data from the [CrytoCompare API](https://min-api.cryptocompare.com/documentation?key=Historical&cat=dataHistoday).


In [1]:
from src.ta.ticker_extract import run_etl_process


In [3]:
# Run an ETL process 
model = run_etl_process(
    ticker='BTC-USD',
    start_date='2016-01-01',
    end_date='2017-01-01',
    project=None,
    destination_table='cryptocompare.features_{ticker}_{interval}_2016',
    table_params={
        'ticker': 'BTC-USD',
        'interval': 'hour'
    },
    interval='hour',
    exchange='CCCAGG',
    data_library='src.ta.cryptocompare_data',
    features_library='src.ta.get_features',
    to_bq=True,
    if_exists='append'
)


RUNNING: algom-trading:cryptocompare.features_{ticker}_{interval}_2016 is being extracted and transformed.
RUNNING: Extracting data using src.ta.cryptocompare_data.
Extracting 1 of 5: BTC-USD up to 2017-01-01 00:00:00
Extracting 2 of 5: BTC-USD up to 2016-10-09 16:00:00
Extracting 3 of 5: BTC-USD up to 2016-07-18 08:00:00
Extracting 4 of 5: BTC-USD up to 2016-04-26 00:00:00
Extracting 5 of 5: BTC-USD up to 2016-02-02 16:00:00
RUNNING: Applying feature engineering using src.ta.get_features.
RUNNING: loading technical_analysis into BigQuery.


1it [01:30, 90.85s/it]

SUCCESS: algom-trading:cryptocompare.features_BTC_USD_hour_2016 has been loaded to BigQuery. Runtime: 0:01:51.712278.





In [None]:
model.df.tail()

<br><br>

## References

#### Documentation: Finance
+ [Best Packages for Financial Analysis](https://financetrain.com/best-python-librariespackages-finance-financial-data-scientists/)
+ [UltraFinance: TA-Lib calculations](https://github.com/panpanpandas/ultrafinance/blob/master/ultrafinance/pyTaLib/pandasImpl.py)
+ [Quantopian: talib](https://www.quantopian.com/posts/technical-analysis-indicators-without-talib-code)
+ [ta: Github](https://github.com/bukosabino/ta): Note that installing this didn't work.
+ [Ultra Finance: pyTalib](https://github.com/panpanpandas/ultrafinance/blob/master/ultrafinance/pyTaLib/pandasImpl.py)

#### Documentation: Python 
+ [Python Tips](http://book.pythontips.com/en/latest/args_and_kwargs.html)
+ [Python Rolling Functions](https://pandas.pydata.org/pandas-docs/version/0.17.0/api.html#standard-moving-window-functions)
+ [Learn Data Sci: EMA](https://www.learndatasci.com/tutorials/python-finance-part-3-moving-average-trading-strategy/)
