Skip to content
This repository has been archived by the owner on Nov 26, 2022. It is now read-only.

Ensure that external price data bundles are supported #65

Closed
fredfortier opened this issue Nov 14, 2017 · 23 comments
Closed

Ensure that external price data bundles are supported #65

fredfortier opened this issue Nov 14, 2017 · 23 comments

Comments

@fredfortier
Copy link
Contributor

fredfortier commented Nov 14, 2017

While we are adding more built-in exchange price data, users may want to backtest with their own. For example, someone may have purchased rare historical data from Coinigy. While this should work in theory with the regular ingest command, we have been focused on exchange bundles so this need to be re-validated and fully supported.

@vonpupp
Copy link
Contributor

vonpupp commented Nov 14, 2017

@fredfortier,

Check PR1860 which adds CSV files support, it has recently been merged.

@fredfortier
Copy link
Contributor Author

@vonpupp thanks for pointing this out. It's useful.

@vonpupp
Copy link
Contributor

vonpupp commented Nov 14, 2017

Cool @fredfortier, I am working on two fronts, trying to make both catalyst and zipline to work. Currently I am not yet able to use it on zipline, most probably due to calendar issues. Since you are far more experienced in zipline than me, If you are able to use it on zipline and could help me with a minimal example on crypto I would greatly appreciate it.

I am definitively also interested in catalyst and understand more about it but since they are related, all the knowledge that I am acquiring with zipline will be helpful with catalyst later on.

Thanks a lot.

@fredfortier
Copy link
Contributor Author

For now, instead of enabling a generic bundle. I'm adding a "--csv" option to the ingest-exchange command. It might look like this:

catalyst ingest-exchange -x binance --csv binance-eng_eth.csv

This way, the behavior is completely consistent with existing bundles. When we add this data set to our set of bundles, you won't have to make any change to your algo. All you have to do is provide the right attributes in your CSV which I will detail shortly.

@avn3r
Copy link
Contributor

avn3r commented Nov 23, 2017

Can you give a small format template example of how the CSV should look like?

Thanks.

@fredfortier
Copy link
Contributor Author

fredfortier commented Nov 23, 2017 via email

@vonpupp
Copy link
Contributor

vonpupp commented Nov 24, 2017

Awesome, thank you very much @fredfortier!

@fredfortier
Copy link
Contributor Author

fredfortier commented Nov 25, 2017

@vonpupp and @abnerA I changed my mind on the columns. After experimenting with it, I found having to specify a header less annoying than having to count or re-order header-less columns.

Here are the proposed required headers:

  • symbol: symbol of the currency pair using the Catalyst convention (e.g. eth_btc)
  • last_traded: the close date of the candle, use any date format which Pandas can parse (e.g. 2017-11-25 17:00)
  • open: float
  • high: float
  • low: float
  • close: float
  • volume: float

Example attached for convenience.

bittrex_bat_eth.csv.zip

There is room for additional optional columns, to specify trading amounts and fees for example. But we can add these features separately.

@fredfortier
Copy link
Contributor Author

fredfortier commented Nov 27, 2017

This has been implemented, here is how you would ingest the previously attached sample csv:

catalyst ingest-exchange -x bittrex --csv ~/Data/bittrex_bat_eth.csv -f minute

We will add this info to the documentation shortly.

@fredfortier
Copy link
Contributor Author

In release 0.3.9.

@vonpupp
Copy link
Contributor

vonpupp commented Nov 28, 2017

Thanks @fredfortier.

I will be testing this today. I need to convert the external data to adapt it to the format you use.

I have been given with btc_usd external data that I need to use, since it is available on your server, I wonder if I ingest it as a "different" exchange will work? Something like:

catalyst ingest-exchange -x gdax --csv mydata.csv -f minute

Furthermore, the data is already resampled on a half an hour basis, which is also the timeframe I am going to use (and also 1h and 2h). Will this whole ingestion thing work out in this scenario? My gut feeling tells me that I should ask for 1m raw data, import it as that and then resample; but if you have an alternative idea please let me know.

@fredfortier
Copy link
Contributor Author

fredfortier commented Nov 29, 2017 via email

@vonpupp
Copy link
Contributor

vonpupp commented Dec 7, 2017

Hi @fredfortier,

Unfortunately I am not able to use this feature.

If I try to ingest on a fake name exchange as follows:

catalyst ingest-exchange -x csv --csv data/bittrex_bat_eth.csv -f minute

I get the following error:

Error: Invalid value for "-x" / "--exchange-name": invalid choice: csv. (choose from poloniex, bitfinex, bittrex)

If I try as you exemplified (using bittrex), I get the following message:

(.envc-unstable) > $ catalyst ingest-exchange -x bittrex --csv data/bittrex_bat_eth.csv -f minute
Ingesting exchange bundle bittrex...
[2017-12-07 17:59:12.023942] INFO: exchange_bundle: ingesting csv file: data/bittrex_bat_eth.csv

This process goes really quick so I am not sure if it really ingested the data. When I try to run a simulation with the bat_eth pair I get the following error:

[2017-12-07 18:00:34.951097] INFO: run_algo: running algo in backtest mode
[2017-12-07 18:00:35.558723] INFO: exchange_algorithm: initialized trading algorithm in backtest mode
Error traceback: /home/av/repos/customers/mdbasset/.envc-unstable/lib/python2.7/site-packages/catalyst/exchange/exchange.py (line 248)
SymbolNotFoundOnExchange:  Symbol BAT_ETH not found on exchange Bitfinex. Choose from: [u'avt_btc', u'avt_eth', u'avt_usd', u'bcc_btc', u'bcc_usd', u'bch_btc', u'bch_eth', u'bch_usd', u'bcu_btc', u'bcu_usd', u'bt1_btc', u'bt1_usd', u'bt2_btc', u'bt2_usd', u'btc_eur', u'btc_usd', u'btg_btc', u'btg_usd', u'dat_btc', u'dat_eth', u'dat_usd', u'dsh_btc', u'dsh_usd', u'edo_btc', u'edo_eth', u'edo_usd', u'eos_btc', u'eos_eth', u'eos_usd', u'etc_btc', u'etc_usd', u'eth_btc', u'eth_usd', u'etp_btc', u'etp_eth', u'etp_usd', u'iot_btc', u'iot_eth', u'iot_usd', u'ltc_btc', u'ltc_usd', u'neo_btc', u'neo_eth', u'neo_usd', u'omg_btc', u'omg_eth', u'omg_usd', u'qsh_btc', u'qsh_eth', u'qsh_usd', u'qtm_btc', u'qtm_eth', u'qtm_usd', u'rrt_btc', u'rrt_usd', u'san_btc', u'san_eth', u'san_usd', u'xmr_btc', u'xmr_usd', u'xrp_btc', u'xrp_usd', u'yyw_btc', u'yyw_eth', u'yyw_usd', u'zec_btc', u'zec_usd']

I am using latest version of catalyst installed via pip (git+https...).

Any idea, please?

Thank you very much.

@fredfortier
Copy link
Contributor Author

fredfortier commented Dec 7, 2017 via email

@vonpupp
Copy link
Contributor

vonpupp commented Dec 7, 2017

Thanks @fredfortier,

If by don't specify-x you mean this:

catalyst ingest-exchange bittrex --csv data/bittrex_bat_eth.csv -f minute

It doesn't work either:

Usage: catalyst ingest-exchange [OPTIONS]

Error: Got unexpected extra argument (bittrex)

I am using two dashes already on the csv option (--csv).

I believe you might not have noticed that I was trying to call the exchange as "csv" (hence -x csv) as you said the name didn't matter, but it didn't work. Then I tried to ingest the data using bittrex and it didn't work either, these are the same two scenarios I asked prior.

@vonpupp
Copy link
Contributor

vonpupp commented Dec 8, 2017

@fredfortier, it works.

It was my fault I was ingesting on one exchange while the strategy used another exchange and by mistake I didn't notice that. Being said that, it is not possible to ingest on a new exchange name like gdax or csv, just to let you know.

I apologize for the confusion.

@DanielKillenberger
Copy link

DanielKillenberger commented Feb 10, 2018

Hi @fredfortier

I'm trying to ingest csv data as well to do backtesting.
The ingestion seems to work.
But I get the following error when trying to backtest:

Requested data for trading pair XVG/BTC is not available on exchange bittrex in minute frequency at this time. Check http://enigma.co/catalyst/status for market coverage.

Even though as I said I ingested the XVG data using a csv file.
Is there a way of figuring out if the data has been ingested correctly?
Any ideas on how to fix the problem?

Thanks in advance!

@fredfortier
Copy link
Contributor Author

fredfortier commented Feb 10, 2018 via email

@DanielKillenberger
Copy link

I'm pretty sure that the issue is that the enddate of the minute data doesnt get written to the symbols.json file.
If I add it manually backtesting seems to work.
I'm trying to figure out where/when you actually write to the symbols.json file and add the enddate to symbols.json within the ingest_csv method.

@DanielKillenberger
Copy link

I think it's overwriting symbols_local.json when ingesting csv without updating it in symbols.json.
So any data ingested before aren't in symbols_local.json after you ingest a new file.
It should probably parse the json file and write back the new data instead of overwriting (symbols_local.json)
It does so in exchange_utils.py on line 170.

Also it seems to parse symbols.json for backtesting. So it wouldn't work anyway unless there is an option to use symbols_local.json
Is there even any point to the symbols_local.json file as the ingested data doesnt differentiate if it's local or not. Or does it?

@fredfortier
Copy link
Contributor Author

fredfortier commented Feb 10, 2018 via email

@DanielKillenberger
Copy link

DanielKillenberger commented Feb 10, 2018

Cheers!
Let me know if you need more info from my side or even better found a solution ;)

@Dan733
Copy link

Dan733 commented Mar 31, 2018

I have been able to successfully import custom pricing data from a separate exchange to catalyst using the $ catalyst ingest -b command, but am running into issues when backtesting.

Catalyst forces me to choose one of its three exchanges and then, when running the backtest, also trades symbols on the chosen exchange in addition to my custom bundle data.

Is there a way to force catalyst to only trade on the custom ingested bundle data?

Edit: Right now I'm considering creating a custom base currency for my data.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants