Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PortAnaRecord - calendar not exists for freq day #237

Open
ChengYen-Tang opened this issue Jan 31, 2021 · 3 comments
Open

PortAnaRecord - calendar not exists for freq day #237

ChengYen-Tang opened this issue Jan 31, 2021 · 3 comments
Labels
bug Something isn't working

Comments

@ChengYen-Tang
Copy link
Contributor

🐛 Bug Description

#231 問題差不多,使用 30m 的數據會出現 calendar not exists for freq day 的錯誤

To Reproduce

Steps to reproduce the behavior:

  1. 載入資料
import time
import numpy as np
import pandas as pd

import qlib
from qlib.config import REG_US
from qlib.contrib.model.gbdt import LGBModel
from qlib.contrib.data.handler import Alpha158
from qlib.contrib.strategy.strategy import TopkDropoutStrategy
from qlib.contrib.evaluate import (
    backtest as normal_backtest,
    risk_analysis
)
from qlib.utils import exists_qlib_data, init_instance_by_config
from qlib.workflow import R
from qlib.workflow.record_temp import SignalRecord, PortAnaRecord
from qlib.utils import flatten_dict
from qlib.data import D

qlib.init(provider_uri='~/.qlib/qlib_data/my_data/')
instruments = D.instruments(market='all')

[18021:MainThread](2021-01-31 23:42:01,889) INFO - qlib.Initialization - [config.py:277] - default_conf: client.
[18021:MainThread](2021-01-31 23:42:01,896) WARNING - qlib.Initialization - [config.py:292] - redis connection failed(host=127.0.0.1 port=6379), cache will not be used!
[18021:MainThread](2021-01-31 23:42:01,900) INFO - qlib.Initialization - [__init__.py:46] - qlib successfully initialized based on client settings.
[18021:MainThread](2021-01-31 23:42:01,901) INFO - qlib.Initialization - [__init__.py:47] - data_path=/home/kenneth/.qlib/qlib_data/my_data
  1. 定義訓練參數
data_handler_config = {
    'start_time': '2017-07-15',
    'end_time': '2021-01-15',
    'fit_start_time': '2017-07-15',
    'fit_end_time': '2020-06-30',
    'instruments': instruments,
    'freq': '30m'
}

task = {
    'model': {
        'class': 'LGBModel',
        'module_path': 'qlib.contrib.model.gbdt',
        'kwargs':{
            'loss': 'mse',
            'colsample_bytree': 0.8879,
            'learning_rate': 0.0421,
            'subsample': 0.8789,
            'lambda_l1': 205.6999,
            'lambda_l2': 580.9768,
            'max_depth': 8,
            'num_leaves': 210,
            'num_threads': 20
        }
    },
    'dataset':{
        'class': 'DatasetH',
        'module_path': 'qlib.data.dataset',
        'kwargs':{
            'handler':{
                'class': 'Alpha158',
                'module_path': 'qlib.contrib.data.handler',
                'kwargs': data_handler_config
            },
            'segments':{
                'train': ('2017-07-15', '2020-01-01'),
                'valid': ('2020-01-02', '2020-06-30'),
                'test': ('2020-07-07', '2021-01-15'),
            }
        }
    }
}
model = init_instance_by_config(task['model'])
dataset = init_instance_by_config(task['dataset'])

[18021:MainThread](2021-02-01 00:16:14,111) INFO - qlib.timer - [log.py:81] - Time cost: 46.901s | Loading data Done
[18021:MainThread](2021-02-01 00:16:15,333) INFO - qlib.timer - [log.py:81] - Time cost: 1.017s | DropnaLabel Done
[18021:MainThread](2021-02-01 00:21:03,812) INFO - qlib.timer - [log.py:81] - Time cost: 288.477s | CSZScoreNorm Done
[18021:MainThread](2021-02-01 00:21:03,815) INFO - qlib.timer - [log.py:81] - Time cost: 289.700s | fit & process data Done
[18021:MainThread](2021-02-01 00:21:03,816) INFO - qlib.timer - [log.py:81] - Time cost: 336.607s | Init data Done
  1. 訓練模型
t_start = time.time()

with R.start(experiment_name='train_model'):
    R.log_params(**flatten_dict(task))
    model.fit(dataset)
    R.save_objects(trained_model=model)
    rid = R.get_recorder().id

t_end = time.time()
print('train model - Time count: %.3fs'%(t_end - t_start))

[18021:MainThread](2021-02-01 00:24:55,000) INFO - qlib.workflow - [expm.py:245] - No tracking URI is provided. Use the default tracking URI.
[18021:MainThread](2021-02-01 00:24:55,014) INFO - qlib.workflow - [expm.py:168] - No valid experiment found. Create a new experiment with name train_model.
[18021:MainThread](2021-02-01 00:24:55,022) INFO - qlib.workflow - [exp.py:181] - Experiment 1 starts running ...
[18021:MainThread](2021-02-01 00:24:55,213) INFO - qlib.workflow - [recorder.py:233] - Recorder 03e37d24ab8b4c809b619bdfecad8c78 starts running under Experiment 1 ...
Training until validation scores don't improve for 50 rounds
[20]	train's l2: 0.891066	valid's l2: 0.94816
[40]	train's l2: 0.889417	valid's l2: 0.948044
[60]	train's l2: 0.888093	valid's l2: 0.948017
[80]	train's l2: 0.886899	valid's l2: 0.948024
[100]	train's l2: 0.885763	valid's l2: 0.948017
[120]	train's l2: 0.884643	valid's l2: 0.948036
Early stopping, best iteration is:
[87]	train's l2: 0.886497	valid's l2: 0.947999
train model - Time count: 34.624s
  1. 回測
port_analysis_config = {
    'strategy':{
        'class': 'TopkDropoutStrategy',
        'module_path': 'qlib.contrib.strategy.strategy',
        'kwargs':{
            'topk': 50,
            'n_drop': 5
        }
    },
    'backtest':{
        'verbose': False,
        'limit_threshold': np.inf,
        'account': 100000000,
        'benchmark': 'btcusdt-futuresusdt',
        'deal_price': 'close',
        'open_cost': 0.1,
        'close_cost': 0.1,
        'min_cost': 1,
    }
}

t_start = time.time()

with R.start(experiment_name='backtest_analysis'):
    recorder = R.get_recorder(rid, experiment_name='train_model')
    model = recorder.load_object('trained_model')

    # 預測
    recorder = R.get_recorder()
    ba_rid = recorder.id
    sr = SignalRecord(model, dataset, recorder)
    sr.generate()

    # 回測和分析
    par = PortAnaRecord(recorder, port_analysis_config)
    par.generate()

t_end = time.time()
print('backtest and analysis - Time count: %.3fs'%(t_end - t_start))

Error message:

[18021:MainThread](2021-02-01 01:10:40,088) INFO - qlib.workflow - [expm.py:245] - No tracking URI is provided. Use the default tracking URI.
[18021:MainThread](2021-02-01 01:10:40,097) INFO - qlib.workflow - [exp.py:181] - Experiment 2 starts running ...
[18021:MainThread](2021-02-01 01:10:40,127) INFO - qlib.workflow - [recorder.py:233] - Recorder 346169cbdcca4617abb3efd3e38d82c6 starts running under Experiment 2 ...
[18021:MainThread](2021-02-01 01:10:41,141) INFO - qlib.workflow - [record_temp.py:125] - Signal record 'pred.pkl' has been saved as the artifact of the Experiment 2
[18021:MainThread](2021-02-01 01:10:41,240) INFO - qlib.backtest caller - [__init__.py:148] - Create new exchange
'The following are prediction results of the LGBModel model.'
                                   score
datetime   instrument                   
2020-07-07 ADABTC-SPOT         -0.012265
           ADAUSDT-FUTURESUSDT -0.011834
           ADAUSDT-SPOT        -0.034770
           BCHBTC-SPOT          0.000807
           BCHUSDT-FUTURESUSDT -0.023978
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-19-c9015082854c> in <module>
     36     # 回測和分析
     37     par = PortAnaRecord(recorder, port_analysis_config)
---> 38     par.generate()
     39 
     40 t_end = time.time()

~/.local/lib/python3.8/site-packages/pyqlib-0.6.1.99-py3.8-linux-x86_64.egg/qlib/workflow/record_temp.py in generate(self, **kwargs)
    241         # custom strategy and get backtest
    242         pred_score = super().load()
--> 243         report_dict = normal_backtest(pred_score, strategy=self.strategy, **self.backtest_config)
    244         report_normal = report_dict.get("report_df")
    245         positions_normal = report_dict.get("positions")

~/.local/lib/python3.8/site-packages/pyqlib-0.6.1.99-py3.8-linux-x86_64.egg/qlib/contrib/backtest/__init__.py in backtest(pred, account, shift, benchmark, verbose, return_order, **kwargs)
    301     spec = inspect.getfullargspec(get_exchange)
    302     ex_args = {k: v for k, v in kwargs.items() if k in spec.args}
--> 303     trade_exchange = get_exchange(pred, **ex_args)
    304 
    305     # init executor:

~/.local/lib/python3.8/site-packages/pyqlib-0.6.1.99-py3.8-linux-x86_64.egg/qlib/contrib/backtest/__init__.py in get_exchange(pred, exchange, subscribe_fields, open_cost, close_cost, min_cost, trade_unit, limit_threshold, deal_price, extract_codes, shift)
    156 
    157         dates = sorted(pred.index.get_level_values("datetime").unique())
--> 158         dates = np.append(dates, get_date_range(dates[-1], left_shift=1, right_shift=shift))
    159 
    160         exchange = Exchange(

~/.local/lib/python3.8/site-packages/pyqlib-0.6.1.99-py3.8-linux-x86_64.egg/qlib/utils/__init__.py in get_date_range(trading_date, left_shift, right_shift, future)
    488     from ..data import D
    489 
--> 490     start = get_date_by_shift(trading_date, left_shift, future=future)
    491     end = get_date_by_shift(trading_date, right_shift, future=future)
    492 

~/.local/lib/python3.8/site-packages/pyqlib-0.6.1.99-py3.8-linux-x86_64.egg/qlib/utils/__init__.py in get_date_by_shift(trading_date, shift, future, clip_shift)
    508     from qlib.data import D
    509 
--> 510     cal = D.calendar(future=future)
    511     if pd.to_datetime(trading_date) not in list(cal):
    512         raise ValueError("{} is not trading day!".format(str(trading_date)))

~/.local/lib/python3.8/site-packages/pyqlib-0.6.1.99-py3.8-linux-x86_64.egg/qlib/data/data.py in calendar(self, start_time, end_time, freq, future)
    929 
    930     def calendar(self, start_time=None, end_time=None, freq="day", future=False):
--> 931         return Cal.calendar(start_time, end_time, freq, future=future)
    932 
    933     def instruments(self, market="all", filter_pipe=None, start_time=None, end_time=None):

~/.local/lib/python3.8/site-packages/pyqlib-0.6.1.99-py3.8-linux-x86_64.egg/qlib/data/data.py in calendar(self, start_time, end_time, freq, future)
    532 
    533     def calendar(self, start_time=None, end_time=None, freq="day", future=False):
--> 534         _calendar, _calendar_index = self._get_calendar(freq, future)
    535         if start_time == "None":
    536             start_time = None

~/.local/lib/python3.8/site-packages/pyqlib-0.6.1.99-py3.8-linux-x86_64.egg/qlib/data/data.py in _get_calendar(self, freq, future)
    118             _calendar, _calendar_index = H["c"][flag]
    119         else:
--> 120             _calendar = np.array(self.load_calendar(freq, future))
    121             _calendar_index = {x: i for i, x in enumerate(_calendar)}  # for fast search
    122             H["c"][flag] = _calendar, _calendar_index

~/.local/lib/python3.8/site-packages/pyqlib-0.6.1.99-py3.8-linux-x86_64.egg/qlib/data/data.py in load_calendar(self, freq, future)
    527             fname = self._uri_cal.format(freq)
    528         if not os.path.exists(fname):
--> 529             raise ValueError("calendar not exists for freq " + freq)
    530         with open(fname) as f:
    531             return [pd.Timestamp(x.strip()) for x in f]

ValueError: calendar not exists for freq day

Environment

Note: User could run cd scripts && python collect_info.py all under project directory to get system information
and paste them here directly.

  • Qlib version:
  • Python version: 3.8.7
  • OS (Windows, Linux, MacOS): Linux
  • Commit number (optional, please provide it if you are using the dev version): c0e7cbc
@ChengYen-Tang ChengYen-Tang added the bug Something isn't working label Jan 31, 2021
@you-n-g
Copy link
Collaborator

you-n-g commented Feb 1, 2021

@ChengYen-Tang
This bug is fixed in
#234

The frequency paramter will be only used in the dataloader in the future

@ChengYen-Tang
Copy link
Contributor Author

@you-n-g
So PortAnaRecord does not support frequency paramter?

@you-n-g
Copy link
Collaborator

you-n-g commented Feb 3, 2021

@ChengYen-Tang No, PortAnaRecord does not support frequency paramters so far.
We are focus on developing this feature

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

2 participants