Skip to content

PortAnaRecord back test fail for highfreq data #1122

@2young-2simple-sometimes-naive

Description

🐛 Bug Description

I am back testing data of 30min interval. The PortAnaRecord module generate the error below.

To Reproduce

Steps to reproduce the behavior:

with R.start(experiment_name="train_model"):
    recorder = R.get_recorder()
    rid = recorder.id
    print("RID: " + rid)
    model.fit(dataset)
    R.save_objects(trained_model=model)
    # prediction
    sr = SignalRecord(model, dataset, recorder)
    sr.generate()
    # prediction
    sig = SigAnaRecord(recorder, ana_long_short=True, ann_scaler=252, skip_existing=False)
    sig.generate()
    # backtest
    par = PortAnaRecord(recorder, port_analysis_config, risk_analysis_freq="day", indicator_analysis_freq="day")
    par.generate()

Expected Behavior

Perform back test

Screenshot

[33234:MainThread](2022-06-10 12:09:26,179) INFO - qlib.timer - [log.py:113] - Time cost: 22.111s | fit & process data Done
[33234:MainThread](2022-06-10 12:09:26,180) INFO - qlib.timer - [log.py:113] - Time cost: 104.391s | Init data Done
[33234:MainThread](2022-06-10 12:09:37,809) INFO - qlib.workflow - [expm.py:315] - <mlflow.tracking.client.MlflowClient object at 0x1554dcd826d0>
[33234:MainThread](2022-06-10 12:09:37,875) INFO - qlib.workflow - [exp.py:257] - Experiment 1 starts running ...
[33234:MainThread](2022-06-10 12:09:38,267) INFO - qlib.workflow - [recorder.py:293] - Recorder f9b0c9f7bdbf49c8b65eb37a57ba1ab0 starts running under Experiment 1 ...
/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/contrib/model/highfreq_gdbt_model.py:93: FutureWarning: Using the level keyword in DataFrame and Series aggregations is deprecated and will be removed in a future version. Use groupby instead. df.median(level=1) should use df.groupby(level=1).median().
  df_train["label"][l_name] = df_train["label"][l_name] - df_train["label"][l_name].mean(level=0)
/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/contrib/model/highfreq_gdbt_model.py:93: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_train["label"][l_name] = df_train["label"][l_name] - df_train["label"][l_name].mean(level=0)
/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/contrib/model/highfreq_gdbt_model.py:94: FutureWarning: Using the level keyword in DataFrame and Series aggregations is deprecated and will be removed in a future version. Use groupby instead. df.median(level=1) should use df.groupby(level=1).median().
  df_valid["label"][l_name] = df_valid["label"][l_name] - df_valid["label"][l_name].mean(level=0)
/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/contrib/model/highfreq_gdbt_model.py:94: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_valid["label"][l_name] = df_valid["label"][l_name] - df_valid["label"][l_name].mean(level=0)
[33234:MainThread](2022-06-10 12:41:35,649) INFO - qlib.workflow - [record_temp.py:194] - Signal record 'pred.pkl' has been saved as the artifact of the Experiment 1
[33234:MainThread](2022-06-10 12:41:55,350) INFO - qlib.timer - [log.py:113] - Time cost: 0.000s | waiting `async_log` Done
[33234:MainThread](2022-06-10 12:41:55,351) ERROR - qlib.workflow - [utils.py:41] - An exception has been raised[ValueError: can't find a freq from [Freq(30min)] that can resample to 1min!].
  File "./wfhf.py", line 218, in <module>
    par.generate()
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/workflow/record_temp.py", line 232, in generate
    return self._generate(*args, **kwargs)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/workflow/record_temp.py", line 468, in _generate
    portfolio_metric_dict, indicator_dict = normal_backtest(
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/backtest/__init__.py", line 245, in backtest
    trade_strategy, trade_executor = get_strategy_executor(
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/backtest/__init__.py", line 178, in get_strategy_executor
    trade_account = create_account_instance(
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/backtest/__init__.py", line 158, in create_account_instance
    return Account(**kwargs)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/backtest/account.py", line 103, in __init__
    self.init_vars(init_cash, position_dict, freq, benchmark_config)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/backtest/account.py", line 124, in init_vars
    self.reset(freq=freq, benchmark_config=benchmark_config)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/backtest/account.py", line 166, in reset
    self.reset_report(self.freq, self.benchmark_config)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/backtest/account.py", line 137, in reset_report
    self.portfolio_metrics = PortfolioMetrics(freq, benchmark_config)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/backtest/report.py", line 70, in __init__
    self.init_bench(freq=freq, benchmark_config=benchmark_config)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/backtest/report.py", line 88, in init_bench
    self.bench = self._cal_benchmark(self.benchmark_config, self.freq)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/backtest/report.py", line 107, in _cal_benchmark
    _temp_result, _ = get_higher_eq_freq_feature(_codes, fields, start_time, end_time, freq=freq)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/utils/resam.py", line 92, in get_higher_eq_freq_feature
    _result = D.features(instruments, fields, start_time, end_time, freq="1min", disk_cache=disk_cache)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/data/data.py", line 1189, in features
    return DatasetD.dataset(instruments, fields, start_time, end_time, freq, inst_processors=inst_processors)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/data/data.py", line 915, in dataset
    cal = Cal.calendar(start_time, end_time, freq)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/data/data.py", line 91, in calendar
    _calendar, _calendar_index = self._get_calendar(freq, future)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/data/data.py", line 172, in _get_calendar
    _calendar = np.array(self.load_calendar(freq, future))
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/data/data.py", line 662, in load_calendar
    backend_obj = self.backend_obj(freq=freq, future=future).data
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/data/storage/file_storage.py", line 124, in data
    self.check()
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/data/storage/file_storage.py", line 72, in check
    if not self.uri.exists():
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/data/storage/file_storage.py", line 120, in uri
    return self.dpm.get_data_uri(self._freq_file).joinpath(f"{self.storage_name}s", self.file_name)
  File "/dataxxxxxxxxxx/.venv/lib/python3.8/site-packages/pyqlib-0.8.5.99-py3.8-linux-x86_64.egg/qlib/data/storage/file_storage.py", line 101, in _freq_file
    raise ValueError(f"can't find a freq from {self.support_freq} that can resample to {self.freq}!")

Environment

Linux
x86_64
Linux-4.18.0-147.el8.x86_64-x86_64-with-glibc2.2.5
#1 SMP Wed Dec 4 21:51:45 UTC 2019

Python version: 3.8.6 (default, Oct 22 2020, 17:03:03)  [GCC 9.3.0]

Qlib version: 0.8.5.99
numpy==1.22.3
pandas==1.3.5
scipy==1.8.1
requests==2.27.1
sacred==0.8.2
python-socketio==5.6.0
redis==4.3.1
python-redis-lock==3.7.0
schedule==1.1.0
cvxpy==1.2.1
hyperopt==0.1.2
fire==0.4.0
statsmodels==0.13.2
xlrd==2.0.1
plotly==5.8.0
matplotlib==3.5.2
tables==3.7.0
pyyaml==6.0
mlflow==1.26.0
tqdm==4.64.0
loguru==0.6.0
lightgbm==3.3.2
tornado==6.1
joblib==1.1.0
fire==0.4.0
ruamel.yaml==0.17.21

Additional Notes

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions