Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RL backtest with simulator #1299

Merged
merged 9 commits into from
Oct 12, 2022
Merged

RL backtest with simulator #1299

merged 9 commits into from
Oct 12, 2022

Conversation

lihuoran
Copy link
Contributor

@lihuoran lihuoran commented Sep 19, 2022

Now support running RL backtest with SAOE simulator.

Description

Motivation and Context

How Has This Been Tested?

  • Pass the test by running: pytest qlib/tests/test_all_pipeline.py under upper directory of qlib.
  • If you are adding a new feature, test on your own test scripts.

Screenshots of Test Results (if appropriate):

  1. Pipeline test:
  2. Your own tests:

Types of changes

  • Fix bugs
  • Add new feature
  • Update documentation

split: str = "stock",
cash_limit: float = None,
generate_report: bool = False,
) -> Union[Tuple[pd.DataFrame, dict], pd.DataFrame]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be some docstring for this function.

@@ -90,26 +92,109 @@ def _convert_indicator_to_dataframe(indicator: dict) -> Optional[pd.DataFrame]:
return records


def _generate_report(decisions: list, report_dict: dict) -> dict:
def _generate_report(decisions: list, report_dicts: List[dict]) -> dict:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be richer annotation for the input (e.g. report) and the returned report

For example, @ dataclass with typed fields and detailed docstrings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some research on this issue and I found that this is related to the entire collect_data_loop() / backtest() lifecycle, so we need a lot of efforts to optimize it. Since it is not about the core function of RL backtest, I suggest we leave this in future PRs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can leave a TODO here...

}
)

simulator = SingleAssetOrderExecution(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this simulator generate reports?
I didn't find how the step is called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simulator will execute one hidden step when it is created. When it is used in training, it will pause at the first yield of the internal strategies. However, in backtest, there will (should) not be any pauses, so the simulator will run until it stops.

stock_pool = stock_pool

single = single_with_simulator if with_simulator else single_with_collect_data_loop
if parallel_mode:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think parallel_mode is a required parameter.
Joblib will fall to single process when n_jobs == 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

executor_config
Executor configuration
exchange_config
Exchange configuration
qlib_config
Configuration used to initialize Qlib. If it is None, Qlib will not be initialized.
cash_limit:
Cash limit.
backtest_mode
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backtest_mode is not a necessary parameter if we carefully design it.
It should disappear with CollectDataEnvWrapper in the future.

Please add doc for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc added.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. backtest_mode looks ugly in the init signature.

executor.inner_strategy.set_env(CollectDataEnvWrapper())
executor = executor.inner_executor

self.step(action=None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should it call step in the reset phase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call step() with None is to "activate" the internal generator.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please Add comments about it.

)
assert isinstance(self._collect_data_loop, Generator)

self._last_yielded_saoe_strategy = self._iter_strategy(action=None)
if backtest_mode:
executor: BaseExecutor = self._executor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments that it should be removed in the future

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done~

fill_val = fill_method(original_data)
return np.array([tmp.get(t, fill_val) for t in total_time_list])


class SAOEStateAdapter:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest moving this adapter to state_adapter.py or simulator_qlib.py. If I'm using the simple simluator, there is no reason I will be interested in all these adapter-related logics.

Can be left as TODO. :)

for key in ["1minute", "5minute", "30minute", "1day"]:
if key not in report_dict["indicator"]:
decision_details = pd.concat([getattr(d, "details") for d in decisions if hasattr(d, "details")])
for key in ["1min", "5min", "30min", "1day"]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I hard-coded this to quickly run through the experiments.
For open source version, it's worth making it more general.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of the following issue mentioned by you-n-g before. I will redesign the entire logic in later PRs.

I think there should be richer annotation for the input (e.g. report) and the returned report
For example, @ dataclass with typed fields and detailed docstrings

@@ -576,3 +576,16 @@ def __repr__(self) -> str:
f"trade_range: {self.trade_range}; "
f"order_list[{len(self.order_list)}]"
)


class TradeDecisionWithDetails(TradeDecisionWO):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some explanations on why (in what scenarios) we need this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

qlib/rl/order_execution/policy.py Show resolved Hide resolved
executor_config
Executor configuration
exchange_config
Exchange configuration
qlib_config
Configuration used to initialize Qlib. If it is None, Qlib will not be initialized.
cash_limit:
Cash limit.
backtest_mode

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. backtest_mode looks ugly in the init signature.

@you-n-g you-n-g merged commit 216a8ec into main Oct 12, 2022
@you-n-g you-n-g deleted the huoran/migrate_amc4th branch October 12, 2022 08:44
@ChiahungTai
Copy link
Collaborator

Why this PR is merged even break the CI???
It make other PRs fail the checks........

Screen Shot 2022-10-14 at 8 43 13 AM

@you-n-g
Copy link
Collaborator

you-n-g commented Oct 14, 2022

The CI will be fixed in this PR soon #1314

@you-n-g you-n-g added the enhancement New feature or request label Dec 9, 2022
qianyun210603 pushed a commit to qianyun210603/qlib that referenced this pull request Mar 23, 2023
* RL backtest with simulator

* Minor modification in init_qlib

* Cherry pick PR 1302

* Resolve PR comments

* Fix missing data processing

* Minor bugfix

* Add TODOs and docs

* Add a comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants