Skip to content

Inconsistency of the adjusted close price provided by Qlib #849

@XeniaLLL

Description

@XeniaLLL

🐛 Bug Description

I ask the authors for help about the inconsistency between the data I got following the document and the data illustrate in the example of the doc. The authors told me that the newly available data is normalized by the first day of the stock on the list. And it's adjusted for observing the real trend of the stock in the market. However. the initialized date is consistent with the real market data caused by some matters like acquisition. Take SH600018(上港集团) as an example, it's not on the listing of Shanghai exchange until 2006-10-16, and it's recorded only after 2006-10-26. However, SH600018 represents another company, say, 上港集箱, from 2005-04-01. Therefore, the developers take both stocks as the same one, which may be some trouble for predicting.

  1. we may fit the stock based on the performance of 上港集箱, which is not existing at present.
  2. 上港集箱 is not in the list of csi100 and csi300, which conflicts with the market indices.
  3. The initial price of 上港集团 is 0.26, not 1, and they have different results after dividing and exit divident and right.

To Reproduce

Steps to reproduce the behavior:

import qlib
qlib.init(auto_mount=False, mount_path='/data/csdesign/qlib')
from qlib.data import D
D.features(['SH600018'], ['$close'], start_time='20000101', end_time='20090101')

  1. return the results as follows:
    $close
    instrument datetime
    SH600018 2005-01-04 1.000000 (the initial listing date of 上港集箱)
    2005-01-05 1.007989
    2005-01-06 1.011984
    2005-01-07 1.019308
    2005-01-10 1.023302
    2005-01-11 1.045939
    2005-01-12 1.039947
    ...
    2006-09-13 1.141976
    2006-09-14 1.141976
    2006-09-15 1.146159
    2006-09-18 1.148250
    2006-09-19 1.145461
    2006-09-20 1.140581
    2006-09-21 1.141976
    2006-09-22 1.148947
    2006-09-25 1.141278
    2006-09-26 NaN
    2006-09-27 NaN
    2006-09-28 NaN
    2006-09-29 NaN
    2006-10-09 NaN
    2006-10-10 NaN
    2006-10-11 NaN
    2006-10-12 NaN
    2006-10-13 NaN
    2006-10-16 NaN
    2006-10-17 NaN
    2006-10-18 NaN
    2006-10-19 NaN
    2006-10-20 NaN
    2006-10-23 NaN
    2006-10-24 NaN
    2006-10-25 NaN
    2006-10-26 0.264230 (the firstly recorded date of 上港集团)
    2006-10-27 0.249589
    2006-10-30 0.252378
    2006-10-31 0.262138
    2006-11-01 0.288631
    2006-11-02 0.306061

Expected Behavior

  1. filter the data that belong to the period of the stock listed in the market indices in the dataset, and correct the real market initialized date.
  2. illustrate the source of dataset open for the Chinese market.

Screenshot

Environment

Note: User could run cd scripts && python collect_info.py all under project directory to get system information
and paste them here directly.

  • Qlib version: the latest one
  • Python version: python3.8
  • OS (Windows, Linux, MacOS): Windows
  • Commit number (optional, please provide it if you are using the dev version):

Additional Notes

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions