.. currentmodule:: qlib
Users can get stock data with Qlib
. The following examples demonstrate the basic user interface.
QLib
Initialization:
Note
In order to get the data, users need to initialize Qlib
with qlib.init first. Please refer to initialization.
If users followed steps in initialization and downloaded the data, they should use the following code to initialize qlib
>> import qlib
>> qlib.init(provider_uri='~/.qlib/qlib_data/cn_data')
Load trading calendar with given time range and frequency:
>> from qlib.data import D
>> D.calendar(start_time='2010-01-01', end_time='2017-12-31', freq='day')[:2]
[Timestamp('2010-01-04 00:00:00'), Timestamp('2010-01-05 00:00:00')]
Parse a given market name into a stock pool config:
>> from qlib.data import D
>> D.instruments(market='all')
{'market': 'all', 'filter_pipe': []}
Load instruments of certain stock pool in the given time range:
>> from qlib.data import D
>> instruments = D.instruments(market='csi300')
>> D.list_instruments(instruments=instruments, start_time='2010-01-01', end_time='2017-12-31', as_list=True)[:6]
['SH600036', 'SH600110', 'SH600087', 'SH600900', 'SH600089', 'SZ000912']
Load dynamic instruments from a base market according to a name filter
>> from qlib.data import D
>> from qlib.data.filter import NameDFilter
>> nameDFilter = NameDFilter(name_rule_re='SH[0-9]{4}55')
>> instruments = D.instruments(market='csi300', filter_pipe=[nameDFilter])
>> D.list_instruments(instruments=instruments, start_time='2015-01-01', end_time='2016-02-15', as_list=True)
['SH600655', 'SH601555']
Load dynamic instruments from a base market according to an expression filter
>> from qlib.data import D
>> from qlib.data.filter import ExpressionDFilter
>> expressionDFilter = ExpressionDFilter(rule_expression='$close>2000')
>> instruments = D.instruments(market='csi300', filter_pipe=[expressionDFilter])
>> D.list_instruments(instruments=instruments, start_time='2015-01-01', end_time='2016-02-15', as_list=True)
['SZ000651', 'SZ000002', 'SH600655', 'SH600570']
For more details about filter, please refer Filter API.
Load features of certain instruments in a given time range:
>> from qlib.data import D
>> instruments = ['SH600000']
>> fields = ['$close', '$volume', 'Ref($close, 1)', 'Mean($close, 3)', '$high-$low']
>> D.features(instruments, fields, start_time='2010-01-01', end_time='2017-12-31', freq='day').head()
$close $volume Ref($close, 1) Mean($close, 3) $high-$low
instrument datetime
SH600000 2010-01-04 86.778313 16162960.0 88.825928 88.061483 2.907631
2010-01-05 87.433578 28117442.0 86.778313 87.679273 3.235252
2010-01-06 85.713585 23632884.0 87.433578 86.641825 1.720009
2010-01-07 83.788803 20813402.0 85.713585 85.645322 3.030487
2010-01-08 84.730675 16044853.0 83.788803 84.744354 2.047623
Load features of certain stock pool in a given time range:
Note
With cache enabled, the qlib data server will cache data all the time for the requested stock pool and fields, it may take longer to process the request for the first time than that without cache. But after the first time, requests with the same stock pool and fields will hit the cache and be processed faster even the requested time period changes.
>> from qlib.data import D
>> from qlib.data.filter import NameDFilter, ExpressionDFilter
>> nameDFilter = NameDFilter(name_rule_re='SH[0-9]{4}55')
>> expressionDFilter = ExpressionDFilter(rule_expression='$close>Ref($close,1)')
>> instruments = D.instruments(market='csi300', filter_pipe=[nameDFilter, expressionDFilter])
>> fields = ['$close', '$volume', 'Ref($close, 1)', 'Mean($close, 3)', '$high-$low']
>> D.features(instruments, fields, start_time='2010-01-01', end_time='2017-12-31', freq='day').head()
$close $volume Ref($close, 1) Mean($close, 3) $high-$low
instrument datetime
SH600655 2010-01-04 2699.567383 158193.328125 2619.070312 2626.097738 124.580566
2010-01-08 2612.359619 77501.406250 2584.567627 2623.220133 83.373047
2010-01-11 2712.982422 160852.390625 2612.359619 2636.636556 146.621582
2010-01-12 2788.688232 164587.937500 2712.982422 2704.676758 128.413818
2010-01-13 2790.604004 145460.453125 2788.688232 2764.091553 128.413818
For more details about features, please refer Feature API.
Note
When calling D.features() at the client, use parameter disk_cache=0 to skip dataset cache, use disk_cache=1 to generate and use dataset cache. In addition, when calling at the server, users can use disk_cache=2 to update the dataset cache.
To know more about how to use the Data, go to API Reference: Data API