microsoft · zhupr · Sep 15, 2020 · Sep 21, 2020
diff --git a/CHANGES.rst b/CHANGES.rst
@@ -0,0 +1,152 @@
+Changelog
+====================
+Here you can see the full list of changes between each QLib release.
+
+Version 0.1.0
+--------------------
+This is the initial release of QLib library.
+
+Version 0.1.1
+--------------------
+Performance optimize. Add more features and operators.
+
+Version 0.1.2
+--------------------
+- Support operator syntax. Now ``High() - Low()`` is equivalent to ``Sub(High(), Low())``.   
+- Add more technical indicators.
+
+Version 0.1.3
+--------------------
+Bug fix and add instruments filtering mechanism.
+
+Version 0.2.0
+--------------------
+- Redesign ``LocalProvider`` database format for performance improvement.
+- Support load features as string fields.
+- Add scripts for database construction.
+- More operators and technical indicators.
+
+Version 0.2.1
+--------------------
+- Support registering user-defined ``Provider``.
+- Support use operators in string format, e.g. ``['Ref($close, 1)']`` is valid field format.
+- Support dynamic fields in ``$some_field`` format. And exising fields like ``Close()`` may be deprecated in the future.
+
+Version 0.2.2
+--------------------
+- Add ``disk_cache`` for reusing features (enabled by default).
+- Add ``qlib.contrib`` for experimental model construction and evaluation.
+
+
+Version 0.2.3
+--------------------
+- Add ``backtest`` module
+- Decoupling the Strategy, Account, Position, Exchange from the backtest module
+
+Version 0.2.4
+--------------------
+- Add ``profit attribution`` module
+- Add ``rick_control`` and ``cost_control`` strategies
+
+Version 0.3.0
+--------------------
+- Add ``estimator`` module
+
+Version 0.3.1
+--------------------
+- Add ``filter`` module
+
+Version 0.3.2
+--------------------
+- Add real price trading, if the ``factor`` field in the data set is incomplete, use ``adj_price`` trading
+- Refactor ``handler`` ``launcher`` ``trainer`` code
+- Support ``backtest`` configuration parameters in the configuration file
+- Fix bug in position ``amount`` is 0
+- Fix bug of ``filter`` module
+
+Version 0.3.3
+-------------------
+- Fix bug of ``filter`` module
+
+Version 0.3.4
+--------------------
+- Support for ``finetune model``
+- Refactor ``fetcher`` code
+
+Version 0.3.5
+--------------------
+- Support multi-label training, you can provide multiple label in ``handler``. (But LightGBM doesn't support due to the algorithm itself)
+- Refactor ``handler`` code, dataset.py is no longer used, and you can deploy your own labels and features in ``feature_label_config``
+- Handler only offer DataFrame. Also, ``trainer`` and model.py only receive DataFrame
+- Change ``split_rolling_data``, we roll the data on market calender now, not on normal date
+- Move some date config from ``handler`` to ``trainer``
+
+Version 0.4.0
+--------------------
+- Add `data` package that holds all data-related codes
+- Reform the data provider structure
+- Create a server for data centralized management `qlib-server<https://amc-msra.visualstudio.com/trading-algo/_git/qlib-server>`_
+- Add a `ClientProvider` to work with server
+- Add a pluggable cache mechanism
+- Add a recursive backtracking algorithm to inspect the furthest reference date for an expression
+
+.. note::
+    The ``D.instruments`` function does not support ``start_time``, ``end_time``, and ``as_list`` parameters, if you want to get the results of previous versions of ``D.instruments``, you can do this:
+
+
+    >>> from qlib.data import D
+    >>> instruments = D.instruments(market='csi500')
+    >>> D.list_instruments(instruments=instruments, start_time='2015-01-01', end_time='2016-02-15', as_list=True)
+
+
+Version 0.4.1
+--------------------
+- Add support Windows
+- Fix ``instruments`` type bug
+- Fix ``features`` is empty bug(It will cause failure in updating)
+- Fix ``cache`` lock and update bug
+- Fix use the same cache for the same field (the original space will add a new cache)
+- Change "logger handler" from config
+- Change model load support 0.4.0 later
+- The default value of the ``method`` parameter of ``risk_analysis`` function is changed from **ci** to **si**
+
+
+Version 0.4.2
+--------------------
+- Refactor DataHandler
+- Add ``ALPHA360`` DataHandler
+
+
+Version 0.4.3
+--------------------
+- Implementing Online Inference and Trading Framework
+- Refactoring The interfaces of backtest and strategy module.
+
+
+Version 0.4.4
+--------------------
+- Optimize cache generation performance
+- Add report module
+- Fix bug when using ``ServerDatasetCache`` offline.
+- In the previous version of ``long_short_backtest``, there is a case of ``np.nan`` in long_short. The current version ``0.4.4`` has been fixed, so ``long_short_backtest`` will be different from the previous version.
+- In the ``0.4.2`` version of ``risk_analysis`` function, ``N`` is ``250``, and ``N`` is ``252`` from ``0.4.3``, so ``0.4.2`` is ``0.002122`` smaller than the ``0.4.3`` the backtest result is slightly different between ``0.4.2`` and ``0.4.3``.
+- refactor the argument of backtest function.
+    - **NOTE**:
+      - The default arguments of topk margin strategy is changed. Please pass the arguments explicitly if you want to get the same backtest result as previous version.
+      - The TopkWeightStrategy is changed slightly. It will try to sell the stocks more than ``topk``.  (The backtest result of TopkAmountStrategy remains the same)
+- The margin ratio mechanism is supported in the Topk Margin strategies.
+
+
+Version 0.4.5
+--------------------
+- Add multi-kernel implementation for both client and server.
+    - Support a new way to load data from client which skips dataset cache.
+    - Change the default dataset method from single kernel implementation to multi kernel implementation.
+- Accelerate the high frequency data reading by optimizing the relative modules.
+- Support a new method to write config file by using dict.
+
+Version 0.4.6
+--------------------
+- Some bugs are fixed
+    - The default config in `Version 0.4.5` is not friendly to daily frequency data.
+    - Backtest error in TopkWeightStrategy when `WithInteract=True`.
diff --git a/README.md b/README.md
@@ -1,3 +1,189 @@
+Qlib is a an AI-oriented quantitative investment platform.  aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment.
+
+With you Qlib, you can easily apply your favorite model to create better Quant investment strategy.
+
+
+- [Framework of Qlib](#framework-of-qlib)
+- [Quick start](#quick-start)
+  - [Installation](#installation)
+  - [Get Data](#get-data)
+  - [Auto Quant research workflow with _estimator_](#auto-quant-research-workflow-with-estimator)
+  - [Customized Quant research workflow by code](#customized-quant-research-workflow-by-code)
+- [More About Qlib](#more-about-qlib)
+  - [Offline mode and online mode](#offline-mode-and-online-mode)
+  - [Performance of Qlib Data Server](#performance-of-qlib-data-server)
+
+
+
+# Framework of Qlib
+![framework](docs/_static/img/framework.png)
+
+At module level, Qlib is a platform that consists of the above components. Each components is loose-coupling and can be used stand-alone.
+
+| Name | Description|
+|------| -----|
+| _Data layer_ | _DataServer_ focus on providing high performance infrastructure  for user to retreive and get raw data. _DataEnhancement_ will preprocess the data and provide the best dataset to be fed in to the models  |
+| _Interday Model_ | _Interday model_ focus on produce forecasting signals(aka. _alpha_). Models are trained by _Model Creator_ and managed by _Model Manager_. User could choose one or multiple models for forecasting. Multiple models could be combined with _Ensemble_ module  |
+| _Interday Strategy_ | _Portfolio Generator_ will take forecasting signals as input and output the orders based on current position to achieve target portfolio  | 
+| _Intraday Trading_ | _Order Executor_ is responsible for executing orders output by _Interday Strategy_ and returning the executed results. |
+| _Analysis_ |  User could get detailed analysis report of forecasting signal and portfolio in this part. |
+
+* The modules with hand-drawn style is under development and will be  released in the future.
+* The modules with dashed border is highly user-customizable and extendible.
+
+
+# Quick start
+
+## Installation
+
+To install Qlib from source you need _Cython_ in addition to the normal dependencies above:
+
+```bash
+pip install cython
+```
+
+Clone the repository and then run:
+```bash
+python setup.py install
+```
+
+
+## Get Data
+- Load and prepare the Data: execute the following command to load the stock data:
+  ```bash
+  python scripts/get_data.py qlib_data_cn --target_dir ~/.qlib/qlib_data/cn_data
+  ```
+<!-- 
+- Run the initialization code and get stock data:
+
+  ```python
+  import qlib
+  from qlib.data import D
+  from qlib.config import REG_CN
+
+  # Initialization
+  mount_path = "~/.qlib/qlib_data/cn_data"  # target_dir
+  qlib.init(mount_path=mount_path, region=REG_CN)
+
+  # Get stock data by Qlib
+  # Load trading calendar with the given time range and frequency
+  print(D.calendar(start_time='2010-01-01', end_time='2017-12-31', freq='day')[:2])
+
+  # Parse a given market name into a stockpool config
+  instruments = D.instruments('csi500')
+  print(D.list_instruments(instruments=instruments, start_time='2010-01-01', end_time='2017-12-31', as_list=True)[:6])
+
+  # Load features of certain instruments in given time range
+  instruments = ['SH600000']
+  fields = ['$close', '$volume', 'Ref($close, 1)', 'Mean($close, 3)', '$high-$low']
+  print(D.features(instruments, fields, start_time='2010-01-01', end_time='2017-12-31', freq='day').head())
+  ```
+ -->
+
+## Auto Quant research workflow with _estimator_
+Qlib provides a tool named `estimator` to run whole workflow automatically(including building dataset, train models, backtest, analysis)
+
+1. Run _estimator_ (_config.yaml_ for: [estimator_config.yaml](example/estimator/estimator_config.yaml)):
+
+    ```bash
+    estimator -c example/estimator/estimator_config.yaml
+    ```
+
+    Estimator result:
+
+    ```bash
+    pred_long       mean    0.001386
+                    std     0.004403
+                    annual  0.349379
+                    sharpe  4.998428
+                    mdd    -0.049486
+    pred_short      mean    0.002703
+                    std     0.004680
+                    annual  0.681071
+                    sharpe  9.166842
+                    mdd    -0.053523
+    pred_long_short mean    0.004089
+                    std     0.007028
+                    annual  1.030451
+                    sharpe  9.236475
+                    mdd    -0.045817
+    sub_bench       mean    0.000953
+                    std     0.004688
+                    annual  0.240123
+                    sharpe  3.226878
+                    mdd    -0.064588
+    sub_cost        mean    0.000718
+                    std     0.004694
+                    annual  0.181003
+                    sharpe  2.428964
+                    mdd    -0.072977
+    ```
+    See the full documnents for [Use _Estimator_ to Start An Experiment](TODO:URL).
+
+2. Analysis
+
+    Run `examples/estimator/analyze_from_estimator.ipynb` in `jupyter notebook`
+    1.  forecasting signal analysis
+        - Model Performance
+        ![Model Performance](docs/_static/img/model_performance.png)
+
+    2.  portfolio analysis
+        - Report
+        ![Report](docs/_static/img/report.png)
+        <!-- 
+        - Score IC
+        ![Score IC](docs/_static/img/score_ic.png)
+        - Cumulative Return
+        ![Cumulative Return](docs/_static/img/cumulative_return.png)
+        - Risk Analysis
+        ![Risk Analysis](docs/_static/img/risk_analysis.png)
+        - Rank Label
+        ![Rank Label](docs/_static/img/rank_label.png)
+        -->
+
+## Customized Quant research workflow by code
+Automatical workflow may not suite the research workflow of all Quant researchers. To support flexible Quant research workflow, Qlib also provide modulized interface to allow researchers to build their own workflow. [Here](TODO_URL) is a demo for customized Quant research workflow by code
+
+
+
+# More About Qlib
+The detailed documents are organized in [docs](docs).
+[Sphinx](http://www.sphinx-doc.org) and the readthedocs theme is required to build the documentation in html formats. 
+```bash
+cd docs/
+conda install sphinx sphinx_rtd_theme -y
+# Otherwise, you can install them with pip
+# pip install sphinx sphinx_rtd_theme
+make html
+```
+You can also view the [latest document](TODO_URL) online directly.
+
+
+
+## Offline mode and online mode
+The data server of Qlib can both deployed as offline mode and online mode. The default  mode is offline mode.
+
+Under offline mode, the data will be deployed locally. 
+
+Under online mode, the data will be deployed as a shared data service. The data and their cache will be shared by clients. The data retrieving performance is expected to be improved due to higher rate of cache hits. It will use less disk space, too. The documents of the online mode can be found in [Qlib-Server](TODO_link). The online mode can be deployed automatically with [Azure CLI based scripts](TODO_link)
+
+## Performance of Qlib Data Server
+The performance of data processing is important to datadriven methods like AI technologies. As an AI-oriented platform, Qlib provides a solution for data storage and data processing. To demonstrate the performance of Qlib, We
+compare Qlib with several other solutions. 
+
+The task for the solutions is to create a dataset from the
+basic OHLCV daily data of a stock market, which involves
+data query and processing.
+
+
+
+Most general purpose databases take too much time on loading data. After looking into the underlying implementation, we find that data go through too many layers of interfaces and unnecessary format transformations in general purpose database solution.
+Such overheads greatly slow down the data loading process.
+Qlib data are stored in a compact format, which is efficient to be combined into arrays for scientific computation.
+
+
+
+
 
 # Contributing
 

diff --git a/README.rst b/README.rst
@@ -0,0 +1,34 @@
+QLib
+==========
+
+QLib is a Quantitative-research Library, which can provide research data with highly consistency, reusability and extensibility.
+
+.. note:: Anaconda python is strongly recommended for this library. See https://www.anaconda.com/download/.
+
+
+Install
+----------
+
+Install as root:
+
+.. code-block:: bash
+
+   $ python setup.py install
+
+
+Install as single user (if you have no root permission):
+
+.. code-block:: bash
+
+   $ python setup.py install --user
+
+
+To verify your installation, open your python shell:
+
+.. code-block:: python
+
+   >>> import qlib
+   >>> qlib.__version__
+   '0.2.2'
+
+You can also run ``tests/test_data_sim.py`` to verify your installation.
diff --git a/docs/Makefile b/docs/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line.
+SPHINXOPTS    =
+SPHINXBUILD   = python3 -msphinx
+SPHINXPROJ    = Quantlab
+SOURCEDIR     = .
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
diff --git a/docs/_static/img/cumulative_return.png b/docs/_static/img/cumulative_return.png
diff --git a/docs/_static/img/framework.png b/docs/_static/img/framework.png
diff --git a/docs/_static/img/model_performance.png b/docs/_static/img/model_performance.png
diff --git a/docs/_static/img/rank_label.png b/docs/_static/img/rank_label.png
diff --git a/docs/_static/img/report.png b/docs/_static/img/report.png
diff --git a/docs/_static/img/risk_analysis.png b/docs/_static/img/risk_analysis.png
diff --git a/docs/_static/img/score_ic.png b/docs/_static/img/score_ic.png