Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change get_data url #1558

Merged
merged 38 commits into from
Jun 25, 2023
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 15 additions & 6 deletions .github/workflows/test_qlib_from_source.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,28 @@ jobs:

steps:
- name: Test qlib from source
uses: actions/checkout@v2
uses: actions/checkout@v3

# Since version 3.7 of python for MacOS is installed in CI, version 3.7.17, this version causes "_bz not found error".
# So we make the version number of python 3.7 for MacOS more specific.
# refs: https://github.com/actions/setup-python/issues/682
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
if: (matrix.os == 'macos-latest' && matrix.python-version == '3.7') || (matrix.os == 'macos-11' && matrix.python-version == '3.7')
uses: actions/setup-python@v4
with:
python-version: "3.7.16"

- name: Set up Python ${{ matrix.python-version }}
if: (matrix.os != 'macos-latest' || matrix.python-version != '3.7') && (matrix.os != 'macos-11' || matrix.python-version != '3.7')
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

- name: Update pip to the latest version
# pip release version 23.1 on Apr.15 2023, CI failed to run, Please refer to #1495 ofr detailed logs.
# The pip version has been temporarily fixed to 23.0.1
# The pip version has been temporarily fixed to 23.0
run: |
python -m pip install pip==23.0.1
python -m pip install pip==23.0

- name: Installing pytorch for macos
if: ${{ matrix.os == 'macos-11' || matrix.os == 'macos-latest' }}
Expand Down Expand Up @@ -129,8 +139,7 @@ jobs:
- name: Test data downloads
run: |
python scripts/get_data.py qlib_data --name qlib_data_simple --target_dir ~/.qlib/qlib_data/cn_data --interval 1d --region cn
azcopy copy https://qlibpublic.blob.core.windows.net/data/rl /tmp/qlibpublic/data --recursive
mv /tmp/qlibpublic/data tests/.data
python scripts/get_data.py rl_data --target_dir tests/.data/rl

- name: Install Lightgbm for MacOS
if: ${{ matrix.os == 'macos-11' || matrix.os == 'macos-latest' }}
Expand Down
18 changes: 14 additions & 4 deletions .github/workflows/test_qlib_from_source_slow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,28 @@ jobs:

steps:
- name: Test qlib from source slow
uses: actions/checkout@v2
uses: actions/checkout@v3

# Since version 3.7 of python for MacOS is installed in CI, version 3.7.17, this version causes "_bz not found error".
# So we make the version number of python 3.7 for MacOS more specific.
# refs: https://github.com/actions/setup-python/issues/682
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
if: (matrix.os == 'macos-latest' && matrix.python-version == '3.7') || (matrix.os == 'macos-11' && matrix.python-version == '3.7')
uses: actions/setup-python@v4
with:
python-version: "3.7.16"

- name: Set up Python ${{ matrix.python-version }}
if: (matrix.os != 'macos-latest' || matrix.python-version != '3.7') && (matrix.os != 'macos-11' || matrix.python-version != '3.7')
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

- name: Set up Python tools
# pip release version 23.1 on Apr.15 2023, CI failed to run, Please refer to #1495 ofr detailed logs.
# The pip version has been temporarily fixed to 23.0.1
# The pip version has been temporarily fixed to 23.0
run: |
python -m pip install pip==23.0.1
python -m pip install pip==23.0
pip install --upgrade cython numpy
pip install -e .[dev]

Expand Down
39 changes: 38 additions & 1 deletion qlib/tests/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

class GetData:
DATASET_VERSION = "v2"
REMOTE_URL = "https://qlibpublic.blob.core.windows.net/data/default/stock_data"
REMOTE_URL = "http://fintech.msra.cn/stock_data/downloads"
QLIB_DATA_NAME = "{dataset_name}_{region}_{interval}_{qlib_version}.zip"
SunsetWolf marked this conversation as resolved.
Show resolved Hide resolved

def __init__(self, delete_zip_file=False):
Expand Down Expand Up @@ -165,6 +165,43 @@ def _get_file_name(v):
file_name = _get_file_name("latest")
self._download_data(file_name.lower(), target_dir, delete_old, dataset_version=version)

def rl_data(
SunsetWolf marked this conversation as resolved.
Show resolved Hide resolved
self,
target_dir="~/.qlib/qlib_data/rl_data",
version=None,
delete_old=True,
exists_skip=False,
):
"""download cn qlib data from remote

Parameters
----------
target_dir: str
data save directory
version: str
data version, value from [v1, ...], by default None(use script to specify version)
delete_old: bool
delete an existing directory, by default True
exists_skip: bool
exists skip, by default False

Examples
---------
# get rl data
python get_data.py rl_data --target_dir ~/.qlib/qlib_data/rl_data
-------

"""
if exists_skip and exists_qlib_data(target_dir):
logger.warning(
f"Data already exists: {target_dir}, the data download will be skipped\n"
f"\tIf downloading is required: `exists_skip=False` or `change target_dir`"
)
return

file_name = "rl.zip"
self._download_data(file_name.lower(), target_dir, delete_old, dataset_version=version)

def csv_data_cn(self, target_dir="~/.qlib/csv_data/cn_data"):
SunsetWolf marked this conversation as resolved.
Show resolved Hide resolved
"""download cn csv data from remote

Expand Down
Loading