New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the error when the stock code is a number #78
Conversation
231fabd
to
4f6fe04
Compare
@zhupr Please check the CI results. |
qlib/data/data.py
Outdated
@@ -591,6 +594,8 @@ def _load_instruments(self, market): | |||
df = pd.read_csv(fname, sep="\t", names=["inst", "start_datetime", "end_datetime", "save_inst"]) | |||
df["start_datetime"] = pd.to_datetime(df["start_datetime"]) | |||
df["end_datetime"] = pd.to_datetime(df["end_datetime"]) | |||
df["inst"] = df["inst"].astype(str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
give more docs about the save_inst
qlib/data/data.py
Outdated
@@ -223,8 +223,11 @@ def convert_instruments(self, instrument): | |||
for _path in Path(C.get_data_path()).joinpath("instruments").glob("*.txt"): | |||
_df = pd.read_csv(_path, sep="\t", names=["inst", "start_datetime", "end_datetime", "save_inst"]) | |||
_df_list.append(_df.iloc[:, [0, -1]]) | |||
df = pd.concat(_df_list, sort=False).sort_values("save_inst") | |||
df = df.drop_duplicates(subset=["save_inst"], keep="first").fillna(axis=1, method="ffill") | |||
df = pd.concat(_df_list, sort=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dtype: Type name or dict of column -> type, optional
Data type for data or columns. E.g. {‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64’} Use str or object together with suitable na_values settings to preserve and not interpret dtype. If converters are specified, they will be applied INSTEAD of dtype conversion.
Will dtype be a cleaner solution?
qlib/data/data.py
Outdated
@@ -223,8 +223,11 @@ def convert_instruments(self, instrument): | |||
for _path in Path(C.get_data_path()).joinpath("instruments").glob("*.txt"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is very strange that InstrumentProvider
has such such a function named convert_instruments
qlib/data/data.py
Outdated
# `day` | ||
begin = inst_time[1] | ||
end = inst_time[2] | ||
elif len(inst_time) == 5: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very strange to use different logic for processing different frequency data.
Why not using code like pd.read_csv(fname, sep="\t", names=["inst", "start_datetime", "end_datetime"])
qlib/__init__.py
Outdated
@@ -2,7 +2,7 @@ | |||
# Licensed under the MIT License. | |||
|
|||
|
|||
__version__ = "0.6.1.dev" | |||
__version__ = "0.6.2.dev" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussion
qlib/tests/data.py
Outdated
@staticmethod | ||
def _delete_qlib_data(file_dir: Path): | ||
logger.info(f"delete {file_dir}") | ||
for _name in ["features", "calendars", "instruments"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about cache data ?
qlib/tests/data.py
Outdated
|
||
Examples | ||
--------- | ||
python get_data.py qlib_data --name qlib_data --target_dir ~/.qlib/qlib_data/cn_data --version latest --interval 1d --region cn | ||
------- | ||
|
||
""" | ||
file_name = f"{name}_{region.lower()}_{interval.lower()}_{version}.zip" | ||
self._download_data(file_name.lower(), target_dir) | ||
_version = re.search(r"(\d+)(.)(\d+)(.)(\d+)", qlib.__version__)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add more documents about the background for this code.
qlib/tests/data.py
Outdated
dataset_name=name, | ||
region=region.lower(), | ||
interval=interval.lower(), | ||
qlib_version=qlib_version, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider the situation that the qlib version changes but the data version remains the same?
qlib/utils/__init__.py
Outdated
replace_names += [f"COM{i}" for i in range(10)] | ||
replace_names += [f"LPT{i}" for i in range(10)] | ||
|
||
prefix = "_QLIB_" if code.isupper() else "_qlib_" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this is necessary?
qlib/__init__.py
Outdated
@@ -2,7 +2,7 @@ | |||
# Licensed under the MIT License. | |||
|
|||
|
|||
__version__ = "0.6.1.dev" | |||
__version__ = "0.6.1.99.dev" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add ".99", do we still need ".dev" ?
Description
Motivation and Context
How Has This Been Tested?
pytest qlib/tests/test_all_pipeline.py
under upper directory ofqlib
.Screenshots of Test Results (if appropriate):
Types of changes