Fix the error when the stock code is a number #78

zhupr · 2020-12-08T15:09:43Z

Description

Motivation and Context

How Has This Been Tested?

Pass the test by running: pytest qlib/tests/test_all_pipeline.py under upper directory of qlib.
If you are adding a new feature, test on your own test scripts.

Screenshots of Test Results (if appropriate):

Pipeline test:
Your own tests:

Types of changes

Fix bugs
Add new feature
Update documentation

you-n-g · 2020-12-09T03:59:34Z

@zhupr Please check the CI results.
Thanks

you-n-g · 2020-12-16T07:05:08Z

qlib/data/data.py

@@ -591,6 +594,8 @@ def _load_instruments(self, market):
        df = pd.read_csv(fname, sep="\t", names=["inst", "start_datetime", "end_datetime", "save_inst"])
        df["start_datetime"] = pd.to_datetime(df["start_datetime"])
        df["end_datetime"] = pd.to_datetime(df["end_datetime"])
+        df["inst"] = df["inst"].astype(str)


give more docs about the save_inst

you-n-g · 2020-12-16T07:08:29Z

qlib/data/data.py

@@ -223,8 +223,11 @@ def convert_instruments(self, instrument):
            for _path in Path(C.get_data_path()).joinpath("instruments").glob("*.txt"):
                _df = pd.read_csv(_path, sep="\t", names=["inst", "start_datetime", "end_datetime", "save_inst"])
                _df_list.append(_df.iloc[:, [0, -1]])
-            df = pd.concat(_df_list, sort=False).sort_values("save_inst")
-            df = df.drop_duplicates(subset=["save_inst"], keep="first").fillna(axis=1, method="ffill")
+            df = pd.concat(_df_list, sort=False)


dtype: Type name or dict of column -> type, optional Data type for data or columns. E.g. {‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64’} Use str or object together with suitable na_values settings to preserve and not interpret dtype. If converters are specified, they will be applied INSTEAD of dtype conversion.

Will dtype be a cleaner solution?

you-n-g · 2020-12-16T07:26:12Z

qlib/data/data.py

@@ -223,8 +223,11 @@ def convert_instruments(self, instrument):
            for _path in Path(C.get_data_path()).joinpath("instruments").glob("*.txt"):


It is very strange that InstrumentProvider has such such a function named convert_instruments

you-n-g · 2020-12-21T03:10:54Z

qlib/data/data.py

+                    # `day`
+                    begin = inst_time[1]
+                    end = inst_time[2]
+                elif len(inst_time) == 5:


This is very strange to use different logic for processing different frequency data.
Why not using code like pd.read_csv(fname, sep="\t", names=["inst", "start_datetime", "end_datetime"])

you-n-g · 2021-01-07T10:40:09Z

qlib/__init__.py

@@ -2,7 +2,7 @@
 # Licensed under the MIT License.


-__version__ = "0.6.1.dev"
+__version__ = "0.6.2.dev"


you-n-g · 2021-01-07T10:43:28Z

qlib/tests/data.py

+    @staticmethod
+    def _delete_qlib_data(file_dir: Path):
+        logger.info(f"delete {file_dir}")
+        for _name in ["features", "calendars", "instruments"]:


How about cache data ?

you-n-g · 2021-01-07T10:46:34Z

qlib/tests/data.py


        Examples
        ---------
        python get_data.py qlib_data --name qlib_data --target_dir ~/.qlib/qlib_data/cn_data --version latest --interval 1d --region cn
        -------

        """
-        file_name = f"{name}_{region.lower()}_{interval.lower()}_{version}.zip"
-        self._download_data(file_name.lower(), target_dir)
+        _version = re.search(r"(\d+)(.)(\d+)(.)(\d+)", qlib.__version__)[0]


Add more documents about the background for this code.

you-n-g · 2021-01-07T10:47:16Z

qlib/tests/data.py

+            dataset_name=name,
+            region=region.lower(),
+            interval=interval.lower(),
+            qlib_version=qlib_version,


Did you consider the situation that the qlib version changes but the data version remains the same?

you-n-g · 2021-01-07T10:50:44Z

qlib/utils/__init__.py

+    replace_names += [f"COM{i}" for i in range(10)]
+    replace_names += [f"LPT{i}" for i in range(10)]
+
+    prefix = "_QLIB_" if code.isupper() else "_qlib_"


why this is necessary?

you-n-g · 2021-01-21T09:36:02Z

qlib/__init__.py

@@ -2,7 +2,7 @@
 # Licensed under the MIT License.


-__version__ = "0.6.1.dev"
+__version__ = "0.6.1.99.dev"


If we add ".99", do we still need ".dev" ?

zhupr force-pushed the main branch 3 times, most recently from 231fabd to 4f6fe04 Compare December 8, 2020 16:56

zhupr force-pushed the main branch from 4f6fe04 to 7fc88b3 Compare December 10, 2020 09:14

you-n-g reviewed Dec 16, 2020

View reviewed changes

zhupr mentioned this pull request Dec 20, 2020

fail to update the data #126

Open

you-n-g reviewed Dec 21, 2020

View reviewed changes

you-n-g mentioned this pull request Jan 2, 2021

fix the bug of DumpDataUpdate #165

Closed

5 tasks

zhupr force-pushed the main branch from 7c16ef1 to 41eac9f Compare January 6, 2021 02:41

Fix the error when the stock code is a number

df55653

zhupr force-pushed the main branch from 41eac9f to ff0c862 Compare January 6, 2021 03:27

you-n-g reviewed Jan 7, 2021

View reviewed changes

zhupr force-pushed the main branch from 049a714 to 896716f Compare January 21, 2021 05:36

you-n-g reviewed Jan 21, 2021

View reviewed changes

zhupr added 2 commits January 26, 2021 16:06

US stock code supports Windows

1a1c459

Merge remote-tracking branch 'qlib/main' into save_inst

7579f4b

zhupr force-pushed the main branch from 896716f to 7579f4b Compare January 26, 2021 08:21

zhupr added 2 commits January 26, 2021 16:29

version removed .dev

1eaf09c

Merge remote-tracking branch 'qlib/main' into save_inst

ae45711

you-n-g merged commit 36e5c60 into microsoft:main Jan 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix the error when the stock code is a number #78

Fix the error when the stock code is a number #78

zhupr commented Dec 8, 2020

you-n-g commented Dec 9, 2020

you-n-g Dec 16, 2020

you-n-g Dec 16, 2020

you-n-g Dec 16, 2020

you-n-g Dec 21, 2020

you-n-g Jan 7, 2021

you-n-g Jan 7, 2021

you-n-g Jan 7, 2021

you-n-g Jan 7, 2021

you-n-g Jan 7, 2021

you-n-g Jan 21, 2021

		@@ -223,8 +223,11 @@ def convert_instruments(self, instrument):
		for _path in Path(C.get_data_path()).joinpath("instruments").glob("*.txt"):

Fix the error when the stock code is a number #78

Fix the error when the stock code is a number #78

Conversation

zhupr commented Dec 8, 2020

Description

Motivation and Context

How Has This Been Tested?

Screenshots of Test Results (if appropriate):

Types of changes

you-n-g commented Dec 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment