Bug fix for Rank and WMA operators #1228

qianyun210603 · 2022-07-24T08:52:31Z

Description

change the scaling in the inner function of Rank operator from len(x1) to 100
add 1 on the non-normalized weights of WMA operators

Motivation and Context

For 1): I would guess the goal is to scale the ranking result between 0 and 1. According to the scipy document, percentileofscore

Compute the percentile rank of a score relative to a list of scores.
A percentileofscore of, for example, 80% means that 80% of the scores in a are below the given score.

it doesn't make sense to divide by length to array size, instead, should divide by 100.

For 2): The common convention of linear weighted MA with window size d should be weighted by d, d-1, ..., 1, instead of d-1, d-2, ..., 0, see e.g. https://www.fidelity.com/learning-center/trading-investing/technical-analysis/technical-indicator-guide/wma and the decay_linear function in famous 101 Formulaic Alphas by Zura Kakushadze from WorldQuant.
Though convention problem is arguable I think it's better to be more intuitive.

How Has This Been Tested?

Pass the test by running: pytest qlib/tests/test_all_pipeline.py under upper directory of qlib.
If you are adding a new feature, test on your own test scripts.

Screenshots of Test Results (if appropriate):

Pipeline test:
Your own tests:

Types of changes

Fix bugs
Add new feature
Update documentation

… to 0-1, not length of array; 2) for (linear) weighted MA(n), weight should be n, n-1, ..., 1 instead of n-1, ..., 0

qianyun210603 · 2022-08-03T05:29:02Z

@you-n-g
Would you take a look please? Let me know if anything needs improvement. Thanks!

qlib/data/ops.py

you-n-g · 2022-08-12T14:45:51Z

Nice shot!
Thanks for fixing the bug!
I have left one comment.
LGTM after it is fixed.

ghost · 2022-08-13T01:43:47Z

All CLA requirements met.

qianyun210603 · 2022-08-13T02:00:42Z

Thanks for reply. Updated as you suggested. Also corrected some wrong param list comments.

qianyun210603 · 2022-08-18T09:23:30Z

@you-n-g any feedback please?

you-n-g · 2022-08-22T09:47:20Z

@qianyun210603
It looks that the CI still fails.
Please check the error. Thanks

qianyun210603 · 2022-08-23T01:24:36Z

@you-n-g It's because the base 'microsoft:main' on Aug 13, which was the base of my pull request, had failed CI .
Fixed after merging the latest 'microsoft:main'.

qianyun210603 · 2022-08-23T05:41:31Z

@you-n-g
took a further look on the failure
it complains

self = Rolling [window=10,min_periods=1,center=False,axis=0,method=single]
attr = 'rank'

    def __getattr__(self, attr: str):
        if attr in self._internal_names_set:
            return object.__getattribute__(self, attr)
        if attr in self.obj:
            return self[attr]
    
        raise AttributeError(
>           f"'{type(self).__name__}' object has no attribute '{attr}'"
        )
E       AttributeError: 'Rolling' object has no attribute 'rank'

I would guess it is caused by too low pandas version which rank has not been implemented in Rolling.
Could you let me know where to check the pandas version in the Action run? Thanks!

qianyun210603 · 2022-08-23T13:09:53Z

@you-n-g

Dig in some further, it looks the py3.7 tests fails because Rolling.rank was added from pandas 1.4.0+ but py3.7 can only run pandas up to 1.3.5. So the failures.

Therefore we need to fall back to the original scipy solution for rank for py3.7.
I modified the pull request to use scipy version for pandas 1.3.5 and below and native pandas Rolling.rank for pandas 1.4.0 and above. The code is a bit ugly though.

Alternative would be drop the py37 support and use native pandas Rolling.rank, or
fall back to the scipy solution for all versions (I would recommend against as scipy version is 20 times slower than pandas native from a simple test I did)

Let me know your thought. thx.

qianyun210603 · 2022-08-29T05:43:50Z

@you-n-g any thoughts please?

you-n-g · 2022-08-29T09:41:21Z

qlib/data/ops.py

@@ -1154,18 +1148,32 @@ class Rank(Rolling):

    def __init__(self, feature, N):
        super(Rank, self).__init__(feature, N, "rank")
+        major_version, minor_version, *_ = pd.__version__.split(".")
+        self._load_internal = (


I would recommend implementing it as a method to override the parent method instead of setting the attribute for the following reasons.

fewer attributes will make it simpler.

It will not work in some special cases (for example, someone dump an instance of the operator in low version of pandas and load it in high version of pandas)

@you-n-g

think of a way to use hasattr instead of version check. See latest commit. let me know if you have better idea. thx.

Any update/feedback please?

…mented

qianyun210603 · 2022-09-14T01:40:57Z

@you-n-g any progress please?

qianyun210603 · 2022-11-08T07:56:04Z

@you-n-g any comments on this?

you-n-g · 2022-11-13T11:03:19Z

Sorry for the late response.
Thanks for your great efforts!

* bug fix: 1) 100 should be used to scale down percentileofscore return to 0-1, not length of array; 2) for (linear) weighted MA(n), weight should be n, n-1, ..., 1 instead of n-1, ..., 0 * use native pandas fucntion for rank * remove useless import * require pandas 1.4+ * rank for py37+pandas 1.3.5 compatibility * lint improvement * lint black fix * use hasattr instead of version to check whether rolling.rank is implemented

bug fix: 1) 100 should be used to scale down percentileofscore return…

2f3edf1

… to 0-1, not length of array; 2) for (linear) weighted MA(n), weight should be n, n-1, ..., 1 instead of n-1, ..., 0

you-n-g reviewed Aug 12, 2022

View reviewed changes

qlib/data/ops.py Show resolved Hide resolved

use native pandas fucntion for rank

9e88ac3

qianyun210603 added 2 commits August 13, 2022 09:46

remove useless import

11785c9

Merge branch 'microsoft:main' into main

5e7b9f0

Merge branch 'microsoft:main' into main

e28fdb3

qianyun210603 added 5 commits August 23, 2022 16:19

require pandas 1.4+

b4f8408

Merge branch 'main' of github.com:qianyun210603/qlib

bef38c4

rank for py37+pandas 1.3.5 compatibility

223f2c7

lint improvement

ad05334

lint black fix

33ed6c3

you-n-g reviewed Aug 29, 2022

View reviewed changes

use hasattr instead of version to check whether rolling.rank is imple…

ee68adc

…mented

qianyun210603 force-pushed the main branch from b6b2b84 to ee68adc Compare August 30, 2022 06:43

Merge branch 'microsoft:main' into main

c6533ff

you-n-g self-assigned this Sep 2, 2022

Merge branch 'microsoft:main' into main

1669bf9

you-n-g merged commit 4001a5d into microsoft:main Nov 13, 2022

you-n-g added the bug Something isn't working label Dec 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug fix for Rank and WMA operators #1228

Bug fix for Rank and WMA operators #1228

qianyun210603 commented Jul 24, 2022

qianyun210603 commented Aug 3, 2022

you-n-g commented Aug 12, 2022

ghost commented Aug 13, 2022 •

edited by ghost

qianyun210603 commented Aug 13, 2022

qianyun210603 commented Aug 18, 2022

you-n-g commented Aug 22, 2022

qianyun210603 commented Aug 23, 2022 •

edited

qianyun210603 commented Aug 23, 2022

qianyun210603 commented Aug 23, 2022 •

edited

qianyun210603 commented Aug 29, 2022

you-n-g Aug 29, 2022

qianyun210603 Aug 30, 2022

qianyun210603 Sep 8, 2022

qianyun210603 commented Sep 14, 2022

qianyun210603 commented Nov 8, 2022

you-n-g commented Nov 13, 2022 •

edited

Bug fix for Rank and WMA operators #1228

Bug fix for Rank and WMA operators #1228

Conversation

qianyun210603 commented Jul 24, 2022

Description

Motivation and Context

How Has This Been Tested?

Screenshots of Test Results (if appropriate):

Types of changes

qianyun210603 commented Aug 3, 2022

you-n-g commented Aug 12, 2022

ghost commented Aug 13, 2022 • edited by ghost

qianyun210603 commented Aug 13, 2022

qianyun210603 commented Aug 18, 2022

you-n-g commented Aug 22, 2022

qianyun210603 commented Aug 23, 2022 • edited

qianyun210603 commented Aug 23, 2022

qianyun210603 commented Aug 23, 2022 • edited

qianyun210603 commented Aug 29, 2022

you-n-g Aug 29, 2022

Choose a reason for hiding this comment

qianyun210603 Aug 30, 2022

Choose a reason for hiding this comment

qianyun210603 Sep 8, 2022

Choose a reason for hiding this comment

qianyun210603 commented Sep 14, 2022

qianyun210603 commented Nov 8, 2022

you-n-g commented Nov 13, 2022 • edited

ghost commented Aug 13, 2022 •

edited by ghost

qianyun210603 commented Aug 23, 2022 •

edited

qianyun210603 commented Aug 23, 2022 •

edited

you-n-g commented Nov 13, 2022 •

edited