【PaddlePaddle Hackathon 2】3、为 Paddle 新增 corrcoef(皮尔逊积矩相关系数) API #40690

liqitong-a · 2022-03-17T16:03:53Z

PR types

Others

PR changes

APIs

Describe

ISSUE链接：corrcoef
RFC的PR链接：PaddlePaddle/community#46
中文文档链接：PaddlePaddle/docs#4316

Ligoml · 2022-03-18T08:23:14Z

hi~有一些细节需要注意一下 @liqitong-a
1、PR的标题要和ISSUE标题内容一样；
2、Describe 里要加上 ISSUE 链接 & RFC链接 & 中文文档链接

paddle-bot-old · 2022-03-18T08:58:27Z

PR格式检查通过，你的PR将接受Paddle专家以及开源社区的review，请及时关注PR动态。
The format inspection passed. Your PR will be reviewed by experts of Paddle and developers from the open-source community. Stay tuned.

liqitong-a · 2022-03-18T09:09:07Z

hi~有一些细节需要注意一下 @liqitong-a 1、PR的标题要和ISSUE标题内容一样； 2、Describe 里要加上 ISSUE 链接 & RFC链接 & 中文文档链接

好的好的改好啦

zhiboniu · 2022-03-21T03:56:20Z

python/paddle/tensor/linalg.py

+
+def corrcoef(x, rowvar=True, ddof=False, name=None):
+    """
+    Return Pearson product-moment correlation coefficients.


文档不要照抄numpy吧，自己尝试写一下能不能更清楚一些

zhiboniu · 2022-03-21T03:58:16Z

python/paddle/tensor/linalg.py

+
+        x(Tensor): A N-D(N<=2) Tensor containing multiple variables and observations. By default, each row of x represents a variable. Also see rowvar below.
+        rowvar(Bool, optional): If rowvar is True (default), then each row represents a variable, with observations in the columns. Default: True
+        ddof(Bool, optional): Has no effect, do not use.


如果已经deprecated，那为什么还要保留？

zhiboniu · 2022-03-21T03:58:28Z

python/paddle/tensor/linalg.py

+
+    """
+
+    if ddof is not False:


zhiboniu · 2022-03-21T03:59:37Z

python/paddle/tensor/linalg.py

+        warnings.warn('ddof have no effect and are deprecated',
+                      DeprecationWarning)
+    c = cov(x, rowvar)
+    try:


代码中最好分类处理，不要用try，否则有其他错误也不容易发现

zhiboniu · 2022-03-21T04:01:27Z

python/paddle/fluid/tests/unittests/test_corr.py

+        self.shape = [20, 10]
+
+    def test_tensor_corr_default(self):
+        typelist = ['float64']


如果支持复数，请添加复数类型测试

如果支持复数，请添加复数类型测试

我想问一下，numpy里的cov支持复数，paddle里的cov不支持复数，corrcoef是再cov的基础上写的，paddle的corrcoef需要支持复数吗，如果需要的话，那cov是需要改成支持复数的吗

我去看了一下，基础操作现在复数支持还不完善，那可以先不添加复数测试了。

typelist是否需要补充fp32，

paddle-bot-old · 2022-03-24T02:35:33Z

你的PR有最新反馈，请及时修改。
There’s the latest feedback about your PR. Please check.

zhiboniu · 2022-03-24T08:27:45Z

python/paddle/fluid/tests/unittests/test_corr.py

@@ -0,0 +1,116 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.


zhiboniu · 2022-03-24T08:33:59Z

python/paddle/fluid/tests/unittests/test_corr.py

+        self.assertRaises(ValueError, test_err)
+
+
+class Corr_Test4(unittest.TestCase):


每个test类添加一下注释，
增加不支持的数据类型测试案例

每个test类添加一下注释，增加不支持的数据类型测试案例

我改好啦，麻烦看一下哦

zhwesky2010 · 2022-04-19T10:45:53Z

python/paddle/fluid/tests/unittests/test_corr.py

+                tensor = paddle.to_tensor(np_arr, place=p)
+                corr = paddle.linalg.corrcoef(tensor)
+                np_corr = numpy_corr(np_arr, rowvar=True, dtype=dtype)
+                self.assertTrue(np.allclose(np_corr, corr.numpy(), atol=1.e-2))


建议float64和float32分开写单测，两者精度差距比较大还是分开设置阈值好些，float64不要改rtol，float32改成rtol=1e-4或-5看过不过得了CI

我试了一下，本地测改成float32改成-5是可以的，提交时候测就过不了

我试了一下，本地测改成float32改成-5是可以的，提交时候测就过不了

按道理是不会的，可能是你本地只是某种环境，但是CI会测试 Linux/mac/windows下mkl/openblas/GPU库的多种组合，某种case下过不了。具体挂的是哪条测试呢

我试了一下，本地测改成float32改成-5是可以的，提交时候测就过不了

按道理是不会的，可能是你本地只是某种环境，但是CI会测试 Linux/mac/windows下mkl/openblas/GPU库的多种组合，某种case下过不了。具体挂的是哪条测试呢

挂的是这个 self.assertTrue(np.allclose(np_corr, corr.numpy(), atol=1.e-2)) ，当使用float32的时候

我试了一下，本地测改成float32改成-5是可以的，提交时候测就过不了

按道理是不会的，可能是你本地只是某种环境，但是CI会测试 Linux/mac/windows下mkl/openblas/GPU库的多种组合，某种case下过不了。具体挂的是哪条测试呢

挂的是这个 self.assertTrue(np.allclose(np_corr, corr.numpy(), atol=1.e-2)) ，当使用float32的时候

我是说挂的哪条流水线

如果1e-5不行的话，1e-4也行吧，麻烦重新提交下

我试了一下，本地测改成float32改成-5是可以的，提交时候测就过不了

按道理是不会的，可能是你本地只是某种环境，但是CI会测试 Linux/mac/windows下mkl/openblas/GPU库的多种组合，某种case下过不了。具体挂的是哪条测试呢

挂的是这个 self.assertTrue(np.allclose(np_corr, corr.numpy(), atol=1.e-2)) ，当使用float32的时候

我是说挂的哪条流水线

挂的是PR-CI-Static-Check

如果1e-5不行的话，1e-4也行吧，麻烦重新提交下

我重新试一下

zhwesky2010 · 2022-04-25T03:19:59Z

python/paddle/fluid/tests/unittests/test_corr.py

+        self.shape = [20, 10]
+
+    def test_tensor_corr_default(self):
+        typelist = ['float64', 'float32']


还是过不了，float32精度是要差一些，就测float64精度吧，你找找最小的atol

还是过不了，float32精度是要差一些，就测float64精度吧，你找找最小的atol

好的好的我调整一下

zhwesky2010

你好，麻烦float64不要设置atol，float32设置1e-5，分开测再提交一下

liqitong-a · 2022-04-26T11:56:42Z

你好，麻烦float64不要设置atol，float32设置1e-5，分开测再提交一下

float64我测试过的是可以过的，不过float32设置1e-3一下都不行的。

zhwesky2010 · 2022-04-26T14:59:46Z

你好，麻烦float64不要设置atol，float32设置1e-5，分开测再提交一下

float64我测试过的是可以过的，不过float32设置1e-3一下都不行的。

你按这个标准来提交就可以，后面我们会处理

liqitong-a · 2022-04-26T15:03:09Z

你好，麻烦float64不要设置atol，float32设置1e-5，分开测再提交一下

float64我测试过的是可以过的，不过float32设置1e-3一下都不行的。

你按这个标准来提交就可以，后面我们会处理

请问是把float32和float64放两个文件测吗

zhwesky2010 · 2022-04-27T02:58:00Z

你好，麻烦float64不要设置atol，float32设置1e-5，分开测再提交一下

float64我测试过的是可以过的，不过float32设置1e-3一下都不行的。

你按这个标准来提交就可以，后面我们会处理

请问是把float32和float64放两个文件测吗

就是分两个 assertTrue

zhwesky2010 · 2022-04-27T11:28:02Z

@liqitong-a 代码风格检查没有过

liqitong-a · 2022-05-05T03:56:49Z

@liqitong-a 代码风格检查没有过

这样子可以吗

Ligoml

LGTM for docs

luotao1 · 2022-05-05T07:28:08Z

从PR-CI-Coverage的历史记录看 https://xly.bce.baidu.com/paddlepaddle/paddle/newipipe/builds/6512?module=github/PaddlePaddle/Paddle&pipeline=PR-CI-Coverage&branch=pull/40690(develop) 单测存在超时情况，可以考虑缩小输入数据维度来避免

liqitong-a · 2022-05-06T03:56:06Z

从PR-CI-Coverage的历史记录看 https://xly.bce.baidu.com/paddlepaddle/paddle/newipipe/builds/6512?module=github/PaddlePaddle/Paddle&pipeline=PR-CI-Coverage&branch=pull/40690(develop) 单测存在超时情况，可以考虑缩小输入数据维度来避免

我改了一下，可以了

XiaoguangHu01 · 2022-05-07T06:39:20Z

python/paddle/tensor/linalg.py

@@ -3181,3 +3182,72 @@ def lstsq(x, y, rcond=None, driver=None, name=None):
        singular_values = paddle.static.data(name='singular_values', shape=[0])

    return solution, residuals, rank, singular_values
+
+
+def corrcoef(x, rowvar=True, name=None):


接口跟numpy比较，缺少了y参数，缺少的原因是什么？后续会添加y参数吗？

接口跟numpy比较，缺少了y参数，缺少的原因是什么？后续会添加y参数吗？

这个是由于计算corrcoef首先要计算cov，paddle的cov在编写的时候对比numpy也是没有y参数，后续需要看cov是否添加y参数。

源码：

示例：

这个原因在于，y是对x元素的补充，最终使用的时候也是跟x拼接起来了，所以跟只用一个拼接好的x是一样的。
从简洁性上考虑，这个y也有点多余。所以以后应该不会添加y参数的。

XiaoguangHu01

LG API

liqitong-a added 2 commits March 17, 2022 23:57

corrcoef commit

001a5f1

corrcoef commit

45a53eb

liqitong-a mentioned this pull request Mar 18, 2022

【PaddlePaddle Hackathon 第二期】任务总览 #40234

Closed

Ligoml assigned zhiboniu Mar 18, 2022

Ligoml added contributor External developers status: proposed labels Mar 18, 2022

liqitong-a changed the title ~~【Hackathon No.3】corrcoef~~ 【PaddlePaddle Hackathon 2】3、为 Paddle 新增 corrcoef(皮尔逊积矩相关系数) API Mar 18, 2022

Ligoml added the status: open review label Mar 18, 2022

paddle-bot-old bot removed the status: proposed label Mar 18, 2022

zhiboniu reviewed Mar 21, 2022

View reviewed changes

liqitong-a added 3 commits March 22, 2022 15:51

Merge branch 'PaddlePaddle:develop' into corrcoef

d2aa0fb

Update test_corr.py

bb5c04d

Update linalg.py

6d88d9a

Ligoml requested a review from zhiboniu March 24, 2022 02:34

Ligoml added the status: revision label Mar 24, 2022

paddle-bot-old bot removed the status: open review label Mar 24, 2022

zhiboniu reviewed Mar 24, 2022

View reviewed changes

liqitong-a added 9 commits March 25, 2022 14:52

Merge branch 'PaddlePaddle:develop' into corrcoef

2c1cfe8

Merge branch 'PaddlePaddle:develop' into corrcoef

1cf83e8

Update test_corr.py

2d654fd

Merge branch 'PaddlePaddle:develop' into corrcoef

e5b66e9

Update test_corr.py

6efaca6

Merge branch 'PaddlePaddle:develop' into corrcoef

53ee671

Merge branch 'PaddlePaddle:develop' into corrcoef

9a74568

Update test_corr.py

fe98fcd

Update test_corr.py

f39ad07

zhwesky2010 reviewed Apr 19, 2022

View reviewed changes

Update test_corr.py

decb986

liqitong-a dismissed Ligoml’s stale review via decb986 April 24, 2022 08:09

Merge branch 'PaddlePaddle:develop' into corrcoef

189a29f

zhwesky2010 reviewed Apr 25, 2022

View reviewed changes

zhwesky2010 reviewed Apr 26, 2022

View reviewed changes

liqitong-a added 2 commits April 27, 2022 13:21

Update test_corr.py

69064d2

Merge branch 'PaddlePaddle:develop' into corrcoef

7c3b09d

liqitong-a added 4 commits April 27, 2022 19:42

Update test_corr.py

7c5efcf

Merge branch 'PaddlePaddle:develop' into corrcoef

e97d91c

Update test_corr.py

4fca073

Merge branch 'PaddlePaddle:develop' into corrcoef

51da5d6

Ligoml previously approved these changes May 5, 2022

View reviewed changes

Update test_corr.py

2255acf

liqitong-a dismissed Ligoml’s stale review via 2255acf May 5, 2022 07:49

Merge branch 'PaddlePaddle:develop' into corrcoef

9cbac57

jeff41404 approved these changes May 7, 2022

View reviewed changes

Ligoml approved these changes May 7, 2022

View reviewed changes

XiaoguangHu01 reviewed May 7, 2022

View reviewed changes

XiaoguangHu01 approved these changes May 7, 2022

View reviewed changes

jeff41404 merged commit 95a502a into PaddlePaddle:develop May 9, 2022

pangyoki mentioned this pull request Jan 29, 2023

paddle.linalg.corrcoef的计算结果不对称 #50048

Closed

		@@ -0,0 +1,116 @@
		# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.

		self.assertRaises(ValueError, test_err)


		class Corr_Test4(unittest.TestCase):

【PaddlePaddle Hackathon 2】3、为 Paddle 新增 corrcoef(皮尔逊积矩相关系数) API #40690

【PaddlePaddle Hackathon 2】3、为 Paddle 新增 corrcoef(皮尔逊积矩相关系数) API #40690

Conversation

liqitong-a commented Mar 17, 2022 • edited by luotao1 Loading

PR types

PR changes

Describe

Ligoml commented Mar 18, 2022

paddle-bot-old bot commented Mar 18, 2022

liqitong-a commented Mar 18, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shjNT Apr 11, 2022 • edited Loading

Choose a reason for hiding this comment

paddle-bot-old bot commented Mar 24, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhwesky2010 Apr 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhwesky2010 Apr 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhwesky2010 left a comment • edited Loading

Choose a reason for hiding this comment

liqitong-a commented Apr 26, 2022

zhwesky2010 commented Apr 26, 2022 • edited Loading

liqitong-a commented Apr 26, 2022

zhwesky2010 commented Apr 27, 2022

zhwesky2010 commented Apr 27, 2022

liqitong-a commented May 5, 2022

Ligoml left a comment

Choose a reason for hiding this comment

luotao1 commented May 5, 2022

liqitong-a commented May 6, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhiboniu May 7, 2022 • edited Loading

Choose a reason for hiding this comment

XiaoguangHu01 left a comment

Choose a reason for hiding this comment

liqitong-a commented Mar 17, 2022 •

edited by luotao1

Loading

shjNT Apr 11, 2022 •

edited

Loading

zhwesky2010 Apr 19, 2022 •

edited

Loading

zhwesky2010 Apr 20, 2022 •

edited

Loading

zhwesky2010 left a comment •

edited

Loading

zhwesky2010 commented Apr 26, 2022 •

edited

Loading

zhiboniu May 7, 2022 •

edited

Loading