【Hackathon 6th No.1】Add AdaptiveLogSoftmaxWithLoss API to Paddle -part #63302

Chen-Lun-Hao · 2024-04-08T03:59:16Z

PR Category

Others

PR Types

New features

Description

添加AdaptiveLogSoftmaxWithLoss API

Link

Rfc PR: PaddlePaddle/community#856

paddle-bot · 2024-04-08T03:59:20Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Chen-Lun-Hao · 2024-04-13T08:06:46Z

@luotao1 @GGBond8488

这个y也是要approve？两个问题是一样的

GGBond8488 · 2024-04-19T07:19:31Z

test/legacy_test/test_adaptive_log_softmax_with_loss.py

+from paddle.nn import functional as F
+
+
+class TestNNAdaptiveLogSoftmaxWithLossAPI(unittest.TestCase):


测试建议添加一个AdaptiveLogSoftmax的应用场景，先用AdaptiveLogSoftmax组一个小但是完整的网络，包括优化器的那种，测试AdaptiveLogSoftmax的输出以及自身的权重是否有更新

没问题，请问还有其他需要添加的吗？

已经重新加上去了，请review

这个我希望的是先写一个比较完备的model组网代码，前向用到了AdaptiveLogSoftmax，然后实际运行这个model，再测试其中的数据是否正确

我理解一下

这个我希望的是先写一个比较完备的model组网代码，前向用到了AdaptiveLogSoftmax，然后实际运行这个model，再测试其中的数据是否正确

按照这个说法的话，结果是没有变化的啊，只是AdaptiveLogSoftmaxwithLoss的输入先经过其他模型，但是结果是不变的。我这一块很疑惑。

GGBond8488 · 2024-04-19T09:03:36Z

python/paddle/nn/layer/loss.py

+        the index `n_classes - 1`. To compute log-probabilities for all classes, the ``log_prob`` method can be used.
+    """
+
+    def __init__(


这里不传weight的话，怎么主动指定初始化呢

这里已经修改

这里修改了话，记得rfc设计文档里面也同步一下，保持两边的参数一致

GGBond8488 · 2024-04-19T09:04:31Z

test/legacy_test/test_adaptive_log_softmax_with_loss.py

+from paddle.nn import functional as F
+
+
+class TestNNAdaptiveLogSoftmaxWithLossAPI(unittest.TestCase):


这个我希望的是先写一个比较完备的model组网代码，前向用到了AdaptiveLogSoftmax，然后实际运行这个model，再测试其中的数据是否正确

luotao1 · 2024-05-11T07:12:00Z

需要过一下CI

Chen-Lun-Hao · 2024-05-11T07:38:19Z

需要过一下CI

这报的错全不是我写的那部分

Chen-Lun-Hao · 2024-05-11T07:38:44Z

是环境出问题了吧 @luotao1

luotao1 · 2024-05-11T07:48:44Z

PR-CI-Codestyle-Check 是提交的代码有问题。

是环境出问题了吧

后续可以rerun CI，或merge develop重新触发

Chen-Lun-Hao · 2024-05-16T04:37:01Z

PR-CI-Codestyle-Check 是提交的代码有问题。

是环境出问题了吧

后续可以rerun CI，或merge develop重新触发

paddle不是支持bool数据和float32数据相乘吗

 @luotao1

… softmax

Chen-Lun-Hao · 2024-05-17T00:27:40Z

已经好了 @jeff41404

jeff41404

LGTM

sunzhongkai588 · 2024-05-21T07:46:05Z

python/paddle/nn/functional/loss.py

+):
+    r"""Compute adaptive logsoftmax result and negative log likelihood between ``input`` and ``label``.
+    Parameter ``head``, ``tail_weights``, ``cutoffs`` are inner members of AdaptiveLogSoftmaxWithLoss
+    Please refer to :ref:`_cn_api_paddle_nn_AdaptiveLogSoftmaxWithLoss`.


Suggested change

Please refer to :ref:`_cn_api_paddle_nn_AdaptiveLogSoftmaxWithLoss`.

Please refer to :ref:`api_paddle_nn_AdaptiveLogSoftmaxWithLoss`.

sunzhongkai588 · 2024-05-21T07:51:02Z

python/paddle/nn/functional/loss.py

+        output (Tensor): The tensor sotring adaptive logsoftmax result, the shape of output is [N]
+        loss (Tensor): The tensor variable storing the adaptive_log_softmax_loss of input and label.


Suggested change

output (Tensor): The tensor sotring adaptive logsoftmax result, the shape of output is [N]

loss (Tensor): The tensor variable storing the adaptive_log_softmax_loss of input and label.

- output (Tensor). The tensor sotring adaptive logsoftmax result, the shape of output is [N]

- loss (Tensor). The tensor variable storing the adaptive_log_softmax_loss of input and label.

sunzhongkai588 · 2024-05-21T07:53:28Z

python/paddle/nn/functional/loss.py

+        output (Tensor): The tensor sotring adaptive logsoftmax result, the shape of output is [N]
+        loss (Tensor): The tensor variable storing the adaptive_log_softmax_loss of input and label.
+
+    Examples::


Suggested change

Examples::

Examples:

sunzhongkai588 · 2024-05-21T07:57:33Z

python/paddle/nn/functional/loss.py

+    Args:
+        input (Tensor): Input tensor, the data type should be float32 or float64.
+        label (Tensor): Label tensor, the data type should be float32 or float64.
+        head_weight (Tensor): weight tensor for linear computation, the data type should be float32 or float64, the shape should be [input.shape[1], shortlist_size + n_clusters], where shortlist_size is the first element in the cutoffs list, and n_clusters is the length of the cutoffs list minus 1.


Suggested change

head_weight (Tensor): weight tensor for linear computation, the data type should be float32 or float64, the shape should be [input.shape[1], shortlist_size + n_clusters], where shortlist_size is the first element in the cutoffs list, and n_clusters is the length of the cutoffs list minus 1.

head_weight (Tensor): weight tensor for linear computation, the data type should be float32 or float64, the shape should be ``[input.shape[1], shortlist_size + n_clusters]``, where ``shortlist_size is`` the first element in the cutoffs list, and ``n_clusters`` is the length of the cutoffs list minus 1.

尽量在官网展示的美观一点吧，都揉在一起了

sunzhongkai588 · 2024-05-21T07:58:58Z

python/paddle/nn/functional/loss.py

+        input (Tensor): Input tensor, the data type should be float32 or float64.
+        label (Tensor): Label tensor, the data type should be float32 or float64.
+        head_weight (Tensor): weight tensor for linear computation, the data type should be float32 or float64, the shape should be [input.shape[1], shortlist_size + n_clusters], where shortlist_size is the first element in the cutoffs list, and n_clusters is the length of the cutoffs list minus 1.
+        tail_weights (list[Tensor]): weight tensor list for linear computation, the data type should be float32 or float64. The number of elements in the tail_weights depends on the value of the n_clusters, and each element contains the weights of two linear layers, their dimensions are [input.shape[1], hsz] and [hsz, osz], where hsz is the number of input features in_features divided by div_value to the power (i + 1), where i is the cyclic variable, from 0 to n_clusters - 1, and osz is the (i + 1) The difference between the cutoff and the ith cutoff.


Suggested change

tail_weights (list[Tensor]): weight tensor list for linear computation, the data type should be float32 or float64. The number of elements in the tail_weights depends on the value of the n_clusters, and each element contains the weights of two linear layers, their dimensions are [input.shape[1], hsz] and [hsz, osz], where hsz is the number of input features in_features divided by div_value to the power (i + 1), where i is the cyclic variable, from 0 to n_clusters - 1, and osz is the (i + 1) The difference between the cutoff and the ith cutoff.

tail_weights (list[Tensor]): weight tensor list for linear computation, the data type should be float32 or float64. The number of elements in the tail_weights depends on the value of the n_clusters, and each element contains the weights of two linear layers, their dimensions are ``[input.shape[1], hsz]`` and ``[hsz, osz]``, where ``hsz`` is the number of input features in_features divided by div_value to the power (i + 1), where i is the cyclic variable, from 0 to n_clusters - 1, and ``osz`` is the (i + 1) The difference between the cutoff and the ith cutoff.

sunzhongkai588 · 2024-05-21T08:13:32Z

python/paddle/nn/layer/loss.py

+class AdaptiveLogSoftmaxWithLoss(Layer):
+    r"""Adaptive softmax is an approximate strategy for training models with large output spaces. It is most effective when
+    the label distribution is highly imbalanced, for example in natural language modelling, where the word frequency
+    distribution approximately follows the ``Zipf's law``.


既然是参考 pytorch 的文档，就抄全吧， Zipf's law 附上链接

Suggested change

distribution approximately follows the ``Zipf's law``.

distribution approximately follows the `Zipf's law <https://en.wikipedia.org/wiki/Zipf%27s_law>`_ .

sunzhongkai588 · 2024-05-21T08:19:18Z

python/paddle/nn/layer/loss.py

+        weight_attr (ParamAttr, optional): The attribute for the learnable
+            weight of this layer. The default value is None. If the Initializer of the
+            param_attr is not set, the parameter is initialized with Xavier.
+            For detailed information, please refer to paddle.ParamAttr.


Suggested change

For detailed information, please refer to paddle.ParamAttr.

For detailed information, please refer to :ref:`api_paddle_ParamAttr`

sunzhongkai588 · 2024-05-21T08:19:54Z

python/paddle/nn/layer/loss.py

+            of this layer. If it is set to False, no bias will be added to the output.
+            If it is set to None or one kind of ParamAttr, a bias parameter will
+            be created according to ParamAttr. For detailed information, please refer
+            to paddle.ParamAttr. The default value is None and the bias will be


Suggested change

to paddle.ParamAttr. The default value is None and the bias will be

to :ref:`api_paddle_ParamAttr`. The default value is None and the bias will be

sunzhongkai588 · 2024-05-21T08:22:15Z

python/paddle/nn/layer/loss.py

+        - input (Tensor): The input tensor. The shapes is [N, in_features]. N is batch size.
+        - label (Tensor): target. The shapes is `[N]`
+        - output1 (Tensor): The shape is `[N]`
+        - output2 (Scalar):


Suggested change

- output2 (Scalar):

- output2 (Scalar).

sunzhongkai588 · 2024-05-21T08:22:25Z

python/paddle/nn/layer/loss.py

+    Returns:
+        A callable object of AdaptiveLogSoftmaxWithLoss.
+
+     Examples::


Suggested change

Examples::

Examples:

已經全部修改了

Chen-Lun-Hao · 2024-05-21T22:50:01Z

pr已经实现，能不能先不关闭？ @luotao1

luotao1 · 2024-05-22T02:20:58Z

pr已经实现，能不能先不关闭

关闭什么？这个PR没有关闭呀

Chen-Lun-Hao · 2024-05-22T04:38:29Z

pr已经实现，能不能先不关闭

关闭什么？这个PR没有关闭呀

好的

sunzhongkai588

LGTM

PaddlePaddle#63302) * Add AdaptiveLogSoftmaxWithLoss API * update codestyle * update loss * test * update test * add weight_attr * update forward * update forward * update * update * update * update test_gard * update * update information * update * update * codestyle * update * update * update * update

Add AdaptiveLogSoftmaxWithLoss API

2e73a42

paddle-bot bot added the contributor External developers label Apr 8, 2024

update codestyle

1272637

luotao1 added the PaddlePaddle Hackathon label Apr 8, 2024

luotao1 assigned luotao1 and GGBond8488 Apr 8, 2024

luotao1 mentioned this pull request Apr 8, 2024

【Hackathon 6th】开源贡献个人挑战赛 #62905

Open

Chen-Lun-Hao and others added 3 commits April 11, 2024 15:54

Merge branch 'PaddlePaddle:develop' into softmax

74e9bb5

update loss

c81de86

test

2036830

Chen-Lun-Hao force-pushed the softmax branch from cd5edf7 to 2036830 Compare April 13, 2024 06:01

Chen-Lun-Hao mentioned this pull request Apr 15, 2024

【Hackathon 6th No.4】Add Ormqr API to Paddle -part #63227

Merged

GGBond8488 reviewed Apr 19, 2024

View reviewed changes

update test

9a489a9

GGBond8488 reviewed Apr 19, 2024

View reviewed changes

Chen-Lun-Hao force-pushed the softmax branch 6 times, most recently from 4c5a6d8 to 0d427ce Compare April 22, 2024 03:13

add weight_attr

5f989be

Chen-Lun-Hao force-pushed the softmax branch 4 times, most recently from b2dcf9d to 9ea3e14 Compare April 23, 2024 01:35

update forward

129095e

Chen-Lun-Hao force-pushed the softmax branch from 0b8e8c9 to 129095e Compare April 23, 2024 07:56

Merge branch 'PaddlePaddle:develop' into softmax

422d801

Chen-Lun-Hao and others added 4 commits May 11, 2024 23:26

update

c5e1eb1

Merge branch 'PaddlePaddle:develop' into softmax

85d5295

Merge branch 'PaddlePaddle:develop' into softmax

bd534ab

Merge branch 'PaddlePaddle:develop' into softmax

1586812

Chen-Lun-Hao added 3 commits May 16, 2024 17:08

update

a367a90

Merge branch 'softmax' of https://github.com/Chen-Lun-Hao/Paddle into…

596b88d

… softmax

codestyle

66adc44

jeff41404 previously approved these changes May 20, 2024

View reviewed changes

luotao1 assigned sunzhongkai588 May 21, 2024

sunzhongkai588 reviewed May 21, 2024

View reviewed changes

update

30ded8c

Chen-Lun-Hao dismissed jeff41404’s stale review via 30ded8c May 21, 2024 09:21

Chen-Lun-Hao added 3 commits May 22, 2024 14:38

update

ad2d0c4

update

b231a1d

update

45c860b

sunzhongkai588 approved these changes May 23, 2024

View reviewed changes

luotao1 changed the title ~~【Hackathon 6th No.1】Add AdaptiveLogSoftmaxWithLoss API to Paddle~~ 【Hackathon 6th No.1】Add AdaptiveLogSoftmaxWithLoss API to Paddle -part May 23, 2024

luotao1 merged commit d0e08a8 into PaddlePaddle:develop May 23, 2024
31 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon 6th No.1】Add AdaptiveLogSoftmaxWithLoss API to Paddle -part #63302

【Hackathon 6th No.1】Add AdaptiveLogSoftmaxWithLoss API to Paddle -part #63302

Chen-Lun-Hao commented Apr 8, 2024

paddle-bot bot commented Apr 8, 2024

Chen-Lun-Hao commented Apr 13, 2024

GGBond8488 Apr 19, 2024

Chen-Lun-Hao Apr 19, 2024

Chen-Lun-Hao Apr 19, 2024

GGBond8488 Apr 19, 2024

Chen-Lun-Hao Apr 19, 2024

Chen-Lun-Hao Apr 19, 2024

GGBond8488 Apr 19, 2024

Chen-Lun-Hao Apr 19, 2024

GGBond8488 Apr 28, 2024

GGBond8488 Apr 19, 2024

luotao1 commented May 11, 2024

Chen-Lun-Hao commented May 11, 2024

Chen-Lun-Hao commented May 11, 2024

luotao1 commented May 11, 2024

Chen-Lun-Hao commented May 16, 2024 •

edited

Chen-Lun-Hao commented May 17, 2024

jeff41404 left a comment

sunzhongkai588 May 21, 2024

sunzhongkai588 May 21, 2024

sunzhongkai588 May 21, 2024

sunzhongkai588 May 21, 2024

sunzhongkai588 May 21, 2024

sunzhongkai588 May 21, 2024

sunzhongkai588 May 21, 2024

sunzhongkai588 May 21, 2024

sunzhongkai588 May 21, 2024

sunzhongkai588 May 21, 2024

Chen-Lun-Hao May 22, 2024

Chen-Lun-Hao commented May 21, 2024

luotao1 commented May 22, 2024

Chen-Lun-Hao commented May 22, 2024

sunzhongkai588 left a comment

		from paddle.nn import functional as F


		class TestNNAdaptiveLogSoftmaxWithLossAPI(unittest.TestCase):

	Please refer to :ref:`_cn_api_paddle_nn_AdaptiveLogSoftmaxWithLoss`.
	Please refer to :ref:`api_paddle_nn_AdaptiveLogSoftmaxWithLoss`.

		output (Tensor): The tensor sotring adaptive logsoftmax result, the shape of output is [N]
		loss (Tensor): The tensor variable storing the adaptive_log_softmax_loss of input and label.

	head_weight (Tensor): weight tensor for linear computation, the data type should be float32 or float64, the shape should be [input.shape[1], shortlist_size + n_clusters], where shortlist_size is the first element in the cutoffs list, and n_clusters is the length of the cutoffs list minus 1.
	head_weight (Tensor): weight tensor for linear computation, the data type should be float32 or float64, the shape should be ``[input.shape[1], shortlist_size + n_clusters]``, where ``shortlist_size is`` the first element in the cutoffs list, and ``n_clusters`` is the length of the cutoffs list minus 1.

	distribution approximately follows the ``Zipf's law``.
	distribution approximately follows the `Zipf's law <https://en.wikipedia.org/wiki/Zipf%27s_law>`_ .

	For detailed information, please refer to paddle.ParamAttr.
	For detailed information, please refer to :ref:`api_paddle_ParamAttr`

	to paddle.ParamAttr. The default value is None and the bias will be
	to :ref:`api_paddle_ParamAttr`. The default value is None and the bias will be

【Hackathon 6th No.1】Add AdaptiveLogSoftmaxWithLoss API to Paddle -part #63302

【Hackathon 6th No.1】Add AdaptiveLogSoftmaxWithLoss API to Paddle -part #63302

Conversation

Chen-Lun-Hao commented Apr 8, 2024

PR Category

PR Types

Description

Link

paddle-bot bot commented Apr 8, 2024

Chen-Lun-Hao commented Apr 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luotao1 commented May 11, 2024

Chen-Lun-Hao commented May 11, 2024

Chen-Lun-Hao commented May 11, 2024

luotao1 commented May 11, 2024

Chen-Lun-Hao commented May 16, 2024 • edited

Chen-Lun-Hao commented May 17, 2024

jeff41404 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Chen-Lun-Hao commented May 21, 2024

luotao1 commented May 22, 2024

Chen-Lun-Hao commented May 22, 2024

sunzhongkai588 left a comment

Choose a reason for hiding this comment

Chen-Lun-Hao commented May 16, 2024 •

edited