【PIR API adaptor No.143、144】 Migrate margin_cross_entropy、masked_multihead_attention #58762

Tomoko-hjf · 2023-11-07T07:26:06Z

PR types

Others

PR changes

Others

Description

PIR API 推全升级
将 paddle.nn.functional.margin_cross_entropy 迁移升级至 pir，并更新单测，单测覆盖率：6/6
将 paddle.incubate.nn.functional.masked_multihead_attention 迁移升级至 pir，并更新单测，单测覆盖率：2/2

paddle-ci-bot · 2023-11-15T03:14:20Z

Sorry to inform you that b0fc204's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

MarioLulab · 2023-12-19T06:05:53Z

sorry ，因为这个 pr 没有关联到 issue #58067 里，所以之前一直忘记 review 这个 pr 了 😭 今天我 review 一下，请问健飞大佬最近还有时间推进这个 pr 的合入吗~

MarioLulab

LGTM
辛苦 merge 最新的分支解决一下冲突~

Tomoko-hjf · 2023-12-19T06:25:08Z

LGTM 辛苦 merge 最新的分支解决一下冲突~

已更新~

MarioLulab

nice work~

test/legacy_test/test_margin_cross_entropy_op.py

test/legacy_test/test_masked_multihead_attention_op.py

test/legacy_test/test_margin_cross_entropy_op.py

MarioLulab · 2023-12-21T07:56:15Z

@0x45f @YuanRisheng 出现了问题，帮忙看一下？

本地运行单测出现错误：

具体出错单测为：test/legacy_test/test_masked_multihead_attention_op.py 下的TestLayerNormStaticInt8Op

--- Running PIR pass [inplace_pass]
I1221 07:52:11.319885 676665 pass.cc:38] --- detected [0] subgraphs!
I1221 07:52:13.244624 676665 pir_interpreter.cc:1264] New Executor is Running ...
W1221 07:52:13.245051 676665 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 11.8
W1221 07:52:13.279644 676665 gpu_resources.cc:164] device: 0, cuDNN Version: 8.6.


--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::framework::StandaloneExecutor::Run(std::vector<std::string, std::allocator<std::string > > const&, bool)
1   paddle::framework::InterpreterCore::Run(std::vector<std::string, std::allocator<std::string > > const&, std::vector<phi::DenseTensor, std::allocator<phi::DenseTensor> > const&, bool, bool)
2   paddle::framework::PirInterpreter::Run(std::vector<std::string, std::allocator<std::string > > const&, std::vector<phi::DenseTensor, std::allocator<phi::DenseTensor> > const&, bool, bool)
3   paddle::framework::PirInterpreter::BuildInstruction()
4   paddle::framework::PhiKernelInstruction::PhiKernelInstruction(unsigned long, phi::Place const&, pir::Operation*, paddle::framework::ValueExecutionInfo const*)
5   void paddle::framework::BuildPhiContext<phi::InferMetaContext, phi::MetaTensor, phi::MetaTensor, paddle::small_vector<phi::MetaTensor, 15u>, paddle::small_vector<phi::MetaTensor, 15u>, false>(pir::Operation*, paddle::framework::ValueExecutionInfo const&, paddle::dialect::OpYamlInfoParser const&, phi::InferMetaContext*)
6   phi::DenseTensor const& paddle::framework::Variable::Get<phi::DenseTensor>() const

----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
  [TimeInfo: *** Aborted at 1703145133 (unix time) try "date -d @1703145133" if you are using GNU date ***]
  [SignalInfo: *** SIGSEGV (@0x0) received by PID 676665 (TID 0x7f332e345740) from PID 0 ***]

复现环境

CUDA Version: 12.0
python==3.8

复现代码

# bugReproduce.py
import numpy as np

import paddle
from paddle.framework import core
from paddle.incubate.nn.functional import masked_multihead_attention
from paddle.pir_utils import test_with_pir_api

np.random.seed(0)
bsz = 2
cache_bsz = 2
num_head = 32
dim_head = 128
beam_size = 1
max_seq_len = 33
sequence_length = 32

x = np.random.uniform(
    -0.05, 0.05, [bsz, 3, num_head, dim_head]
)
bias = np.random.uniform(
    -0.05, 0.05, [3, num_head, dim_head]
)
src_mask = np.zeros([bsz, 1, 1, sequence_length + 1])

cum_offsets = None
sequence_lengths = None
rotary_tensor = None
beam_cache_offset = None

cache_kv_out = np.random.uniform(
    -0.05,
    0.05,
    [
        2,
        cache_bsz,
        num_head,
        sequence_length,
        dim_head,
    ],
)
numpy_ones = np.zeros(
    [2, cache_bsz, num_head, 1, dim_head]
)
cache_kv_mmha_out = np.concatenate(
    (cache_kv_out, numpy_ones), axis=3
)

qkv_out_scale = None
out_shift = None
out_smooth = None

seq_len = 1
rotary_emb_dims = 0
use_neox_rotary_style = False

out_scale = -1
quant_round_type = 1
quant_max_bound = 127
quant_min_bound = -127
place = paddle.CUDAPlace(0)

def check_main(
    x,
    bias,
    src_mask,
    cache_kv_out,
    cache_kv_mmha_out,
    qkv_out_scale,
    out_scale,
    dtype,
):
    paddle.enable_static()
    with paddle.static.program_guard(paddle.static.Program()):
        x_static = paddle.static.data(
            name="x_static",
            shape=[bsz, 3 * num_head * dim_head],
            dtype=dtype,
        )
        bias_static = paddle.static.data(
            name="bias_static",
            shape=[3, num_head, dim_head],
            dtype=dtype,
        )
        src_mask_static = paddle.static.data(
            name="src_mask_static",
            shape=[bsz, 1, 1, sequence_length + 1],
            dtype=dtype,
        )
        cache_kv_mmha_out_static = paddle.static.data(
            name="cache_kv_mmha_out_static",
            shape=[
                2,
                cache_bsz,
                num_head,
                sequence_length + 1,
                dim_head,
            ],
            dtype=dtype,
        )

        outs = masked_multihead_attention(
            x_static,
            cache_kv_mmha_out_static,
            bias_static,
            src_mask_static,
            None,
            None,
            None,
            None,
            None,
            None,
            None,
            32,
            0,
            False,
            "fp16",
            -1,
            1,
            127.0,
            -127.0,
        )
        exe = paddle.static.Executor(place)
        exe.run(
            feed={
                "x_static": x.reshape(bsz, -1).astype(dtype),
                "cache_kv_mmha_out_static": cache_kv_mmha_out.astype(dtype),
                "bias_static": bias.astype(dtype),
                "src_mask_static": src_mask.astype(dtype),
            },
        )


with paddle.pir_utils.IrGuard():
    check_main(
        x,
        bias,
        src_mask,
        cache_kv_out,
        cache_kv_mmha_out,
        qkv_out_scale,
        out_scale,
        'float16',
    )

排查过程

运行

GLOG_v=8 python3 bugReproduce.py

会出现错误：

  File "bugReproduce.py", line 123, in check_main
    exe.run(
  File "/home/aistudio/Paddle/build-gpu/python/paddle/base/executor.py", line 1763, in run
    res = self._run_pir_impl(
  File "/home/aistudio/Paddle/build-gpu/python/paddle/base/executor.py", line 2113, in _run_pir_impl
    ret = new_exe.run(list(feed.keys()), return_numpy)
  File "/home/aistudio/Paddle/build-gpu/python/paddle/base/executor.py", line 828, in run
    tensors = self._new_exe.run(
RuntimeError: (PreconditionNotMet) var() should exist in var_name_2_id_ (at /home/aistudio/Paddle/paddle/fluid/framework/new_executor/pir_interpreter.cc:819)

表明 pir 模式的执行器缺少运行时的 Variable
在旧ir静态图模式下运行不报错。猜测是由于 masked_multihead_attention 调用的是 masked_multihead_attention_ inplace 算子。算子会对 beam_cache_offset 输入进行 inplace 操作，但是在该单测下 beam_cache_offset 输入为 None

0x45f · 2023-12-22T08:39:36Z

上面提到的问题提PR进行了修复，待 #60269 合入后再推进本PR的合入~

0x45f · 2023-12-25T06:13:16Z

上面提到的问题提PR进行了修复，待 #60269 合入后再推进本PR的合入~

#60269 已合入~

MarioLulab · 2023-12-26T02:34:44Z

辛苦 Merge 一下最新的分支~ 🥳

… pir

MarioLulab

LGTM

…ihead_attention (PaddlePaddle#58762)

fix pir

b0fc204

luotao1 mentioned this pull request Nov 7, 2023

新IR Python API适配升级 #58067

Closed

paddle-bot bot added the contributor External developers label Nov 7, 2023

luotao1 added the HappyOpenSource 快乐开源活动issue与PR label Nov 8, 2023

luotao1 assigned luotao1, 0x45f, YuanRisheng and MarioLulab Nov 8, 2023

MarioLulab approved these changes Dec 19, 2023

View reviewed changes

Fix conflict

a62ef1d

MarioLulab reviewed Dec 19, 2023

View reviewed changes

test/legacy_test/test_margin_cross_entropy_op.py Show resolved Hide resolved

test/legacy_test/test_masked_multihead_attention_op.py Outdated Show resolved Hide resolved

Fix bugs

a04cfe4

MarioLulab reviewed Dec 21, 2023

View reviewed changes

Tomoko-hjf force-pushed the pir branch from 79ab297 to a04cfe4 Compare December 21, 2023 06:29

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

5a7057a

… pir

MarioLulab approved these changes Dec 26, 2023

View reviewed changes

0x45f approved these changes Dec 26, 2023

View reviewed changes

Aurelius84 approved these changes Dec 26, 2023

View reviewed changes

0x45f merged commit b4ff023 into PaddlePaddle:develop Dec 26, 2023
29 checks passed

Wanglongzhi2001 pushed a commit to Wanglongzhi2001/Paddle that referenced this pull request Jan 7, 2024

【PIR API adaptor No.143、144】 Migrate margin_cross_entropy、masked_mult…

8d40c67

…ihead_attention (PaddlePaddle#58762)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【PIR API adaptor No.143、144】 Migrate margin_cross_entropy、masked_multihead_attention #58762

【PIR API adaptor No.143、144】 Migrate margin_cross_entropy、masked_multihead_attention #58762

Tomoko-hjf commented Nov 7, 2023

paddle-ci-bot bot commented Nov 15, 2023

MarioLulab commented Dec 19, 2023

MarioLulab left a comment

Tomoko-hjf commented Dec 19, 2023

MarioLulab left a comment

MarioLulab commented Dec 21, 2023 •

edited

0x45f commented Dec 22, 2023

0x45f commented Dec 25, 2023

MarioLulab commented Dec 26, 2023

MarioLulab left a comment

【PIR API adaptor No.143、144】 Migrate margin_cross_entropy、masked_multihead_attention #58762

【PIR API adaptor No.143、144】 Migrate margin_cross_entropy、masked_multihead_attention #58762

Conversation

Tomoko-hjf commented Nov 7, 2023

PR types

PR changes

Description

paddle-ci-bot bot commented Nov 15, 2023

MarioLulab commented Dec 19, 2023

MarioLulab left a comment

Choose a reason for hiding this comment

Tomoko-hjf commented Dec 19, 2023

MarioLulab left a comment

Choose a reason for hiding this comment

MarioLulab commented Dec 21, 2023 • edited

本地运行单测出现错误：

复现环境

复现代码

排查过程

0x45f commented Dec 22, 2023

0x45f commented Dec 25, 2023

MarioLulab commented Dec 26, 2023

MarioLulab left a comment

Choose a reason for hiding this comment

MarioLulab commented Dec 21, 2023 •

edited