Skip to content

mindnlp 0.4 跑deepseek lora微调,训练过程中内存持续增长 #2221

@TJHsiao-Ni

Description

@TJHsiao-Ni

Describe the bug/ 问题描述 (Mandatory / 必填)
使用mindnlp 0.4分支在香橙派AIpro 20T 上跑deepseek lora微调代码deepseek-r1-distill-qwen-1.5b-lora.py,发现在训练过程中,内存持续增加

  • Hardware Environment(Ascend/GPU/CPU) / 硬件环境:
    Ascend310B

  • Software Environment / 软件环境 (Mandatory / 必填):
    -- MindSpore version: 2.6.0
    -- CANN:8.1.rc1
    -- Python version: 3.9.0
    -- OS platform and distribution: Linux Ubuntu 22.04

  • Excute Mode / 执行模式 (Mandatory / 必填)(PyNative/Graph):

Please delete the mode not involved / 请删除不涉及的模式:
/mode pynative

待在新版本mindspore上测试

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions