Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[squeezenet] [Ascend910] [GRAPH] Unable to reproduce precision #715

Closed
787918582 opened this issue Jul 31, 2023 · 0 comments
Closed

[squeezenet] [Ascend910] [GRAPH] Unable to reproduce precision #715

787918582 opened this issue Jul 31, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@787918582
Copy link

787918582 commented Jul 31, 2023

If this is your first time, please read our contributor guidelines:
https://github.com/mindspore-lab/mindcv/blob/main/CONTRIBUTING.md

Describe the bug/ 问题描述 (Mandatory / 必填)
squeezenet_1_0& squeezenet_1_1边训边推过程中精度异常

  • Hardware Environment(Ascend/GPU/CPU) / 硬件环境:

Please delete the backend not involved / 请删除不涉及的后端:
/device ascend

  • Software Environment / 软件环境 (Mandatory / 必填):
    -- MindSpore version (e.g., 1.7.0.Bxxx) :mindspore_v2.0.0 mindcv_0.2.2
    -- Python version (e.g., Python 3.7.5) :3.7.5
    -- OS platform and distribution (e.g., Linux Ubuntu 16.04):EulerOS2.8
    -- GCC/Compiler version (if compiled from source):7.3.0

  • Excute Mode / 执行模式 (Mandatory / 必填)(PyNative/Graph):

Please delete the mode not involved / 请删除不涉及的模式:
/mode graph

To Reproduce / 重现步骤 (Mandatory / 必填)
Steps to reproduce the behavior:

  1. mpirun --allow-run-as-root -n 8 python train.py --config configs/squeezenet/squeezenet_1.0_ascend.yaml --distribute True --data_dir /ImageNet_Origin/

Expected behavior / 预期结果 (Mandatory / 必填)
可复现达标精度

Screenshots/ 日志 / 截图 (Mandatory / 必填)
[2023-07-11 13:53:06] mindcv.utils.callbacks INFO - Epoch: [195/200], batch: [5004/5004], loss: 6.907755, lr: 0.000154, time: 97.860354s
[2023-07-11 13:53:11] mindcv.utils.callbacks INFO - Validation Top_1_Accuracy: 0.1000%, Top_5_Accuracy: 0.5000%, time: 4.961876s
[2023-07-11 13:53:11] mindcv.utils.callbacks INFO - Saving model to ./ckpt/squeezenet1_0-195_5004.ckpt
[2023-07-11 13:53:11] mindcv.utils.callbacks INFO - Total time since last epoch: 102.965617(train: 97.866245, val: 4.961876)s, ETA: 514.828086s
[2023-07-11 13:53:11] mindcv.utils.callbacks INFO - --------------------------------------------------------------------------------
[2023-07-11 13:54:49] mindcv.utils.callbacks INFO - Epoch: [196/200], batch: [5004/5004], loss: 6.907755, lr: 0.000099, time: 98.291733s
[2023-07-11 13:54:54] mindcv.utils.callbacks INFO - Validation Top_1_Accuracy: 0.1000%, Top_5_Accuracy: 0.5000%, time: 4.957407s
[2023-07-11 13:54:54] mindcv.utils.callbacks INFO - Saving model to ./ckpt/squeezenet1_0-196_5004.ckpt
[2023-07-11 13:54:54] mindcv.utils.callbacks INFO - Total time since last epoch: 103.394972(train: 98.297709, val: 4.957407)s, ETA: 413.579886s
[2023-07-11 13:54:54] mindcv.utils.callbacks INFO - --------------------------------------------------------------------------------
[2023-07-11 13:56:34] mindcv.utils.callbacks INFO - Epoch: [197/200], batch: [5004/5004], loss: 6.907755, lr: 0.000056, time: 99.201918s
[2023-07-11 13:56:38] mindcv.utils.callbacks INFO - Validation Top_1_Accuracy: 0.1000%, Top_5_Accuracy: 0.5000%, time: 4.923371s
[2023-07-11 13:56:39] mindcv.utils.callbacks INFO - Saving model to ./ckpt/squeezenet1_0-197_5004.ckpt
[2023-07-11 13:56:39] mindcv.utils.callbacks INFO - Total time since last epoch: 104.277067(train: 99.208602, val: 4.923371)s, ETA: 312.831202s
[2023-07-11 13:56:39] mindcv.utils.callbacks INFO - --------------------------------------------------------------------------------
[2023-07-11 13:58:17] mindcv.utils.callbacks INFO - Epoch: [198/200], batch: [5004/5004], loss: 6.907755, lr: 0.000025, time: 98.811944s
[2023-07-11 13:58:22] mindcv.utils.callbacks INFO - Validation Top_1_Accuracy: 0.1000%, Top_5_Accuracy: 0.5000%, time: 4.935712s
[2023-07-11 13:58:22] mindcv.utils.callbacks INFO - Saving model to ./ckpt/squeezenet1_0-198_5004.ckpt
[2023-07-11 13:58:23] mindcv.utils.callbacks INFO - Total time since last epoch: 103.892637(train: 98.817379, val: 4.935712)s, ETA: 207.785274s
[2023-07-11 13:58:23] mindcv.utils.callbacks INFO - --------------------------------------------------------------------------------
[2023-07-11 14:00:01] mindcv.utils.callbacks INFO - Epoch: [199/200], batch: [5004/5004], loss: 6.907755, lr: 0.000006, time: 98.633017s
[2023-07-11 14:00:06] mindcv.utils.callbacks INFO - Validation Top_1_Accuracy: 0.1000%, Top_5_Accuracy: 0.5000%, time: 4.964054s
[2023-07-11 14:00:06] mindcv.utils.callbacks INFO - Saving model to ./ckpt/squeezenet1_0-199_5004.ckpt
[2023-07-11 14:00:06] mindcv.utils.callbacks INFO - Total time since last epoch: 103.742033(train: 98.637943, val: 4.964054)s, ETA: 103.742033s
[2023-07-11 14:00:06] mindcv.utils.callbacks INFO - --------------------------------------------------------------------------------
[2023-07-11 14:01:45] mindcv.utils.callbacks INFO - Epoch: [200/200], batch: [5004/5004], loss: 6.907755, lr: 0.000000, time: 99.083169s
[2023-07-11 14:01:50] mindcv.utils.callbacks INFO - Validation Top_1_Accuracy: 0.1000%, Top_5_Accuracy: 0.5000%, time: 4.841324s
[2023-07-11 14:01:50] mindcv.utils.callbacks INFO - Saving model to ./ckpt/squeezenet1_0-200_5004.ckpt
[2023-07-11 14:01:50] mindcv.utils.callbacks INFO - Total time since last epoch: 104.066254(train: 99.087659, val: 4.841324)s, ETA: 0.000000s
[2023-07-11 14:01:50] mindcv.utils.callbacks INFO - --------------------------------------------------------------------------------
[2023-07-11 14:01:50] mindcv.utils.callbacks INFO - Finish training!
[2023-07-11 14:01:50] mindcv.utils.callbacks INFO - The best validation Top_1_Accuracy is: 0.1000% at epoch 1.
[2023-07-11 14:01:51] mindcv.utils.callbacks INFO - ================================================================================

Additional context / 备注 (Optional / 选填)
Add any other context about the problem here.

@787918582 787918582 added the bug Something isn't working label Jul 31, 2023
@IASZHT IASZHT closed this as completed Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants