Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StreamContext #6129

Merged
merged 15 commits into from
Sep 4, 2021
Merged

StreamContext #6129

merged 15 commits into from
Sep 4, 2021

Conversation

liujuncheng
Copy link
Collaborator

No description provided.

@liujuncheng liujuncheng changed the title [WIP]StreamContext StreamContext Sep 3, 2021
@liujuncheng liujuncheng requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 3, 2021 07:39
@oneflow-ci-bot oneflow-ci-bot removed their request for review September 3, 2021 08:10
@liujuncheng liujuncheng requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 3, 2021 09:52
@oneflow-ci-bot oneflow-ci-bot removed their request for review September 3, 2021 12:17
@liujuncheng liujuncheng requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 3, 2021 13:40
@github-actions
Copy link
Contributor

github-actions bot commented Sep 3, 2021

Speed stats:
GPU Name: GeForce GTX 1080 

OneFlow resnet50 time: 128.7ms (= 6435.9ms / 50, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 143.5ms (= 7174.4ms / 50, input_shape=[16, 3, 224, 224])
Relative speed: 1.11 (= 143.5ms / 128.7ms)

OneFlow resnet50 time: 74.9ms (= 3743.9ms / 50, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 84.1ms (= 4203.4ms / 50, input_shape=[8, 3, 224, 224])
Relative speed: 1.12 (= 84.1ms / 74.9ms)

OneFlow resnet50 time: 48.6ms (= 2427.9ms / 50, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 63.4ms (= 3170.5ms / 50, input_shape=[4, 3, 224, 224])
Relative speed: 1.31 (= 63.4ms / 48.6ms)

OneFlow resnet50 time: 42.3ms (= 2112.6ms / 50, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 47.2ms (= 2361.6ms / 50, input_shape=[2, 3, 224, 224])
Relative speed: 1.12 (= 47.2ms / 42.3ms)

OneFlow resnet50 time: 38.1ms (= 1906.9ms / 50, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 41.0ms (= 2048.1ms / 50, input_shape=[1, 3, 224, 224])
Relative speed: 1.07 (= 41.0ms / 38.1ms)

OneFlow resnet50 time: 139.7ms (= 6986.6ms / 50, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 159.1ms (= 7953.6ms / 50, input_shape=[16, 3, 224, 224], ddp, world size=2)
Relative speed: 1.14 (= 159.1ms / 139.7ms)

OneFlow resnet50 time: 92.1ms (= 4607.2ms / 50, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 109.5ms (= 5477.2ms / 50, input_shape=[8, 3, 224, 224], ddp, world size=2)
Relative speed: 1.19 (= 109.5ms / 92.1ms)

OneFlow resnet50 time: 70.2ms (= 3511.6ms / 50, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 79.1ms (= 3957.5ms / 50, input_shape=[4, 3, 224, 224], ddp, world size=2)
Relative speed: 1.13 (= 79.1ms / 70.2ms)

OneFlow resnet50 time: 64.3ms (= 3213.6ms / 50, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 72.2ms (= 3609.7ms / 50, input_shape=[2, 3, 224, 224], ddp, world size=2)
Relative speed: 1.12 (= 72.2ms / 64.3ms)

OneFlow resnet50 time: 61.6ms (= 3077.7ms / 50, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 67.1ms (= 3354.5ms / 50, input_shape=[1, 3, 224, 224], ddp, world size=2)
Relative speed: 1.09 (= 67.1ms / 61.6ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review September 3, 2021 15:11
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 3, 2021 15:46
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 3, 2021 17:14
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 3, 2021 18:19
@github-actions
Copy link
Contributor

github-actions bot commented Sep 3, 2021

CI failed, removing label automerge

@github-actions github-actions bot removed the automerge label Sep 3, 2021
@oneflow-ci-bot oneflow-ci-bot removed their request for review September 3, 2021 20:21
@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2021

Speed stats:
GPU Name: GeForce GTX 1080 

OneFlow resnet50 time: 128.7ms (= 6433.7ms / 50, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 141.4ms (= 7067.9ms / 50, input_shape=[16, 3, 224, 224])
Relative speed: 1.10 (= 141.4ms / 128.7ms)

OneFlow resnet50 time: 74.7ms (= 3737.5ms / 50, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 82.7ms (= 4134.7ms / 50, input_shape=[8, 3, 224, 224])
Relative speed: 1.11 (= 82.7ms / 74.7ms)

OneFlow resnet50 time: 48.3ms (= 2415.5ms / 50, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 58.9ms (= 2945.7ms / 50, input_shape=[4, 3, 224, 224])
Relative speed: 1.22 (= 58.9ms / 48.3ms)

OneFlow resnet50 time: 41.7ms (= 2083.4ms / 50, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 48.3ms (= 2412.6ms / 50, input_shape=[2, 3, 224, 224])
Relative speed: 1.16 (= 48.3ms / 41.7ms)

OneFlow resnet50 time: 42.4ms (= 2117.8ms / 50, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 41.2ms (= 2062.1ms / 50, input_shape=[1, 3, 224, 224])
Relative speed: 0.97 (= 41.2ms / 42.4ms)

OneFlow resnet50 time: 140.4ms (= 7021.5ms / 50, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 159.2ms (= 7958.5ms / 50, input_shape=[16, 3, 224, 224], ddp, world size=2)
Relative speed: 1.13 (= 159.2ms / 140.4ms)

OneFlow resnet50 time: 92.1ms (= 4607.4ms / 50, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 104.8ms (= 5240.8ms / 50, input_shape=[8, 3, 224, 224], ddp, world size=2)
Relative speed: 1.14 (= 104.8ms / 92.1ms)

OneFlow resnet50 time: 69.2ms (= 3462.0ms / 50, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 74.2ms (= 3711.9ms / 50, input_shape=[4, 3, 224, 224], ddp, world size=2)
Relative speed: 1.07 (= 74.2ms / 69.2ms)

OneFlow resnet50 time: 60.2ms (= 3008.6ms / 50, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.5ms (= 3575.9ms / 50, input_shape=[2, 3, 224, 224], ddp, world size=2)
Relative speed: 1.19 (= 71.5ms / 60.2ms)

OneFlow resnet50 time: 68.9ms (= 3445.4ms / 50, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 57.7ms (= 2882.8ms / 50, input_shape=[1, 3, 224, 224], ddp, world size=2)
Relative speed: 0.84 (= 57.7ms / 68.9ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review September 4, 2021 02:49
@oneflow-ci-bot oneflow-ci-bot merged commit 480c6cd into master Sep 4, 2021
@oneflow-ci-bot oneflow-ci-bot deleted the dev_stream_context branch September 4, 2021 02:50
#endif

// cublas_pmd_handle
OF_CUBLAS_CHECK(cublasCreate(&cublas_pmd_handle_));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instrusive分支的这一行代码爆

Check failed: cublasCreate(&cublas_pmh_handle_) : CUBLAS_STATUS_NOT_INITIALIZED (1)

大概是什么原因?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

栈如下,记得往右滑动才能看到。

[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose] *** Check failure stack trace: ***
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x7fcc9e2d79b3  google::LogMessage::Fail()
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x7fcc9e2dc75b  google::LogMessage::SendToLog()
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x7fcc9e2d76af  google::LogMessage::Flush()
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x7fcc9e2d7edf  google::LogMessageFatal::~LogMessageFatal()
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x7fcc9a3eb343  std::_Function_handler<>::_M_invoke()
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x7fcc9a3ef186  oneflow::Thread::Thread()
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x7fcc9a3f438d  oneflow::ThreadMgr::AddPlan()
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x7fcc999c18c6  oneflow::Runtime::Runtime()
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x7fcc9996c92a  oneflow::Oneflow::Init()
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x7fcce9f125e6  oneflow::StartLazyGlobalSession()
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x7fcce9f0f1c2  StartLazyGlobalSession()
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x7fcce9dbd447  _ZZN8pybind1112cpp_function10initializeIRPFvvEvJEJNS_4nameENS_5scopeENS_7siblingEEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE_8__invokeESL_
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x7fcce979be8f  pybind11::cpp_function::dispatcher()
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c6f78ee  cfunction_call_varargs
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c6ec75f  _PyObject_MakeTpCall
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c795fd7  _PyEval_EvalFrameDefault
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c788667  _PyFunction_Vectorcall
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c791070  _PyEval_EvalFrameDefault
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c788667  _PyFunction_Vectorcall
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c791070  _PyEval_EvalFrameDefault
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c787480  _PyEval_EvalCodeWithName
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c788a44  _PyFunction_Vectorcall
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c6f287d  PyObject_Call
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c792abe  _PyEval_EvalFrameDefault
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c787f72  _PyEval_EvalCodeWithName
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c788a44  _PyFunction_Vectorcall
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c7912cb  _PyEval_EvalFrameDefault
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c787f72  _PyEval_EvalCodeWithName
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c788a44  _PyFunction_Vectorcall
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c6f287d  PyObject_Call
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c792abe  _PyEval_EvalFrameDefault
[/home/lixinqi/miniconda3/envs/oneflow-dev-clang10-v2/bin/python3 test/ops/test_broadcast_logical_ops.py --failfast --verbose]     @     0x564b5c788667  _PyFunction_Vectorcall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants