Skip to content

Commit

Permalink
Dev randint refine (#5981)
Browse files Browse the repository at this point in the history
* disable backward pass consistent tensor meta check. (#5871)

* disable backward pass consistent tensor meta check.

* auto format by CI

Co-authored-by: binbinHan <han_binbin@163.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* ddp broadcast params and buffers (#5913)

* ddp broadcast params and buffers

Signed-off-by: daquexian <daquexian566@gmail.com>

* auto format by CI

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* add clang tidy target (#5957)

* add clang tidy target

* fix a bug

* refine

* refine

* reformat

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* cfg: add move assignment operator for performance (#5962)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* add zhangshen op-test (#5600)

* add some op-test

* fix dims_error in my branch

* Fix the bad backward kernel function by using 'cuda::atomic::Add' (#5614)

* Test `nn.AdaptiveAvgPoolXd` (#5615)

* Fix the bad backward kernel function by using 'cuda::atomic::Add'

* Support the 'NoneType' annotation

* Support objects of 'collections.abc.Iterable' as 'output_size'

* Test with all cases of 'output_size'

* Update adaptive_pool_gpu_kernel.cu

* Skip testing `nn.AdaptiveAvgPool3d` for the current PyTorch

* remove some useless test

* Format TODO

* Add the assertion messages for 'output_size'

* Reformat codes

* Remove raw tests for `flow.negative`

* Remove unnecessary codes and add the assertion messages

* Merge updates for 'generators.py' from master

* Remove unnecessary 'random()'

* Delete the separate test for `AvgPool2d`

* Fix import paths

* Fix import problems

* Remove the PyTorch import

* Denote the annotations for `tile` and `repeat` ops

* Add the test for `nn.AvgPool1d`

* Choose better generators for `nn.MaxPoolXd`

* Randomly choose `dilation` and default values

* auto format by CI

* Test more kwargs for `nn.AvgPoolXd`

* Add tests for `return_indices`

* auto format by CI

Co-authored-by: Tianyu Zhao <guikarist@gmail.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* fix wrong names (#5951)

* fix wrong names

* auto format by CI

* refine

* auto format by CI

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Enable more checkers for clang-tidy in CI (#5738)

* CI: enable more checkers for clang-tidy

* .clang-tidy: remove cppcoreguidelines-pro-type-vararg

* CI: remove duplicate checkers

* CI: remove clang-analyzer-alpha.deadcode.*

* .clang-tidy: add performance-*

* oneflow/core/eager: remove unnecessary malloc & free

* .clang-tidy: add clang-analyzer-cplusplus.* to werror

* user_kernel: remove useless move

* quantization_aware_training: fix move return

* .clang-tidy: add google-*

* CI: fix clang tidy command

* CI: fix test

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Feat grad mode classes (#5956)

* feat(no_grad): support no_grad decorator

* feat(AutogradMode): export flow.autograd_mode

* feat(GradMode): export some grad_mode class

* docs(GradMode): export documents

* refine

* docs(GradMode): export document for is_grad_enabled

* auto format by CI

* fix(GradMode): fix single client bug

* fix bug

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* extract_consistent_to_consistent_op_expr (#5870)

* abstract_consistent_to_consistent_op_expr

* fix compiler complaint

* refactor consistent-to-consistent eager consisitent op interpreter

* fix compiler complaint

* refactor ConsistentToConsistentOpExpr

* lazy interpreter (#5903)

* fix bugs about consistent_id

* refactor functional::ToConsistent

* refactor GetNdSbp

* Update eager_consistent_op_interpreter.cpp

* Update eager_mirrored_op_interpreter.cpp

* fix error

* fix error

* auto format by CI

* Update nd_sbp.h

* refine identity boxing

* fix sync checkmeta error

* avoid consistent id check in lazy

Co-authored-by: Xinqi Li <lixinqi0703106@163.com>
Co-authored-by: leaves-zwx <kunta0932@gmail.com>
Co-authored-by: Li Xinqi <lixinqi2010@gmail.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* add CMAKE_INTERPROCEDURAL_OPTIMIZATION in fast cmake cache (#5970)

* add CMAKE_INTERPROCEDURAL_OPTIMIZATION in fast cmake cache

* skip test targets of re2

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* check: fix clang-tidy-diff commands (#5972)

* check: fix clang-tidy-diff commands

* CI: fix step names

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Cpu mpi (#5865)

* cuda base cpu mpi boxing

* cpu_mpi

* fix conflicts

* add cpu mpi unittests

* more checks and unittests

* abstract_consistent_to_consistent_op_expr

* fix compiler complaint

* refactor consistent-to-consistent eager consisitent op interpreter

* fix compiler complaint

* refactor ConsistentToConsistentOpExpr

* lazy interpreter (#5903)

* fix bugs about consistent_id

* more test_consistent_cast unittests

* refactor functional::ToConsistent

* refactor GetNdSbp

* fix compiler complaints

* refactor GetDevice4CurrentProcessCtx

* fix error

Co-authored-by: clackhan <han_binbin@163.com>
Co-authored-by: leaves-zwx <kunta0932@gmail.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* fix_bug_test_tensor_str (#5958)

* fix bug int test_tensor_str

* format

* fix comment

* fix bug to(cuda) is unavailable in cpu env

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* common/error: fix build error in mac (#5971)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Prevent running oneflow in forked subprocess (#5976)

* prevent_running_oneflow_in_forked_subprocess

* add line change

* IsFork => IsForkedSubProcess

* auto format by CI

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* refine randint

Co-authored-by: Li Xinqi <lixinqi2010@gmail.com>
Co-authored-by: binbinHan <han_binbin@163.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: daquexian <daquexian566@gmail.com>
Co-authored-by: Peihong Liu <mosout@qq.com>
Co-authored-by: Twice <i@twice.moe>
Co-authored-by: ZhangShen <55383772+zhangshen12356@users.noreply.github.com>
Co-authored-by: Tianyu Zhao <guikarist@gmail.com>
Co-authored-by: Luyang <flowingsun007@163.com>
Co-authored-by: Yinggang Wang <wyg19970408@gmail.com>
Co-authored-by: Xinqi Li <lixinqi0703106@163.com>
Co-authored-by: leaves-zwx <kunta0932@gmail.com>
Co-authored-by: Shenghang Tsai <jackalcooper@gmail.com>
Co-authored-by: liufengwei0103 <2472937968@qq.com>
  • Loading branch information
16 people committed Aug 20, 2021
1 parent 038320c commit 671f4f5
Show file tree
Hide file tree
Showing 92 changed files with 1,943 additions and 2,377 deletions.
17 changes: 14 additions & 3 deletions .clang-tidy
Original file line number Diff line number Diff line change
@@ -1,4 +1,15 @@
# maybe-* checks are only available on OneFlow custom clang-tidy and clangd
Checks: '-*, maybe-*'
# `maybe-*` checks are only available on OneFlow custom clang-tidy and clangd
# `-allow-enabling-analyzer-alpha-checkers` should be passed to clang-tidy for CSA checkers named `clang-analyzer-alpha.*` (or `-allow-enabling-alpha-checkers` for run-clang-tidy.py)
# `aggressive-binary-operation-simplification` should be enabled (via `-Xclang -analyzer-config -Xclang aggressive-binary-operation-simplification=true` in clang)
# there is some problem in `clang-analyzer-alpha.clone.*`, so do not enable it
# `clang-analyzer-alpha.deadcode.*` is just too verbose to enable
Checks: '-*, maybe-*, clang-analyzer-core.*, clang-analyzer-cplusplus.*, clang-analyzer-nullability.*, clang-analyzer-deadcode.*, clang-analyzer-security.*, clang-analyzer-optin.cplusplus.*, clang-analyzer-optin.performance.*, clang-analyzer-alpha.core.*, clang-analyzer-alpha.cplusplus.*, clang-analyzer-alpha.security.*, cppcoreguidelines-avoid-goto, cppcoreguidelines-init-variables, cppcoreguidelines-interfaces-global-init, cppcoreguidelines-no-malloc, cppcoreguidelines-prefer-member-initializer, cppcoreguidelines-pro-type-member-init, cppcoreguidelines-pro-type-static-cast-downcast, cppcoreguidelines-slicing, cppcoreguidelines-special-member-functions, performance-unnecessary-value-param, performance-unnecessary-copy-initialization, performance-noexcept-move-constructor, performance-no-automatic-move, performance-move-const-arg, performance-implicit-conversion-in-loop, performance-for-range-copy, google-default-arguments, google-global-names-in-headers, google-explicit-constructor'
# TODO: treat all maybe warnings as errors when existing warnings are all fixed
WarningsAsErrors: 'maybe-unused'
WarningsAsErrors: 'maybe-unused, clang-analyzer-nullability.*, clang-analyzer-cplusplus.*, performance-implicit-conversion-in-loop, performance-move-const-arg, performance-no-automatic-move, performance-noexcept-move-constructor, google-default-arguments, google-global-names-in-headers'

CheckOptions:
# `cppcoreguidelines-special-member-functions` is enabled, refer to https://en.cppreference.com/w/cpp/language/rule_of_three
- key: cppcoreguidelines-special-member-functions.AllowSoleDefaultDtor
value: True
- key: performance-move-const-arg.CheckTriviallyCopyableMove
value: False
4 changes: 2 additions & 2 deletions .github/workflows/simple.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ jobs:
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_TESTING=ON
cmake --build . -j$(nproc) --target of_git_version oneflow_deps generate_functional of_cfgobj generate_py_cfg
- name: Run Maybe-related checks by clang-tidy
- name: Run clang-tidy for all translation units
# use clang as compiler for correct compiler flags
run: |
cd build
Expand All @@ -62,7 +62,7 @@ jobs:
-DBUILD_TESTING=ON \
-DCMAKE_EXPORT_COMPILE_COMMANDS=ON
cd ..
./run-clang-tidy.py -clang-tidy-binary ./clang-tidy-489012f-x86_64.AppImage -p build -quiet
./run-clang-tidy.py -clang-tidy-binary ./clang-tidy-489012f-x86_64.AppImage -p build -quiet -allow-enabling-alpha-checkers -extra-arg="-Xclang" -extra-arg="-analyzer-config" -extra-arg="-Xclang" -extra-arg="aggressive-binary-operation-simplification=true" '^((?!third_party_install).)+(?<!cfg.cpp)(?<!pb.cc)$'
hosted:
name: CPU-only
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -749,7 +749,7 @@ jobs:
run: |
git remote add upstream https://github.com/Oneflow-Inc/oneflow
git fetch upstream
- name: Run Maybe-related checks by clang-tidy
- name: Run clang-tidy for modified files
# use clang as compiler for correct compiler flags
run: |
cd build
Expand All @@ -761,4 +761,4 @@ jobs:
-DBUILD_TESTING=ON \
-DCMAKE_EXPORT_COMPILE_COMMANDS=ON
cd ..
git diff -U0 ${{ github.event.pull_request.base.sha }} | ./clang-tidy-diff.py -clang-tidy-binary ./clang-tidy-489012f-x86_64.AppImage -path build -quiet -j $(nproc) -p1
git diff -U0 ${{ github.event.pull_request.base.sha }} | ./clang-tidy-diff.py -clang-tidy-binary ./clang-tidy-489012f-x86_64.AppImage -path build -allow-enabling-alpha-checkers -j $(nproc) -p1 -extra-arg="-Xclang" -extra-arg="-analyzer-config" -extra-arg="-Xclang" -extra-arg="aggressive-binary-operation-simplification=true"
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ if (NOT THIRD_PARTY AND NOT ONEFLOW)
endif()

option(USE_CLANG_FORMAT "" OFF)
option(USE_CLANG_TIDY "" OFF)
option(BUILD_RDMA "" OFF)
option(BUILD_CUDA "" ON)
option(BUILD_TESTING "" OFF)
Expand Down
97 changes: 97 additions & 0 deletions ci/check/run_clang_tidy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
#!/usr/bin/env python2
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import asyncio
import argparse
import subprocess
import os


def split_and_print(prefix, text):
lines = text.decode().splitlines(keepends=True)
prefixed = ""
for l in lines:
prefixed += f"{prefix} {l.strip()}"
if l.strip():
print(prefixed, flush=True)


async def handle_stream(stream, cb):
while True:
line = await stream.readline()
if line:
cb(line)
else:
break


async def run_command(cmd=None, dry=False, name=None):
if dry:
print(f"[dry] {cmd}")
return 0
process = await asyncio.create_subprocess_shell(
cmd, stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE,
)
l = lambda x: split_and_print(f"[{name}]" if name else "", x)
await asyncio.gather(
handle_stream(process.stdout, l), handle_stream(process.stderr, l),
)
await process.wait()
return process.returncode


def download(build_dir, dry=False):
urls = [
"https://github.com/Oneflow-Inc/llvm-project/releases/download/latest/clang-tidy-489012f-x86_64.AppImage"
if os.getenv("CI")
else "https://oneflow-static.oss-cn-beijing.aliyuncs.com/bin/clang-tidy/linux-x86_64/clang-tidy.AppImage",
"https://raw.githubusercontent.com/oneflow-inc/llvm-project/maybe/clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py",
]
dst_dir = f"{build_dir}/cache/bin"
dst = [f"{dst_dir}/clang-tidy", f"{dst_dir}/clang-tidy-diff.py"]
if dry:
if os.path.isfile(dst[0]) and os.path.isfile(dst[1]):
return dst
else:
None
else:
assert subprocess.call(f"mkdir -p {dst_dir}", shell=True) == 0
for i, _dst in enumerate(dst):
assert subprocess.call(f"curl -L {urls[i]} -o {_dst}", shell=True) == 0
assert subprocess.call(f"chmod +x {_dst}", shell=True) == 0
return dst


if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Runs clang-tidy on all of the source files."
)
parser.add_argument(
"--build_dir", required=True,
)
args = parser.parse_args()
loop = asyncio.get_event_loop()
downloaded = download(args.build_dir, dry=True)
if downloaded is None:
downloaded = download(args.build_dir)
promises = [
run_command(
f"cd .. && git diff -U0 master | {downloaded[1]} -clang-tidy-binary {downloaded[0]} -path {args.build_dir} -j $(nproc) -p1 -allow-enabling-alpha-checkers -extra-arg=-Xclang -extra-arg=-analyzer-config -extra-arg=-Xclang -extra-arg=aggressive-binary-operation-simplification=true"
)
]
loop.run_until_complete(asyncio.gather(*promises))
1 change: 1 addition & 0 deletions cmake/caches/cn/fast/cpu.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ set(CMAKE_BUILD_TYPE RelWithDebInfo CACHE STRING "")
set(CMAKE_GENERATOR Ninja CACHE STRING "")
set(CMAKE_C_COMPILER_LAUNCHER sccache CACHE STRING "")
set(CMAKE_CXX_COMPILER_LAUNCHER sccache CACHE STRING "")
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION OFF CACHE BOOL "")
1 change: 1 addition & 0 deletions cmake/caches/cn/fast/cuda-61.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ set(CUDA_NVCC_GENCODES "arch=compute_61,code=sm_61" CACHE STRING "")
set(CMAKE_C_COMPILER_LAUNCHER sccache CACHE STRING "")
set(CMAKE_CXX_COMPILER_LAUNCHER sccache CACHE STRING "")
set(CMAKE_CUDA_COMPILER_LAUNCHER sccache CACHE STRING "")
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION OFF CACHE BOOL "")
1 change: 1 addition & 0 deletions cmake/caches/cn/fast/cuda-75.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ set(CUDA_NVCC_GENCODES "arch=compute_75,code=sm_75" CACHE STRING "")
set(CMAKE_C_COMPILER_LAUNCHER sccache CACHE STRING "")
set(CMAKE_CXX_COMPILER_LAUNCHER sccache CACHE STRING "")
set(CMAKE_CUDA_COMPILER_LAUNCHER sccache CACHE STRING "")
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION OFF CACHE BOOL "")
9 changes: 8 additions & 1 deletion cmake/oneflow.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,11 @@ add_custom_target(of_format
COMMAND ${Python_EXECUTABLE} ${CMAKE_CURRENT_SOURCE_DIR}/ci/check/run_clang_format.py --source_dir ${CMAKE_CURRENT_SOURCE_DIR}/oneflow --fix --quiet
COMMAND ${Python_EXECUTABLE} ${CMAKE_CURRENT_SOURCE_DIR}/ci/check/run_py_format.py --source_dir ${CMAKE_CURRENT_SOURCE_DIR} --fix
)

# clang tidy
add_custom_target(of_tidy
COMMAND ${Python_EXECUTABLE} ${CMAKE_SOURCE_DIR}/ci/check/run_clang_tidy.py --build_dir ${CMAKE_BINARY_DIR}
DEPENDS of_git_version oneflow_deps generate_functional of_cfgobj generate_py_cfg
)
# generate version
set(OF_GIT_VERSION_DIR ${CMAKE_CURRENT_BINARY_DIR}/of_git_version)
set(OF_GIT_VERSION_FILE ${OF_GIT_VERSION_DIR}/version.cpp)
Expand Down Expand Up @@ -292,6 +296,9 @@ add_dependencies(of_ccobj of_git_version)
if (USE_CLANG_FORMAT)
add_dependencies(of_ccobj of_format)
endif()
if (USE_CLANG_TIDY)
add_dependencies(of_ccobj of_tidy)
endif()

target_link_libraries(of_ccobj of_protoobj of_cfgobj ${ONEFLOW_CUDA_LIBS} glog_imported)

Expand Down
1 change: 1 addition & 0 deletions cmake/third_party/re2.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -29,5 +29,6 @@ if (THIRD_PARTY)
-DCMAKE_INSTALL_PREFIX:PATH=${RE2_INSTALL_DIR}
-DCMAKE_INSTALL_LIBDIR:PATH=${RE2_LIBRARY_DIR}
-DCMAKE_POSITION_INDEPENDENT_CODE:BOOL=ON
-DRE2_BUILD_TESTING:BOOL=OFF
-DCMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE})
endif (THIRD_PARTY)
4 changes: 4 additions & 0 deletions docs/source/oneflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -123,5 +123,9 @@ oneflow
zeros,
zeros_like,
is_nonzero,
no_grad,
grad_enable,
inference_mode,
is_grad_enabled,

.. autofunction:: oneflow.data.load_mnist(train_batch_size=100, test_batch_size=100, data_format='NCHW')
4 changes: 2 additions & 2 deletions oneflow/api/python/autograd/autograd.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ Maybe<one::TensorTuple> Backward(const one::TensorTuple& outputs, const one::Ten
bool retain_graph, bool create_graph) {
if (create_graph) { retain_graph = true; }
std::shared_ptr<one::TensorTuple> gradients = JUST(CheckAndInitOutGrads(outputs, out_grads));
JUST(one::GetThreadLocalAutogradEngine()->RunBackwardAndSaveGrads4LeafTensor(
JUST(one::GetThreadLocalAutogradEngine()->RunBackwardAndSaveGrads4LeafTensorIf(
outputs, *gradients, retain_graph, create_graph));
return std::make_shared<one::TensorTuple>(0);
}
Expand All @@ -86,7 +86,7 @@ Maybe<one::TensorTuple> Grad(const one::TensorTuple& outputs, const one::TensorT
[](const std::shared_ptr<one::Tensor>& tensor) { return tensor->requires_grad(); }))
<< "All input tensors `.requires_grad` should be true";
std::shared_ptr<one::TensorTuple> gradients = JUST(CheckAndInitOutGrads(outputs, out_grads));
return one::GetThreadLocalAutogradEngine()->RunBackwardAndReturnInputsTensorGrad(
return one::GetThreadLocalAutogradEngine()->RunBackwardAndReturnInputsTensorGradIf(
outputs, inputs, *gradients, retain_graph, create_graph);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,12 @@ namespace oneflow {
namespace autograd {

ONEFLOW_API_PYBIND11_MODULE("autograd", m) {
py::class_<NoGradGuard, std::shared_ptr<NoGradGuard>>(m, "no_grad")
.def(py::init([]() { return std::make_shared<NoGradGuard>(); }))
.def("__enter__", [](const NoGradGuard& no_grad_obj) {})
.def("__exit__", [](const NoGradGuard& no_grad_obj, const py::object& type,
py::class_<AutoGradMode, std::shared_ptr<AutoGradMode>>(m, "AutoGradMode")
.def(py::init([](bool mode) { return std::make_shared<AutoGradMode>(mode); }))
.def("__enter__", [](const AutoGradMode& no_grad_obj) {})
.def("__exit__", [](const AutoGradMode& no_grad_obj, const py::object& type,
const py::object& value, const py::object& traceback) {});
m.def("is_grad_enabled", &GradMode::is_enabled);
}

} // namespace autograd
Expand Down
11 changes: 5 additions & 6 deletions oneflow/api/python/framework/tensor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ limitations under the License.
#include "oneflow/core/framework/tensor_method.h"
#include "oneflow/core/framework/device.h"
#include "oneflow/core/framework/stride.h"
#include "oneflow/core/framework/nd_sbp.h"
#include "oneflow/core/framework/py_distribute.h"
#include "oneflow/core/functional/value_types.h"
#include "oneflow/core/job/placement.cfg.h"
Expand Down Expand Up @@ -299,9 +300,8 @@ Maybe<Tensor> NewTensor(py::args args, py::kwargs kwargs, Symbol<DType> desired_
if (other_tensor->is_local()) {
if (placement) {
// LocalTensor -> ConsistentTensor
tensor = JUST(functional::ToConsistent(other_tensor, placement, sbp_tuple,
/* identity_grad */ false,
/* grad_sbp_parallels */ {}));
tensor =
JUST(functional::ToConsistent(other_tensor, placement, sbp_tuple, GetNoneSbpList()));
} else {
// LocalTensor -> LocalTensor
if (!device) { device = JUST(Device::New("cpu")); }
Expand All @@ -310,9 +310,8 @@ Maybe<Tensor> NewTensor(py::args args, py::kwargs kwargs, Symbol<DType> desired_
} else {
if (placement) {
// ConsistentTensor -> ConsistentTensor
tensor = JUST(functional::ToConsistent(other_tensor, placement, sbp_tuple,
/* identity_grad */ false,
/* grad_sbp_parallels */ {}));
tensor =
JUST(functional::ToConsistent(other_tensor, placement, sbp_tuple, GetNoneSbpList()));
} else {
// ConsistentTensor -> LocalTensor
tensor = JUST(functional::ConsistentToLocal(other_tensor));
Expand Down
49 changes: 1 addition & 48 deletions oneflow/api/python/symbol/placement_symbol.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -40,53 +40,6 @@ Maybe<Shape> MakeShape(const py::tuple& py_shape) {
return std::make_shared<Shape>(shape_dims);
}

std::string SerializePlacementSymbol2String(Symbol<ParallelDesc> placement) {
std::string device_type = placement->device_tag() == "gpu" ? "\"cuda\"" : "\"cpu\"";
std::vector<int64_t> sorted_node_ids;
HashMap<int64_t, std::vector<int64_t>> node_id2sorted_dev_phy_ids;
for (int64_t machine_id : placement->sorted_machine_ids()) {
int64_t node_id = GlobalProcessCtx::NodeId(machine_id);
if (!std::count(sorted_node_ids.begin(), sorted_node_ids.end(), node_id)) {
sorted_node_ids.push_back(node_id);
}
for (int64_t device_id : placement->sorted_dev_phy_ids(machine_id)) {
node_id2sorted_dev_phy_ids[node_id].push_back(device_id);
}
}
std::string machine_device_ids = "{";
int64_t node_idx = 0;
for (int64_t node_id : sorted_node_ids) {
std::string device_name = std::to_string(node_id) + " : [";
int64_t device_idx = 0;
for (int64_t device_id : node_id2sorted_dev_phy_ids.at(node_id)) {
device_name += std::to_string(device_id);
if (++device_idx != node_id2sorted_dev_phy_ids.at(node_id).size()) { device_name += ", "; }
}
device_name += "]";
if (++node_idx != sorted_node_ids.size()) { device_name += ", "; }
machine_device_ids += device_name;
}
machine_device_ids += "}";
std::string hierarchy = "(";
int32_t hierarchy_dim_idx = 0;
for (int64_t dim : placement->hierarchy()->dim_vec()) {
hierarchy += std::to_string(dim);
if (++hierarchy_dim_idx != placement->hierarchy()->dim_vec().size()) {
hierarchy += ", ";
} else if (placement->hierarchy()->dim_vec().size() == 1) {
hierarchy += ",";
}
}
hierarchy += ")";
std::string placement_str = "oneflow.placement(device_type=" + device_type
+ ", machine_device_ids=" + machine_device_ids
+ ", hierarchy=" + hierarchy + ")";
return placement_str;
}

auto* CachedSerializePlacementSymbol2String =
DECORATE(&SerializePlacementSymbol2String, ThreadLocal);

struct PlacementSymbolExportUtil {
static std::shared_ptr<ParallelDesc> ApiCreatePlacementSymbol(
int64_t symbol_id, const std::shared_ptr<cfg::ParallelConf>& symbol_conf) {
Expand Down Expand Up @@ -207,7 +160,7 @@ struct PlacementSymbolExportUtil {
}

static std::string PlacementSymbol2String(Symbol<ParallelDesc> placement) {
return CachedSerializePlacementSymbol2String(placement);
return *PlacementToString(placement).GetPtrOrThrow();
}

static Maybe<Symbol<ParallelDesc>> ReplacePlacementDeviceTag(Symbol<ParallelDesc> parallel_desc,
Expand Down
13 changes: 2 additions & 11 deletions oneflow/api/python/symbol/sbp_symbol.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ limitations under the License.
#include "oneflow/core/common/constant.h"
#include "oneflow/core/common/maybe.h"
#include "oneflow/core/common/symbol.h"
#include "oneflow/core/framework/nd_sbp.h"
#include "oneflow/core/job/sbp_parallel.cfg.h"
#include "oneflow/core/job/sbp_parallel.h"

Expand All @@ -30,17 +31,7 @@ namespace oneflow {
namespace {

std::string SbpParallelSymbolToString(const Symbol<cfg::SbpParallel>& sbp_sym) {
std::string sbp_str = "oneflow.sbp.";
if (sbp_sym->has_broadcast_parallel()) {
sbp_str += "broadcast";
} else if (sbp_sym->has_partial_sum_parallel()) {
sbp_str += "partial_sum";
} else if (sbp_sym->has_split_parallel()) {
sbp_str += "split(axis=" + std::to_string(sbp_sym->split_parallel().axis()) + ")";
} else {
UNIMPLEMENTED();
}
return sbp_str;
return *SbpToString(sbp_sym).GetPtrOrThrow();
}

Maybe<std::vector<Symbol<cfg::SbpParallel>>> MakeSplitSbpParallelList(int max_split_axis) {
Expand Down

0 comments on commit 671f4f5

Please sign in to comment.