Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensorsplit_op #7258

Merged
merged 8 commits into from
Jan 27, 2022
Merged

tensorsplit_op #7258

merged 8 commits into from
Jan 27, 2022

Conversation

lcylcy
Copy link
Contributor

@lcylcy lcylcy commented Jan 14, 2022

1

@CLAassistant
Copy link

CLAassistant commented Jan 14, 2022

CLA assistant check
All committers have signed the CLA.

TensorSplitVecFunctor() = default;
Maybe<TensorTuple> operator()(const std::shared_ptr<one::Tensor>& input,
const std::vector<int32_t>& indices_or_sections,
const int32_t& dim) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参考上个pr的comment:#7275 (comment)

std::vector<int64_t> stop(ndim);
std::vector<int64_t> step(ndim, 1);
for(int32_t i=0; i<ndim; i++){
stop[i] = input->shape()->At(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one::Tensor有个更简单的接口:input->dim(i)

reference:https://github.com/Oneflow-Inc/oneflow/blob/master/oneflow/core/framework/tensor.h#L49

output[i] = JUST(Slice(input, start, stop, step));
start[pos_dim] = end_idx;
}
stop[pos_dim] = input->shape()->At(ndim-1);
Copy link
Contributor

@wyushun wyushun Jan 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 1794 to 1795
const int32_t& indices_or_sections,
const int32_t& dim) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::vector<int64_t> stop(ndim);
std::vector<int64_t> step(ndim, 1);
for(int32_t i=0; i<ndim; i++){
stop[i] = input->shape()->At(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HsplitIntFunctor() = default;
Maybe<TensorTuple> operator()(const std::shared_ptr<one::Tensor>& input,
const int32_t& indices_or_sections) const {
int32_t ndim = input->shape()->NumAxes();
Copy link
Contributor

@wyushun wyushun Jan 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public:
HsplitIntFunctor() = default;
Maybe<TensorTuple> operator()(const std::shared_ptr<one::Tensor>& input,
const int32_t& indices_or_sections) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HsplitVecFunctor() = default;
Maybe<TensorTuple> operator()(const std::shared_ptr<one::Tensor>& input,
const std::vector<int32_t>& indices_or_sections) const {
int32_t ndim = input->shape()->NumAxes();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public:
VsplitIntFunctor() = default;
Maybe<TensorTuple> operator()(const std::shared_ptr<one::Tensor>& input,
const int32_t& indices_or_sections) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int32_t ndim = input->shape()->NumAxes();
CHECK_OR_RETURN(ndim>=2)<<"torch.vsplit requires a tensor with at least 2 dimension, but got a tensor with "<<ndim <<" dimensions!";
CHECK_OR_RETURN(indices_or_sections>0) << "indices_or_sections must greater than 0";
CHECK_OR_RETURN(input->shape()->At(0)% indices_or_sections == 0) << "torch.vsplit attempted to split along dimension " << 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider input->dim()

VsplitVecFunctor() = default;
Maybe<TensorTuple> operator()(const std::shared_ptr<one::Tensor>& input,
const std::vector<int32_t>& indices_or_sections) const {
int32_t ndim = input->shape()->NumAxes();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider input->ndim()

Comment on lines 22 to 51
class TestHsplitVec(flow.unittest.TestCase):
@autotest(check_graph=False)
def test_flow_hsplit_vec(test_case):
device = random_device()
x = random_pytorch_tensor(
ndim=4,
dim1=random(3, 6),
dim2=random(3, 6),
dim3=random(3, 6),
dim4=random(3, 6),
).to(device)
z = torch.hsplit(x, (1,2))
return z[0]

class TestHsplitInt(flow.unittest.TestCase):
@autotest(check_graph=False)
def test_flow_hsplit_int(test_case):
device = random_device()
x = random_pytorch_tensor(
ndim=4,
dim1=random(3, 6),
dim2=random(3, 6),
dim3=random(3, 6),
dim4=random(3, 6),
).to(device)
split = random(1, 3).to(int)
z = torch.hsplit(x, split)
return z[0]


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参考一下这个:#7275 (comment)

Comment on lines 22 to 51
class TestTorchSplitVec(flow.unittest.TestCase):
@autotest(check_graph=False)
def test_flow_tensor_split_vec(test_case):
device = random_device()
x = random_pytorch_tensor(
ndim=4,
dim1=random(3, 6),
dim2=random(3, 6),
dim3=random(3, 6),
dim4=random(3, 6),
).to(device)
dim = random(-3, 3).to(int)
z = torch.tensor_split(x, (1,2),dim)
return z[0]

class TestTorchSplitInt(flow.unittest.TestCase):
@autotest(check_graph=False)
def test_flow_tensor_split_int(test_case):
device = random_device()
x = random_pytorch_tensor(
ndim=4,
dim1=random(3, 6),
dim2=random(3, 6),
dim3=random(3, 6),
dim4=random(3, 6),
).to(device)
split = random(-3, 3).to(int)
dim = random(-3, 3).to(int)
z = torch.tensor_split(x, split,dim)
return z[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

Comment on lines 22 to 49
class TestVsplitVec(flow.unittest.TestCase):
@autotest(check_graph=False)
def test_flow_vsplit_vec(test_case):
device = random_device()
x = random_pytorch_tensor(
ndim=4,
dim1=random(3, 6),
dim2=random(3, 6),
dim3=random(3, 6),
dim4=random(3, 6),
).to(device)
z = torch.vsplit(x, (1,2))
return z[0]

class TestVsplitInt(flow.unittest.TestCase):
@autotest(check_graph=False)
def test_flow_vsplit_int(test_case):
device = random_device()
x = random_pytorch_tensor(
ndim=4,
dim1=random(3, 6),
dim2=random(3, 6),
dim3=random(3, 6),
dim4=random(3, 6),
).to(device)
split = random(1, 3).to(int)
z = torch.vsplit(x, split)
return z[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

Copy link
Contributor

@wyushun wyushun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review done,写的太棒了,我写了一些comments,大多都和上个pr(implement as strided)的雷同,你看情况自己酌情修改哈,我直接给你approve了,这样可以尽可能提高效率,你自己做足测试保证正确性就好~ @lcylcy

@lcylcy lcylcy changed the title first tensorsplit_op Jan 24, 2022
@lcylcy
Copy link
Contributor Author

lcylcy commented Jan 24, 2022

好的

@github-actions
Copy link
Contributor

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

@lcylcy lcylcy requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 27, 2022 02:26
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 27, 2022 03:52
@github-actions
Copy link
Contributor

CI failed when running job: cuda-module. PR label automerge has been removed

@oneflow-ci-bot oneflow-ci-bot removed their request for review January 27, 2022 05:12
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

✔️ OneFlow resnet50 time: 136.6ms (= 13656.7ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 139.4ms (= 13943.8ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.02 (= 139.4ms / 136.6ms)

✔️ OneFlow resnet50 time: 78.6ms (= 7855.6ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 83.6ms (= 8357.3ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.06 (= 83.6ms / 78.6ms)

OneFlow resnet50 time: 52.9ms (= 10581.3ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 57.4ms (= 11486.2ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.09 (= 57.4ms / 52.9ms)

OneFlow resnet50 time: 41.7ms (= 8343.9ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 48.0ms (= 9597.2ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.15 (= 48.0ms / 41.7ms)

OneFlow resnet50 time: 40.9ms (= 8175.1ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 38.1ms (= 7614.3ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 0.93 (= 38.1ms / 40.9ms)

✔️ OneFlow resnet50 time: 148.9ms (= 14888.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 158.8ms (= 15884.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.07 (= 158.8ms / 148.9ms)

OneFlow resnet50 time: 90.0ms (= 9004.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.2ms (= 10117.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.12 (= 101.2ms / 90.0ms)

OneFlow resnet50 time: 65.7ms (= 13137.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 72.8ms (= 14555.0ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.11 (= 72.8ms / 65.7ms)

OneFlow resnet50 time: 52.6ms (= 10526.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 62.3ms (= 12454.1ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.18 (= 62.3ms / 52.6ms)

OneFlow resnet50 time: 57.4ms (= 11474.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 57.9ms (= 11583.3ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.01 (= 57.9ms / 57.4ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review January 27, 2022 07:48
@oneflow-ci-bot oneflow-ci-bot merged commit e11cd60 into master Jan 27, 2022
@oneflow-ci-bot oneflow-ci-bot deleted the lcy_tensorsplit branch January 27, 2022 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants