Skip to content

Conversation

@AlexanderKalistratov
Copy link
Collaborator

No description provided.

@AlexanderKalistratov
Copy link
Collaborator Author

@Hardcode84

"deformable_convolution_sycl",
]

"""l2-norm calculation of n vectors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments need to be added for this workload


return (
default_rng.random((batch, in_channels, in_height, in_width)).astype(dtype),
# np.ones((batch, in_channels, in_height, in_width)).astype(dtype),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please cleanup commented out code.

@@ -0,0 +1,81 @@
# Copyright 2022 Intel Corp.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this _numpy implementation is the same as the _numba_npr version. If so, we should not be adding this.

)
find_package(MKL CONFIG REQUIRED)

# target_include_directories(${py_module_name} PUBLIC ${Dpctl_INCLUDE_DIRS} $)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove commented out lines

@@ -0,0 +1,297 @@
//==- _l2_norm_sycl.cpp - Python native extension of l2-norm ===//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update comments to reflect this workload

// SPDX - License - Identifier : Apache 2.0
///
/// \file
/// The files implements a SYCL-based Python native extension for the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update comments


auto offset_y = *get_ptr_5d(offset, k_height, k_width, 2, out_height, out_width, kh, kw, 1, h, w) + h*stride_y + (kh - k_h_m)*dilation_y - (pad_y - k_h_m);
auto offset_x = *get_ptr_5d(offset, k_height, k_width, 2, out_height, out_width, kh, kw, 0, h, w) + w*stride_x + (kw - k_w_m)*dilation_x - (pad_x - k_w_m);
// auto offset_y = h*stride_y + (kh - k_h_m)*dilation_y - (pad_y - k_h_m);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented lines

auto _input = get_ptr_3d(input, in_channels, in_height, in_width, c, 0, 0);

_output[w] = bilinear(_input, in_height, in_width, offset_y, offset_x);
// _output[w] = offset_y;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove line

+ " ========================"
)
if result.error_state == 0:
if result.error_state == enums.ErrorCodes.SUCCESS:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change does not appear to be part of this workload PR. Consider adding separate PR.

@ZzEeKkAa
Copy link
Contributor

ZzEeKkAa commented May 5, 2023

@AlexanderKalistratov thank you for you PR. Could you please update implementation (@diptorupd said that you've improved it). Please also get rid of unrelated comments, update comments to match current implementation and provide PR description.

Hoping to see updates in the nearest feature!

@AlexanderKalistratov
Copy link
Collaborator Author

@ZzEeKkAa deformable convolution can't pass CI because it uses sycl implementation as reference, and there are no-sycl runs.
Also have to modify benchmark_runner.py. Otherwise it simply hangs on no-sycl runs.
https://github.com/IntelPython/dpbench/pull/168/files#diff-84ed6cf60ee660c2164668c4d4296513ccb0a6b01e124b2de6a35d6133653ad8R360

@Hardcode84
Copy link
Contributor

M size is too small compared to other WLs, I suggest to move L to M and make L even bigger

@ZzEeKkAa
Copy link
Contributor

ZzEeKkAa commented Jun 29, 2023

Thank @AlexanderKalistratov and @Hardcode84. Is it hard to add numpy implementation (using numpy features)?

@ZzEeKkAa
Copy link
Contributor

Are MKL AND TBB cmake files patched? If no and compilation is failing we need to update workflow, not provide copy of cmake files. I guess dpnp or dpctl does it in conda recepie, so we can refer there.

@Hardcode84
Copy link
Contributor

Is it hard to add numpy implementation (using numpy features)?

Numpy version will be prohibitively slow even for S size.

@AlexanderKalistratov
Copy link
Collaborator Author

@ZzEeKkAa

Is it hard to add numpy implementation (using numpy features)?

As Ivan said straightforward implementation would always fail with timeout even on S preset. I had one and then remove it.
Probably it is possible to make faster implementation, but it is quiet tricky and has no practical value.

I'd say we need option to exclude workload from run. Right now we have option to list workloads to run and now we need also option to list workloads NOT to run. And then simply exclude deformable convolution from no-sycl runs.

Are MKL AND TBB cmake files patched? If no and compilation is failing we need to update workflow, not provide copy of cmake files. I guess dpnp or dpctl does it in conda recepie, so we can refer there.

They are patched. These cmakes taken from dpnp repo.

@AlexanderKalistratov
Copy link
Collaborator Author

Also we need to handle the case when reference implementation not found more gracefully than now

@adarshyoga
Copy link
Contributor

Closing this PR. Please reopen if there are further updates.

@adarshyoga adarshyoga closed this Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants