-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ToDecibels GPU operator #1837
ToDecibels GPU operator #1837
Conversation
6d0b7f1
to
e82fa28
Compare
e82fa28
to
e4f19a0
Compare
auto in_view = view<const T, Dims>(input); | ||
auto out_view = view<T, Dims>(output); | ||
|
||
kernels::KernelContext ctx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we share the ctx between setup and run? I see that we set stream twice for ctx, it looks error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we don't need the stream during Setup, so I'll remove it.
Regarding sharing context, I think @mzient's intention was KernelContext to be a small object that you create every time, rather than keeping it in the instance. Am I right Michal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mzient - so what is the purpose of the context if we are keeping recreating it and sometimes it doesn't carry any state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's the context the kernel needs to run. We are also passing it to the Setup function but in most cases if not all it is not needed there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are also passing it to the Setup function
My reservation is that the ctx is different between setup and run, and I think the idea was to share a state between these two functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it's mostly for CUDA stream. If we decide to go for some suballocators (and not making them global), they would be passed this way, too. It's not for Kernel state.
}; | ||
|
||
template <typename T, int Dims> | ||
bool ToDecibelsImpl<T, Dims>::SetupImpl(std::vector<OutputDesc> &output_desc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's something I've missed in the kernel review: do we need Dims as a compile time argument at all? If we flatten the tensor in the kernel anyway, it can just as well be DynamicDimensions, which would save us some compile time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
f7b47e6
to
c05eea3
Compare
#include "dali/pipeline/util/operator_impl_utils.h" | ||
|
||
#define TO_DB_SUPPORTED_TYPES (float) | ||
#define TO_DB_SUPPORTED_NDIMS (1, 2, 3, 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
#include <string> | ||
#include <vector> | ||
#include "dali/core/common.h" | ||
#include "dali/kernels/kernel_manager.h" | ||
#include "dali/kernels/signal/decibel/to_decibels_args.h" | ||
#include "dali/pipeline/operator/common.h" | ||
#include "dali/pipeline/operator/operator.h" | ||
#include "dali/pipeline/util/operator_impl_utils.h" | ||
|
||
#define TO_DB_SUPPORTED_TYPES (float) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we need this kind of indirection for one type... And couldn't we support doubles on CPU?
c05eea3
to
e2e2de2
Compare
CI MESSAGE: [1226058]: BUILD STARTED |
CI MESSAGE: [1226058]: BUILD FAILED |
…rnel Signed-off-by: Joaquin Anton <janton@nvidia.com>
e2e2de2
to
59ccdb4
Compare
CI MESSAGE: [1226268]: BUILD STARTED |
CI MESSAGE: [1226268]: BUILD PASSED |
Why we need this PR?
Pick one, remove the rest
What happened in this PR?
Fill relevant points, put NA otherwise. Replace anything inside []
Introduced ToDecibels GPU operator, based on ToDecibels kernel and ReduceAll kernel (to calculate maximum)
Remove some leftover OpImplBase that should have been removed in a previous commit
ToDecibels operator
The operator
Python tests added
Operator documentation already present from the CPU implementation
JIRA TASK: [DALI-1345]