-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove CUDA headers and generate stubs in runtime #2420
Conversation
cd59368
to
645ab70
Compare
CI MESSAGE: [1749684]: BUILD STARTED |
CI MESSAGE: [1749684]: BUILD FAILED |
This pull request introduces 1 alert when merging 8ed0dabe24242d23b05c8d103e0b28f9eb4c7161 into 359a6a5 - view on LGTM.com new alerts:
|
dc49beb
to
77aa257
Compare
This pull request introduces 1 alert when merging 77aa257045fa8ad2eeb8d5c5eeab6da887cf011d into 359a6a5 - view on LGTM.com new alerts:
|
77aa257
to
df4e19f
Compare
CI MESSAGE: [1755428]: BUILD STARTED |
CI MESSAGE: [1755428]: BUILD FAILED |
df4e19f
to
f19fd54
Compare
CI MESSAGE: [1757622]: BUILD STARTED |
tools/stub_generator/stub_codegen.py
Outdated
raise Exception(str(diag)) | ||
|
||
args.output.write('#include <string>\n') | ||
args.output.write('#include <cuda.h>\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need that for other headers (nvcuvid, optical flow, etc...)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All uses cuda.h defined types.
string is used symbol loader function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is: we must include the relevant header anyway (e.g. to get the types) and it should include cuda.h
if it's necessary, so this inclusion is redundant. Meanwhile, if we use this tool for something else entirely, then keeping cuda.h may be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving to json file.
tools/stub_generator/stub_codegen.py
Outdated
args.output.write(prolog.format(args.unique_prefix)) | ||
|
||
for cursor in translation_unit.cursor.get_children(): | ||
if cursor.kind != clang.cindex.CursorKind.FUNCTION_DECL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check if there's a definition of that function in this translation unit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I hope.
tools/stub_generator/stub_codegen.py
Outdated
if cursor.spelling not in config['functions']: | ||
continue | ||
|
||
with open(cursor.location.file.name, 'r', encoding='latin-1') as file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
latin-1? Why not utf-8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Apparently cuda.h is not utf-8, xavier build yields strange result. Switching back.
tools/stub_generator/stub_codegen.py
Outdated
if cursor.kind != clang.cindex.CursorKind.FUNCTION_DECL: | ||
continue | ||
|
||
if cursor.spelling not in config['functions'] or cursor.is_definition(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6/10. This would work as long as there's no declaration followed by definition. Also, I think that C allows multiple declarations of the same function (not that I expect that in the headers we're interested in).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I hope.
CI MESSAGE: [1757622]: BUILD FAILED |
ad1250e
to
0966726
Compare
CI MESSAGE: [1758673]: BUILD STARTED |
@@ -2978,3 +2978,209 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | |||
THE SOFTWARE. | |||
|
|||
======================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe say what it is that we are using?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See L2984 - Copyright 2020 The TensorFlow Runtime Authors. All rights reserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, I'd add a line like - otherwise it looks like we're using the whole runtime
Dynamic loader generator for shared libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH - I don't know if anyone is using such convention in the Acknowledgements. Usually it is this project is using parts of...
, and the appropriate license is copied. Also we may use something else later and I don't think we will remember to update this file.
// | ||
// Licensed under the Apache License, Version 2.0 (the "License"); | ||
// you may not use this file except in compliance with the License. | ||
// You may obtain a copy of the License at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering. What's the purpose of this file now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have cuInitChecked
which opens/loads the lib itself.
#endif /* __cplusplus */ | ||
|
||
#endif | ||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happened here? I understand this is not our file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the file and the license, also updated the line endings to be unix one.
|
||
#endif | ||
/* | ||
* Copyright(c) 2020, NVIDIA CORPORATION.All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above.
// it is defined in the generated file | ||
typedef void *tLoadSymbol(const std::string &name); | ||
void NvcuvidSetSymbolLoader(tLoadSymbol loader_func); | ||
void Nvcuvid_2SetSymbolLoader(tLoadSymbol loader_func); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why there's a _2
in the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two separated dynlink implementations, each have a separate *SetSymbolLoader
function.
static std::unordered_map<std::string, void*> symbol_map; | ||
std::lock_guard<std::mutex> lock(symbol_mutex); | ||
auto it = symbol_map.find(name); | ||
if (it == symbol_map.end()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like that a function called "is symbol availave" is modying the symbol map. Can you explain why is that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It lazy loads the symbols to the map only for the purpose of checking is it is available. It is not very frequently used functionality, so it won't happen very often.
I don't have a good idea how to move it to the generated file and not affect the perf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current implementation of generated functions is:
using FuncPtr = CUresult (CUDAAPI *)(CUresult, const char **);
static auto func_ptr = reinterpret_cast<FuncPtr>(load_symbol_func("cuGetErrorString"));
if (!func_ptr) return CUDA_ERROR_SHARED_OBJECT_SYMBOL_NOT_FOUND;
return func_ptr(error, pStr);
}
If we add a map there we would need add a sync point for that.
We cannot easily expose the f_pointer itself as well from it to check the value.
No idea how to approach differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just commenting on the name. This function is actually loading symbols. Perhaps it should be called something like LoadSymbolIfNeeded
to better reflect what it does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it doesn't load the symbol either, in the way that the first invocation of the relevant function will call again LoadSymbol
.
I would keep the name as it is as this is the purpose of the function - check if a function with the given name is available, not to load it. One day we may rework the stub implementation but the external API would rather stay the same.
|
||
# there is a one cuvidGetVideoSourceState function that doesn't follow the returned value convention | ||
# so need a dedicated stub for it | ||
set(NVCUVID_2_GENERATED_STUB "${CMAKE_CURRENT_BINARY_DIR}/dynlink_nvcuvid_2_gen.cc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we have a _2
stub?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a one method that doesn't follow the common and we need a separate json for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out wrappers returns CUresult and CUDA_ERROR_SHARED_OBJECT_SYMBOL_NOT_FOUND is symbol is not available.
cuvidGetVideoSourceState
returns cudaVideoState so we need to return cudaVideoState_Error in case of lack of the symbol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we add one more level of JSON that would aggregate functions, return type and error code? This doesn't seem too complicated - we'd have a dictionary that maps a function to a proper return type and error code and we'd generate stubs with these. I think we might run into this issue in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
CI MESSAGE: [1758777]: BUILD STARTED |
CI MESSAGE: [1758777]: BUILD FAILED |
127fbe9
to
5d847f7
Compare
CI MESSAGE: [1760163]: BUILD STARTED |
CI MESSAGE: [1760163]: BUILD FAILED |
CI MESSAGE: [1777411]: BUILD FAILED |
CI MESSAGE: [1777423]: BUILD FAILED |
CI MESSAGE: [1782650]: BUILD STARTED |
CI MESSAGE: [1782661]: BUILD STARTED |
CI MESSAGE: [1782650]: BUILD FAILED |
CI MESSAGE: [1782661]: BUILD PASSED |
static std::unordered_map<std::string, void*> symbol_map; | ||
std::lock_guard<std::mutex> lock(symbol_mutex); | ||
auto it = symbol_map.find(name); | ||
if (it == symbol_map.end()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just commenting on the name. This function is actually loading symbols. Perhaps it should be called something like LoadSymbolIfNeeded
to better reflect what it does
"cuGetErrorName": {}, | ||
"cuGetErrorString": {}, | ||
"cuDriverGetVersion": {}, | ||
"cuDeviceGetCount": {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question: Can this be somehow generated (offline is fine) by listing the symbols from the library? I am wondering if we need to manually add any new symbol that we'd like to use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's something I've already suggested - to have a regex pattern. But maybe let's merge it as-is and improve later - at least more people will have an opportunity to do that when they have the code in their working copies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can try to regexp inside stub_codegen.py
. But we can think about this in the next step, now it provides the same functionality as we used to have - hardcoded list of supported functions.
extra_args = args.extra_args | ||
|
||
translation_unit = index.parse(header, args=extra_args) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a mix of 2-space and 4-space indentations here. Can you unify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
CI MESSAGE: [1783091]: BUILD STARTED |
- removes files that should not be in the repo due to license - adds a step that generates dynlink_cuda during the build based on the available headers Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
CI MESSAGE: [1783091]: BUILD FAILED |
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
60c02c9
to
0b6c651
Compare
CI MESSAGE: [1783362]: BUILD STARTED |
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
CI MESSAGE: [1783647]: BUILD STARTED |
CI MESSAGE: [1783647]: BUILD FAILED |
CI MESSAGE: [1783362]: BUILD PASSED |
CI MESSAGE: [1783647]: BUILD PASSED |
Signed-off-by: Janusz Lisiecki jlisiecki@nvidia.com
Why we need this PR?
Pick one, remove the rest
What happened in this PR?
Fill relevant points, put NA otherwise. Replace anything inside []
removes files that should not be in the repo due to license
adds a step that generates dynlink_cuda during the build based on the available headers
build system
dynlink cuda
NA
CI
NA
JIRA TASK: [DALI-1610]