Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove CUDA headers and generate stubs in runtime #2420

Merged
merged 11 commits into from Nov 10, 2020

Conversation

JanuszL
Copy link
Contributor

@JanuszL JanuszL commented Oct 30, 2020

  • removes files that should not be in the repo due to license
  • adds a step that generates dynlink_cuda during the build based on the available headers

Signed-off-by: Janusz Lisiecki jlisiecki@nvidia.com

Why we need this PR?

Pick one, remove the rest

  • It removes CUDA headers and generate stubs in runtime

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    removes files that should not be in the repo due to license
    adds a step that generates dynlink_cuda during the build based on the available headers
  • Affected modules and functionalities:
    build system
    dynlink cuda
  • Key points relevant for the review:
    NA
  • Validation and testing:
    CI
  • Documentation (including examples):
    NA

JIRA TASK: [DALI-1610]

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1749684]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1749684]: BUILD FAILED

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Nov 2, 2020

This pull request introduces 1 alert when merging 8ed0dabe24242d23b05c8d103e0b28f9eb4c7161 into 359a6a5 - view on LGTM.com

new alerts:

  • 1 for Unused import

@JanuszL JanuszL force-pushed the move_to_stubs branch 2 times, most recently from dc49beb to 77aa257 Compare November 2, 2020 19:38
@lgtm-com
Copy link
Contributor

lgtm-com bot commented Nov 2, 2020

This pull request introduces 1 alert when merging 77aa257045fa8ad2eeb8d5c5eeab6da887cf011d into 359a6a5 - view on LGTM.com

new alerts:

  • 1 for Unused import

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1755428]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1755428]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1757622]: BUILD STARTED

raise Exception(str(diag))

args.output.write('#include <string>\n')
args.output.write('#include <cuda.h>\n')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need that for other headers (nvcuvid, optical flow, etc...)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All uses cuda.h defined types.
string is used symbol loader function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is: we must include the relevant header anyway (e.g. to get the types) and it should include cuda.h if it's necessary, so this inclusion is redundant. Meanwhile, if we use this tool for something else entirely, then keeping cuda.h may be a problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving to json file.

args.output.write(prolog.format(args.unique_prefix))

for cursor in translation_unit.cursor.get_children():
if cursor.kind != clang.cindex.CursorKind.FUNCTION_DECL:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check if there's a definition of that function in this translation unit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I hope.

if cursor.spelling not in config['functions']:
continue

with open(cursor.location.file.name, 'r', encoding='latin-1') as file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

latin-1? Why not utf-8?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Apparently cuda.h is not utf-8, xavier build yields strange result. Switching back.

if cursor.kind != clang.cindex.CursorKind.FUNCTION_DECL:
continue

if cursor.spelling not in config['functions'] or cursor.is_definition():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6/10. This would work as long as there's no declaration followed by definition. Also, I think that C allows multiple declarations of the same function (not that I expect that in the headers we're interested in).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I hope.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1757622]: BUILD FAILED

@JanuszL JanuszL force-pushed the move_to_stubs branch 3 times, most recently from ad1250e to 0966726 Compare November 3, 2020 16:25
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1758673]: BUILD STARTED

@@ -2978,3 +2978,209 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

========================
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe say what it is that we are using?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See L2984 - Copyright 2020 The TensorFlow Runtime Authors. All rights reserved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, I'd add a line like - otherwise it looks like we're using the whole runtime
Dynamic loader generator for shared libraries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH - I don't know if anyone is using such convention in the Acknowledgements. Usually it is this project is using parts of..., and the appropriate license is copied. Also we may use something else later and I don't think we will remember to update this file.

//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering. What's the purpose of this file now?

Copy link
Contributor Author

@JanuszL JanuszL Nov 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have cuInitChecked which opens/loads the lib itself.

#endif /* __cplusplus */

#endif
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happened here? I understand this is not our file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the file and the license, also updated the line endings to be unix one.


#endif
/*
* Copyright(c) 2020, NVIDIA CORPORATION.All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above.

// it is defined in the generated file
typedef void *tLoadSymbol(const std::string &name);
void NvcuvidSetSymbolLoader(tLoadSymbol loader_func);
void Nvcuvid_2SetSymbolLoader(tLoadSymbol loader_func);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why there's a _2 in the name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two separated dynlink implementations, each have a separate *SetSymbolLoader function.

static std::unordered_map<std::string, void*> symbol_map;
std::lock_guard<std::mutex> lock(symbol_mutex);
auto it = symbol_map.find(name);
if (it == symbol_map.end()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that a function called "is symbol availave" is modying the symbol map. Can you explain why is that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It lazy loads the symbols to the map only for the purpose of checking is it is available. It is not very frequently used functionality, so it won't happen very often.
I don't have a good idea how to move it to the generated file and not affect the perf.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current implementation of generated functions is:

  using FuncPtr = CUresult (CUDAAPI *)(CUresult, const char **);
  static auto func_ptr = reinterpret_cast<FuncPtr>(load_symbol_func("cuGetErrorString"));
  if (!func_ptr) return CUDA_ERROR_SHARED_OBJECT_SYMBOL_NOT_FOUND;
  return func_ptr(error, pStr);
}

If we add a map there we would need add a sync point for that.
We cannot easily expose the f_pointer itself as well from it to check the value.
No idea how to approach differently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just commenting on the name. This function is actually loading symbols. Perhaps it should be called something like LoadSymbolIfNeeded to better reflect what it does

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it doesn't load the symbol either, in the way that the first invocation of the relevant function will call again LoadSymbol.
I would keep the name as it is as this is the purpose of the function - check if a function with the given name is available, not to load it. One day we may rework the stub implementation but the external API would rather stay the same.


# there is a one cuvidGetVideoSourceState function that doesn't follow the returned value convention
# so need a dedicated stub for it
set(NVCUVID_2_GENERATED_STUB "${CMAKE_CURRENT_BINARY_DIR}/dynlink_nvcuvid_2_gen.cc")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have a _2 stub?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a one method that doesn't follow the common and we need a separate json for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out wrappers returns CUresult and CUDA_ERROR_SHARED_OBJECT_SYMBOL_NOT_FOUND is symbol is not available.
cuvidGetVideoSourceState returns cudaVideoState so we need to return cudaVideoState_Error in case of lack of the symbol.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we add one more level of JSON that would aggregate functions, return type and error code? This doesn't seem too complicated - we'd have a dictionary that maps a function to a proper return type and error code and we'd generate stubs with these. I think we might run into this issue in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1758777]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1758777]: BUILD FAILED

@JanuszL JanuszL force-pushed the move_to_stubs branch 4 times, most recently from 127fbe9 to 5d847f7 Compare November 3, 2020 22:36
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1760163]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1760163]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1777411]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1777423]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1782650]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1782661]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1782650]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1782661]: BUILD PASSED

static std::unordered_map<std::string, void*> symbol_map;
std::lock_guard<std::mutex> lock(symbol_mutex);
auto it = symbol_map.find(name);
if (it == symbol_map.end()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just commenting on the name. This function is actually loading symbols. Perhaps it should be called something like LoadSymbolIfNeeded to better reflect what it does

"cuGetErrorName": {},
"cuGetErrorString": {},
"cuDriverGetVersion": {},
"cuDeviceGetCount": {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question: Can this be somehow generated (offline is fine) by listing the symbols from the library? I am wondering if we need to manually add any new symbol that we'd like to use

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's something I've already suggested - to have a regex pattern. But maybe let's merge it as-is and improve later - at least more people will have an opportunity to do that when they have the code in their working copies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can try to regexp inside stub_codegen.py. But we can think about this in the next step, now it provides the same functionality as we used to have - hardcoded list of supported functions.

extra_args = args.extra_args

translation_unit = index.parse(header, args=extra_args)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a mix of 2-space and 4-space indentations here. Can you unify?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1783091]: BUILD STARTED

- removes files that should not be in the repo due to license
- adds a step that generates dynlink_cuda during the build based
  on the available headers

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1783091]: BUILD FAILED

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1783362]: BUILD STARTED

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1783647]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1783647]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1783362]: BUILD PASSED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1783647]: BUILD PASSED

@JanuszL JanuszL merged commit a2ad452 into NVIDIA:master Nov 10, 2020
@JanuszL JanuszL deleted the move_to_stubs branch November 10, 2020 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants