Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TORCH_LIBRARIES variables when do static build #49458

Closed
wants to merge 3 commits into from

Conversation

gemfield
Copy link
Contributor

Fixes #21737

With this fix, TORCH_LIBRARIES variable can provide all nessesary static libraries build from pytorch repo.
User program (if do static build) now can just link with ${TORCH_LIBRARIES} + MKL + cuda runtime.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Dec 16, 2020

💊 CI failures summary and remediations

As of commit 1b2e102 (more details on the Dr. CI page):


  • 3/3 failures introduced in this PR

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_build (1/3)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_linux_test.sh
Auto-merging .circleci/scripts/binary_linux_test.sh
CONFLICT (add/add): Merge conflict in .circleci/generate_config_yml.py
Auto-merging .circleci/generate_config_yml.py
CONFLICT (add/add): Merge conflict in .circleci/docker/common/install_conda.sh
Auto-merging .circleci/docker/common/install_conda.sh
CONFLICT (add/add): Merge conflict in .circleci/config.yml
Auto-merging .circleci/config.yml
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/dimensions.py
Auto-merging .circleci/cimodel/data/dimensions.py
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

See CircleCI build pytorch_linux_xenial_cuda9_2_cudnn7_py3_gcc5_4_build (2/3)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_linux_test.sh
Auto-merging .circleci/scripts/binary_linux_test.sh
CONFLICT (add/add): Merge conflict in .circleci/generate_config_yml.py
Auto-merging .circleci/generate_config_yml.py
CONFLICT (add/add): Merge conflict in .circleci/docker/common/install_conda.sh
Auto-merging .circleci/docker/common/install_conda.sh
CONFLICT (add/add): Merge conflict in .circleci/config.yml
Auto-merging .circleci/config.yml
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/dimensions.py
Auto-merging .circleci/cimodel/data/dimensions.py
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_build (3/3)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_linux_test.sh
Auto-merging .circleci/scripts/binary_linux_test.sh
CONFLICT (add/add): Merge conflict in .circleci/generate_config_yml.py
Auto-merging .circleci/generate_config_yml.py
CONFLICT (add/add): Merge conflict in .circleci/docker/common/install_conda.sh
Auto-merging .circleci/docker/common/install_conda.sh
CONFLICT (add/add): Merge conflict in .circleci/config.yml
Auto-merging .circleci/config.yml
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/dimensions.py
Auto-merging .circleci/cimodel/data/dimensions.py
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

@gemfield
Copy link
Contributor Author

@malfet could you have a code review for this PR?

Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, but please address review feedback, namely refactor repeated patterns into append_wholearchive_lib_if_found

Comment on lines 63 to 64
find_library(C10_LIBRARY c10 PATHS "${TORCH_INSTALL_PREFIX}/lib")
list(APPEND TORCH_LIBRARIES ${C10_LIBRARY})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use append_torchlib macro here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't plan to touch the code which belongs to shared library build. Anyway, I can fix this if it looks good.

if(${_arg}_LIBRARY)
list(APPEND TORCH_LIBRARIES ${${_arg}_LIBRARY})
else()
message(WARNING "static library ${${_arg}_LIBRARY} not found.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this one should not be an error?
Also, why this macro is specific to a static libraries discovery?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning instead of error mainly because of two reasons:
1, I cannot find the corresponding CMake option for every individual static library, so if the static library is not found, means the implied cmake option is off (rather than an error);
2, some static libraries are not mandatory for end user, warning instead of error provides flexibility in this situation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, can you give an example of optional library dependencies?
Also "static library" message is misleading, as append_torchlib_if_found is called from both shared and static library path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eigen_blas is optional, as we may use MKL as lapack backend, and I cannot find a public cmake option to control this,e.g.:

if(@USE_EIGENBLAS@)

If eigen_blas option can be found, and also caffe2_protos, protobuf-lite, protobuf, protoc, onnx, onnx_proto, foxi_loader, fmt, sleef, asmjit are all mandatory library for static build, we can change the warning to error.

endforeach()
endmacro()

function(add_whole_archive_flag lib output_var)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain why add_whole_archive_flag is a function, but append_torchlib_if_found is macro?

Copy link
Contributor Author

@gemfield gemfield Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list append seems not support parent scope, while "set list" seems not intuitive, so use macro instead. And, to be consistent, I will change the function(add_whole_archive_flag) to macro if it looks good.

Comment on lines 79 to 81
else()
set(library_with_flag "")
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain, why we shouldn't raise an error if torch_LIBRARY could not be found?

Copy link
Contributor Author

@gemfield gemfield Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some static libraries are optional, e.g. for torch_cuda and c10_cuda. After pytorch cuda build, end user can still link libtorch with cuda libraries missing (just use the cpu device) .

Comment on lines 86 to 90
add_whole_archive_flag(${torch_cpu_LIBRARY} library_with_flag)
else()
set(library_with_flag "")
endif()
list(APPEND TORCH_LIBRARIES ${library_with_flag})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain why else() statement is needed?

Suggested change
add_whole_archive_flag(${torch_cpu_LIBRARY} library_with_flag)
else()
set(library_with_flag "")
endif()
list(APPEND TORCH_LIBRARIES ${library_with_flag})
add_whole_archive_flag(${torch_cpu_LIBRARY} library_with_flag)
list(APPEND TORCH_LIBRARIES ${library_with_flag})
endif()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right.

Comment on lines 93 to 98
if(torch_cuda_LIBRARY)
add_whole_archive_flag(${torch_cuda_LIBRARY} library_with_flag)
else()
set(library_with_flag "")
endif()
list(APPEND TORCH_LIBRARIES ${library_with_flag})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a repeated pattern, can it be refactored to macro? Something like add_wholearchive_library_if_found?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this will be done.

@gemfield
Copy link
Contributor Author

@malfet All have been fixed.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@malfet has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@malfet merged this pull request in deba3bd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Static linking master bug
4 participants