Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizer support via Libtorch C++ on Android #42665

Open
Nitya05 opened this issue Aug 6, 2020 · 7 comments
Open

Optimizer support via Libtorch C++ on Android #42665

Nitya05 opened this issue Aug 6, 2020 · 7 comments
Labels
module: cpp Related to C++ API module: optimizer Related to torch.optim oncall: mobile Related to mobile support, including iOS and Android triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@Nitya05
Copy link

Nitya05 commented Aug 6, 2020

馃悰 Bug

We are building libtorch using ./scripts/build_anroid.sh.

We need the support for Aten Ops and TorchScript, so building without the BUILD_CAFFE2_MOBILE option.

The build is successful, but the library does not link ${TORCH_SRC_DIR}/csrc/api/src/optim/adam.cpp and dependencies because of NO_API being set for Mobile Builds.

Because of this I am unable to train the model and instantiate Adam Optimizer instance from the code.

torch::optim::Adam optimizer(parameters, lr); //Linker Error
optimizer.zero_grad(); //Linker Error
optimizer.step(); //Linker Error

Following is the Linker error:
/home/atibrewal/work/apprecommender/src/RNNRecommender.cpp:374: undefined reference to torch::optim::AdamOptions::AdamOptions(double)' /home/atibrewal/work/apprecommender/src/RNNRecommender.cpp:375: undefined reference to torch::optim::Optimizer::zero_grad()'
/home/atibrewal/work/apprecommender/src/RNNRecommender.cpp:395: undefined reference to torch::optim::Adam::step(std::__ndk1::function<at::Tensor ()>)' CMakeFiles/hxRecommenderEngine.dir/src/RNNRecommender.cpp.o: In function OptimizerParamGroup':
/home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/optimizer.h:68: undefined reference to torch::optim::OptimizerParamGroup::params() const' /home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/optimizer.h:68: undefined reference to torch::optim::OptimizerParamGroup::has_options() const'
/home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/optimizer.h:68: undefined reference to torch::optim::OptimizerParamGroup::options() const' CMakeFiles/hxRecommenderEngine.dir/src/RNNRecommender.cpp.o: In function AdamOptions':
/home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/adam.h:21: undefined reference to vtable for torch::optim::AdamOptions' /home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/adam.h:21: undefined reference to vtable for torch::optim::AdamOptions'
CMakeFiles/hxRecommenderEngine.dir/src/RNNRecommender.cpp.o: In function Adam': /home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/adam.h:52: undefined reference to vtable for torch::optim::Adam'
/home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/adam.h:52: undefined reference to vtable for torch::optim::Adam' CMakeFiles/hxRecommenderEngine.dir/src/RNNRecommender.cpp.o: In function OptimizerOptions':
/home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/optimizer.h:49: undefined reference to vtable for torch::optim::OptimizerOptions' /home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/optimizer.h:49: undefined reference to vtable for torch::optim::OptimizerOptions'
CMakeFiles/hxRecommenderEngine.dir/src/RNNRecommender.cpp.o: In function Optimizer': /home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/optimizer.h:91: undefined reference to vtable for torch::optim::Optimizer'
/home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/optimizer.h:91: undefined reference to vtable for torch::optim::Optimizer' /home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/optimizer.h:93: undefined reference to torch::optim::Optimizer::add_param_group(torch::optim::OptimizerParamGroup const&)'
CMakeFiles/hxRecommenderEngine.dir/src/RNNRecommender.cpp.o: In function ~Optimizer': /home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/optimizer.h:103: undefined reference to vtable for torch::optim::Optimizer'
/home/atibrewal/work/hielibs_android/include/torch/csrc/api/include/torch/optim/optimizer.h:103: undefined reference to vtable for torch::optim::Optimizer' CMakeFiles/hxRecommenderEngine.dir/src/RNNRecommender.cpp.o:(.data.rel.ro._ZTVN5torch5optim25OptimizerCloneableOptionsINS0_11AdamOptionsEEE[_ZTVN5torch5optim25OptimizerCloneableOptionsINS0_11AdamOptionsEEE]+0x18): undefined reference to torch::optim::OptimizerOptions::serialize(torch::serialize::InputArchive&)'
CMakeFiles/hxRecommenderEngine.dir/src/RNNRecommender.cpp.o:(.data.rel.ro._ZTVN5torch5optim25OptimizerCloneableOptionsINS0_11AdamOptionsEEE[_ZTVN5torch5optim25OptimizerCloneableOptionsINS0_11AdamOptionsEEE]+0x20): undefined reference to torch::optim::OptimizerOptions::serialize(torch::serialize::OutputArchive&) const' CMakeFiles/hxRecommenderEngine.dir/src/RNNRecommender.cpp.o:(.data.rel.ro._ZTIN5torch5optim25OptimizerCloneableOptionsINS0_11AdamOptionsEEE[_ZTIN5torch5optim25OptimizerCloneableOptionsINS0_11AdamOptionsEEE]+0x10): undefined reference to typeinfo for torch::optim::OptimizerOptions'

Environment

  • PyTorch Version (e.g., 1.0): Pytorch 1.5.1 and 1.6.0
  • OS (e.g., Linux): Android arm64-v8a
  • How you installed PyTorch (conda, pip, source): source
  • Build command you used (if compiling from source): /scripts/build_anroid.sh
  • Python version: NA
  • CUDA/cuDNN version: NA (using a CPU version)
  • GPU models and configuration: NA
  • Any other relevant information: Android NDK version 19c

cc @yf225 @glaringlee @vincentqb

@ailzhang ailzhang added module: cpp Related to C++ API module: optimizer Related to torch.optim oncall: mobile Related to mobile support, including iOS and Android triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Aug 7, 2020
@glaringlee
Copy link
Contributor

@malfet Can we enable libtorch even BUILD_CAFFE2_MOBILE is off? Not sure why this is set one or the other.

# BUILD_CAFFE2_MOBILE is the master switch to choose between libcaffe2 v.s. libtorch mobile build.

@ezyang
Copy link
Contributor

ezyang commented Aug 10, 2020

I suppose you could YOLO removing the relevant line at

if(INTERN_BUILD_MOBILE AND NOT BUILD_CAFFE2_MOBILE)
  if(NOT BUILD_SHARED_LIBS AND NOT "${SELECTED_OP_LIST}" STREQUAL "")
    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DNO_EXPORT")
  endif()
  set(BUILD_PYTHON OFF)
  set(BUILD_CAFFE2_OPS OFF)
  set(USE_DISTRIBUTED OFF)
  set(FEATURE_TORCH_MOBILE ON)
  set(NO_API ON) # this one

in CMakeLists.txt. No warranty provided though...

@Nitya05
Copy link
Author

Nitya05 commented Aug 11, 2020

We have already tried this option before.
After disabling NO_API option in CMakeLists.txt

/home/atibrewal/work/hielibs_android/lib/libtorch_cpu.a(output-archive.cpp.o): In function torch::serialize::OutputArchive::save_to(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)': /home/atibrewal/work/pytorchnew/pytorch/torch/csrc/api/src/serialize/output-archive.cpp:39: undefined reference to torch::jit::ExportModule(torch::jit::Module const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > const&, std::__ndk1::unordered_map<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator >, std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator >, std::__ndk1::hash<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > >, std::__ndk1::equal_to<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > >, std::__ndk1::allocator<std::__ndk1::pair<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > const, std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > > > > const&, bool)'
/home/atibrewal/work/hielibs_android/lib/libtorch_cpu.a(output-archive.cpp.o): In function torch::serialize::OutputArchive::save_to(std::__ndk1::basic_ostream<char, std::__ndk1::char_traits<char> >&)': /home/atibrewal/work/pytorchnew/pytorch/torch/csrc/api/src/serialize/output-archive.cpp:43: undefined reference to torch::jit::ExportModule(torch::jit::Module const&, std::__ndk1::basic_ostream<char, std::__ndk1::char_traits >&, std::__ndk1::unordered_map<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator >, std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator >, std::__ndk1::hash<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > >, std::__ndk1::equal_to<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > >, std::__ndk1::allocator<std::__ndk1::pair<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > const, std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > > > > const&, bool)'
/home/atibrewal/work/hielibs_android/lib/libtorch_cpu.a(output-archive.cpp.o): In function torch::serialize::OutputArchive::save_to(std::__ndk1::function<unsigned long (void const*, unsigned long)> const&)': /home/atibrewal/work/pytorchnew/pytorch/torch/csrc/api/src/serialize/output-archive.cpp:48: undefined reference to torch::jit::ExportModule(torch::jit::Module const&, std::__ndk1::function<unsigned long (void const*, unsigned long)> const&, std::__ndk1::unordered_map<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator >, std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator >, std::__ndk1::hash<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > >, std::__ndk1::equal_to<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > >, std::__ndk1::allocator<std::__ndk1::pair<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > const, std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > > > > const&, bool)'

@ezyang
Copy link
Contributor

ezyang commented Aug 12, 2020

cc @ann-ss who is working on a different project to remove dependence on InputArchive/OutputArchive from samplers

@ankurtibrewal
Copy link

We got this to work along with on device training support on Android by tweaking the cmakefiles to basically allow both options (Torch Script and CAFFE2_MOBILE modes) simultaneosly.

To summarize, we did the following adaptations in the make file :
Enabling protobuf/protoc
Enabling autograd
Disabling static dispatch.
Some minor code changes.
Disabling onnx.
Allowing all caffe2 features to be built which were being disabled when CAFFE2_MOBILE was not set.

The changes are part of this commit on a fork :
ankurtibrewal@77c2e10

Can someone please review and let us know if you foresee major issues with this change ?

@ezyang
Copy link
Contributor

ezyang commented Aug 18, 2020

Your changes look reasonable. We'd probably be interested in taking them upstream, but if guarded as a flag (as not everyone is going to want on-device training). Thanks for the work!

@ankurtibrewal
Copy link

@ezyang Thank You ! I will cleanup the changes and raise a pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: cpp Related to C++ API module: optimizer Related to torch.optim oncall: mobile Related to mobile support, including iOS and Android triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

5 participants