-
Notifications
You must be signed in to change notification settings - Fork 22.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Torch rename #20774
Torch rename #20774
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think important part is missing. Today we have libtorch depending on {libcaffe2, libcaffe2_gpu}. So when USE_CUDA=True libtorch.so depends on gpu bits of C2.
Your PR changes to have two separate libtorch and libtorch_gpu that depends on it
so if the user links with libtorch.so they won't get cuda 😕
I think we should just give up _gpu separation and just conditionally include files with USE_CUDA into libtorch.so (we already do it for "pytorchy" bits but not for "c2-ish" bits). We can do it only for the .so for now and still keep pybind_state_gpu separation.
So long as we still have the internal separation, I think combining with _gpu is okay in CMake. @bddppq any concerns with this in the AMD/HIP case? |
Just a quick shout: We used to build caffe2 statically and link it into libtorch for Android libtorch. This doesn't work in the current intermittent state of master until this PR is merged. |
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: 7300c8219806b54f44d0529fd996f69d5a997f3f Pull Request resolved: #30315
…cuda,hip}" The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
…cuda,hip}" The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: 9e721d2f76108ebc63d53cca68e0becd6e3aabeb Pull Request resolved: #30315
…cuda,hip}" The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: 9e69bf273a77f37f49db42dc84c968f2abb72f4e Pull Request resolved: #30315
…cuda,hip}" The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly * I had to torch_cpu/torch_cuda caffe2_interface_library so that they get whole-archived linked into torch when you statically link. And I had to do this in an *exported* fashion because torch needs to depend on torch_cpu_library. In the end I exported everything and removed the redefinition in the Caffe2Config.cmake. However, I am not too sure why the old code did it in this way in the first place; however, it doesn't seem to have broken anything to switch it this way. * There's some uses of `__HIP_PLATFORM_HCC__` still in `torch_cpu` code, so I had to apply it to that library too (UGH). This manifests as a failer when trying to run the CUDA fuser. This doesn't really matter substantively right now because we still in-place HIPify, but it would be good to fix eventually. This was a bit difficult to debug because of an unrelated HIP bug, see ROCm/HIP#1706 Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly * I had to torch_cpu/torch_cuda caffe2_interface_library so that they get whole-archived linked into torch when you statically link. And I had to do this in an *exported* fashion because torch needs to depend on torch_cpu_library. In the end I exported everything and removed the redefinition in the Caffe2Config.cmake. However, I am not too sure why the old code did it in this way in the first place; however, it doesn't seem to have broken anything to switch it this way. * There's some uses of `__HIP_PLATFORM_HCC__` still in `torch_cpu` code, so I had to apply it to that library too (UGH). This manifests as a failer when trying to run the CUDA fuser. This doesn't really matter substantively right now because we still in-place HIPify, but it would be good to fix eventually. This was a bit difficult to debug because of an unrelated HIP bug, see ROCm/HIP#1706 Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: 4b2a8bd9678b36f81cb1ac1be19f501a3cc2c812 Pull Request resolved: #30315
…cuda,hip}" The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly * I had to torch_cpu/torch_cuda caffe2_interface_library so that they get whole-archived linked into torch when you statically link. And I had to do this in an *exported* fashion because torch needs to depend on torch_cpu_library. In the end I exported everything and removed the redefinition in the Caffe2Config.cmake. However, I am not too sure why the old code did it in this way in the first place; however, it doesn't seem to have broken anything to switch it this way. * There's some uses of `__HIP_PLATFORM_HCC__` still in `torch_cpu` code, so I had to apply it to that library too (UGH). This manifests as a failer when trying to run the CUDA fuser. This doesn't really matter substantively right now because we still in-place HIPify, but it would be good to fix eventually. This was a bit difficult to debug because of an unrelated HIP bug, see ROCm/HIP#1706 Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly * I had to torch_cpu/torch_cuda caffe2_interface_library so that they get whole-archived linked into torch when you statically link. And I had to do this in an *exported* fashion because torch needs to depend on torch_cpu_library. In the end I exported everything and removed the redefinition in the Caffe2Config.cmake. However, I am not too sure why the old code did it in this way in the first place; however, it doesn't seem to have broken anything to switch it this way. * There's some uses of `__HIP_PLATFORM_HCC__` still in `torch_cpu` code, so I had to apply it to that library too (UGH). This manifests as a failer when trying to run the CUDA fuser. This doesn't really matter substantively right now because we still in-place HIPify, but it would be good to fix eventually. This was a bit difficult to debug because of an unrelated HIP bug, see ROCm/HIP#1706 Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: 46c198af7b201d130cec6499476de2be4c102ad1 Pull Request resolved: #30315
…cuda,hip}" The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly * I had to torch_cpu/torch_cuda caffe2_interface_library so that they get whole-archived linked into torch when you statically link. And I had to do this in an *exported* fashion because torch needs to depend on torch_cpu_library. In the end I exported everything and removed the redefinition in the Caffe2Config.cmake. However, I am not too sure why the old code did it in this way in the first place; however, it doesn't seem to have broken anything to switch it this way. * There's some uses of `__HIP_PLATFORM_HCC__` still in `torch_cpu` code, so I had to apply it to that library too (UGH). This manifests as a failer when trying to run the CUDA fuser. This doesn't really matter substantively right now because we still in-place HIPify, but it would be good to fix eventually. This was a bit difficult to debug because of an unrelated HIP bug, see ROCm/HIP#1706 Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: [D18790941](https://our.internmc.facebook.com/intern/diff/D18790941) [ghstack-poisoned]
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly * I had to torch_cpu/torch_cuda caffe2_interface_library so that they get whole-archived linked into torch when you statically link. And I had to do this in an *exported* fashion because torch needs to depend on torch_cpu_library. In the end I exported everything and removed the redefinition in the Caffe2Config.cmake. However, I am not too sure why the old code did it in this way in the first place; however, it doesn't seem to have broken anything to switch it this way. * There's some uses of `__HIP_PLATFORM_HCC__` still in `torch_cpu` code, so I had to apply it to that library too (UGH). This manifests as a failer when trying to run the CUDA fuser. This doesn't really matter substantively right now because we still in-place HIPify, but it would be good to fix eventually. This was a bit difficult to debug because of an unrelated HIP bug, see ROCm/HIP#1706 Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: [D18790941](https://our.internmc.facebook.com/intern/diff/D18790941) [ghstack-poisoned]
The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: cb309b0d9b535a762964142b96fcedc6a45c6e67 Pull Request resolved: #30315
Summary: Pull Request resolved: #30315 The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of #29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly * I had to torch_cpu/torch_cuda caffe2_interface_library so that they get whole-archived linked into torch when you statically link. And I had to do this in an *exported* fashion because torch needs to depend on torch_cpu_library. In the end I exported everything and removed the redefinition in the Caffe2Config.cmake. However, I am not too sure why the old code did it in this way in the first place; however, it doesn't seem to have broken anything to switch it this way. * There's some uses of `__HIP_PLATFORM_HCC__` still in `torch_cpu` code, so I had to apply it to that library too (UGH). This manifests as a failer when trying to run the CUDA fuser. This doesn't really matter substantively right now because we still in-place HIPify, but it would be good to fix eventually. This was a bit difficult to debug because of an unrelated HIP bug, see ROCm/HIP#1706 Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D18790941 Pulled By: ezyang fbshipit-source-id: 01296f6089d3de5e8365251b490c51e694f2d6c7
Summary: Pull Request resolved: pytorch#30315 The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of pytorch#29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in pytorch#20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly * I had to torch_cpu/torch_cuda caffe2_interface_library so that they get whole-archived linked into torch when you statically link. And I had to do this in an *exported* fashion because torch needs to depend on torch_cpu_library. In the end I exported everything and removed the redefinition in the Caffe2Config.cmake. However, I am not too sure why the old code did it in this way in the first place; however, it doesn't seem to have broken anything to switch it this way. * There's some uses of `__HIP_PLATFORM_HCC__` still in `torch_cpu` code, so I had to apply it to that library too (UGH). This manifests as a failer when trying to run the CUDA fuser. This doesn't really matter substantively right now because we still in-place HIPify, but it would be good to fix eventually. This was a bit difficult to debug because of an unrelated HIP bug, see ROCm/HIP#1706 Fixes pytorch#27215 (as our libraries are smaller), and executes on part of the plan in pytorch#29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D18790941 Pulled By: ezyang fbshipit-source-id: 01296f6089d3de5e8365251b490c51e694f2d6c7
The torchbind tests didn't work be cause somehow we missed the rename of caffe2_gpu to torch_... (hip for us) in pytorch#20774 (merged 2019-06-13, oops) and thus the libs didn't compile.
Summary: The torchbind tests didn't work be cause somehow we missed the rename of caffe2_gpu to torch_... (hip for us) in #20774 (merged 2019-06-13, oops) and still tried to link against it. Pull Request resolved: #45426 Reviewed By: VitalyFedyunin Differential Revision: D24112439 Pulled By: walterddr fbshipit-source-id: a66a574e63714728183399c543d2dafbd6c028f7
This renames the CMake
caffe2
target totorch
, as well as renamingcaffe2_gpu
totorch_gpu
(and likewise for other gpu target variants). Many intermediate variables that don't manifest as artifacts of the build remain for now with the "caffe2" name; a complete purge ofcaffe2
from CMake variable names is beyond the scope of this PR.The shell
libtorch
library that had been introduced as a stopgap in #17783 is again flattened in this PR.