Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows C++ tensorflow_cc.dll has overlapping memory address between string gpu options for "allocator type" and "visible device list" #39439

Closed
kognat-docs opened this issue May 12, 2020 · 152 comments
Assignees
Labels
comp:gpu GPU related issues comp:runtime c++ runtime, performance issues (cpu) TF 1.12 Issues related to TF 1.12 type:performance Performance Issue

Comments

@kognat-docs
Copy link

Please make sure that this is a bug. As per our
GitHub Policy,
we only address code/doc bugs, performance issues, feature requests and
build/installation issues on GitHub. tag:bug_template

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: NA
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): 1.12.0 branched from 5b900cf
  • Python version: NA
  • Bazel version (if compiling from source): 0.19.2
  • GCC/Compiler version (if compiling from source): MSVC 2015
  • CUDA/cuDNN version: 10.0.130, 9.2.148
  • GPU model and memory: NVIDIA GP100 16Gb

You can collect some of this information using our environment capture
script
You can also obtain the TensorFlow version with:
NA

Describe the current behavior

I am creating as session as follows adapted from original code

   std::unique_ptr<tensorflow::Session>* session;
   tensorflow::SessionOptions options;
   tensorflow::ConfigProto* config = &options.config;
   float fraction =0.8;
   int whichGPU = 0;
   int cuda_device_count=1;
   tensorflow::GraphDef graph_def;
   tensorflow::status = tensorflow::ReadBinaryProto(tensorflow::Env::Default(), "C:\\\models\\graph.pb", &graph_def);
   auto* device_count = options.config.mutable_device_count();
   device_count->insert({ "GPU", cuda_device_count });
   device_count->insert({ "CPU", 1 });
   options.config.mutable_gpu_options()->set_per_process_gpu_memory_fraction(fraction);
   options.config.mutable_gpu_options()->set_visible_device_list(std::to_string(whichGPU));
   session->reset(tensorflow::NewSession(options));
  (*session)->Create(graph_def);

which results in

    70 2020-05-12 09:41:28.214176: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] 
    Found device 0 with properties: 
   71 name: Quadro GP100 major: 6 minor: 0 memoryClockRate(GHz): 1.4425
   72 pciBusID: 0000:01:00.0
   73 totalMemory: 16.00GiB freeMemory: 13.28GiB
   74 2020-05-12 09:41:28.215329: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] 
Adding visible gpu devices: 0
   75 2020-05-12 09:41:28.952392: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
   76 2020-05-12 09:41:28.952785: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
   77 2020-05-12 09:41:28.953095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
    78 2020-05-12 09:41:28.953962: E tensorflow/core/common_runtime/gpu/gpu_process_state.cc:106] Invalid allocator type: 0
   79 2020-05-12 09:41:28.954425: E tensorflow/core/common_runtime/session.cc:64] Failed to create session: Internal: Failed to get memory allocator for TF GPU 0 with 6899999744 bytes of memory.

Describe the expected behavior

Session is created and runs on GPU 0 only using only 80% of available memory

Standalone code to reproduce the issue

#include "tensorflow/core/protobuf/control_flow.pb.h"
#include "tensorflow/core/protobuf/config.pb.h"
#include <iostream>

int main() {
  tensorflow::GPUOptions gpu_options;

  gpu_options.set_visible_device_list("0");

  std::cout << "allocator_type " << gpu_options.allocator_type() << std::endl; //print 0

}

Other info / logs

Please see the following issues
#16291
fo40225/tensorflow-windows-wheel#39

I have built my tensorflow.dll as follows:

$ENV:USE_BAZEL_VERSION="0.19.2"
$ENV:PYTHON_BIN_PATH=C:\ProgramData\Anaconda3\python.exe
$ENV:Path += ";C:\msys64\usr\bin"
$ENV:Path += ";C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin"
$ENV:Path += ";C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\extras\CUPTI\libx64"
$ENV:Path += ";C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\cudnn-9.2-windows10-x64-v7.5.0.56\cuda\bin"
$ENV:BAZEL_SH = "C:\msys64\usr\bin\bash.exe"
$ENV:CUDA_TOOLKIT_PATH="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2"
$ENV:TF_CUDA_VERSION="9.2"
$ENV:CUDNN_INSTALL_PATH="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\cudnn-9.2-windows10-x64-v7.5.0.56\cuda"
$ENV:TF_CUDNN_VERSION="7"
$ENV:TF_NCCL_VERSION="1"
$ENV:TF_CUDA_COMPUTE_CAPABILITIES="3.5,3.7,5.0,5.2,6.0,6.1"
$ENV:TF_CUDA_CLANG="0"
$ENV:TF_NEED_CUDA="1"
$ENV:TF_NEED_ROCM="0"
$ENV:TF_NEED_OPENCL_SYCL="0"

$params = "configure.py",""
Remove-Item -Recurse -Force "C:\Windows\system32\config\systemprofile_bazel_SYSTEM\install\75b09cf1ac98c0ffb0534079b30efcc4"
cmd /c "ECHO Y" | & python.exe @params
bazel.exe clean --expunge
bazel.exe build --copt=-nvcc_options=disable-warnings --test_tag_filters=-no_oss,-gpu,-benchmark-test,-nomac,-no_mac --announce_rc --test_timeout 300,450,1200,3600 --test_size_filters=small,medium --jobs=12 //tensorflow:libtensorflow_cc.so //tensorflow:libtensorflow_framework.so

edits have been made to the following files:

within

tensorflow/BUILD

`"//tensorflow:windows": [],`

becomes

"//tensorflow:windows": [
            "-def:" +  # This line must be directly followed by the exported_symbols_msvc.lds file
            "$(location //tensorflow:tf_exported_symbols_msvc.lds)",
        ],

and within
tf_cc_shared_object the function of tensorflow/BUILD

    visibility = ["//visibility:public"],
    deps = [
        "//tensorflow:tf_exported_symbols.lds",
        "//tensorflow:tf_version_script.lds",
        "//tensorflow/c:c_api",
        "//tensorflow/c/eager:c_api",

becomes

    visibility = ["//visibility:public"],
    deps = [
        "//tensorflow:tf_exported_symbols.lds",
        "//tensorflow:tf_exported_symbols_msvc.lds",
        "//tensorflow:tf_version_script.lds",
        "//tensorflow/c:c_api",
        "//tensorflow/c/eager:c_api",

The contents of tf_exported_symbols_msvc.lds are

LIBRARY tensorflow_cc
EXPORTS
    ??0MetaGraphDef@tensorflow@@QEAA@XZ
    ??1MetaGraphDef@tensorflow@@UEAA@XZ
    ??0LogMessageFatal@internal@tensorflow@@QEAA@PEBDH@Z
    ??1LogMessageFatal@internal@tensorflow@@UEAA@XZ
    ??0CheckOpMessageBuilder@internal@tensorflow@@QEAA@PEBD@Z
    ??1CheckOpMessageBuilder@internal@tensorflow@@QEAA@XZ
    ?ForVar2@CheckOpMessageBuilder@internal@tensorflow@@QEAAPEAV?$basic_ostream@DU?$char_traits@D@std@@@std@@XZ
    ?NewString@CheckOpMessageBuilder@internal@tensorflow@@QEAAPEAV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@XZ
    ?GetVarint32PtrFallback@core@tensorflow@@YAPEBDPEBD0PEAI@Z
    ?SlowCopyFrom@Status@tensorflow@@AEAAXPEBUState@12@@Z
    ?_GraphDef_default_instance_@tensorflow@@3VGraphDefDefaultTypeInternal@1@A
    ?NewSession@tensorflow@@YA?AVStatus@1@AEBUSessionOptions@1@PEAPEAVSession@1@@Z
    ?InitMain@port@tensorflow@@YAXPEBDPEAHPEAPEAPEAD@Z
    ?FastUInt64ToBufferLeft@strings@tensorflow@@YA_K_KPEAD@Z
    ?StrCat@strings@tensorflow@@YA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AEBVAlphaNum@12@@Z
    ?empty_string@Status@tensorflow@@CAAEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@XZ
    ?NewSession@tensorflow@@YAPEAVSession@1@AEBUSessionOptions@1@@Z
    ?InternalSwap@ConfigProto@tensorflow@@AEAAXPEAV12@@Z
    ?CopyFrom@ConfigProto@tensorflow@@QEAAXAEBV12@@Z
    ??0ConfigProto@tensorflow@@QEAA@XZ
    ?DebugString@TensorShapeRep@tensorflow@@QEBA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@XZ
    ??6tensorflow@@YAAEAV?$basic_ostream@DU?$char_traits@D@std@@@std@@AEAV12@AEBVStatus@0@@Z
    ??0Status@tensorflow@@QEAA@W4Code@error@1@Vstring_view@absl@@@Z
    ?StrCat@strings@tensorflow@@YA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AEBVAlphaNum@12@00@Z
    ??0SessionOptions@tensorflow@@QEAA@XZ
    ??1ConfigProto@tensorflow@@UEAA@XZ
    ?ReadBinaryProto@tensorflow@@YA?AVStatus@1@PEAVEnv@1@AEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@PEAVMessageLite@protobuf@google@@@Z
    ?Default@Env@tensorflow@@SAPEAV12@XZ
    ?CheckIsAlignedAndSingleElement@Tensor@tensorflow@@AEBAXXZ
    ?_SaverDef_default_instance_@tensorflow@@3VSaverDefDefaultTypeInternal@1@A
    ?CheckTypeAndIsAligned@Tensor@tensorflow@@AEBAXW4DataType@2@@Z
    ??1Tensor@tensorflow@@QEAA@XZ
    ??0Tensor@tensorflow@@QEAA@W4DataType@1@AEBVTensorShape@1@@Z
    ??0?$TensorShapeBase@VTensorShape@tensorflow@@@tensorflow@@QEAA@XZ
    ??0?$TensorShapeBase@VTensorShape@tensorflow@@@tensorflow@@QEAA@V?$Span@$$CB_J@absl@@@Z
    ?DestructorOutOfLine@TensorShapeRep@tensorflow@@AEAAXXZ
    ?TfCheckOpHelperOutOfLine@tensorflow@@YAPEAV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AEBVStatus@1@PEBD@Z
    ?SlowCopyFrom@TensorShapeRep@tensorflow@@AEAAXAEBV12@@Z
    ?dim_size@?$TensorShapeBase@VTensorShape@tensorflow@@@tensorflow@@QEBA_JH@Z
    ?CheckDimsEqual@TensorShape@tensorflow@@AEBAXH@Z
    ?CheckDimsAtLeast@TensorShape@tensorflow@@AEBAXH@Z
    ??0Tensor@tensorflow@@QEAA@XZ
    ??0GraphDef@tensorflow@@QEAA@XZ
    ??1GraphDef@tensorflow@@UEAA@XZ
    ?CheckType@Tensor@tensorflow@@AEBAXW4DataType@2@@Z
    ??1NodeDef@tensorflow@@UEAA@XZ
    ??0NodeDef@tensorflow@@QEAA@AEBV01@@Z
    ?DEVICE_CPU@tensorflow@@3QEBDEB
    ?DEVICE_GPU@tensorflow@@3QEBDEB
    ?DEVICE_SYCL@tensorflow@@3QEBDEB
    ?ThenBlasGemm@Stream@stream_executor@@QEAAAEAV12@W4Transpose@blas@2@0_K11MAEBV?$DeviceMemory@M@2@H2HMPEAV52@H@Z
    ?kDatasetGraphKey@DatasetBase@data@tensorflow@@2QBDB
    ??0?$TensorShapeBase@VTensorShape@tensorflow@@@tensorflow@@QEAA@V?$Span@$$CB_J@absl@@@Z
    ??0?$TensorShapeBase@VTensorShape@tensorflow@@@tensorflow@@QEAA@XZ
    ??0CheckOpMessageBuilder@internal@tensorflow@@QEAA@PEBD@Z
    ??0GraphDef@tensorflow@@QEAA@XZ
    ??0LogMessageFatal@internal@tensorflow@@QEAA@PEBDH@Z
    ??0MetaGraphDef@tensorflow@@QEAA@XZ
    ??0SessionOptions@tensorflow@@QEAA@XZ
    ??0Tensor@tensorflow@@QEAA@W4DataType@1@AEBVTensorShape@1@@Z
    ?DebugString@Tensor@tensorflow@@QEBA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@XZ
    ?Default@Env@tensorflow@@SAPEAV12@XZ
    ?DestructorOutOfLine@TensorShapeRep@tensorflow@@AEAAXXZ
    ?ForVar2@CheckOpMessageBuilder@internal@tensorflow@@QEAAPEAV?$basic_ostream@DU?$char_traits@D@std@@@std@@XZ
    ?GetVarint32PtrFallback@core@tensorflow@@YAPEBDPEBD0PEAI@Z
    ?SlowCopyFrom@TensorShapeRep@tensorflow@@AEAAXAEBV12@@Z
    ?TfCheckOpHelperOutOfLine@tensorflow@@YAPEAV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AEBVStatus@1@PEBD@Z
    ?ThenBlasGemm@Stream@stream_executor@@QEAAAEAV12@W4Transpose@blas@2@0_K11MAEBV?$DeviceMemory@M@2@H2HMPEAV52@H@Z
    ?ToString@Status@tensorflow@@QEBA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@XZ
    ??0Tensor@tensorflow@@QEAA@XZ
    ??1CheckOpMessageBuilder@internal@tensorflow@@QEAA@XZ
    ??1ConfigProto@tensorflow@@UEAA@XZ
    ??1LogMessageFatal@internal@tensorflow@@UEAA@XZ
    ??1NodeDef@tensorflow@@UEAA@XZ
    ??1Tensor@tensorflow@@QEAA@XZ
    ?CheckDimsAtLeast@TensorShape@tensorflow@@AEBAXH@Z
    ?CheckDimsEqual@TensorShape@tensorflow@@AEBAXH@Z
    ?CheckIsAlignedAndSingleElement@Tensor@tensorflow@@AEBAXXZ
    ?CheckTypeAndIsAligned@Tensor@tensorflow@@AEBAXW4DataType@2@@Z
    ?CopyFromInternal@Tensor@tensorflow@@AEAAXAEBV12@AEBVTensorShape@2@@Z
    ?_GraphDef_default_instance_@tensorflow@@3VGraphDefDefaultTypeInternal@1@A
    ?dim_size@?$TensorShapeBase@VTensorShape@tensorflow@@@tensorflow@@QEBA_JH@Z
    ?DebugString@Tensor@tensorflow@@QEBA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@XZ
    ?StrCat@strings@tensorflow@@YA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AEBVAlphaNum@12@0@Z
    ?CopyFrom@GraphDef@tensorflow@@QEAAXAEBV12@@Z
    ??_7ConfigProto@tensorflow@@6B@
    ??$CreateMaybeMessage@VGPUOptions@tensorflow@@$$V@Arena@protobuf@google@@CAPEAVGPUOptions@tensorflow@@PEAV012@@Z
    ??0GraphDef@tensorflow@@QEAA@AEBV01@@Z
    ?fixed_address_empty_string@internal@protobuf@google@@3V?$ExplicitlyConstructed@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@123@A

As documented by
#22047 (comment)

My software is linked against libprotobuf.lib from https://mirror.bazel.build/github.com/google/protobuf/archive/v3.6.0.tar.gz

built as

cmake -G "Visual Studio 14 2015 Win64"  .. -DCMAKE_INSTALL_PREFIX="%current%\protobuf-3.6.0" -Dprotobuf_BUILD_TESTS=OFF -Dprotobuf_BUILD_SHARED_LIBS=ON -Dprotobuf_MSVC_STATIC_RUNTIME=OFF
cmake --build . --target install --config Release -- /maxcpucount:12

I also tried editing tensorflow\tf_version_script.lds to include

*protobuf*

I also tried the TF_EXPORT macro from #include "tensorflow/core/platform/macros.h"

in
tensorflow/core/public/session_options.h
and
tensorflow/core/common_runtime/session_options.cc

as suggested by
https://github.com/sitting-duck/stuff/tree/master/ai/tensorflow/build_tensorflow_1.14_source_for_Windows

Do you have any suggestions about how to make sure that

the GPU options for allocator type and visible device list do not share the same memory but we still have a monolithic DLL under windows?

@kognat-docs kognat-docs added the type:bug Bug label May 12, 2020
@ravikyram ravikyram added comp:gpu GPU related issues TF 1.12 Issues related to TF 1.12 labels May 12, 2020
@ravikyram ravikyram assigned gowthamkpr and unassigned ravikyram May 12, 2020
@ravikyram ravikyram added comp:runtime c++ runtime, performance issues (cpu) type:performance Performance Issue and removed type:bug Bug labels May 12, 2020
@samhodge
Copy link

So clearly it is a compilation and linking problem these attributes are part of the same protobuf message:

https://github.com/tensorflow/tensorflow/blob/r1.12/tensorflow/core/protobuf/config.proto

So the symbol they address will have the same name.

Which is

?fixed_address_empty_string@internal@protobuf@google@@3v?$ExplicitlyConstructed@V?$basic_string@DU?$char_traits@D@std@@v?$allocator@D@2@@std@@@123@A

How is it possible for the compilation process to address different a memory address for the same protobuf symbol from the message in the .cc file above.

@kognat-docs
Copy link
Author

Mentioning similar issue:
#23542

@kognat-docs
Copy link
Author

Where is the object file for this config.proto file mentioned above

I could find

bazel-bin/tensorflow/core/_objs/master_proto_cc/master.pb.o

I could try linking against that rather than exposing the symbol from tensorflow.dll

But I have yet to do a symbol dump from that file to see if

?fixed_address_empty_string@internal@protobuf@google@@3v?$ExplicitlyConstructed@V?$basic_string@DU?$char_traits@D@std@@v?$allocator@D@2@@std@@@123@A is there.

@kognat-docs
Copy link
Author

Here is another sign of hope:

#23542 (comment)

@kognat-docs
Copy link
Author

Mentioning @ttdd11 @Steroes @ZhuoranLyu @brantl @sitting-duck who have been near this issue before.

@kognat-docs
Copy link
Author

OK here it is

From my continous integration test

101 $ cl -nologo -EHsc -GR -Zc:forScope -Zc:wchar_t  .\main.cpp .\tensorflow-r1.12\bazel-bin\tensorflow\tensorflow_cc.lib .\tensorflow-r1.12\bazel-bin\tensorflow\tensorflow_framework.lib .\protobuf-3.6.0\lib\libprotobuf.lib  /I.\abseil-cpp-kognat\include /I.\eigen-fd6845384b86\include /I.\eigen-fd6845384b86\include\eigen3 /I.\protobuf-3.6.0\include /I.\tensorflow-r1.12 /I.\tensorflow-r1.12\bazel-genfiles;
102 main.cpp
103 $ Set-Item -Path Env:Path -Value ("C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files (x86)\MSBuild\14.0\bin\amd64;C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\amd64;C:\Windows\Microsoft.NET\Framework64\v4.0.30319;C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\VCPackages;C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE;C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\Tools;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files (x86)\Microsoft Visual Studio 14.0\Team Tools\Performance Tools\x64;C:\Program Files (x86)\Microsoft Visual Studio 14.0\Team Tools\Performance Tools;C:\Program Files (x86)\Windows Kits\10\bin\x64;C:\Program Files (x86)\Windows Kits\10\bin\x86;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.6.1 Tools\x64\;C:\Perl64\site\bin;C:\Perl64\bin;C:\ProgramData\Anaconda3;C:\ProgramData\Anaconda3\Library\mingw-w64\bin;C:\ProgramData\Anaconda3\Library\usr\bin;C:\ProgramData\Anaconda3\Library\bin;C:\ProgramData\Anaconda3\Scripts;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Windows\System32\OpenSSH\;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files (x86)\Kognat\shared_libraries;C:\Users\Sam Hodge\.dnx\bin;C:\Program Files\Microsoft DNX\Dnvm\;C:\Program Files\Microsoft SQL Server\130\Tools\Binn\;C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit\;C:\Program Files\Git\cmd;C:\Program Files (x86)\Bazel;C:\Program Files\CMake\bin;C:\Program Files\Git LFS;C:\Program Files\7-Zip;C:\Program Files (x86)\GnuWin32\bin;C:\Program Files (x86)\wget-1.20.3-win64;C:\Program Files\NASM;C:\msys64\mingw64\bin;C:\msys64\usr\bin;C:\msys64\clang64\bin;C:\Users\Sam Hodge\AppData\Local\Microsoft\WindowsApps;.\protobuf-3.6.0\bin;.\tensorflow-r1.12\bazel-bin\tensorflow;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\extras\CUPTI\libx64;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\cudnn-9.2-windows10-x64-v7.5.0.56\cuda\bin;")
104 $ .\main.exe
105 allocator_type 0
106 visible_device_list 0
107 address visible_device_list 00007FFE73777BC0
108 address allocator_type 00007FFE73777BC0
109 Job succeeded

Here is main.cpp

include "tensorflow/core/protobuf/control_flow.pb.h"
#include "tensorflow/core/protobuf/config.pb.h"
#include <iostream>

int main() {
  tensorflow::GPUOptions gpu_options;

  gpu_options.set_visible_device_list("0");

  std::cout << "allocator_type " << gpu_options.allocator_type() << std::endl; //print 0
  std::cout << "visible_device_list " << gpu_options.visible_device_list() << std::endl; //print 0
  std::cout << "address visible_device_list " << static_cast<const void*>((gpu_options.visible_device_list().c_str())) << std::endl; //Where is this damn string
  std::cout << "address allocator_type " << static_cast<const void*>((gpu_options.allocator_type().c_str())) << std::endl; //Where is this damn string

}

Now I will see if I can get it to work somehow so those two things are not on top of each other.

@kognat-docs
Copy link
Author

Can anyone describe to me how

https://github.com/tensorflow/tensorflow/blob/r1.12/tensorflow/core/protobuf/config.proto

Becomes those two std::string.c_str() on the same bit of memory?

I do not understand the protobuf and bazel process particularly well.

@kognat-docs
Copy link
Author

kognat-docs commented May 14, 2020

Here is the contents of config.pb.h config.pb.cc
config.pb.h.zip
config.pb.cc.zip

found in .\tensorflow-r1.12\bazel-genfiles\tensorflow\core\protobuf

in the attached .zip files

@kognat-docs
Copy link
Author

kognat-docs commented May 14, 2020

Here are the control_flow.pb.h and control_flow.pb.cc as .zip files also from .\tensorflow-r1.12\bazel-genfiles\tensorflow\core\protobuf

control_flow.pb.h.zip
control_flow.pb.cc.zip

@gowthamkpr gowthamkpr assigned sanjoy and unassigned gowthamkpr May 14, 2020
@gowthamkpr gowthamkpr added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label May 14, 2020
@samhodge
Copy link

The same code on a static version of tensorflow with the same code under Linux does not share the same address.

@samhodge
Copy link

@sanjoy if you need any additional information everything I do is triggered by repeatable scripts in a CI environment this is not a roulette process but a repeatable process.

@samhodge
Copy link

@gunan Do you have anyone working on this?

@gunan
Copy link
Contributor

gunan commented May 20, 2020

Does the issue occur on a newer version?
1.12 is definitely outside our support window. I can try exploring this on 1.15, but realistically, master or 2.2 is the ones I will be able to help most with.

@samhodge
Copy link

@gunan

I will have an attempt, but r1.12+ requires hardware instructions which are more modern than some legacy hardware which I wanted to support I think AVX, AVX2 and SSE4

I will see if I can do the build on a cloud box I think others have reported on 1.15 I will see if I can find a reference to that first before spinning up a whole new build platform.

Sam

@samhodge
Copy link

samhodge commented Jun 20, 2020

I think I need static linking a little more "out of the box"

my hacked solution without linking to protocol buffers (protobuf_archive) 3.6.0 on tensorflow r1.12

leaves me with the following missing symbols

Without betting property on it these all come from .proto files

  example.obj : error LNK2019: unresolved external symbol "void __cdecl tensorflow::port::InitMain(char const *,int *,c
har * * *)"
  example.obj : error LNK2019: unresolved external symbol "__declspec(dllimport) public: void * __cdecl google::protobu
f::internal::ArenaImpl::AllocateAligned(unsigned __int64)"
  example.obj : error LNK2019: unresolved external symbol "__declspec(dllimport) public: void __cdecl google::protobuf:
:internal::ArenaImpl::AddCleanup(void *,void (__cdecl*)(void *))" 
  example.obj : error LNK2019: unresolved external symbol "__declspec(dllimport) private: void __cdecl google::protobuf
::Arena::OnArenaAllocation(class type_info const *,unsigned __int64)const " 
  example.obj : error LNK2019: unresolved external symbol "__declspec(dllimport) public: void __cdecl google::protobuf:
:internal::ArenaStringPtr::Set(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >const *,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,class google::protobuf::Arena *)"
  example.obj : error LNK2019: unresolved external symbol "__declspec(dllimport) class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const & __cdecl google::protobuf::internal::GetEmptyStringAlreadyInit
ed(void)" 
  example.obj : error LNK2019: unresolved external symbol "__declspec(dllimport) public: bool __cdecl google::protobuf:
:MessageLite::ParseFromArray(void const *,int)" 
  example.obj : error LNK2019: unresolved external symbol "public: __cdecl tensorflow::GraphDef::GraphDef(void)" (??0GraphDef@tensorflow@@QEAA@XZ) 
  example.obj : error LNK2019: unresolved external symbol "public: virtual __cdecl tensorflow::GraphDef::~GraphDef(void)" (??1GraphDef@tensorflow@@UEAA@XZ) 
  example.obj : error LNK2019: unresolved external symbol "public: __cdecl tensorflow::internal::LogMessageFatal::LogMe
ssageFatal(char const *,int)" 
  example.obj : error LNK2019: unresolved external symbol "public: __cdecl tensorflow::internal::CheckOpMessageBuilder:
:~CheckOpMessageBuilder(void)" 
  example.obj : error LNK2019: unresolved external symbol "public: class std::basic_ostream<char,struct std::char_trait
s<char> > * __cdecl tensorflow::internal::CheckOpMessageBuilder::ForVar2(void)" (?ForVar2@CheckOpMessageBuilder@interna
l@tensorflow@@QEAAPEAV?$basic_ostream@DU?$char_traits@D@std@@@std@@XZ) referenced in function "class std::basic_string<
char,struct std::char_traits<char>,class std::allocator<char> > * __cdecl tensorflow::internal::MakeCheckOpString<__int
64,__int64>(__int64 const &,__int64 const &,char const *)"
  example.obj : error LNK2019: unresolved external symbol "public: class std::basic_string<char,struct std::char_traits
<char>,class std::allocator<char> > * __cdecl tensorflow::internal::CheckOpMessageBuilder::NewString(void)" (?NewString
@CheckOpMessageBuilder@internal@tensorflow@@QEAAPEAV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@XZ) re
ferenced in function "class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > * __cdecl
 tensorflow::internal::MakeCheckOpString<__int64,__int64>(__int64 const &,__int64 const &,char const *)" 
  example.obj : error LNK2019: unresolved external symbol "unsigned __int64 __cdecl tensorflow::strings::FastUInt64ToBu
fferLeft(unsigned __int64,char *)" 
  example.obj : error LNK2019: unresolved external symbol "class std::basic_string<char,struct std::char_traits<char>,c
lass std::allocator<char> > __cdecl tensorflow::strings::StrCat(class tensorflow::strings::AlphaNum const &)" 
  example.obj : error LNK2019: unresolved external symbol "class std::basic_string<char,struct std::char_traits<char>,c
lass std::allocator<char> > __cdecl tensorflow::strings::StrCat(class tensorflow::strings::AlphaNum const &,class tenso
rflow::strings::AlphaNum const &)" 
  example.obj : error LNK2019: unresolved external symbol "public: __cdecl tensorflow::Status::Status(enum tensorflow::
error::Code,class absl::string_view)" 
  example.obj : error LNK2019: unresolved external symbol "private: static class std::basic_string<char,struct std::cha
r_traits<char>,class std::allocator<char> > const & __cdecl tensorflow::Status::empty_string(void)" 

  example.obj : error LNK2019: unresolved external symbol "public: class std::basic_string<char,struct std::char_traits
<char>,class std::allocator<char> > __cdecl tensorflow::TensorShapeRep::DebugString(void)const " 
  example.obj : error LNK2019: unresolved external symbol "private: void __cdecl tensorflow::TensorShapeRep::Destructor
OutOfLine(void)" (?DestructorOutOfLine@TensorShapeRep@tensorflow@@AEAAXXZ) referenced in function "public: __cdecl tens
orflow::TensorShape::~TensorShape(void)" 
  example.obj : error LNK2019: unresolved external symbol "private: void __cdecl tensorflow::TensorShapeRep::SlowCopyFr
om(class tensorflow::TensorShapeRep const &)" 
  example.obj : error LNK2019: unresolved external symbol "public: __cdecl tensorflow::TensorShapeBase<class tensorflow
::TensorShape>::TensorShapeBase<class tensorflow::TensorShape>(class absl::Span<__int64 const >)" 
  example.obj : error LNK2019: unresolved external symbol "private: void __cdecl tensorflow::TensorShape::CheckDimsEqual(int)const " 
  example.obj : error LNK2019: unresolved external symbol "private: void __cdecl tensorflow::TensorShape::CheckDimsAtLeast(int)const " 
  example.obj : error LNK2019: unresolved external symbol "public: __cdecl tensorflow::Tensor::Tensor(enum tensorflow::
DataType,class tensorflow::TensorShape const &)"
  example.obj : error LNK2019: unresolved external symbol "public: __cdecl tensorflow::Tensor::~Tensor(void)" 
  example.obj : error LNK2019: unresolved external symbol "private: void __cdecl tensorflow::Tensor::CheckTypeAndIsAligned(enum tensorflow::DataType)const " 
  example.obj : error LNK2019: unresolved external symbol "private: static class tensorflow::GPUOptions * __cdecl google::protobuf::Arena::CreateMaybeMessage<class tensorflow::GPUOptions>(class google::protobuf::Arena *)"
  example.obj : error LNK2019: unresolved external symbol "public: virtual __cdecl tensorflow::ConfigProto::~ConfigProto(void)"
  example.obj : error LNK2001: unresolved external symbol "public: void __cdecl tensorflow::ConfigProto::CopyFrom(class tensorflow::ConfigProto const &)"
  example.obj : error LNK2001: unresolved external symbol "private: void __cdecl tensorflow::ConfigProto::InternalSwap(class tensorflow::ConfigProto *)"
  example.obj : error LNK2019: unresolved external symbol "__declspec(dllimport) public: __cdecl tensorflow::SessionOptions::SessionOptions(void)"
  example.obj : error LNK2019: unresolved external symbol "class tensorflow::Session * __cdecl tensorflow::NewSession(struct tensorflow::SessionOptions const &)"
  example.obj : error LNK2001: unresolved external symbol "const tensorflow::ConfigProto::`vftable'"

here are a list of my libs

$ENV:TFLIBS1="/WHOLEARCHIVE:c/eager/libc_api.a;/WHOLEARCHIVE:core/distributed_runtime/rpc/eager/libgrpc_eager_client.a;/WHOLEARCHIVE:core/distributed_runtime/rpc/libgrpc_server_lib.lo;/WHOLEARCHIVE:core/distributed_runtime/rpc/libgrpc_master_service.lo;/WHOLEARCHIVE:core/distributed_runtime/rpc/libgrpc_master_service_impl.a;/WHOLEARCHIVE:core/distributed_runtime/liblocal_master.a;/WHOLEARCHIVE:core/distributed_runtime/libmaster.a;/WHOLEARCHIVE:core/distributed_runtime/libmaster_session.a;/WHOLEARCHIVE:core/distributed_runtime/libscheduler.a;/WHOLEARCHIVE:core/distributed_runtime/librpc_collective_executor_mgr.a;/WHOLEARCHIVE:core/distributed_runtime/libcollective_param_resolver_distributed.a;/WHOLEARCHIVE:core/distributed_runtime/libcollective_rma_distributed.a;/WHOLEARCHIVE:core/distributed_runtime/libdevice_resolver_distributed.a;/WHOLEARCHIVE:core/distributed_runtime/rpc/eager/libgrpc_eager_service_impl.a;/WHOLEARCHIVE:core/distributed_runtime/rpc/eager/libgrpc_eager_service.a;/WHOLEARCHIVE:core/distributed_runtime/eager/libeager_service_impl.a;/WHOLEARCHIVE:core/common_runtime/eager/libexecute.a;/WHOLEARCHIVE:core/common_runtime/eager/libeager_operation.a;/WHOLEARCHIVE:core/common_runtime/eager/libattr_builder.a;/WHOLEARCHIVE:core/common_runtime/eager/libtensor_handle.a;/WHOLEARCHIVE:core/common_runtime/eager/libcontext.a;/WHOLEARCHIVE:core/common_runtime/eager/libeager_executor.a;/WHOLEARCHIVE:core/common_runtime/eager/libkernel_and_device.a;/WHOLEARCHIVE:core/libeager_service_proto_cc.a;/WHOLEARCHIVE:core/distributed_runtime/rpc/libgrpc_worker_cache.a;/WHOLEARCHIVE:core/distributed_runtime/rpc/libgrpc_channel.a;/WHOLEARCHIVE:core/distributed_runtime/rpc/libgrpc_remote_worker.a;/WHOLEARCHIVE:core/distributed_runtime/libworker_cache_logger.a;/WHOLEARCHIVE:core/distributed_runtime/libworker_cache_partial.a;/WHOLEARCHIVE:core/distributed_runtime/rpc/libgrpc_worker_service.a;/WHOLEARCHIVE:core/distributed_runtime/rpc/libgrpc_tensor_coding.a;/WHOLEARCHIVE:core/distributed_runtime/rpc/libgrpc_worker_service_impl.a;/WHOLEARCHIVE:core/distributed_runtime/rpc/libgrpc_util.a;/WHOLEARCHIVE:core/distributed_runtime/librecent_request_ids.a;/WHOLEARCHIVE:core/distributed_runtime/libworker.a;/WHOLEARCHIVE:core/distributed_runtime/libpartial_run_mgr.a;/WHOLEARCHIVE:core/distributed_runtime/libsession_mgr.a;/WHOLEARCHIVE:core/distributed_runtime/rpc/librpc_rendezvous_mgr.a;/WHOLEARCHIVE:core/distributed_runtime/libbase_rendezvous_mgr.a;/WHOLEARCHIVE:core/distributed_runtime/libworker_session.a;/WHOLEARCHIVE:core/distributed_runtime/libgraph_mgr.a;/WHOLEARCHIVE:core/debug/libdebug.lo;/WHOLEARCHIVE:core/debug/libdebugger_state_impl.lo;/WHOLEARCHIVE:core/libdebug_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libdebug_ops.lo;/WHOLEARCHIVE:core/debug/libdebug_io_utils.lo;/WHOLEARCHIVE:core/debug/libdebug_callback_registry.a;/WHOLEARCHIVE:core/debug/libdebug_node_key.a;/WHOLEARCHIVE:core/debug/libdebug_service_proto_cc.a;/WHOLEARCHIVE:core/debug/libdebugger_event_metadata_proto_cc.a;/WHOLEARCHIVE:grpc/libgrpc++.a;/WHOLEARCHIVE:grpc/libgrpc++_base.a;/WHOLEARCHIVE:grpc/libgrpc.a;/WHOLEARCHIVE:grpc/libcensus.a;/WHOLEARCHIVE:grpc/libgrpc_lb_policy_pick_first.a;/WHOLEARCHIVE:grpc/libgrpc_lb_policy_round_robin.a;/WHOLEARCHIVE:grpc/libgrpc_server_load_reporting.a;/WHOLEARCHIVE:grpc/libgrpc_max_age_filter.a;/WHOLEARCHIVE:grpc/libgrpc_message_size_filter.a;/WHOLEARCHIVE:grpc/libgrpc_resolver_dns_ares.a;/WHOLEARCHIVE:grpc/third_party/address_sorting/libaddress_sorting.a;/WHOLEARCHIVE:grpc/libgrpc_resolver_dns_native.a;/WHOLEARCHIVE:grpc/libgrpc_resolver_sockaddr.a;/WHOLEARCHIVE:grpc/libgrpc_transport_chttp2_server_insecure.a;/WHOLEARCHIVE:grpc/libgrpc_transport_inproc.a;/WHOLEARCHIVE:grpc/libgrpc_workaround_cronet_compression_filter.a;/WHOLEARCHIVE:grpc/libgrpc_server_backward_compatibility.a;/WHOLEARCHIVE:grpc/libgrpc_lb_policy_grpclb_secure.a;/WHOLEARCHIVE:grpc/libgrpc_resolver_fake.a;/WHOLEARCHIVE:grpc/libgrpc_transport_chttp2_client_secure.a;/WHOLEARCHIVE:grpc/libgrpc_transport_chttp2_server_secure.a;/WHOLEARCHIVE:grpc/libgrpc_secure.a;/WHOLEARCHIVE:grpc/libtsi.a;/WHOLEARCHIVE:grpc/libalts_frame_protector.a;/WHOLEARCHIVE:grpc/libalts_util.a;/WHOLEARCHIVE:grpc/libalts_proto.a;/WHOLEARCHIVE:grpc/libgrpc_nanopb.a;/WHOLEARCHIVE:grpc/libgrpc_transport_chttp2_client_insecure.a;/WHOLEARCHIVE:grpc/libgrpc_transport_chttp2_client_connector.a;/WHOLEARCHIVE:grpc/libgrpc_client_channel.a;/WHOLEARCHIVE:grpc/libgrpc_client_authority_filter.a;/WHOLEARCHIVE:grpc/libgrpc_deadline_filter.a;/WHOLEARCHIVE:grpc/libtsi_interface.a;/WHOLEARCHIVE:boringssl/libssl.a;/WHOLEARCHIVE:boringssl/libcrypto.a;/WHOLEARCHIVE:grpc/libgrpc_transport_chttp2_server.a;/WHOLEARCHIVE:grpc/libgrpc_transport_chttp2.a;/WHOLEARCHIVE:grpc/libgrpc_http_filters.a;/WHOLEARCHIVE:grpc/libgrpc_base.a;/WHOLEARCHIVE:grpc/libgrpc_base_c.a;/WHOLEARCHIVE:grpc/libgrpc_trace.a;/WHOLEARCHIVE:grpc/libgrpc_transport_chttp2_alpn.a;/WHOLEARCHIVE:grpc/libgpr_base.a;/WHOLEARCHIVE:grpc/libgrpc++_codegen_base_src.a;/WHOLEARCHIVE:core/distributed_runtime/librequest_id.a;/WHOLEARCHIVE:core/distributed_runtime/libtensor_coding.a;/WHOLEARCHIVE:core/distributed_runtime/libremote_device.a;/WHOLEARCHIVE:core/distributed_runtime/libcall_options.a;/WHOLEARCHIVE:core/distributed_runtime/libmessage_wrappers.a;/WHOLEARCHIVE:core/libmaster_proto_cc.a;/WHOLEARCHIVE:core/libworker_proto_cc.a;/WHOLEARCHIVE:core/distributed_runtime/libserver_lib.a;/WHOLEARCHIVE:cc/libclient_session.a;/WHOLEARCHIVE:cc/profiler/libprofiler.a;/WHOLEARCHIVE:core/profiler/internal/libtfprof_stats.a;/WHOLEARCHIVE:core/profiler/internal/libtfprof_code.a;/WHOLEARCHIVE:core/profiler/internal/libtfprof_graph.a;/WHOLEARCHIVE:core/profiler/internal/libtfprof_op.a;/WHOLEARCHIVE:core/profiler/internal/libtfprof_show_multi.a;/WHOLEARCHIVE:core/profiler/internal/libtfprof_scope.a;/WHOLEARCHIVE:core/profiler/internal/libtfprof_show.a;/WHOLEARCHIVE:core/profiler/internal/libtfprof_tensor.a;/WHOLEARCHIVE:core/profiler/internal/libtfprof_timeline.a;/WHOLEARCHIVE:core/profiler/internal/libtfprof_node_show.a;/WHOLEARCHIVE:core/profiler/internal/libtfprof_node.a;/WHOLEARCHIVE:jsoncpp_git/libjsoncpp.a;/WHOLEARCHIVE:core/profiler/internal/libtfprof_utils.a;/WHOLEARCHIVE:c/libcheckpoint_reader.a;/WHOLEARCHIVE:c/libtf_status_helper.a;/WHOLEARCHIVE:c/libc_api.a;/WHOLEARCHIVE:cc/saved_model/libloader_lite.a;/WHOLEARCHIVE:cc/saved_model/libreader.a;/WHOLEARCHIVE:cc/libarray_grad.lo;/WHOLEARCHIVE:cc/libdata_flow_grad.lo;/WHOLEARCHIVE:cc/libimage_grad.lo;/WHOLEARCHIVE:cc/libmath_grad.lo;/WHOLEARCHIVE:cc/libnn_grad.lo;/WHOLEARCHIVE:cc/libgradients.a;/WHOLEARCHIVE:cc/libgrad_op_registry.a;/WHOLEARCHIVE:cc/libwhile_loop.a;/WHOLEARCHIVE:cc/libcc_ops.lo;/WHOLEARCHIVE:cc/libcc_ops_internal.lo;/WHOLEARCHIVE:cc/libconst_op.a;/WHOLEARCHIVE:cc/libscope.a;/WHOLEARCHIVE:cc/libops.a;/WHOLEARCHIVE:core/libop_gen_lib.a;/WHOLEARCHIVE:core/profiler/libtfprof_options.a;/WHOLEARCHIVE:core/profiler/libprotos_all_cc.a;/WHOLEARCHIVE:core/kernels/libbatch_space_ops.lo;/WHOLEARCHIVE:core/kernels/libbatch_space_ops_gpu.lo;/WHOLEARCHIVE:core/kernels/libbcast_ops.lo;/WHOLEARCHIVE:core/kernels/libbitcast_op.lo;/WHOLEARCHIVE:core/kernels/libbroadcast_to_op.lo;/WHOLEARCHIVE:core/kernels/libbroadcast_to_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libconcat_op.lo;/WHOLEARCHIVE:core/kernels/libconstant_op.lo;/WHOLEARCHIVE:core/kernels/libconstant_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libdepth_space_ops.lo;/WHOLEARCHIVE:core/kernels/libdepth_space_ops_gpu.lo;/WHOLEARCHIVE:core/kernels/libdiag_op.lo;/WHOLEARCHIVE:core/kernels/libdiag_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libedit_distance_op.lo;/WHOLEARCHIVE:core/kernels/libextract_image_patches_op.lo;/WHOLEARCHIVE:core/kernels/libextract_image_patches_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libextract_volume_patches_op.lo;/WHOLEARCHIVE:core/kernels/libextract_volume_patches_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libgather_nd_op.lo;/WHOLEARCHIVE:core/kernels/libgather_nd_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libgather_op.lo;/WHOLEARCHIVE:core/kernels/libguarantee_const_op.lo;/WHOLEARCHIVE:core/kernels/libhost_constant_op.lo;/WHOLEARCHIVE:core/kernels/libidentity_n_op.lo;/WHOLEARCHIVE:core/kernels/libidentity_op.lo;/WHOLEARCHIVE:core/kernels/liblistdiff_op.lo;/WHOLEARCHIVE:core/kernels/libmatrix_diag_op.lo;/WHOLEARCHIVE:core/kernels/libmatrix_diag_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libmatrix_set_diag_op.lo;/WHOLEARCHIVE:core/kernels/libmatrix_set_diag_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libmirror_pad_op.lo;/WHOLEARCHIVE:core/kernels/libmirror_pad_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libone_hot_op.lo;/WHOLEARCHIVE:core/kernels/libone_hot_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libpack_op.lo;/WHOLEARCHIVE:core/kernels/libpad_op.lo;/WHOLEARCHIVE:core/kernels/libpad_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libquantize_and_dequantize_op.lo;/WHOLEARCHIVE:core/kernels/libquantize_and_dequantize_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libreshape_op.lo;/WHOLEARCHIVE:core/kernels/libreverse_op.lo;/WHOLEARCHIVE:core/kernels/libreverse_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libreverse_sequence_op.lo;/WHOLEARCHIVE:core/kernels/libreverse_sequence_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libshape_ops.lo;/WHOLEARCHIVE:core/kernels/libslice_op.lo;/WHOLEARCHIVE:core/kernels/libslice_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libsnapshot_op.lo;/WHOLEARCHIVE:core/kernels/libsnapshot_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libsplit_op.lo;/WHOLEARCHIVE:core/kernels/libsplit_v_op.lo;/WHOLEARCHIVE:core/kernels/libstrided_slice_op.lo;/WHOLEARCHIVE:core/kernels/libstrided_slice_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libtile_ops.lo;/WHOLEARCHIVE:core/kernels/libtile_ops_gpu.lo;/WHOLEARCHIVE:core/kernels/libtranspose_op.lo;/WHOLEARCHIVE:core/kernels/libunique_op.lo;/WHOLEARCHIVE:core/kernels/libunpack_op.lo;/WHOLEARCHIVE:core/kernels/libunravel_index_op.lo;/WHOLEARCHIVE:core/kernels/libwhere_op.lo;/WHOLEARCHIVE:core/kernels/libwhere_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libdecode_wav_op.lo;/WHOLEARCHIVE:core/kernels/libencode_wav_op.lo;/WHOLEARCHIVE:core/kernels/libmfcc_op.lo;/WHOLEARCHIVE:core/kernels/libmfcc.a;/WHOLEARCHIVE:core/kernels/libmfcc_dct.a;/WHOLEARCHIVE:core/kernels/libmfcc_mel_filterbank.a;/WHOLEARCHIVE:core/kernels/libspectrogram_op.lo;/WHOLEARCHIVE:core/kernels/libspectrogram.a;/WHOLEARCHIVE:core/libaudio_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libbatch_kernels.lo;/WHOLEARCHIVE:core/libbatch_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/batching_util/libperiodic_function_dynamic.a;/WHOLEARCHIVE:core/kernels/boosted_trees/libprediction_ops.lo;/WHOLEARCHIVE:core/kernels/boosted_trees/libquantile_ops.lo;/WHOLEARCHIVE:core/kernels/boosted_trees/libresource_ops.lo;/WHOLEARCHIVE:core/kernels/boosted_trees/libstats_ops.lo;/WHOLEARCHIVE:core/kernels/boosted_trees/libtraining_ops.lo;/WHOLEARCHIVE:core/kernels/boosted_trees/libresources.a;/WHOLEARCHIVE:core/libboosted_trees_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/boosted_trees/libboosted_trees_proto_cc.a;/WHOLEARCHIVE:core/kernels/libcandidate_sampler_ops.lo;/WHOLEARCHIVE:core/kernels/librange_sampler.a;/WHOLEARCHIVE:core/libcandidate_sampling_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libgenerate_vocab_remapping_op.lo;/WHOLEARCHIVE:core/kernels/libload_and_remap_matrix_op.lo;/WHOLEARCHIVE:core/libcheckpoint_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libcollective_ops.lo;/WHOLEARCHIVE:core/libcollective_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libctc_ops.lo;/WHOLEARCHIVE:core/libctc_ops_op_lib.lo;/WHOLEARCHIVE:core/util/ctc/libctc_loss_calculator_lib.a;/WHOLEARCHIVE:core/kernels/libcudnn_rnn_kernels.lo;/WHOLEARCHIVE:core/libcudnn_rnn_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libbarrier_ops.lo;/WHOLEARCHIVE:core/kernels/libconditional_accumulator_base_op.lo;/WHOLEARCHIVE:core/kernels/libconditional_accumulator_op.lo;/WHOLEARCHIVE:core/kernels/libdynamic_partition_op.lo;/WHOLEARCHIVE:core/kernels/libdynamic_partition_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libdynamic_stitch_op.lo;/WHOLEARCHIVE:core/kernels/libdynamic_stitch_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libfifo_queue_op.lo;/WHOLEARCHIVE:core/kernels/libmap_stage_op.lo;/WHOLEARCHIVE:core/kernels/libpadding_fifo_queue_op.lo;/WHOLEARCHIVE:core/kernels/libpriority_queue_op.lo;"

$ENV:TFLIBS2="/WHOLEARCHIVE:core/kernels/libqueue_ops.lo;/WHOLEARCHIVE:core/kernels/librandom_shuffle_queue_op.lo;/WHOLEARCHIVE:core/kernels/librecord_input_op.lo;/WHOLEARCHIVE:core/kernels/libsession_ops.lo;/WHOLEARCHIVE:core/kernels/libsparse_conditional_accumulator_op.lo;/WHOLEARCHIVE:core/kernels/libstack_ops.lo;/WHOLEARCHIVE:core/kernels/libstage_op.lo;/WHOLEARCHIVE:core/kernels/libtensor_array_ops.lo;/WHOLEARCHIVE:core/kernels/libconditional_accumulator_base.a;/WHOLEARCHIVE:core/kernels/libpadding_fifo_queue.a;/WHOLEARCHIVE:core/kernels/libfifo_queue.a;/WHOLEARCHIVE:core/kernels/libpriority_queue.a;/WHOLEARCHIVE:core/kernels/libqueue_op.a;/WHOLEARCHIVE:core/kernels/libsplit_lib.a;/WHOLEARCHIVE:core/kernels/libsplit_lib_gpu.lo;/WHOLEARCHIVE:core/kernels/libtensor_array.lo;/WHOLEARCHIVE:core/kernels/libqueue_base.a;/WHOLEARCHIVE:core/libdata_flow_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/data/libbatch_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libcache_dataset_ops.lo;/WHOLEARCHIVE:core/kernels/data/libconcatenate_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libdataset_ops.lo;/WHOLEARCHIVE:core/kernels/data/libdense_to_sparse_batch_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libfilter_by_component_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libfilter_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libflat_map_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libgenerator_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libgroup_by_reducer_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libgroup_by_window_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libinterleave_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libiterator_ops.lo;/WHOLEARCHIVE:core/kernels/data/libmap_and_batch_dataset_op.lo;/WHOLEARCHIVE:core/kernels/libinplace_ops.lo;/WHOLEARCHIVE:core/kernels/libinplace_ops_gpu.lo;/WHOLEARCHIVE:core/kernels/data/libmap_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libmap_defun_op.lo;/WHOLEARCHIVE:core/kernels/data/libmodel_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libmulti_device_iterator_ops.lo;/WHOLEARCHIVE:core/kernels/data/liboptimize_dataset_op.lo;/WHOLEARCHIVE:core/grappler/libgrappler_item_builder.a;/WHOLEARCHIVE:core/grappler/inputs/libutils.a;/WHOLEARCHIVE:core/grappler/optimizers/data/libfilter_fusion.a;/WHOLEARCHIVE:core/grappler/optimizers/data/libhoist_random_uniform.a;/WHOLEARCHIVE:core/grappler/optimizers/data/liblatency_all_edges.a;/WHOLEARCHIVE:core/grappler/optimizers/data/libmap_and_batch_fusion.a;/WHOLEARCHIVE:core/grappler/optimizers/data/libmap_and_filter_fusion.a;/WHOLEARCHIVE:core/grappler/optimizers/data/libmap_fusion.a;/WHOLEARCHIVE:core/grappler/optimizers/data/libfusion_utils.a;/WHOLEARCHIVE:core/kernels/libcontrol_flow_ops.lo;/WHOLEARCHIVE:core/libcontrol_flow_ops_op_lib.lo;/WHOLEARCHIVE:core/grappler/optimizers/data/libmap_parallelization.a;/WHOLEARCHIVE:core/grappler/optimizers/data/libmap_vectorization.a;/WHOLEARCHIVE:core/grappler/optimizers/data/libvectorization_utils.a;/WHOLEARCHIVE:core/grappler/optimizers/data/libfunction_utils.a;/WHOLEARCHIVE:core/grappler/optimizers/data/vectorization/libcast_vectorizer.lo;/WHOLEARCHIVE:core/grappler/optimizers/data/vectorization/libunpack_vectorizer.lo;/WHOLEARCHIVE:core/grappler/optimizers/data/vectorization/libvectorizer_registry.a;/WHOLEARCHIVE:core/grappler/optimizers/data/libnoop_elimination.a;/WHOLEARCHIVE:core/grappler/optimizers/data/libshuffle_and_repeat_fusion.a;/WHOLEARCHIVE:core/grappler/optimizers/data/libgraph_utils.a;/WHOLEARCHIVE:core/grappler/libmutable_graph_view.a;/WHOLEARCHIVE:core/kernels/data/liboptional_ops.lo;/WHOLEARCHIVE:core/kernels/data/libpadded_batch_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libparallel_interleave_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libparallel_map_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libparse_example_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libparallel_map_iterator.a;/WHOLEARCHIVE:core/kernels/data/libprefetch_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libprefetch_autotuner.a;/WHOLEARCHIVE:core/kernels/data/librandom_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/librange_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libreader_dataset_ops.lo;/WHOLEARCHIVE:core/kernels/data/librepeat_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libscan_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libshuffle_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libskip_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libslide_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libsparse_tensor_slice_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libsql_dataset_ops.lo;/WHOLEARCHIVE:core/kernels/data/sql/libsql.a;/WHOLEARCHIVE:core/kernels/data/libstats_aggregator_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libstats_aggregator_ops.lo;/WHOLEARCHIVE:core/kernels/data/libstats_dataset_ops.lo;/WHOLEARCHIVE:core/kernels/data/libtake_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libtensor_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libtensor_queue_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libtensor_slice_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libunbatch_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libwindow_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/libwindow_dataset.a;/WHOLEARCHIVE:core/kernels/data/libwriter_ops.lo;/WHOLEARCHIVE:core/kernels/data/libdataset_utils.a;/WHOLEARCHIVE:core/kernels/data/libcaptured_function.a;/WHOLEARCHIVE:core/kernels/data/libsingle_threaded_executor.lo;/WHOLEARCHIVE:core/kernels/data/libzip_dataset_op.lo;/WHOLEARCHIVE:core/libdataset_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/data/experimental/libassert_next_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/experimental/libcsv_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/experimental/libdirected_interleave_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/experimental/libignore_errors_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/experimental/libindexed_dataset.lo;/WHOLEARCHIVE:core/kernels/data/experimental/liblmdb_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/experimental/libprefetching_kernels.lo;/WHOLEARCHIVE:core/kernels/data/experimental/libthreadpool_dataset_op.lo;/WHOLEARCHIVE:core/kernels/data/experimental/libunique_dataset_op.lo;/WHOLEARCHIVE:core/libexperimental_dataset_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libdecode_proto_op.lo;/WHOLEARCHIVE:core/libdecode_proto_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libencode_proto_op.lo;/WHOLEARCHIVE:core/libencode_proto_ops_op_lib.lo;/WHOLEARCHIVE:core/util/proto/libdescriptors.a;/WHOLEARCHIVE:core/util/proto/liblocal_descriptor_pool_registration.lo;/WHOLEARCHIVE:core/util/proto/libdescriptor_pool_registry.a;/WHOLEARCHIVE:core/util/proto/libproto_utils.a;/WHOLEARCHIVE:core/kernels/libfake_quant_ops.lo;/WHOLEARCHIVE:core/kernels/libfake_quant_ops_gpu.lo;/WHOLEARCHIVE:core/kernels/libfunctional_ops.lo;/WHOLEARCHIVE:core/kernels/libunary_ops_composition.lo;/WHOLEARCHIVE:core/kernels/libadjust_contrast_op.lo;/WHOLEARCHIVE:core/kernels/libadjust_contrast_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libadjust_hue_op.lo;/WHOLEARCHIVE:core/kernels/libadjust_hue_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libadjust_saturation_op.lo;/WHOLEARCHIVE:core/kernels/libadjust_saturation_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libattention_ops.lo;/WHOLEARCHIVE:core/kernels/libcolorspace_op.lo;/WHOLEARCHIVE:core/kernels/libcolorspace_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libcrop_and_resize_op.lo;/WHOLEARCHIVE:core/kernels/libcrop_and_resize_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libdecode_bmp_op.lo;/WHOLEARCHIVE:core/kernels/libdecode_image_op.lo;/WHOLEARCHIVE:core/kernels/libdraw_bounding_box_op.lo;/WHOLEARCHIVE:core/kernels/libencode_jpeg_op.lo;/WHOLEARCHIVE:core/kernels/libencode_png_op.lo;/WHOLEARCHIVE:core/kernels/libextract_jpeg_shape_op.lo;/WHOLEARCHIVE:core/kernels/libnon_max_suppression_op.lo;/WHOLEARCHIVE:core/kernels/librandom_crop_op.lo;/WHOLEARCHIVE:core/kernels/libresize_area_op.lo;/WHOLEARCHIVE:core/kernels/libresize_bicubic_op.lo;/WHOLEARCHIVE:core/kernels/libresize_bilinear_op.lo;/WHOLEARCHIVE:core/kernels/libresize_bilinear_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libresize_nearest_neighbor_op.lo;/WHOLEARCHIVE:core/kernels/libresize_nearest_neighbor_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libsample_distorted_bounding_box_op.lo;/WHOLEARCHIVE:core/libgif_internal.a;/WHOLEARCHIVE:core/libimage_ops_op_lib.lo;/WHOLEARCHIVE:core/libjpeg_internal.a;/WHOLEARCHIVE:core/kernels/libfixed_length_record_reader_op.lo;/WHOLEARCHIVE:core/kernels/libidentity_reader_op.lo;/WHOLEARCHIVE:core/kernels/liblmdb_reader_op.lo;/WHOLEARCHIVE:lmdb/liblmdb.a;/WHOLEARCHIVE:core/kernels/libmatching_files_op.lo;/WHOLEARCHIVE:core/kernels/libreader_ops.lo;/WHOLEARCHIVE:core/kernels/librestore_op.lo;/WHOLEARCHIVE:core/kernels/libsave_op.lo;/WHOLEARCHIVE:core/kernels/libsave_restore_v2_ops.lo;/WHOLEARCHIVE:core/kernels/libsave_restore_tensor.a;/WHOLEARCHIVE:core/kernels/libtext_line_reader_op.lo;/WHOLEARCHIVE:core/kernels/libtf_record_reader_op.lo;/WHOLEARCHIVE:core/kernels/libwhole_file_read_ops.lo;/WHOLEARCHIVE:core/libio_ops_op_lib.lo;/WHOLEARCHIVE:core/libreader_base.a;/WHOLEARCHIVE:core/util/tensor_bundle/libtensor_bundle.a;/WHOLEARCHIVE:core/util/tensor_bundle/libnaming.a;/WHOLEARCHIVE:core/kernels/libcholesky_grad.lo;/WHOLEARCHIVE:core/kernels/libcholesky_op.lo;/WHOLEARCHIVE:core/kernels/libdeterminant_op.lo;/WHOLEARCHIVE:core/kernels/libdeterminant_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libmatrix_exponential_op.lo;/WHOLEARCHIVE:core/kernels/libmatrix_inverse_op.lo;/WHOLEARCHIVE:core/kernels/libmatrix_logarithm_op.lo;/WHOLEARCHIVE:core/kernels/libmatrix_solve_ls_op.lo;/WHOLEARCHIVE:core/kernels/libmatrix_solve_op.lo;/WHOLEARCHIVE:core/kernels/libmatrix_triangular_solve_op.lo;/WHOLEARCHIVE:core/kernels/libqr_op.lo;/WHOLEARCHIVE:core/kernels/libeye_functor_gpu.lo;/WHOLEARCHIVE:core/kernels/libmatrix_band_part_op.lo;/WHOLEARCHIVE:core/kernels/libmatrix_band_part_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libself_adjoint_eig_op.lo;/WHOLEARCHIVE:core/kernels/libself_adjoint_eig_v2_op.lo;/WHOLEARCHIVE:core/kernels/libsvd_op.lo;/WHOLEARCHIVE:core/kernels/libsvd_op_gpu.lo;/WHOLEARCHIVE:core/kernels/liblinalg_ops_common.a;/WHOLEARCHIVE:core/liblinalg_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/liblist_kernels.lo;/WHOLEARCHIVE:core/kernels/liblist_kernels_gpu.lo;/WHOLEARCHIVE:core/liblist_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/liblookup_table_init_op.lo;/WHOLEARCHIVE:core/kernels/liblookup_table_op.lo;/WHOLEARCHIVE:core/kernels/liblookup_util.a;/WHOLEARCHIVE:core/kernels/libinitializable_lookup_table.a;/WHOLEARCHIVE:core/liblookup_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/liblogging_ops.lo;/WHOLEARCHIVE:core/kernels/libsummary_audio_op.lo;/WHOLEARCHIVE:core/kernels/libsummary_image_op.lo;/WHOLEARCHIVE:core/kernels/libsummary_op.lo;/WHOLEARCHIVE:core/kernels/libsummary_tensor_op.lo;/WHOLEARCHIVE:core/liblogging_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libroll_op.lo;/WHOLEARCHIVE:core/libmanip_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libaggregate_ops.lo;/WHOLEARCHIVE:core/kernels/libaggregate_ops_gpu.lo;/WHOLEARCHIVE:core/kernels/libargmax_op.lo;/WHOLEARCHIVE:core/kernels/libargmax_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libbatch_matmul_op.lo;/WHOLEARCHIVE:core/kernels/libbetainc_op.lo;/WHOLEARCHIVE:core/kernels/libbetainc_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libbincount_op.lo;/WHOLEARCHIVE:core/kernels/libbincount_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libbucketize_op.lo;/WHOLEARCHIVE:core/kernels/libbucketize_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libcast_op.lo;/WHOLEARCHIVE:core/kernels/libcast_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libcheck_numerics_op.lo;/WHOLEARCHIVE:core/kernels/libcheck_numerics_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libcompare_and_bitpack_op.lo;/WHOLEARCHIVE:core/kernels/libcompare_and_bitpack_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libcross_op.lo;/WHOLEARCHIVE:core/kernels/libcross_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libfft_ops.lo;/WHOLEARCHIVE:core/libspectral_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libhistogram_op.lo;/WHOLEARCHIVE:core/kernels/libhistogram_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libmatmul_op.lo;/WHOLEARCHIVE:core/kernels/libpopulation_count_op.lo;/WHOLEARCHIVE:core/kernels/libpopulation_count_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libscan_ops.lo;/WHOLEARCHIVE:core/kernels/libscan_ops_gpu.lo;/WHOLEARCHIVE:core/kernels/libsegment_reduction_ops.lo;/WHOLEARCHIVE:core/kernels/libsegment_reduction_ops_gpu.lo;/WHOLEARCHIVE:core/kernels/libcuda_solvers.lo;cusolver.lib;/WHOLEARCHIVE:core/kernels/libsequence_ops.lo;/WHOLEARCHIVE:core/kernels/libmultinomial_op.lo;/WHOLEARCHIVE:core/kernels/libmultinomial_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libbatch_norm_op.lo;/WHOLEARCHIVE:core/kernels/libbatch_norm_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libbias_op.lo;/WHOLEARCHIVE:core/kernels/libbias_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libdata_format_ops.lo;/WHOLEARCHIVE:core/kernels/libdata_format_ops_gpu.lo;/WHOLEARCHIVE:core/kernels/libdepthwise_conv_grad_op.lo;/WHOLEARCHIVE:core/kernels/libdepthwise_conv_op.lo;/WHOLEARCHIVE:core/kernels/libdepthwise_conv_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libdilation_ops.lo;/WHOLEARCHIVE:core/kernels/libdilation_ops_gpu.lo;/WHOLEARCHIVE:core/kernels/libfused_batch_norm_op.lo;/WHOLEARCHIVE:core/kernels/libfused_batch_norm_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libin_topk_op.lo;/WHOLEARCHIVE:core/kernels/libl2loss_op.lo;/WHOLEARCHIVE:core/kernels/libl2loss_op_gpu.lo;/WHOLEARCHIVE:core/kernels/liblrn_op.lo;/WHOLEARCHIVE:core/kernels/libnth_element_op.lo;/WHOLEARCHIVE:core/kernels/librelu_op.lo;/WHOLEARCHIVE:core/kernels/librelu_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libsoftmax_op.lo;/WHOLEARCHIVE:core/kernels/libsoftmax_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libreduction_ops.lo;/WHOLEARCHIVE:core/kernels/libreduction_ops_gpu.lo;/WHOLEARCHIVE:core/kernels/libsoftplus_op.lo;/WHOLEARCHIVE:core/kernels/libsoftplus_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libsoftsign_op.lo;/WHOLEARCHIVE:core/kernels/libsoftsign_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libtopk_op.lo;/WHOLEARCHIVE:core/kernels/libtopk_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libxent_op.lo;/WHOLEARCHIVE:core/kernels/libxent_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libfused_batch_norm_util_gpu.lo;/WHOLEARCHIVE:core/kernels/libpooling_ops.lo;/WHOLEARCHIVE:core/kernels/libpooling_ops_gpu.lo;/WHOLEARCHIVE:core/libnn_grad.lo;/WHOLEARCHIVE:core/kernels/libparameterized_truncated_normal_op.lo;/WHOLEARCHIVE:core/kernels/libparameterized_truncated_normal_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libdecode_compressed_op.lo;/WHOLEARCHIVE:core/kernels/libdecode_csv_op.lo;/WHOLEARCHIVE:core/kernels/libdecode_raw_op.lo;/WHOLEARCHIVE:core/kernels/libexample_parsing_ops.lo;/WHOLEARCHIVE:core/kernels/libparse_tensor_op.lo;/WHOLEARCHIVE:core/kernels/libstring_to_number_op.lo;/WHOLEARCHIVE:core/libparsing_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libpartitioned_function_ops.lo;/WHOLEARCHIVE:core/kernels/librandom_poisson_op.lo;/WHOLEARCHIVE:core/kernels/librandom_shuffle_op.lo;/WHOLEARCHIVE:core/kernels/libremote_fused_graph_ops.lo;/WHOLEARCHIVE:core/kernels/libremote_fused_graph_execute_utils.a;/WHOLEARCHIVE:core/libremote_fused_graph_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libresource_variable_ops.lo;/WHOLEARCHIVE:core/kernels/libmutex_ops.lo;/WHOLEARCHIVE:core/libresource_variable_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/librpc_op.lo;/WHOLEARCHIVE:core/librpc_ops_op_lib.lo;/WHOLEARCHIVE:core/util/rpc/librpc_factory_registry.a;/WHOLEARCHIVE:core/util/rpc/librpc_factory.a;/WHOLEARCHIVE:core/kernels/libscoped_allocator_ops.lo;/WHOLEARCHIVE:core/libscoped_allocator_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libsdca_ops.lo;/WHOLEARCHIVE:core/kernels/libsdca_internal.a;/WHOLEARCHIVE:core/libsdca_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libsearchsorted_op.lo;/WHOLEARCHIVE:core/kernels/libsearchsorted_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libconcat_lib.a;/WHOLEARCHIVE:core/kernels/libconcat_lib_gpu.lo;/WHOLEARCHIVE:core/kernels/libgather_functor.lo;/WHOLEARCHIVE:core/kernels/libgather_functor_gpu.lo;/WHOLEARCHIVE:core/kernels/libtranspose_functor.a;/WHOLEARCHIVE:core/kernels/libtranspose_functor_gpu.lo;/WHOLEARCHIVE:core/kernels/libconv_ops.lo;/WHOLEARCHIVE:core/kernels/libconv_ops_gpu.lo;/WHOLEARCHIVE:core/libnn_ops_op_lib.lo;/WHOLEARCHIVE:core/libarray_grad.lo;/WHOLEARCHIVE:core/libarray_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libset_kernels.lo;/WHOLEARCHIVE:core/libset_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libdeserialize_sparse_string_op.lo;/WHOLEARCHIVE:core/kernels/libdeserialize_sparse_variant_op.lo;/WHOLEARCHIVE:core/kernels/libserialize_sparse_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_add_grad_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_add_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_concat_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_cross_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_dense_binary_op_shared.lo;/WHOLEARCHIVE:core/kernels/libsparse_fill_empty_rows_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_reduce_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_reorder_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_reshape_op.lo;/WHOLEARCHIVE:core/kernels/libreshape_util.a;/WHOLEARCHIVE:core/kernels/libsparse_slice_grad_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_slice_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_softmax.lo;/WHOLEARCHIVE:core/kernels/libsparse_sparse_binary_op_shared.lo;/WHOLEARCHIVE:core/kernels/libcwise_op.lo;/WHOLEARCHIVE:core/kernels/libcwise_op_gpu.lo;/WHOLEARCHIVE:core/libmath_grad.lo;/WHOLEARCHIVE:core/libmath_ops_op_lib.lo;/WHOLEARCHIVE:core/libbitwise_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libsparse_split_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_tensor_dense_add_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_tensor_dense_matmul_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_tensor_dense_matmul_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libsparse_tensors_map_ops.lo;/WHOLEARCHIVE:core/kernels/libsparse_to_dense_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_xent_op.lo;/WHOLEARCHIVE:core/kernels/libsparse_xent_op_gpu.lo;/WHOLEARCHIVE:core/libsparse_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libcount_up_to_op.lo;/WHOLEARCHIVE:core/kernels/libdense_update_ops.lo;/WHOLEARCHIVE:core/kernels/libscatter_nd_op.lo;/WHOLEARCHIVE:core/kernels/libscatter_nd_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libscatter_op.lo;/WHOLEARCHIVE:core/kernels/libscatter_op_gpu.lo;/WHOLEARCHIVE:core/kernels/libstateless_random_ops.lo;/WHOLEARCHIVE:core/kernels/librandom_op.lo;/WHOLEARCHIVE:core/kernels/librandom_op_gpu.lo;/WHOLEARCHIVE:core/librandom_ops_op_lib.lo;/WHOLEARCHIVE:core/libstateless_random_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libas_string_op.lo;/WHOLEARCHIVE:core/kernels/libbase64_ops.lo;/WHOLEARCHIVE:core/kernels/libreduce_join_op.lo;/WHOLEARCHIVE:core/kernels/libregex_full_match_op.lo;/WHOLEARCHIVE:core/kernels/libregex_replace_op.lo;/WHOLEARCHIVE:core/kernels/libstring_format_op.lo;/WHOLEARCHIVE:core/kernels/libstring_join_op.lo;/WHOLEARCHIVE:core/kernels/libstring_length_op.lo;/WHOLEARCHIVE:core/kernels/libstring_split_op.lo;/WHOLEARCHIVE:core/kernels/libstring_strip_op.lo;/WHOLEARCHIVE:core/kernels/libstring_to_hash_bucket_op.lo;/WHOLEARCHIVE:core/kernels/libsubstr_op.lo;/WHOLEARCHIVE:core/kernels/libstring_util.a;/WHOLEARCHIVE:core/kernels/libunicode_script_op.lo;/WHOLEARCHIVE:core/libstring_ops_op_lib.lo;/WHOLEARCHIVE:icu/libicuuc.a;/WHOLEARCHIVE:core/kernels/libsummary_kernels.lo;/WHOLEARCHIVE:contrib/tensorboard/db/libschema.a;/WHOLEARCHIVE:contrib/tensorboard/db/libsummary_db_writer.a;/WHOLEARCHIVE:contrib/tensorboard/db/libsummary_file_writer.a;/WHOLEARCHIVE:contrib/tensorboard/db/libsummary_converter.a;/WHOLEARCHIVE:core/libpng_internal.a;/WHOLEARCHIVE:png_archive/libpng.a;/WHOLEARCHIVE:core/libsummary_ops_op_lib.lo;/WHOLEARCHIVE:core/lib/db/libsqlite.a;/WHOLEARCHIVE:core/lib/db/libsnapfn.lo;/WHOLEARCHIVE:org_sqlite/liborg_sqlite.a;/WHOLEARCHIVE:core/kernels/libtraining_ops.lo;/WHOLEARCHIVE:core/kernels/libtraining_ops_gpu.lo;/WHOLEARCHIVE:core/kernels/libtraining_op_helpers.a;/WHOLEARCHIVE:core/kernels/libvariable_ops.lo;/WHOLEARCHIVE:core/kernels/libfill_functor.lo;/WHOLEARCHIVE:core/kernels/libfill_functor_gpu.lo;/WHOLEARCHIVE:core/kernels/libscatter_functor.lo;/WHOLEARCHIVE:core/kernels/libscatter_functor_gpu.lo;/WHOLEARCHIVE:core/kernels/libdense_update_functor.a;/WHOLEARCHIVE:core/kernels/libdense_update_functor_gpu.lo;/WHOLEARCHIVE:core/libstate_ops_op_lib.lo;/WHOLEARCHIVE:core/libtraining_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libword2vec_kernels.lo;/WHOLEARCHIVE:core/libword2vec_ops.lo;/WHOLEARCHIVE:core/grappler/optimizers/libgpu_swapping_kernels.lo;/WHOLEARCHIVE:core/grappler/optimizers/libgpu_swapping_ops.lo;/WHOLEARCHIVE:core/libdirect_session_internal.lo;/WHOLEARCHIVE:core/libdevice_tracer.a;/WHOLEARCHIVE:core/platform/default/gpu/libcupti_wrapper.a;/WHOLEARCHIVE:core/debug/libdebug_graph_utils.lo;/WHOLEARCHIVE:core/kernels/libfunction_ops.lo;/WHOLEARCHIVE:core/libexample_parser_configuration.lo;/WHOLEARCHIVE:core/libcore_cpu_internal.lo;/WHOLEARCHIVE:core/grappler/optimizers/libmeta_optimizer.a;/WHOLEARCHIVE:core/grappler/optimizers/libarithmetic_optimizer.a;/WHOLEARCHIVE:core/grappler/optimizers/libgraph_optimizer_stage.a;/WHOLEARCHIVE:core/grappler/optimizers/libauto_parallel.a;/WHOLEARCHIVE:core/grappler/optimizers/libdebug_stripper.a;/WHOLEARCHIVE:core/grappler/optimizers/libdependency_optimizer.a;/WHOLEARCHIVE:core/grappler/optimizers/libexperimental_implementation_selector.a;/WHOLEARCHIVE:core/grappler/optimizers/libcustom_graph_optimizer_registry_impl.a;/WHOLEARCHIVE:core/grappler/optimizers/libfunction_api_info.a;/WHOLEARCHIVE:core/grappler/optimizers/libfunction_optimizer.a;"

$ENV:TFLIBS3="/WHOLEARCHIVE:core/grappler/optimizers/liblayout_optimizer.a;/WHOLEARCHIVE:core/grappler/optimizers/libloop_optimizer.a;/WHOLEARCHIVE:core/grappler/optimizers/libmemory_optimizer.a;/WHOLEARCHIVE:core/grappler/optimizers/libstatic_schedule.a;/WHOLEARCHIVE:core/grappler/costs/libgraph_memory.a;/WHOLEARCHIVE:core/grappler/clusters/libvirtual_cluster.a;/WHOLEARCHIVE:core/grappler/costs/libop_level_cost_estimator.a;/WHOLEARCHIVE:core/grappler/costs/libvirtual_scheduler.a;/WHOLEARCHIVE:core/grappler/costs/libvirtual_placer.a;/WHOLEARCHIVE:core/grappler/libdevices.a;/WHOLEARCHIVE:core/grappler/utils/libtraversal.a;/WHOLEARCHIVE:core/grappler/optimizers/libmodel_pruner.a;/WHOLEARCHIVE:core/grappler/optimizers/libgraph_rewriter.a;/WHOLEARCHIVE:core/grappler/optimizers/libpin_to_host_optimizer.a;/WHOLEARCHIVE:core/grappler/optimizers/libremapper.a;/WHOLEARCHIVE:core/grappler/optimizers/libconstant_folding.a;/WHOLEARCHIVE:core/grappler/optimizers/libevaluation_utils.a;/WHOLEARCHIVE:core/grappler/optimizers/libscoped_allocator_optimizer.a;/WHOLEARCHIVE:core/grappler/optimizers/libshape_optimizer.a;/WHOLEARCHIVE:core/grappler/costs/libgraph_properties.a;/WHOLEARCHIVE:core/grappler/costs/libutils.a;/WHOLEARCHIVE:core/grappler/clusters/libutils.a;/WHOLEARCHIVE:core/grappler/libgraph_view.a;/WHOLEARCHIVE:core/grappler/clusters/libcluster.a;/WHOLEARCHIVE:core/grappler/utils/libframe.a;/WHOLEARCHIVE:core/grappler/utils/libsymbolic_shapes.a;/WHOLEARCHIVE:core/grappler/utils/libcolocation.a;/WHOLEARCHIVE:core/grappler/utils/libfunctions.a;/WHOLEARCHIVE:core/grappler/utils/libtopological_sort.a;/WHOLEARCHIVE:core/libgpu_runtime_impl.lo;/WHOLEARCHIVE:core/libcore_cpu_base.lo;/WHOLEARCHIVE:core/libfunction_ops_op_lib.lo;/WHOLEARCHIVE:core/libfunctional_grad.lo;/WHOLEARCHIVE:core/libfunctional_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libno_op.lo;/WHOLEARCHIVE:core/kernels/libsendrecv_ops.lo;/WHOLEARCHIVE:core/libno_op_op_lib.lo;/WHOLEARCHIVE:core/libsendrecv_ops_op_lib.lo;/WHOLEARCHIVE:core/libcore_cpu_impl.lo;/WHOLEARCHIVE:core/grappler/libgrappler_item.a;/WHOLEARCHIVE:core/grappler/libop_types.a;/WHOLEARCHIVE:core/grappler/libutils.a;/WHOLEARCHIVE:core/libgpu_id_impl.a;/WHOLEARCHIVE:core/libgpu_init_impl.lo;/WHOLEARCHIVE:core/libgpu_lib.a;/WHOLEARCHIVE:core/libgraph.a;/WHOLEARCHIVE:stream_executor/libcuda_platform.lo;/WHOLEARCHIVE:stream_executor/libstream_executor_impl.lo;/WHOLEARCHIVE:core/kernels/libops_util.a;/WHOLEARCHIVE:core/libframework_internal_impl.lo;/WHOLEARCHIVE:core/libfeature_util.a;/WHOLEARCHIVE:core/libprotos_all_proto_text.a;/WHOLEARCHIVE:core/liberror_codes_proto_text.a;/WHOLEARCHIVE:core/libstats_calculator_portable.a;/WHOLEARCHIVE:core/libversion_lib.a;cublas.lib;cuda.lib;cudnn.lib;cufft.lib;curand.lib;/WHOLEARCHIVE:core/liblib_internal_impl.a;/WHOLEARCHIVE:nsync/libnsync_cpp.a;/WHOLEARCHIVE:core/liblib_hash_crc32c_accelerate_internal.a;/WHOLEARCHIVE:core/liblib_proto_parsing.a;/WHOLEARCHIVE:core/libabi.a;/WHOLEARCHIVE:core/libplatform_base.a;/WHOLEARCHIVE:gif_archive/libgif.a;/WHOLEARCHIVE:jpeg/libjpeg.a;/WHOLEARCHIVE:jpeg/libsimd_win_x86_64.a;/WHOLEARCHIVE:com_googlesource_code_re2/libre2.a;/WHOLEARCHIVE:farmhash_archive/libfarmhash.a;/WHOLEARCHIVE:fft2d/libfft2d.a;/WHOLEARCHIVE:highwayhash/libsip_hash.a;/WHOLEARCHIVE:highwayhash/libarch_specific.a;/WHOLEARCHIVE:snappy/libsnappy.a;/WHOLEARCHIVE:zlib_archive/libzlib.a;/WHOLEARCHIVE:double_conversion/libdouble-conversion.a;/WHOLEARCHIVE:core/grappler/costs/libop_performance_data_cc_impl.a;/WHOLEARCHIVE:core/libprotos_all_proto_cc_impl.a;/WHOLEARCHIVE:core/liberror_codes_proto_cc_impl.a;/WHOLEARCHIVE:protobuf_archive/libprotobuf.a;/WHOLEARCHIVE:protobuf_archive/libprotobuf_lite.a;/WHOLEARCHIVE:com_google_absl/absl/strings/libstrings.a;/WHOLEARCHIVE:com_google_absl/absl/strings/libinternal.a;/WHOLEARCHIVE:com_google_absl/absl/base/libthrow_delegate.a;/WHOLEARCHIVE:com_google_absl/absl/numeric/libint128.a;/WHOLEARCHIVE:com_google_absl/absl/types/liboptional.a;/WHOLEARCHIVE:com_google_absl/absl/types/libbad_optional_access.a;/WHOLEARCHIVE:com_google_absl/absl/base/libbase.a;/WHOLEARCHIVE:com_google_absl/absl/base/libspinlock_wait.a;/WHOLEARCHIVE:com_google_absl/absl/base/libdynamic_annotations.a;cudart.lib;"

$ENV:TFLIBS4="/WHOLEARCHIVE:core/libgpu_runtime_impl.lo;/WHOLEARCHIVE:core/libgpu_id_impl.a;/WHOLEARCHIVE:core/libgpu_init_impl.lo;/WHOLEARCHIVE:core/grappler/optimizers/libcustom_graph_optimizer_registry_impl.a;/WHOLEARCHIVE:core/kernels/liblookup_util.a;/WHOLEARCHIVE:core/kernels/libinitializable_lookup_table.a;/WHOLEARCHIVE:core/util/tensor_bundle/libtensor_bundle.a;/WHOLEARCHIVE:core/util/tensor_bundle/libnaming.a;/WHOLEARCHIVE:core/libcore_cpu_base.lo;/WHOLEARCHIVE:core/libfunction_ops_op_lib.lo;/WHOLEARCHIVE:core/libfunctional_grad.lo;/WHOLEARCHIVE:core/libfunctional_ops_op_lib.lo;/WHOLEARCHIVE:core/kernels/libno_op.lo;/WHOLEARCHIVE:core/kernels/libsendrecv_ops.lo;/WHOLEARCHIVE:core/libno_op_op_lib.lo;/WHOLEARCHIVE:core/libsendrecv_ops_op_lib.lo;/WHOLEARCHIVE:core/libgpu_lib.a;/WHOLEARCHIVE:stream_executor/libcuda_platform.lo;/WHOLEARCHIVE:stream_executor/libstream_executor_impl.lo;/WHOLEARCHIVE:core/kernels/libops_util.a;cublas.lib;cuda.lib;cudnn.lib;cufft.lib;curand.lib;/WHOLEARCHIVE:core/libcore_cpu_impl.lo;/WHOLEARCHIVE:core/libgraph.a;/WHOLEARCHIVE:core/grappler/libgrappler_item.a;/WHOLEARCHIVE:core/grappler/libop_types.a;/WHOLEARCHIVE:core/grappler/libutils.a;/WHOLEARCHIVE:core/libframework_internal_impl.lo;/WHOLEARCHIVE:core/libfeature_util.a;/WHOLEARCHIVE:core/libprotos_all_proto_text.a;/WHOLEARCHIVE:core/liberror_codes_proto_text.a;/WHOLEARCHIVE:core/libstats_calculator_portable.a;/WHOLEARCHIVE:core/libversion_lib.a;cudart.lib;/WHOLEARCHIVE:core/liblib_internal_impl.a;/WHOLEARCHIVE:com_google_absl/absl/types/liboptional.a;/WHOLEARCHIVE:com_google_absl/absl/types/libbad_optional_access.a;/WHOLEARCHIVE:nsync/libnsync_cpp.a;/WHOLEARCHIVE:core/liblib_hash_crc32c_accelerate_internal.a;/WHOLEARCHIVE:core/liblib_proto_parsing.a;/WHOLEARCHIVE:core/libabi.a;/WHOLEARCHIVE:core/libplatform_base.a;/WHOLEARCHIVE:com_google_absl/absl/strings/libstrings.a;/WHOLEARCHIVE:com_google_absl/absl/strings/libinternal.a;/WHOLEARCHIVE:com_google_absl/absl/base/libthrow_delegate.a;/WHOLEARCHIVE:com_google_absl/absl/base/libbase.a;/WHOLEARCHIVE:com_google_absl/absl/base/libspinlock_wait.a;/WHOLEARCHIVE:com_google_absl/absl/numeric/libint128.a;/WHOLEARCHIVE:com_google_absl/absl/base/libdynamic_annotations.a;/WHOLEARCHIVE:gif_archive/libgif.a;/WHOLEARCHIVE:jpeg/libjpeg.a;/WHOLEARCHIVE:jpeg/libsimd_win_x86_64.a;/WHOLEARCHIVE:com_googlesource_code_re2/libre2.a;/WHOLEARCHIVE:farmhash_archive/libfarmhash.a;/WHOLEARCHIVE:fft2d/libfft2d.a;/WHOLEARCHIVE:highwayhash/libsip_hash.a;/WHOLEARCHIVE:highwayhash/libarch_specific.a;/WHOLEARCHIVE:snappy/libsnappy.a;/WHOLEARCHIVE:zlib_archive/libzlib.a;/WHOLEARCHIVE:double_conversion/libdouble-conversion.a;/WHOLEARCHIVE:core/grappler/costs/libop_performance_data_cc_impl.a;/WHOLEARCHIVE:core/libprotos_all_proto_cc_impl.a;/WHOLEARCHIVE:core/liberror_codes_proto_cc_impl.a;/WHOLEARCHIVE:protobuf_archive/libprotobuf.a;/WHOLEARCHIVE:protobuf_archive/libprotobuf_lite.a;"

based on

bazel-tensorflow\bazel-out\x64_windows-opt\bin\tensorflow
bazel-tensorflow\bazel-out\x64_windows-opt\bin\external as search paths

this is coming from the information in

libtensorflow_cc.so-2.params and libtensorflow_framework.so-2.params

@samhodge
Copy link

It all becomes a can of worms using a .vcxproj file to set this up.

The character limit comes into play over and over, with a 65K character limit you get stuffed at every turn. The 16bit history of the Windows Operating System has a lot to answer for.

@samhodge
Copy link

Maybe if I just add the .obj files of the .pb.cc files directly that will heal the missing symbols.

@samhodge
Copy link

samhodge commented Jun 20, 2020

@gunan @meteorcloudy @Artem-B

Simply adding

../../../tensorflow-r1.12/bazel-tensorflow/bazel-out/x64_windows-opt/bin/tensorflow/core/_objs/protos_all_proto_cc_impl/config.pb.o

The the objects being linked you go from 34 missing symbols to 164 missing symbols.

I really need a expert's guide about how to run the following C++ code in Windows 10

#include "tensorflow/core/protobuf/config.pb.h"
#include <iostream>

int main() {
	tensorflow::GPUOptions gpu_options;

	gpu_options.set_visible_device_list("0");

	std::cout << "allocator_type " << gpu_options.allocator_type() << std::endl;
	std::cout << "visible_device_list " << gpu_options.visible_device_list() << std::endl;

	gpu_options.set_allocator_type("7");

	std::cout << "allocator_type " << gpu_options.allocator_type() << std::endl;
	std::cout << "visible_device_list " << gpu_options.visible_device_list() << std::endl;


	gpu_options.set_visible_device_list("5");

	std::cout << "allocator_type " << gpu_options.allocator_type() << std::endl;
	std::cout << "visible_device_list " << gpu_options.visible_device_list() << std::endl;

	gpu_options.set_allocator_type("3");

	std::cout << "allocator_type " << gpu_options.allocator_type() << std::endl;
	std::cout << "visible_device_list " << gpu_options.visible_device_list() << std::endl;


}

Down to the operating system, compiler, bazel version, CUDA version, etc.

The issue is not isolated to r1.12 or r2.2 it is systemic to how protobuf works with linking on Windows.

We do not have a solution for static linking, but that is a requirement for the above code to run correctly.

My hacked solution doesn't work beyond the simple driver version, under MSVC 2019 with r2.2

It was based around adding the inputs to the tensorflow_cc.dll target under r2.2 as recording the linking script .params file

Thereby using the inputs for libtensorflow_cc.so under r1.12 should work the same, but they do not, the leave missing symbols for the .proto based compiled files.

I am unsure what causes this behaviour.

@samhodge
Copy link

As an example if the core/platform/windows/port.cc file has the symbols for port::InitMain how can I find which artifact has the symbols for the compiled version of that code?

This is one of the missing symbols that was in the list about, when dynamically linking.

So if the inputs to the .dll are there when statically linking. Why is this symbol missing?

Maybe if I can work this out I can apply similar logic to the remaining 33 missing symbols

Sam

@samhodge
Copy link

OK I followed this up with a recursive grep through the .params files looking for port.o from the core/platform/windows/port.cc file and it was bundled first into libframework_internal_impl.lo and liblib_internal_impl.a then which example.obj is linked against both to make a .dll

So the symbols should be there twice!

One curiosity I have is the /DEFAULTLIB:msvrt.lib vs /DEFAULTLIB:libcmt.lib

see:
https://docs.microsoft.com/en-us/cpp/c-runtime-library/crt-library-features?view=vs-2019

example.obj which will make example.dll when linked to all the TF symbols and other Cool Stuff (tm). is using /DEFAULTLIB:libcmt.lib whereas the tensorflow_cc.dll was made using /DEFAULTLIB:msvc

@samhodge
Copy link

samhodge commented Jun 21, 2020

libtensorflow_cc.so-2.params.zip
OK I followed this up with a recursive grep through the .params files looking for port.o from the core/platform/windows/port.cc file and it was bundled first into libframework_internal_impl.lo and liblib_internal_impl.a then which example.obj is linked against both to make a .dll

So the symbols should be there twice!

One curiosity I have is the /DEFAULTLIB:msvrt.lib vs /DEFAULTLIB:libcmt.lib

see:
https://docs.microsoft.com/en-us/cpp/c-runtime-library/crt-library-features?view=vs-2019

example.obj which will make example.dll when linked to all the TF symbols and other Cool Stuff (tm). is using /DEFAULTLIB:libcmt.lib whereas the tensorflow_cc.dll was made using /DEFAULTLIB:msvcrt.lib I don't think this is the cause of all of my hassles.

But if we are in the situation where including protobuf symbols more than once the fact that libtensorflow_cc.so includes port.cc via libframework_internal_impl.lo and via liblib_internal_impl.a I know that port.cc is not a .proto file but if it is happening with some source files it could be happening with others.

This is before I started hacking.

see file attached, libtensorflow_cc.so-2.params.zip

@samhodge
Copy link

@meteorcloudy @sanjoy @gunan

We have established two things

Including of the same symbols twice when linking can cause unexpected behavior

Tensorflow includes the same port.o file with the same symbols at least twice

Can we make sure the input into a bazel target have a directed acyclic graph representation?

It seems my approach of getting the union set of all of the object files under Linux and MacOS achieves this outcome

But we will be faced with a .a file that is larger than 2Gb to say nothing of feeding the arguments via lists smaller than 65K characters

How do we proceed?

Sam

@mihaimaruseac
Copy link
Collaborator

Including the same symbol twice results in ODR (one definition rule) and that is UB / prone to errors.

We have several ODR violations in TF, need to identify and fix them. Thanks for finding out that port.o causes it!

@samhodge
Copy link

I guess the next conclusion would be to find the size of the Union of all the required object files by recursive search of .params files then make a linking strategy with a complexity less than n factorial which was my last attempt failing after a six hours time out

Then see if the final linkage is under 2GB

I am guessing that is a testable binary outcome the file will have a size greater or less than 2Gb

Sam

@MikhailStartsev
Copy link
Contributor

Sorry for maybe a stupid question, but what workaround is the "final" one?
I have issues with device placement on Windows in C++ (basically cannot reliably place anything on a particular GPU, if several are capable enough to run Tensorflow), which I tried to solve with options.config.mutable_gpu_options()->set_visible_device_list("0"); to work with GPU 0.

However, this leads to a crash: F tensorflow/core/framework/op.cc:214] Non-OK-status: RegisterAlreadyLocked(deferred_[i]) status: Invalid argument: No attr with name '0' for input 'constants'; in OpDef: name: "XlaLaunch" input_arg { name: "constants" description: "0" type_attr: "0" number_attr: "0" type_list_attr: "Tconstants" } input_arg { name: "args" description: "0" type_attr: "0" number_attr: "0" type_list_attr: "Targs" } input_arg { name: "resources" description: "0" type: DT_RESOURCE type_attr: "0" number_attr: "Nresources" type_list_attr: "0" } output_arg { name: "results" description: "0" type_attr: "0" number_attr: "0" type_list_attr: "Tresults" } attr { name: "Tconstants" type: "list(type)" description: "0" has_minimum: true } attr { name: "Targs" type: "list(type)" description: "0" has_minimum: true } attr { name: "Nresources" type: "int" description: "0" has_minimum: true } attr { name: "Tresults" type: "list(type)" description: "0" has_minimum: true } attr { name: "function" type: "func" description: "0" } summary: "XLA Launch Op. For use by the XLA JIT only." description: "0" is_stateful: true

I thought that this error might have something to do with what is described in this thread, since for me the pointers for visible_device_list and e.g. allocator_type are also identical, likely leading to the parsing errors later on.

Any way to navigate around this is appreciated!

@samhodge
Copy link

samhodge commented Oct 3, 2020

I wish you well.

The solution is blocked by two issues.

The DLL cannot be dynamically linked because the symbols from lib protobuf will overlap with the symbols from Tensorflow

Statically linking leads to a object file symbol archive which is larger 2Gb

So while there is this

https://docs.microsoft.com/en-us/dotnet/framework/configure-apps/file-schema/runtime/gcallowverylargeobjects-element

I think this may take several hours to accumulate all of symbols in the obj file to make a static archive.

@rmothukuru
Copy link
Contributor

@kognat-docs,
Can you please confirm if your issue is resolved using above comments. Thanks!

@rmothukuru rmothukuru self-assigned this Dec 4, 2020
@kognat-docs
Copy link
Author

It is unclear to be how to apply the large archive option to the bazel build.

If one of the bazel Windows team could point it out that would be great

Sam

@rmothukuru rmothukuru removed their assignment Dec 8, 2020
@rmothukuru rmothukuru added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Dec 8, 2020
@kognat-docs
Copy link
Author

How do I create a static archive in Windows 10 greater than 2Gb using bazel?

1 similar comment
@kognat-docs
Copy link
Author

How do I create a static archive in Windows 10 greater than 2Gb using bazel?

@samhodge
Copy link

samhodge commented Jan 5, 2021

If there were clear instructions on how to create a static archive in Windows 10 greater than 2Gb using bazel?

Then this long overdue issue can be closed.

@mihaimaruseac
Copy link
Collaborator

I think we should ask this on Bazel repo?

@samhodge
Copy link

Sounds like a good idea!

@tensorflowbutler tensorflowbutler removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jan 23, 2021
@tensorflowbutler
Copy link
Member

Hi There,

We are checking to see if you still need help on this, as you are using an older version of tensorflow which is officially considered end of life . We recommend that you upgrade to the latest 2.x version and let us know if the issue still persists in newer versions. Please open a new issue for any help you need against 2.x, and we will get you the right help.

This issue will be closed automatically 7 days from now. If you still need help with this issue, please provide us with more information.

@jdlee0
Copy link

jdlee0 commented Mar 3, 2021

So the underlying surprising feature in that protobuf doesn't make separate allocations for each empty std::string in its structures. Instead it tries to have a fixed empty string fixed_address_empty_string, which you get a pointer to with GetEmptyStringAlreadyInited. Then on every set_<field> there are checks against this magic default pointer, and a new string is allocated if the pointer matches. If it doesn't then it's assumed that this string is initialized (and not shared) and it gets written to.

The bug is then that (for unknown-to-me build reasons) either there are multiple fixed_address_empty_strings being generated, or the struct is being moved after some inlining has happened. In any event, the key point is that empty, un-initialized strings exist in these structs at a common location that is not *GetEmptyStringAlreadyInited(). So the run-time checks say that these are actually initialized and unshared, the set_<field> mutates this shared string, and many other structures become sad.

There's an unsafe workaround -- the unsafe_arena_release_<free> functions forcibly reset string pointers to GetEmptyStringAlreadyInited(), and if some (nominally) initialized string might have been about to leak it returns it to the caller to deal with. If we "know" that the field should be unset, then it points at some static string that protobuf will clean up, so we can just drop it without leaking. Then calling set_<field>() correctly allocates a new string and assigns it.

@samhodge
Copy link

samhodge commented Mar 3, 2021

I am grateful for your insight to this issue

it seems that the creation of multiple definitions and thereby multiple symbols in the shared library are the cause of the incorrect behaviour. When it is possible to create a static library such as on Linux and OSX we are left with a single definition during the linking of the static archive.

So the unsafe work around seems to be less work than fixing all of the multiple definition problems in such a large code base.

but arena needs to error to prevent multiple definitions from squatting on the same memory address without ownership

sam

@samhodge
Copy link

samhodge commented Mar 4, 2021

1 similar comment
@samhodge
Copy link

samhodge commented Mar 4, 2021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:gpu GPU related issues comp:runtime c++ runtime, performance issues (cpu) TF 1.12 Issues related to TF 1.12 type:performance Performance Issue
Projects
None yet
Development

No branches or pull requests