# Make tutorial_example_trainer build on Windows with Bazel #4796

Merged
merged 13 commits into from Oct 11, 2016

## Conversation

Projects
None yet
9 participants
Member

### meteorcloudy commented Oct 6, 2016 • edited Edited 1 time meteorcloudy edited Oct 6, 2016 (most recent)

 With these changes, we are able to build the C++ example trainer by bazel build --host_cpu=x64_windows_msvc --cpu=x64_windows_msvc -c opt tensorflow/cc:tutorials_example_trainer with Bazel from HEAD on Windows. I modified the BUILD files carefully, hopefully it won't break TF build on other platform. @mrry @damienmg @dslomov

### meteorcloudy added some commits Sep 21, 2016

 Make //tensorflow/core:core_cpu build on Windows 
 2e07e1b 
 Fix platform.bzl for cuda configuration on Windows 
I am not sure about the naming convention of the dlls, but I did see some dll name like
curand32_42_9.dll or cupti64_65.dll.
 7225d72 
 Fix //tensorflow/core:core_cpu again after sync to tf/master 
1. Excluded load_library.cc
2. Fixed logging.h
 32b5414 
 Exclude common_runtime/gpu/ and stream_executor on Windows 
 57b1294 
 Using #pragma to include necessary library 
 2acbb91 
 Add windows specified sources 
 26b3eea 
 Exclude some ops on Windows 
Including debug_ops, immutable_constant_op, sparse_matmul_op, fact_op, cswise_op, depthwise_conv_op, svd_op, summary_audio_op

This makes //tensorflow/core:tensorflow build on Windows
 3925e88 
 Add /DLANG_CXX11 in tf_copts for Windows 
 481d623 
 Fix example trainer build after sync to windows-20160915 
 3ccce56 
 Re-enable svd_op, cwise_op, summary_audio_op, on Windows in Bazel Build 
 9b4b07e 
 Fixed lib_proto_parsing by adding 'copts = tf_copts()' 
 fe74dfa 
Collaborator

### tensorflow-jenkins commented Oct 6, 2016

 Can one of the admins verify this patch?

### mention-bot commented Oct 6, 2016

 @meteorcloudy, thanks for your PR! By analyzing the history of the files in this pull request, we identified @josh11b, @vrv and @jhseu to be potential reviewers.

Contributor

### mrry commented Oct 6, 2016

 Thanks for sending this in, Yun!
Contributor

Closed

Member

### meteorcloudy commented Oct 6, 2016

 You are welcome! Looks like there are many failures, I'll look into it.

### mrry reviewed Oct 6, 2016

tensorflow/core/BUILD
 ] + # Protobuf deps already included through the ":lib_proto_parsing" # dependency. tf_additional_proto_srcs(),

#### mrry Oct 6, 2016

Contributor

Nit: The indentation is a bit inconsistent between L955 and here.

#### meteorcloudy Oct 6, 2016

Member

Thanks, I'll reformat all the BUILD files with buildifier.

tensorflow/core/BUILD
 "//third_party/eigen3", ], ] + if_not_windows([":stream_executor"]),

#### mrry Oct 6, 2016

Contributor

Instead of making this an (almost) no-op when building on Windows, could we modify :core so that it only pulls in :gpu_runtime on not-Windows?

There are a few targets that depend on :gpu_runtime in this file, but they all appear to depend on :core_cpu, so AFAICT this could be replaced with a dependency on :core.

#### meteorcloudy Oct 6, 2016

Member

I found there are 4 targets we need to modify if we want to cut the dependency on gpu_runtime.
//tensorflow/core:core,
//tensorflow/core:tensorflow_opensource,
//tensorflow/core/util/ctc:ctc_beam_search_lib,
//tensorflow/core/kernels:array
As you can see, they are scattered in 3 different files, I kind of prefer the original change, so it will be easier when we enable gpu support later. WDYT?

#### mrry Oct 7, 2016 • edited Edited 1 time mrry edited Oct 7, 2016 (most recent)

Contributor

Some of these dependencies look spurious (especially for ctc_beam_search_lib), but fixing that is probably orthogonal to this change. ("gpu_runtime" seems like a strange thing for CPU-only code to depend on :)....) Fine to leave as is.

 @@ -1608,7 +1657,9 @@ tf_kernel_libraries( "//tensorflow/core:nn_grad", "//tensorflow/core:nn_ops_op_lib", "//third_party/eigen3", ], ] + if_not_windows([

#### mrry Oct 6, 2016

Contributor

I'm confused why this is treated differently from the tf_kernel_libraries() target, and thus not in :math_not_windows. (It may have been in the wrong place originally, but it'd be good to reduce the complexity if possible.)

I believe it should be possible to remove :depthwise_conv_op from the deps here, and move it into the prefixes of :math_not_windows. Does that make sense?

#### meteorcloudy Oct 6, 2016

Member

Sounds good, I'll verify if it works

#### meteorcloudy Oct 6, 2016

Member

Hmm.. As you can see the definition here
Looks like math_now_windows lacks dependencies depthwise_conv_op needs.
:conv_ops, :ops_util, //tensorflow/core:nn_ops_op_lib
Do you think it's fine to add them into deps of math_not_window?

#### mrry Oct 7, 2016

Contributor

Sorry, I should have said that :depthwise_conv_ops should (probably, for consistency) be in a rule called :nn_not_windows. Does that make more sense?

#### meteorcloudy Oct 7, 2016

Member

Yes, I agree!

tensorflow/core/platform/default/build_config.bzl
 }) def tf_additional_lib_srcs(exclude = []): return select({

#### mrry Oct 6, 2016

Contributor

Nit: only one space between return and select

Member

Thanks~

### meteorcloudy added some commits Oct 6, 2016

 Reformat BUILD files 
 ebc40fb 
 Refactor BUILD files and make it work on Linux 
 aae924c 
Member

### meteorcloudy commented Oct 6, 2016

 Can we test it again?
Member

Member

### yifeif commented Oct 7, 2016

 @mrry could you take another look?

Member

### meteorcloudy commented Oct 7, 2016

 @mrry Just pushed, please review again. :)
Contributor

### mrry approved these changes Oct 7, 2016

Looks good to me - thanks for the cleanup!

Member

### meteorcloudy commented Oct 7, 2016

 From the error message in Linux GPU, looks like we should also pull depthwise_conv_grad_op out of deps, but depthwise_conv_grad_op needs depthwise_conv_op.h. What should we do?
Member

### meteorcloudy commented Oct 7, 2016

 Or maybe they are in the deps for a reason? See these two commits b51ef0c, 99e298a
Contributor

### mrry commented Oct 7, 2016

 I'm not sure what the proper solution would be for them. Assigning to @jmchen-g, since he wrote those ops originally.

Contributor

### jmchen-g commented Oct 10, 2016

 Which deps are not clear here? If you mean why it is in the deps of nn, Depthwise is one of the convs that we need to support in TF and is commonly used to reduce the computing needs for a model.
Member

### meteorcloudy commented Oct 10, 2016

 OK, I see. Sounds like depthwise_conv_ops is supposed to be in deps of nn. I've deleted my last commit. Please test again?
Member

### meteorcloudy commented Oct 11, 2016 • edited Edited 1 time meteorcloudy edited Oct 11, 2016 (most recent)

 @mrry Are you OK with keeping depthwise_conv_ops in deps of nn? If so, I think this PR is ready to be merged.
Contributor

### mrry commented Oct 11, 2016

 @meteorcloudy Yes, that's fine. I suspect our existing build rules are not as clean as they could be for this op, but that shouldn't derail what you're trying to do :). I'll go ahead and merge now.

### mrry merged commit 1975cd1 into tensorflow:master Oct 11, 2016 8 checks passed

#### 8 checks passed

Android Demo App SUCCESS
Details
Linux CPU Tests SUCCESS
Details
Linux CPU Tests (Python 3) SUCCESS
Details
Linux GPU SUCCESS
Details
MacOS CPU Tests SUCCESS
Details
Sanity Checks SUCCESS
Details
ci.tensorflow.org SUCCESS
Details
cla/google All necessary CLAs are signed
Member

### meteorcloudy commented Oct 11, 2016

 Nice, thanks! \o/
Member

### rohan100jain commented Oct 15, 2016

 This change is causing problems in testing now with BUILD rules. Apparently select() and glob() don't play nicely with each other.

Merged

Member

### meteorcloudy commented Oct 15, 2016

 @rohan100jain Can you show me the error message? I wonder why it's not causing problem before.. Users are beginning to try Windows build with Bazel now, reverting this will completely break them.
Member

### rohan100jain commented Oct 15, 2016

 Even after fixing the bug at https://github.com/tensorflow/tensorflow/pull/4796/files#diff-15609f78c0cd35bfdc31f9d8bd9b1e91R984 there are failures like: https://ci.tensorflow.org/job/tensorflow-cl-sanity/14995/console I tried various combinations of glob() and select() and none seem to work.

### bssrdf commented Oct 16, 2016

 Just want to follow up on this. I managed to build label_image example using the same procedure. Have to fix a bug at bssrdf@58f93bd#diff-f6254560279f78a9d74e66dfd2d85977 for it to build on win. But it failed to run due to some file path issue \$ bazel-bin/tensorflow/examples/label_image/label_image E C:\tools\msys64\var\tmp\Bazel\s5nfxnhX\execroot\tensorflow\tensorflow\examples\label_image\main.cc:279] Not found: Failed to load compute graph at 'tensorflow/examples/label_image/data/tensorflow_inception_graph.pb' although the file tensorflow_inception_graph.pb is present in that directory.
Member

### meteorcloudy commented Oct 17, 2016

 @rohan100jain Hmm.. This is weird. Here's what I tested: I reset my branch to c20da14, which is just before reverting this change. And I ran bazel build --nobuild tensorflow/... on Mac, Linux and Windows. It all passed. For glob() and select(), as long as you don't use select as arguments of glob, they will work just fine. So I'm not sure why it's failing like that on ci. Can you give me some instructions to reproduce this error?

Closed

Member

### meteorcloudy commented Oct 17, 2016 • edited Edited 1 time meteorcloudy edited Oct 17, 2016 (most recent)

 I believe there is nothing wrong with this PR, but I am not sure how to get it back. Another PR or can you revert the revert commit? @mrry @rohan100jain
Contributor

### mrry commented Oct 17, 2016

 I think the best way forward is another PR. Should be quick to review if it's the same as the last one :).
Member

### meteorcloudy commented Oct 17, 2016

 OK, PR sent. #5009