New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations" in "Hello, TensorFlow!" program #7778

Closed
bingq opened this Issue Feb 22, 2017 · 56 comments

Comments

Projects
None yet
@bingq

bingq commented Feb 22, 2017

Opening this with reference to #7500.

Installed TensorFlow 1.0 with reference to https://www.tensorflow.org/install/install_windows on Windows 10 and hit the same issue discussed in #7500. With applying the solution suggested in that thread, the original issue disappeared but got the new warnings:

C:\Users\geldqb>python
Python 3.5.3 (v3.5.3:1880cb95a742, Jan 16 2017, 16:02:32) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
2017-02-22 22:28:20.696929: W c:\tf_jenkins\home\workspace\nightly-win\device\cpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-02-22 22:28:20.698285: W c:\tf_jenkins\home\workspace\nightly-win\device\cpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-02-22 22:28:20.700143: W c:\tf_jenkins\home\workspace\nightly-win\device\cpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2017-02-22 22:28:20.700853: W c:\tf_jenkins\home\workspace\nightly-win\device\cpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-02-22 22:28:20.701498: W c:\tf_jenkins\home\workspace\nightly-win\device\cpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-02-22 22:28:20.702190: W c:\tf_jenkins\home\workspace\nightly-win\device\cpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-02-22 22:28:20.702837: W c:\tf_jenkins\home\workspace\nightly-win\device\cpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-02-22 22:28:20.703460: W c:\tf_jenkins\home\workspace\nightly-win\device\cpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
print(sess.run(hello))
b'Hello, TensorFlow!'

@Carmezim

This comment has been minimized.

Show comment
Hide comment
@Carmezim

Carmezim Feb 22, 2017

Contributor

Those are simply warnings. They are just informing you if you build TensorFlow from source it can be faster on your machine. Those instructions are not enabled by default on the builds available I think to be compatible with more CPUs as possible.
If you have any other doubts regarding this please feel free to ask, otherwise this can be closed.

edit: To deactivate these warnings as @yaroslavvb suggested in another comment, do the following:

import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf

or if you're on a Unix system simply do export TF_CPP_MIN_LOG_LEVEL=2.

TF_CPP_MIN_LOG_LEVEL is a TensorFlow environment variable responsible for the logs, to silence INFO logs set it to 1, to filter out WARNING 2 and to additionally silence ERROR logs (not recommended) set it to 3

Contributor

Carmezim commented Feb 22, 2017

Those are simply warnings. They are just informing you if you build TensorFlow from source it can be faster on your machine. Those instructions are not enabled by default on the builds available I think to be compatible with more CPUs as possible.
If you have any other doubts regarding this please feel free to ask, otherwise this can be closed.

edit: To deactivate these warnings as @yaroslavvb suggested in another comment, do the following:

import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf

or if you're on a Unix system simply do export TF_CPP_MIN_LOG_LEVEL=2.

TF_CPP_MIN_LOG_LEVEL is a TensorFlow environment variable responsible for the logs, to silence INFO logs set it to 1, to filter out WARNING 2 and to additionally silence ERROR logs (not recommended) set it to 3

@bingq

This comment has been minimized.

Show comment
Hide comment
@bingq

bingq Feb 22, 2017

Thanks Camezim !
Any hint of how to kill those warnings?

bingq commented Feb 22, 2017

Thanks Camezim !
Any hint of how to kill those warnings?

@yaroslavvb

This comment has been minimized.

Show comment
Hide comment
@yaroslavvb

yaroslavvb Feb 22, 2017

Contributor

@bingq the best way I found to kill unwanted TF output:

run your script as tf.sh myscript.py where tf.sh contains

#!/bin/sh
# Run python script, filtering out TensorFlow logging
# https://github.com/tensorflow/tensorflow/issues/566#issuecomment-259170351
python $* 3>&1 1>&2 2>&3 3>&- | grep -v ":\ I\ " | grep -v "WARNING:tensorflow" | grep -v ^pciBusID | grep -v ^major: | grep -v ^name: |grep -v ^Total\ memory:|grep -v ^Free\ memory:

You can add extra |grep -v parts to get rid of more things

Contributor

yaroslavvb commented Feb 22, 2017

@bingq the best way I found to kill unwanted TF output:

run your script as tf.sh myscript.py where tf.sh contains

#!/bin/sh
# Run python script, filtering out TensorFlow logging
# https://github.com/tensorflow/tensorflow/issues/566#issuecomment-259170351
python $* 3>&1 1>&2 2>&3 3>&- | grep -v ":\ I\ " | grep -v "WARNING:tensorflow" | grep -v ^pciBusID | grep -v ^major: | grep -v ^name: |grep -v ^Total\ memory:|grep -v ^Free\ memory:

You can add extra |grep -v parts to get rid of more things

@tomrunia

This comment has been minimized.

Show comment
Hide comment
@tomrunia

tomrunia Feb 24, 2017

@Carmezim Any estimate of how much faster when compiling from source using advanced CPU instructions? Any reason to do this at all when running most of the graph on a GPU?

tomrunia commented Feb 24, 2017

@Carmezim Any estimate of how much faster when compiling from source using advanced CPU instructions? Any reason to do this at all when running most of the graph on a GPU?

@Carmezim

This comment has been minimized.

Show comment
Hide comment
@Carmezim

Carmezim Feb 24, 2017

Contributor

@tomrunia I haven't tested myself yet (actually am building it now with SSE) although heard 4-8x. If @yaroslavvb wants to chime in as he himself got 3x speed improvement.

Contributor

Carmezim commented Feb 24, 2017

@tomrunia I haven't tested myself yet (actually am building it now with SSE) although heard 4-8x. If @yaroslavvb wants to chime in as he himself got 3x speed improvement.

@yaroslavvb

This comment has been minimized.

Show comment
Hide comment
@yaroslavvb

yaroslavvb Feb 24, 2017

Contributor

Yup, 3x for large matrix multiply on Xeon-V3, I expect it's probably due to FMA/AVX rather than SSE

Contributor

yaroslavvb commented Feb 24, 2017

Yup, 3x for large matrix multiply on Xeon-V3, I expect it's probably due to FMA/AVX rather than SSE

@Carmezim

This comment has been minimized.

Show comment
Hide comment
@Carmezim

Carmezim Feb 25, 2017

Contributor

@tomrunia Well pointed by @yaroslavvb, not SSE specifically in his case although those CPU instructions are expected to provide performance improvement

Contributor

Carmezim commented Feb 25, 2017

@tomrunia Well pointed by @yaroslavvb, not SSE specifically in his case although those CPU instructions are expected to provide performance improvement

@Juanlu001

This comment has been minimized.

Show comment
Hide comment
@Juanlu001

Juanlu001 Mar 7, 2017

It would be very nice to silence these warnings from the Python side, it's not so easy to use grep from Windows.

Juanlu001 commented Mar 7, 2017

It would be very nice to silence these warnings from the Python side, it's not so easy to use grep from Windows.

@yaroslavvb

This comment has been minimized.

Show comment
Hide comment
@yaroslavvb

yaroslavvb Mar 7, 2017

Contributor

Try export TF_CPP_MIN_LOG_LEVEL=2

Contributor

yaroslavvb commented Mar 7, 2017

Try export TF_CPP_MIN_LOG_LEVEL=2

@Juanlu001

This comment has been minimized.

Show comment
Hide comment
@Juanlu001

Juanlu001 Mar 7, 2017

Try export TF_CPP_MIN_LOG_LEVEL=2

Thanks @yaroslavvb, I haven't tried it yet but an environment variable is definitely more useful.

Juanlu001 commented Mar 7, 2017

Try export TF_CPP_MIN_LOG_LEVEL=2

Thanks @yaroslavvb, I haven't tried it yet but an environment variable is definitely more useful.

@evgfreyman

This comment has been minimized.

Show comment
Hide comment
@evgfreyman

evgfreyman Mar 8, 2017

It works, thank you

evgfreyman commented Mar 8, 2017

It works, thank you

@hughsando

This comment has been minimized.

Show comment
Hide comment
@hughsando

hughsando Mar 9, 2017

Is it possible that these errors are coming from the fact that with MSVC, x64, SSE2 is implicit (all x64 chips have SSE2) but the __SSE2__ et al defines are not explicitly set?
Perhaps the guards should be on EIGEN_VECTORIZE_SSE2 etc instead.

hughsando commented Mar 9, 2017

Is it possible that these errors are coming from the fact that with MSVC, x64, SSE2 is implicit (all x64 chips have SSE2) but the __SSE2__ et al defines are not explicitly set?
Perhaps the guards should be on EIGEN_VECTORIZE_SSE2 etc instead.

@evgfreyman

This comment has been minimized.

Show comment
Hide comment
@evgfreyman

evgfreyman Mar 9, 2017

Could you say what should I exactly do according to your idea? What does "guards on" mean?

evgfreyman commented Mar 9, 2017

Could you say what should I exactly do according to your idea? What does "guards on" mean?

@hughsando

This comment has been minimized.

Show comment
Hide comment
@hughsando

hughsando Mar 10, 2017

The SSE warnings use code like this:

#ifndef __SSE__
    WarnIfFeatureUnused(CPUFeature::SSE, "SSE");
#endif  // __SSE__
#ifndef __SSE2__
    WarnIfFeatureUnused(CPUFeature::SSE2, "SSE2");
#endif  // __SSE2__

But the Eigen imeplementation (eigen/Eigen/Core) uses more complicated logic to work out whether to use SSS1/2:

#ifndef EIGEN_DONT_VECTORIZE
  #if defined (EIGEN_SSE2_ON_NON_MSVC_BUT_NOT_OLD_GCC) || defined(EIGEN_SSE2_ON_MSVC_2008_OR_LATER)

    // Defines symbols for compile-time detection of which instructions are
    // used.
    // EIGEN_VECTORIZE_YY is defined if and only if the instruction set YY is used
    #define EIGEN_VECTORIZE
    #define EIGEN_VECTORIZE_SSE
    #define EIGEN_VECTORIZE_SSE2

    // Detect sse3/ssse3/sse4:
    // gcc and icc defines __SSE3__, ...
    // there is no way to know about this on msvc. You can define EIGEN_VECTORIZE_SSE* if you
    // want to force the use of those instructions with msvc.
    #ifdef __SSE3__
      #define EIGEN_VECTORIZE_SSE3
    #endif
    #ifdef __SSSE3__
      #define EIGEN_VECTORIZE_SSSE3
    #endif
    #ifdef __SSE4_1__
      #define EIGEN_VECTORIZE_SSE4_1
    #endif
    #ifdef __SSE4_2__
      #define EIGEN_VECTORIZE_SSE4_2
    #endif
    #ifdef __AVX__
      #define EIGEN_VECTORIZE_AVX
      #define EIGEN_VECTORIZE_SSE3
      #define EIGEN_VECTORIZE_SSSE3
      #define EIGEN_VECTORIZE_SSE4_1
      #define EIGEN_VECTORIZE_SSE4_2
    #endif

Due mainly to the fact Visual Studio assumes SSE1/SSE2 when compiling for x64.
Also note that is depends on EIGEN_DONT_VECTORIZE - perhaps some user customization.

So one solution would be to #include eigen/Eigen/Core and use the "EIGEN_VECTORIZE_SSE" symbols in the conditional code-guard ("#ifndef EIGEN_VECTORIZE_SSE"),
I'm not 100% sure about the build system and whether Eigen is the only source of SSE operations, so I'm not 100% sure that this is the right answer.

I'm also not sure what is the right thing to do if building a binary for distribution. Do you include AVX and risk it not running, or do you not include it and risk the warning (and low performance)? Ideally you would build with full vectorization and let the software choose at runtime. I guess another possibility would be to build 2 dlls, and dynamically load the right one at runtime.

hughsando commented Mar 10, 2017

The SSE warnings use code like this:

#ifndef __SSE__
    WarnIfFeatureUnused(CPUFeature::SSE, "SSE");
#endif  // __SSE__
#ifndef __SSE2__
    WarnIfFeatureUnused(CPUFeature::SSE2, "SSE2");
#endif  // __SSE2__

But the Eigen imeplementation (eigen/Eigen/Core) uses more complicated logic to work out whether to use SSS1/2:

#ifndef EIGEN_DONT_VECTORIZE
  #if defined (EIGEN_SSE2_ON_NON_MSVC_BUT_NOT_OLD_GCC) || defined(EIGEN_SSE2_ON_MSVC_2008_OR_LATER)

    // Defines symbols for compile-time detection of which instructions are
    // used.
    // EIGEN_VECTORIZE_YY is defined if and only if the instruction set YY is used
    #define EIGEN_VECTORIZE
    #define EIGEN_VECTORIZE_SSE
    #define EIGEN_VECTORIZE_SSE2

    // Detect sse3/ssse3/sse4:
    // gcc and icc defines __SSE3__, ...
    // there is no way to know about this on msvc. You can define EIGEN_VECTORIZE_SSE* if you
    // want to force the use of those instructions with msvc.
    #ifdef __SSE3__
      #define EIGEN_VECTORIZE_SSE3
    #endif
    #ifdef __SSSE3__
      #define EIGEN_VECTORIZE_SSSE3
    #endif
    #ifdef __SSE4_1__
      #define EIGEN_VECTORIZE_SSE4_1
    #endif
    #ifdef __SSE4_2__
      #define EIGEN_VECTORIZE_SSE4_2
    #endif
    #ifdef __AVX__
      #define EIGEN_VECTORIZE_AVX
      #define EIGEN_VECTORIZE_SSE3
      #define EIGEN_VECTORIZE_SSSE3
      #define EIGEN_VECTORIZE_SSE4_1
      #define EIGEN_VECTORIZE_SSE4_2
    #endif

Due mainly to the fact Visual Studio assumes SSE1/SSE2 when compiling for x64.
Also note that is depends on EIGEN_DONT_VECTORIZE - perhaps some user customization.

So one solution would be to #include eigen/Eigen/Core and use the "EIGEN_VECTORIZE_SSE" symbols in the conditional code-guard ("#ifndef EIGEN_VECTORIZE_SSE"),
I'm not 100% sure about the build system and whether Eigen is the only source of SSE operations, so I'm not 100% sure that this is the right answer.

I'm also not sure what is the right thing to do if building a binary for distribution. Do you include AVX and risk it not running, or do you not include it and risk the warning (and low performance)? Ideally you would build with full vectorization and let the software choose at runtime. I guess another possibility would be to build 2 dlls, and dynamically load the right one at runtime.

@metorm

This comment has been minimized.

Show comment
Hide comment
@metorm

metorm Mar 13, 2017

You can try this post and tell me if it really becomes faster.

metorm commented Mar 13, 2017

You can try this post and tell me if it really becomes faster.

@Carmezim

This comment has been minimized.

Show comment
Hide comment
@Carmezim

Carmezim Mar 16, 2017

Contributor

@Juanlu001 Check this comment for how this variable works and @yaroslavvb's code below for a handy way to change it.

Contributor

Carmezim commented Mar 16, 2017

@Juanlu001 Check this comment for how this variable works and @yaroslavvb's code below for a handy way to change it.

@RaviTezu

This comment has been minimized.

Show comment
Hide comment
@RaviTezu

RaviTezu Mar 24, 2017

Thanks @Carmezim
Setting the env. variable TF_CPP_MIN_LOG_LEVEL=3 via the os package inside the code worked 👍

RaviTezu commented Mar 24, 2017

Thanks @Carmezim
Setting the env. variable TF_CPP_MIN_LOG_LEVEL=3 via the os package inside the code worked 👍

@aselle

This comment has been minimized.

Show comment
Hide comment
@aselle

aselle Apr 1, 2017

Member

@hughsando, these are in fact the debates we had internally. Ideally in the future we'd be able to ship all compiled versions and choose at runtime, but logistically that's actually quite time consuming and tricky to implement. We are looking into it, but this was the best solution we had for now. We wanted people to know that if they have that warning, things are working, but it could be faster if you build it yourself. I.e. if you benchmarking our system, it's not a valid benchmark w/o compiling it with the best optimizations.

Member

aselle commented Apr 1, 2017

@hughsando, these are in fact the debates we had internally. Ideally in the future we'd be able to ship all compiled versions and choose at runtime, but logistically that's actually quite time consuming and tricky to implement. We are looking into it, but this was the best solution we had for now. We wanted people to know that if they have that warning, things are working, but it could be faster if you build it yourself. I.e. if you benchmarking our system, it's not a valid benchmark w/o compiling it with the best optimizations.

@yaroslavvb

This comment has been minimized.

Show comment
Hide comment
@yaroslavvb

yaroslavvb Apr 2, 2017

Contributor

@hughsando PS people have been uploading wheels built for their favorite configuration to https://github.com/yaroslavvb/tensorflow-community-wheels

Contributor

yaroslavvb commented Apr 2, 2017

@hughsando PS people have been uploading wheels built for their favorite configuration to https://github.com/yaroslavvb/tensorflow-community-wheels

@hughsando

This comment has been minimized.

Show comment
Hide comment
@hughsando

hughsando Apr 3, 2017

hughsando commented Apr 3, 2017

@aselle

This comment has been minimized.

Show comment
Hide comment
@aselle

aselle Apr 3, 2017

Member

@hughsando, in spirit that is the idea we would like to pursue. However, it is more difficult than that in that we use Eigen for a lot of the implementations of kernels. Eigen would have to be compiled multiple ways without causing any symbol conflicts int the final binary, and also we would probably have to break up modules into more dsos so as not to have too large of a binary resident.

Member

aselle commented Apr 3, 2017

@hughsando, in spirit that is the idea we would like to pursue. However, it is more difficult than that in that we use Eigen for a lot of the implementations of kernels. Eigen would have to be compiled multiple ways without causing any symbol conflicts int the final binary, and also we would probably have to break up modules into more dsos so as not to have too large of a binary resident.

@hekimgil

This comment has been minimized.

Show comment
Hide comment
@hekimgil

hekimgil Apr 3, 2017

So after reading these, I went ahead and reinstalled, this time from Source following instructions on https://www.tensorflow.org/install/install_sources. I still see the "The TensorFlow library wasn't compiled to to use XXXX instructions..." warnings. So did I miss something or is installing from source and building is not what you meant by "building it yourself"?

hekimgil commented Apr 3, 2017

So after reading these, I went ahead and reinstalled, this time from Source following instructions on https://www.tensorflow.org/install/install_sources. I still see the "The TensorFlow library wasn't compiled to to use XXXX instructions..." warnings. So did I miss something or is installing from source and building is not what you meant by "building it yourself"?

@aselle

This comment has been minimized.

Show comment
Hide comment
@aselle

aselle Apr 4, 2017

Member

What did you put for the compiler optimization options ./configure asked you)? and what are the remaining warnings it shows you? Are you running the binary on the same machine you compiled it on?

Member

aselle commented Apr 4, 2017

What did you put for the compiler optimization options ./configure asked you)? and what are the remaining warnings it shows you? Are you running the binary on the same machine you compiled it on?

@hekimgil

This comment has been minimized.

Show comment
Hide comment
@hekimgil

hekimgil Apr 4, 2017

Thank you aselle, my problem is solved now but here is what happened:

  • I used all default options with ./configure except: 1) Y for CUDA support, and 2) compute capability
  • The warnings were about the SSE3, SSE4.1, SSE4.2, AVX, AVX2, and FMA capabilities of my machine (+ negative NUMA node read)
  • Yes, the same machine...

However, after reading your message, I changed my bazel build command from
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package to
bazel build -c opt --copt=-march=native --config=cuda //tensorflow/tools/pip_package:build_pip_package
with that extra --copt=-march=native in there and that did the trick. The warnings on instructions disappeared (albeit the NUMA warning still remaining). So although my problem seems to be solved, I wonder if it is possible that the -march=native is not really the default option for the "optimization flags to use" question in the ./configure options?

hekimgil commented Apr 4, 2017

Thank you aselle, my problem is solved now but here is what happened:

  • I used all default options with ./configure except: 1) Y for CUDA support, and 2) compute capability
  • The warnings were about the SSE3, SSE4.1, SSE4.2, AVX, AVX2, and FMA capabilities of my machine (+ negative NUMA node read)
  • Yes, the same machine...

However, after reading your message, I changed my bazel build command from
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package to
bazel build -c opt --copt=-march=native --config=cuda //tensorflow/tools/pip_package:build_pip_package
with that extra --copt=-march=native in there and that did the trick. The warnings on instructions disappeared (albeit the NUMA warning still remaining). So although my problem seems to be solved, I wonder if it is possible that the -march=native is not really the default option for the "optimization flags to use" question in the ./configure options?

@aselle

This comment has been minimized.

Show comment
Hide comment
@aselle

aselle Apr 4, 2017

Member

@gunan, is the default different as @hekimgil suggests. Is this expected behavior?

Member

aselle commented Apr 4, 2017

@gunan, is the default different as @hekimgil suggests. Is this expected behavior?

@gunan

This comment has been minimized.

Show comment
Hide comment
@gunan

gunan Apr 4, 2017

Member

To fix the error messages, the bazel build command should be:
bazel build --config opt --config cuda tensorflow/tools/pip_package:build_pip_package

-c opt != --config opt
-c means "--compilation_mode"

Member

gunan commented Apr 4, 2017

To fix the error messages, the bazel build command should be:
bazel build --config opt --config cuda tensorflow/tools/pip_package:build_pip_package

-c opt != --config opt
-c means "--compilation_mode"

@hekimgil

This comment has been minimized.

Show comment
Hide comment
@hekimgil

hekimgil Apr 4, 2017

Thanks @gunan and @aselle and sorry for the confusion.

I saw the -c opt option in another website and wrongfully assumed it was equivalent to --config=opt... To double-check, I ran the bazel build again with --config=opt and without any extras (that is, no --copt=-march=native) and it works just fine...

Apologies for taking your time and thanks again for your responses...

hekimgil commented Apr 4, 2017

Thanks @gunan and @aselle and sorry for the confusion.

I saw the -c opt option in another website and wrongfully assumed it was equivalent to --config=opt... To double-check, I ran the bazel build again with --config=opt and without any extras (that is, no --copt=-march=native) and it works just fine...

Apologies for taking your time and thanks again for your responses...

@Daniel451

This comment has been minimized.

Show comment
Hide comment
@Daniel451

Daniel451 Apr 10, 2017

Sorry for bothering, but I think the second question by @tomrunia was not answered? While it is clear that the CPU runtimes benefit a lot by compiling TensorFlow from source with optimizations enabled, I am also wondering if it has an impact on the runtimes when using the GPU-version?
I guess it should not make much of a difference for the GPU-version of TensorFlow?

Daniel451 commented Apr 10, 2017

Sorry for bothering, but I think the second question by @tomrunia was not answered? While it is clear that the CPU runtimes benefit a lot by compiling TensorFlow from source with optimizations enabled, I am also wondering if it has an impact on the runtimes when using the GPU-version?
I guess it should not make much of a difference for the GPU-version of TensorFlow?

@aselle

This comment has been minimized.

Show comment
Hide comment
@aselle

aselle Apr 10, 2017

Member

@Daniel451, not as much, but @zheng-xq can comment further.

Member

aselle commented Apr 10, 2017

@Daniel451, not as much, but @zheng-xq can comment further.

@zheng-xq

This comment has been minimized.

Show comment
Hide comment
@zheng-xq

zheng-xq Apr 10, 2017

Contributor

Depending on the model, it could make quite a bit of difference when the model has a lot to process with its input pipeline.

Contributor

zheng-xq commented Apr 10, 2017

Depending on the model, it could make quite a bit of difference when the model has a lot to process with its input pipeline.

@Daniel451

This comment has been minimized.

Show comment
Hide comment
@Daniel451

Daniel451 Apr 10, 2017

Ok, thanks for the clarification!

Daniel451 commented Apr 10, 2017

Ok, thanks for the clarification!

@fat-lobyte

This comment has been minimized.

Show comment
Hide comment
@fat-lobyte

fat-lobyte Apr 11, 2017

Well is there an official (or semi-official) build available that uses the "advanced" CPU instruction set?
Maybe add an official release for PIP, something like "tensorflow-sse" and "tensorflow-gpu-sse"?

After the official stable release of 1.0 I was kind of looking forward to not having to build TensorFlow myself anymore.
But if that implies taking a serious performance hit, I guess it's back to self-compiling again. 😒

fat-lobyte commented Apr 11, 2017

Well is there an official (or semi-official) build available that uses the "advanced" CPU instruction set?
Maybe add an official release for PIP, something like "tensorflow-sse" and "tensorflow-gpu-sse"?

After the official stable release of 1.0 I was kind of looking forward to not having to build TensorFlow myself anymore.
But if that implies taking a serious performance hit, I guess it's back to self-compiling again. 😒

@KendallWeihe

This comment has been minimized.

Show comment
Hide comment
@KendallWeihe

KendallWeihe Apr 11, 2017

I am unfamiliar with SSE, AVX, and FMA. I'm confused why I would be getting the same CPU warnings when I'm computing on my GPU. Furthermore, when I upgraded to r1.0 my compute time was significantly increased. What is going on? I can confirm that Tensorflow is using the GPU by the following initialization messages:

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GeForce GTX 1070
major: 6 minor: 1 memoryClockRate (GHz) 1.797
pciBusID 0000:03:00.0
Total memory: 7.92GiB
Free memory: 6.98GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:03:00.0)

Furthermore, I can see Tensorflow is using GPU memory:

volcart@volcart-Precision-Tower-7910:/usr/bin$ nvidia-smi
Tue Apr 11 12:25:36 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 0000:03:00.0      On |                  N/A |
| 17%   59C    P2    38W / 185W |   7871MiB /  8105MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1120    G   /usr/lib/xorg/Xorg                             412MiB |
|    0      2188    G   compiz                                         244MiB |
|    0      2674    G   ...s-passed-by-fd --v8-snapshot-passed-by-fd    48MiB |
|    0      3529    G   ...eCTForProblematicRoots/disabled/ExpectCTR    72MiB |
|    0     15786    C   python3                                          8MiB |
|    0     20754    C   python3                                       7082MiB |
+-----------------------------------------------------------------------------+

Why is my training so much slower now?

KendallWeihe commented Apr 11, 2017

I am unfamiliar with SSE, AVX, and FMA. I'm confused why I would be getting the same CPU warnings when I'm computing on my GPU. Furthermore, when I upgraded to r1.0 my compute time was significantly increased. What is going on? I can confirm that Tensorflow is using the GPU by the following initialization messages:

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GeForce GTX 1070
major: 6 minor: 1 memoryClockRate (GHz) 1.797
pciBusID 0000:03:00.0
Total memory: 7.92GiB
Free memory: 6.98GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:03:00.0)

Furthermore, I can see Tensorflow is using GPU memory:

volcart@volcart-Precision-Tower-7910:/usr/bin$ nvidia-smi
Tue Apr 11 12:25:36 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 0000:03:00.0      On |                  N/A |
| 17%   59C    P2    38W / 185W |   7871MiB /  8105MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1120    G   /usr/lib/xorg/Xorg                             412MiB |
|    0      2188    G   compiz                                         244MiB |
|    0      2674    G   ...s-passed-by-fd --v8-snapshot-passed-by-fd    48MiB |
|    0      3529    G   ...eCTForProblematicRoots/disabled/ExpectCTR    72MiB |
|    0     15786    C   python3                                          8MiB |
|    0     20754    C   python3                                       7082MiB |
+-----------------------------------------------------------------------------+

Why is my training so much slower now?

@Zephyr-D

This comment has been minimized.

Show comment
Hide comment
@Zephyr-D

Zephyr-D Apr 11, 2017

Hi all,

I'm using Macbook Pro 2015 (8gb) to do some simple feature extraction with only CPU support. I first easily installed tf by pip but got those warnings. Thanks all above and I successfully recompiled tf from source follow the official instruction. Here're some result I want to share:

  • Compiling (installing from source) time: about 1.5 hour
    (There're a lot warnings during compilation but it works at last, hopefully. Just follow the instruction)
  • CPU computing speed up: about 2x
    (I'm using ResNet-50 to do feature extraction, the pip version takes about 85s a batch v.s. source version takes about 41s)

Hope this will help. But still, GTX will be a much better choice...

Zephyr-D commented Apr 11, 2017

Hi all,

I'm using Macbook Pro 2015 (8gb) to do some simple feature extraction with only CPU support. I first easily installed tf by pip but got those warnings. Thanks all above and I successfully recompiled tf from source follow the official instruction. Here're some result I want to share:

  • Compiling (installing from source) time: about 1.5 hour
    (There're a lot warnings during compilation but it works at last, hopefully. Just follow the instruction)
  • CPU computing speed up: about 2x
    (I'm using ResNet-50 to do feature extraction, the pip version takes about 85s a batch v.s. source version takes about 41s)

Hope this will help. But still, GTX will be a much better choice...

@ckalas

This comment has been minimized.

Show comment
Hide comment
@ckalas

ckalas Apr 13, 2017

I have successfully built the GPU version from source, but I'm not sure if the GPU is being used. Ill attach output below, but basically I don't see the CUDA imports. Does this mean something went wrong?

Using TensorFlow backend. Found 2125 images belonging to 2 classes. Found 832 images belonging to 2 classes. demo.py:64: UserWarning: Update yourfit_generatorcall to the Keras 2 API:fit_generator(<keras.pre..., validation_data=<keras.pre..., steps_per_epoch=128, epochs=5, validation_steps=832)nb_val_samples=nb_validation_samples) Epoch 1/5 2017-04-12 20:41:32.502286: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:865] OS X does not support NUMA - returning NUMA node zero 2017-04-12 20:41:32.502390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: name: GeForce GT 750M major: 3 minor: 0 memoryClockRate (GHz) 0.9255 pciBusID 0000:01:00.0 Total memory: 2.00GiB Free memory: 1.74GiB 2017-04-12 20:41:32.502402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 2017-04-12 20:41:32.502405: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0: Y 2017-04-12 20:41:32.502412: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 750M, pci bus id: 0000:01:00.0)

ckalas commented Apr 13, 2017

I have successfully built the GPU version from source, but I'm not sure if the GPU is being used. Ill attach output below, but basically I don't see the CUDA imports. Does this mean something went wrong?

Using TensorFlow backend. Found 2125 images belonging to 2 classes. Found 832 images belonging to 2 classes. demo.py:64: UserWarning: Update yourfit_generatorcall to the Keras 2 API:fit_generator(<keras.pre..., validation_data=<keras.pre..., steps_per_epoch=128, epochs=5, validation_steps=832)nb_val_samples=nb_validation_samples) Epoch 1/5 2017-04-12 20:41:32.502286: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:865] OS X does not support NUMA - returning NUMA node zero 2017-04-12 20:41:32.502390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: name: GeForce GT 750M major: 3 minor: 0 memoryClockRate (GHz) 0.9255 pciBusID 0000:01:00.0 Total memory: 2.00GiB Free memory: 1.74GiB 2017-04-12 20:41:32.502402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 2017-04-12 20:41:32.502405: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0: Y 2017-04-12 20:41:32.502412: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 750M, pci bus id: 0000:01:00.0)

@jshin49

This comment has been minimized.

Show comment
Hide comment
@jshin49

jshin49 Apr 18, 2017

Here is the build command I used to get rid of those warnings by actually optimizing TensorFlow to my CPU:

sudo bazel build --config opt --copt=-msse4.1 --copt=-msse4.1 --copt=-mavx --copt=-mavx2 --copt=-mfma //tensorflow/tools/pip_package:build_pip_package

jshin49 commented Apr 18, 2017

Here is the build command I used to get rid of those warnings by actually optimizing TensorFlow to my CPU:

sudo bazel build --config opt --copt=-msse4.1 --copt=-msse4.1 --copt=-mavx --copt=-mavx2 --copt=-mfma //tensorflow/tools/pip_package:build_pip_package
@logological

This comment has been minimized.

Show comment
Hide comment
@logological

logological Apr 25, 2017

@jshin49: Why are you using --copt=-msse4.1 twice? Or is the second one a typo, and you meant to write --copt=-msse4.2?

logological commented Apr 25, 2017

@jshin49: Why are you using --copt=-msse4.1 twice? Or is the second one a typo, and you meant to write --copt=-msse4.2?

@jshin49

This comment has been minimized.

Show comment
Hide comment
@jshin49

jshin49 Apr 25, 2017

@logological: You are right. It is a typo and I meant msse4.2.

Thanks for pointing it out!

jshin49 commented Apr 25, 2017

@logological: You are right. It is a typo and I meant msse4.2.

Thanks for pointing it out!

@apacha

This comment has been minimized.

Show comment
Hide comment
@apacha

apacha May 11, 2017

Just as a matter of interest: How comes, these warnings were not present in Tensorflow 1.0.1 but only in newer version Tensorflow 1.1.0?

apacha commented May 11, 2017

Just as a matter of interest: How comes, these warnings were not present in Tensorflow 1.0.1 but only in newer version Tensorflow 1.1.0?

@akors

This comment has been minimized.

Show comment
Hide comment
@akors

akors May 11, 2017

Just as a matter of interest: How comes, these warnings were not present in Tensorflow 1.0.1 but only in newer version Tensorflow 1.1.0?

Not quite sure what you mean, I am using the binaries from PIP for TF 1.0.1, and the warnings have been there already.

akors commented May 11, 2017

Just as a matter of interest: How comes, these warnings were not present in Tensorflow 1.0.1 but only in newer version Tensorflow 1.1.0?

Not quite sure what you mean, I am using the binaries from PIP for TF 1.0.1, and the warnings have been there already.

@apacha

This comment has been minimized.

Show comment
Hide comment
@apacha

apacha May 11, 2017

Aha... then maybe my cached wheel of TF 1.0.1 was somehow different. It displayed some warnings regarding OpKernels but no warnings regarding SSE instructions.

apacha commented May 11, 2017

Aha... then maybe my cached wheel of TF 1.0.1 was somehow different. It displayed some warnings regarding OpKernels but no warnings regarding SSE instructions.

@ProgramItUp

This comment has been minimized.

Show comment
Hide comment
@ProgramItUp

ProgramItUp May 21, 2017

@jshin49 and others who have tried it
How much performance improvement are you seeing by recompiling with compiler optimizations?

ProgramItUp commented May 21, 2017

@jshin49 and others who have tried it
How much performance improvement are you seeing by recompiling with compiler optimizations?

@apacha

This comment has been minimized.

Show comment
Hide comment
@apacha

apacha May 21, 2017

No performance improvements here, when activating AVX, but probably because I was building and using the GPU-version. I thought it would speed up some parts at least, but I didn't find it to make a big impact. Probably when using the CPU-version it does make an impact.

apacha commented May 21, 2017

No performance improvements here, when activating AVX, but probably because I was building and using the GPU-version. I thought it would speed up some parts at least, but I didn't find it to make a big impact. Probably when using the CPU-version it does make an impact.

@lgalke

This comment has been minimized.

Show comment
Hide comment
@lgalke

lgalke Jun 12, 2017

Single Instruction Multiple Data makes sense when you perform vector computations on the CPU. Thanks for the build-command examples.

lgalke commented Jun 12, 2017

Single Instruction Multiple Data makes sense when you perform vector computations on the CPU. Thanks for the build-command examples.

@mphz

This comment has been minimized.

Show comment
Hide comment
@mphz

mphz Jun 15, 2017

@apacha you should uninstall tensorflow and then reinstall again to kill the OpKernels warnings

mphz commented Jun 15, 2017

@apacha you should uninstall tensorflow and then reinstall again to kill the OpKernels warnings

@neelkadia

This comment has been minimized.

Show comment
Hide comment
@neelkadia

neelkadia Jun 23, 2017

If I do export TF_CPP_MIN_LOG_LEVEL=2 then console don't show results, it just empty.
and if I do export TF_CPP_MIN_LOG_LEVEL=0 then only it popsup with all the warning and results

2017-06-24 07:02:26.650752: W tensorflow/core/framework/op_def_util.cc:332] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization().
2017-06-24 07:02:26.851332: I tensorflow/examples/label_image/main.cc:251] withoutshadow (0): 0.801019
2017-06-24 07:02:26.851356: I tensorflow/examples/label_image/main.cc:251] withshadow (1): 0.198981

any thoughts why this happening?

neelkadia commented Jun 23, 2017

If I do export TF_CPP_MIN_LOG_LEVEL=2 then console don't show results, it just empty.
and if I do export TF_CPP_MIN_LOG_LEVEL=0 then only it popsup with all the warning and results

2017-06-24 07:02:26.650752: W tensorflow/core/framework/op_def_util.cc:332] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization().
2017-06-24 07:02:26.851332: I tensorflow/examples/label_image/main.cc:251] withoutshadow (0): 0.801019
2017-06-24 07:02:26.851356: I tensorflow/examples/label_image/main.cc:251] withshadow (1): 0.198981

any thoughts why this happening?

@guruprasaad123

This comment has been minimized.

Show comment
Hide comment
@guruprasaad123

guruprasaad123 Aug 5, 2017

I am also getting this Warning while using tensorflow api ' s in java,I have no idea why i am getting this.Any idea to resolve this warning!

guruprasaad123 commented Aug 5, 2017

I am also getting this Warning while using tensorflow api ' s in java,I have no idea why i am getting this.Any idea to resolve this warning!

@Ewurama

This comment has been minimized.

Show comment
Hide comment
@Ewurama

Ewurama Aug 22, 2017

if you want to disable them, you may use the code below

import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf

this should silence the warnings. 'TF_CPP_MIN_LOG_LEVEL' represents the Tensorflow environment variable responsible for logging. Also if you are on Ubuntu you may use this code below

export TF_CPP_MIN_LOG_LEVEL=2 

I hope this helps.

Ewurama commented Aug 22, 2017

if you want to disable them, you may use the code below

import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf

this should silence the warnings. 'TF_CPP_MIN_LOG_LEVEL' represents the Tensorflow environment variable responsible for logging. Also if you are on Ubuntu you may use this code below

export TF_CPP_MIN_LOG_LEVEL=2 

I hope this helps.

@g10guang

This comment has been minimized.

Show comment
Hide comment
@g10guang

g10guang Aug 31, 2017

@Carmezim If I compute with GPU, will I got this warning?
And how to know if tensorflow run with GPU or CPU?
Thanks.

g10guang commented Aug 31, 2017

@Carmezim If I compute with GPU, will I got this warning?
And how to know if tensorflow run with GPU or CPU?
Thanks.

@scotthuang1989

This comment has been minimized.

Show comment
Hide comment
@scotthuang1989

scotthuang1989 Sep 3, 2017

Contributor
Contributor

scotthuang1989 commented Sep 3, 2017

@Carmezim

This comment has been minimized.

Show comment
Hide comment
@Carmezim

Carmezim Sep 3, 2017

Contributor

@g10guang Yeah, TF can use both CPU and GPU but even if you're using GPU only it will inform you of the SIMD instructions available when you run the code.

To know in which device TF is running you can set log_device_placement to True when creating the session as in:
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

You can see more details on this under Logging Device placement in the documentation.

Contributor

Carmezim commented Sep 3, 2017

@g10guang Yeah, TF can use both CPU and GPU but even if you're using GPU only it will inform you of the SIMD instructions available when you run the code.

To know in which device TF is running you can set log_device_placement to True when creating the session as in:
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

You can see more details on this under Logging Device placement in the documentation.

@SragAR

This comment has been minimized.

Show comment
Hide comment
@SragAR

SragAR Sep 21, 2017

Follow instructions here (Just one instruction !!!)
It is amazing. Time taken for for a training step is halved. #8037

SragAR commented Sep 21, 2017

Follow instructions here (Just one instruction !!!)
It is amazing. Time taken for for a training step is halved. #8037

@shanuka

This comment has been minimized.

Show comment
Hide comment
@shanuka

shanuka commented Sep 23, 2017

Thanks @Carmezim !

@giker17

This comment has been minimized.

Show comment
Hide comment
@giker17

giker17 Dec 15, 2017

I am getting the same warning while using tensorflow_gpu 1.1.0 on win10, the version of Python is 3.6.3 installed in Anaconda.
Since the tf for GPU version is used, then what does that matters CPU?
I want to know what do these warnings mean, and what should I do or just leave them?

Here is my warnings:

C:\DevTools\Anaconda3\envs\py36_tfg>python
Python 3.6.3 | packaged by conda-forge | (default, Dec  9 2017, 16:22:46) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> sess = tf.Session()
2017-12-15 09:59:27.506604: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.507839: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.509196: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.509641: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.510098: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.510475: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.512253: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.512821: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.824267: I c:\l\work\tensorflow-1.1.0\tensorflow\core\common_runtime\gpu\gpu_device.cc:887] Found device 0 with properties:
name: GeForce GTX 1060
major: 6 minor: 1 memoryClockRate (GHz) 1.6705
pciBusID 0000:01:00.0
Total memory: 3.00GiB
Free memory: 2.43GiB
2017-12-15 09:59:27.824508: I c:\l\work\tensorflow-1.1.0\tensorflow\core\common_runtime\gpu\gpu_device.cc:908] DMA: 0
2017-12-15 09:59:27.826709: I c:\l\work\tensorflow-1.1.0\tensorflow\core\common_runtime\gpu\gpu_device.cc:918] 0:   Y
2017-12-15 09:59:27.827566: I c:\l\work\tensorflow-1.1.0\tensorflow\core\common_runtime\gpu\gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0)

Another question:
How will I know if the tf using Gpu or Cpu while runing a program such as object detection in a video, are there any tools suggested for monitoring my devices?

Thanks a lot.
: )

giker17 commented Dec 15, 2017

I am getting the same warning while using tensorflow_gpu 1.1.0 on win10, the version of Python is 3.6.3 installed in Anaconda.
Since the tf for GPU version is used, then what does that matters CPU?
I want to know what do these warnings mean, and what should I do or just leave them?

Here is my warnings:

C:\DevTools\Anaconda3\envs\py36_tfg>python
Python 3.6.3 | packaged by conda-forge | (default, Dec  9 2017, 16:22:46) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> sess = tf.Session()
2017-12-15 09:59:27.506604: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.507839: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.509196: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.509641: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.510098: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.510475: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.512253: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.512821: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-12-15 09:59:27.824267: I c:\l\work\tensorflow-1.1.0\tensorflow\core\common_runtime\gpu\gpu_device.cc:887] Found device 0 with properties:
name: GeForce GTX 1060
major: 6 minor: 1 memoryClockRate (GHz) 1.6705
pciBusID 0000:01:00.0
Total memory: 3.00GiB
Free memory: 2.43GiB
2017-12-15 09:59:27.824508: I c:\l\work\tensorflow-1.1.0\tensorflow\core\common_runtime\gpu\gpu_device.cc:908] DMA: 0
2017-12-15 09:59:27.826709: I c:\l\work\tensorflow-1.1.0\tensorflow\core\common_runtime\gpu\gpu_device.cc:918] 0:   Y
2017-12-15 09:59:27.827566: I c:\l\work\tensorflow-1.1.0\tensorflow\core\common_runtime\gpu\gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0)

Another question:
How will I know if the tf using Gpu or Cpu while runing a program such as object detection in a video, are there any tools suggested for monitoring my devices?

Thanks a lot.
: )

@gknight7

This comment has been minimized.

Show comment
Hide comment
@gknight7

gknight7 Jun 19, 2018

Instead of removing the warning is there any way to use those SSE instructions to speed up the training

gknight7 commented Jun 19, 2018

Instead of removing the warning is there any way to use those SSE instructions to speed up the training

@SragAR

This comment has been minimized.

Show comment
Hide comment
@SragAR

SragAR Jun 20, 2018

@gknight7
Please check my comment above.

SragAR commented Jun 20, 2018

@gknight7
Please check my comment above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment