Skip to content

How to build tensorflow from source, so that it works similar across INTEL and AMD CPUs (in terms of floating point math precision) #56586

@GChaitanya2001

Description

@GChaitanya2001
Click to expand!

Issue Type

Build/Install

Source

source

Tensorflow Version

1.15

Custom Code

No

OS Platform and Distribution

Linux Ubuntu 18.04

Mobile device

No response

Python version

3.6.9

Bazel version

0.26.1

GCC/Compiler version

7.5.0

CUDA/cuDNN version

No response

GPU model and memory

No response

Current Behaviour?

I am trying to build tensorflow from source(v1.15) as I was getting a difference in floating point math precision across INTEL and AMD CPUs(refer to issue #56529). What are the possible options to provide for the build command to fix this? I am looking for a solution for either TF v1.15 or TF v2.3.

I tried the following build commands,

1)  bazel --output_base=/local/mnt/workspace/gsaichai build --config=v1 --config=mkl --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-mfpmath=387 --copt=-mtune=generic --copt=-march=x86-64 --host_copt=-march=x86-64 --verbose_failures //tensorflow/tools/pip_package:build_pip_package

2)  bazel --output_base=/local/mnt/workspace/gsaichai build --config=v1 --config=mkl --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-mfpmath=387 --copt=-march=x86-64 --host_copt=-march=x86-64 --verbose_failures //tensorflow/tools/pip_package:build_pip_package

But the difference was still present.

These commands are resulting in the following error,

ERROR: /local/mnt/workspace/gsaichai/qnn_src/tensorflow/tensorflow/python/BUILD:329:1: C++ compilation of rule '//tensorflow/python:bfloat16_lib' failed (Exit 1)
tensorflow/python/lib/core/bfloat16.cc: In function 'bool tensorflow::{anonymous}::Initialize()':
tensorflow/python/lib/core/bfloat16.cc:634:36: error: no match for call to '(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [6], <unresolved overloaded function type>, const std::array<int, 3>&)'
                       compare_types)) {
                                    ^
tensorflow/python/lib/core/bfloat16.cc:608:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
                             const std::array<int, 3>& types) {
                                                            ^
tensorflow/python/lib/core/bfloat16.cc:608:60: note:   no known conversion for argument 2 from '<unresolved overloaded function type>' to 'PyUFuncGenericFunction {aka void (*)(char**, const long int*, const long int*, void*)}'
tensorflow/python/lib/core/bfloat16.cc:638:36: error: no match for call to '(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [10], <unresolved overloaded function type>, const std::array<int, 3>&)'
                       compare_types)) {
                                    ^
tensorflow/python/lib/core/bfloat16.cc:608:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
                             const std::array<int, 3>& types) {
                                                            ^
tensorflow/python/lib/core/bfloat16.cc:608:60: note:   no known conversion for argument 2 from '<unresolved overloaded function type>' to 'PyUFuncGenericFunction {aka void (*)(char**, const long int*, const long int*, void*)}'
tensorflow/python/lib/core/bfloat16.cc:641:77: error: no match for call to '(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [5], <unresolved overloaded function type>, const std::array<int, 3>&)'
   if (!register_ufunc("less", CompareUFunc<Bfloat16LtFunctor>, compare_types)) {
                                                                             ^
tensorflow/python/lib/core/bfloat16.cc:608:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
                             const std::array<int, 3>& types) {
                                                            ^
tensorflow/python/lib/core/bfloat16.cc:608:60: note:   no known conversion for argument 2 from '<unresolved overloaded function type>' to 'PyUFuncGenericFunction {aka void (*)(char**, const long int*, const long int*, void*)}'
tensorflow/python/lib/core/bfloat16.cc:645:36: error: no match for call to '(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [8], <unresolved overloaded function type>, const std::array<int, 3>&)'
                       compare_types)) {
                                    ^
tensorflow/python/lib/core/bfloat16.cc:608:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
                             const std::array<int, 3>& types) {
                                                            ^
tensorflow/python/lib/core/bfloat16.cc:608:60: note:   no known conversion for argument 2 from '<unresolved overloaded function type>' to 'PyUFuncGenericFunction {aka void (*)(char**, const long int*, const long int*, void*)}'
tensorflow/python/lib/core/bfloat16.cc:649:36: error: no match for call to '(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [11], <unresolved overloaded function type>, const std::array<int, 3>&)'
                       compare_types)) {
                                    ^
tensorflow/python/lib/core/bfloat16.cc:608:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
                             const std::array<int, 3>& types) {
                                                            ^
tensorflow/python/lib/core/bfloat16.cc:608:60: note:   no known conversion for argument 2 from '<unresolved overloaded function type>' to 'PyUFuncGenericFunction {aka void (*)(char**, const long int*, const long int*, void*)}'
tensorflow/python/lib/core/bfloat16.cc:653:36: error: no match for call to '(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [14], <unresolved overloaded function type>, const std::array<int, 3>&)'
                       compare_types)) {
                                    ^
tensorflow/python/lib/core/bfloat16.cc:608:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
                             const std::array<int, 3>& types) {
                                                            ^
tensorflow/python/lib/core/bfloat16.cc:608:60: note:   no known conversion for argument 2 from '<unresolved overloaded function type>' to 'PyUFuncGenericFunction {aka void (*)(char**, const long int*, const long int*, void*)}'
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 232.568s, Critical Path: 63.39s
INFO: 1208 processes: 1208 local.
FAILED: Build did NOT complete successfully

The above error got resolved after making some changes to my ./configure file. But the resulting tensorflow packages are still resulting in the difference between AMD and INTEL. Please let me know what are the options that I can provide while building Tensorflow from source(for v1.15 or v2.3), so that the floating point math is consistent across AMD and INTEL

Standalone code to reproduce the issue

None

Relevant log output

No response

Metadata

Metadata

Labels

TF 1.15for issues seen on TF 1.15TF 2.3Issues related to TF 2.3staleThis label marks the issue/pr stale - to be closed automatically if no activitystat:awaiting responseStatus - Awaiting response from authorsubtype: ubuntu/linuxUbuntu/Linux Build/Installation Issuestype:build/installBuild and install issues

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions