tensorflow 2.15.0 #353

xhochy · 2023-11-15T08:44:33Z

Fixes #352

conda-forge-webservices · 2023-11-15T08:44:44Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

…nda-forge-pinning 2023.11.15.00.22.50

xhochy · 2023-11-15T13:05:03Z

Fails in the estimator build with:

+ bazel build tensorflow_estimator/tools/pip_package:build_pip_package
Starting local Bazel server and connecting to it...
Loading:
Loading:
Loading: 0 packages loaded
Analyzing: target //tensorflow_estimator/tools/pip_package:build_pip_package (1 packages loaded, 0 targets configured)
Analyzing: target //tensorflow_estimator/tools/pip_package:build_pip_package (44 packages loaded, 283 targets configured)
INFO: Analyzed target //tensorflow_estimator/tools/pip_package:build_pip_package (47 packages loaded, 333 targets configured).
INFO: Found 1 target...
[0 / 8] [Prepa] BazelWorkspaceStatusAction stable-status.txt
ERROR: /home/uwe/mambaforge/conda-bld/tensorflow-split_1700044396155/work/tensorflow-estimator/tensorflow_estimator/python/estimator/canned/linear_optimizer/BUILD:55:11: Extracting tensorflow_estimator APIs for //tensorflow_estimator/python/estimator/canned/linear_optimizer:sharded_mutable_dense_hashtable_py to bazel-out/k8-fastbuild/bin/tensorflow_estimator/python/estimator/canned/linear_optimizer/sharded_mutable_dense_hashtable_py_extracted_tensorflow_estimator_api.json. failed: (Abo
rted): extractor_wrapper failed: error executing command (from target //tensorflow_estimator/python/estimator/canned/linear_optimizer:sharded_mutable_dense_hashtable_py) bazel-out/k8-opt-exec-2B5CBBC6/bin/tensorflow_estimator/python/estimator/api/extractor_wrapper --output ... (remaining 6 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
[libprotobuf ERROR google/protobuf/descriptor_database.cc:642] File already exists in database: google/protobuf/descriptor.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:1986] CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
terminate called after throwing an instance of 'google::protobuf::FatalException'
  what():  CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
ERROR: /home/uwe/mambaforge/conda-bld/tensorflow-split_1700044396155/work/tensorflow-estimator/tensorflow_estimator/python/estimator/BUILD:633:11: Extracting tensorflow_estimator APIs for //tensorflow_estimator/python/estimator:dnn_linear_combined to bazel-out/k8-fastbuild/bin/tensorflow_estimator/python/estimator/dnn_linear_combined_extracted_tensorflow_estimator_api.json. failed: (Aborted): extractor_wrapper failed: error executing command (from target //tensorflow_estimator/python/e
stimator:dnn_linear_combined) bazel-out/k8-opt-exec-2B5CBBC6/bin/tensorflow_estimator/python/estimator/api/extractor_wrapper --output ... (remaining 6 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
[libprotobuf ERROR google/protobuf/descriptor_database.cc:642] File already exists in database: google/protobuf/descriptor.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:1986] CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
terminate called after throwing an instance of 'google::protobuf::FatalException'
  what():  CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
[12 / 75] Extracting tensorflow_estimator APIs for //tensorflow_estimator/python/estimator:export_output to bazel-out/k8-fastbuild/bin/tensorflow_estimator/python/estimator/export_output_extracted_tensorflow_estimator_api.json.; 1s linux-sandbox ... (46 actions, 45 running)
Target //tensorflow_estimator/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
ERROR: /home/uwe/mambaforge/conda-bld/tensorflow-split_1700044396155/work/tensorflow-estimator/tensorflow_estimator/tools/pip_package/BUILD:18:10 Middleman _middlemen/tensorflow_Uestimator_Stools_Spip_Upackage_Sbuild_Upip_Upackage-runfiles failed: (Aborted): extractor_wrapper failed: error executing command (from target //tensorflow_estimator/python/estimator/canned/linear_optimizer:sharded_mutable_dense_hashtable_py) bazel-out/k8-opt-exec-2B5CBBC6/bin/tensorflow_estimator/python/estim
ator/api/extractor_wrapper --output ... (remaining 6 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
INFO: Elapsed time: 5.216s, Critical Path: 1.16s
INFO: 59 processes: 59 internal.

xhochy · 2023-11-17T10:12:24Z

Bisecting for this error:

Supposely it is this:

% git bisect bad
7c8a95f2ab9b8996eccf5c33729018a45af467cb is the first bad commit
commit 7c8a95f2ab9b8996eccf5c33729018a45af467cb
Author: Shixin Li <shixinli@google.com>
Date:   Fri Sep 22 13:05:26 2023 -0700

    Enable cross compilation for PJRT GPU compiler:
    1. StreamExecutorGpuCompiler compiles w/o client.
    2. Add StreamExecutorGpuExecutable (the unloaded pjrt executable).
    3. Load StreamExecutorGpuExecutable to PjRtLoadedExecutable through `Load` API.

    PiperOrigin-RevId: 567697879

 third_party/xla/xla/client/local_client.h          |   2 +
 third_party/xla/xla/pjrt/BUILD                     |  16 ++
 third_party/xla/xla/pjrt/gpu/BUILD                 |  95 +++++++++++-
 third_party/xla/xla/pjrt/gpu/se_gpu_pjrt_client.cc |  45 ++++++
 third_party/xla/xla/pjrt/gpu/se_gpu_pjrt_client.h  |   5 +
 .../xla/xla/pjrt/gpu/se_gpu_pjrt_compiler.cc       | 108 +++++++++++++
 .../xla/xla/pjrt/gpu/se_gpu_pjrt_compiler.h        |  15 ++
 .../xla/pjrt/gpu/se_gpu_pjrt_compiler_aot_test.cc  | 167 +++++++++++++++++++++
 .../xla/xla/pjrt/gpu/se_gpu_pjrt_compiler_test.cc  |   1 +
 .../xla/xla/pjrt/pjrt_stream_executor_client.cc    |  39 ++---
 .../xla/xla/pjrt/pjrt_stream_executor_client.h     |   1 +
 .../pjrt/stream_executor_unloaded_executable.cc    |  31 ++++
 .../xla/pjrt/stream_executor_unloaded_executable.h |  78 ++++++++++
 .../pjrt/stream_executor_unloaded_executable.proto |  28 ++++
 third_party/xla/xla/service/gpu/BUILD              |  14 ++
 third_party/xla/xla/service/gpu/gpu_compiler.cc    |  15 --
 third_party/xla/xla/service/gpu/gpu_compiler.h     |  13 +-
 .../xla/xla/service/gpu/gpu_target_config.cc       |  38 +++++
 .../xla/xla/service/gpu/gpu_target_config.h        |  41 +++++
 19 files changed, 705 insertions(+), 47 deletions(-)
 create mode 100644 third_party/xla/xla/pjrt/gpu/se_gpu_pjrt_compiler_aot_test.cc
 create mode 100644 third_party/xla/xla/pjrt/stream_executor_unloaded_executable.cc
 create mode 100644 third_party/xla/xla/pjrt/stream_executor_unloaded_executable.h
 create mode 100644 third_party/xla/xla/pjrt/stream_executor_unloaded_executable.proto
 create mode 100644 third_party/xla/xla/service/gpu/gpu_target_config.cc
 create mode 100644 third_party/xla/xla/service/gpu/gpu_target_config.h

Maybe one of by bisects took a wrong turn?

xhochy · 2023-11-20T14:14:45Z

I got past the problem by carefully reading the Bazel scripts of riegeli. Next stop: CUDA.

xhochy · 2023-11-21T10:13:03Z

CUDA builds fail with the following (I have no idea what it means):

/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h: In constructor 'absl::lts_20230125::str_format_internal::FormatSpecTemplate<Args>::FormatSpecTemplate(const absl::lts_20230125::str_format_internal::ExtendedParsedFormat<absl::lts_20230
125::FormatConversionCharSet(C)...>&)':
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:171:1: error: parse error in template argument list
  171 |     CheckArity<sizeof...(C), sizeof...(Args)>();
      | ^   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:171:63: error: expected ';' before ')' token
  171 |     CheckArity<sizeof...(C), sizeof...(Args)>();
      |                                                               ^
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:172:147: error: template argument 1 is invalid
  172 |     CheckMatches<C...>(absl::make_index_sequence<sizeof...(C)>{});
      |                                                                                                                                                   ^
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:172:151: error: expected primary-expression before '{' token
  172 |     CheckMatches<C...>(absl::make_index_sequence<sizeof...(C)>{});
      |                                                                                                                                                       ^
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:172:151: error: expected ';' before '{' token
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:172:153: error: expected primary-expression before ')' token
  172 |     CheckMatches<C...>(absl::make_index_sequence<sizeof...(C)>{});
      |                                                                                                                                                         ^
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h: In instantiation of 'constexpr absl::lts_20230125::FormatConversionCharSet absl::lts_20230125::str_format_internal::ArgumentToConv() [with Arg = long int]':
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/str_format.h:268:156:   required by substitution of 'template<class ... Args> using FormatSpec = absl::lts_20230125::str_format_internal::FormatSpecTemplate<absl::lts_20230125::FormatConversionCharSet((ArgumentToC
onv<Args>)())...> [with Args = {long int, const tensorflow::ResourceBase*}]'
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/str_format.h:351:1:   required by substitution of 'template<class ... Args> std::string absl::lts_20230125::StrFormat(absl::lts_20230125::FormatSpec<Args ...>&, const Args& ...) [with Args = {long int, const tenso
rflow::ResourceBase*}]'
./tensorflow/core/framework/resource_base.h:44:23:   required from here
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:403:43: error: no matching function for call to 'ExtractCharSet(ConvResult)'
  403 |   return absl::str_format_internal::ExtractCharSet(ConvResult{});
      |        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:196:1: note: candidate: 'template<absl::lts_20230125::FormatConversionCharSet C> constexpr absl::lts_20230125::FormatConversionCharSet absl::lts_20230125::str_format_internal::ExtractChar
Set(absl::lts_20230125::FormatConvertResult<(absl::lts_20230125::FormatConversionCharSet)(C)>)'
  196 | constexpr FormatConversionCharSet ExtractCharSet(FormatConvertResult<C>) {
      | ^~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:196:1: note:   template argument deduction/substitution failed:
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:403:43: note:   couldn't deduce template parameter 'C'
  403 |   return absl::str_format_internal::ExtractCharSet(ConvResult{});
      |        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:201:1: note: candidate: 'template<absl::lts_20230125::FormatConversionCharSet C> constexpr absl::lts_20230125::FormatConversionCharSet absl::lts_20230125::str_format_internal::ExtractChar
Set(absl::lts_20230125::str_format_internal::ArgConvertResult<(absl::lts_20230125::FormatConversionCharSet)(C)>)'
  201 | constexpr FormatConversionCharSet ExtractCharSet(ArgConvertResult<C>) {
      | ^~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:201:1: note:   template argument deduction/substitution failed:
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:403:43: note:   couldn't deduce template parameter 'C'
…

xhochy · 2023-11-23T13:17:08Z

Next one:

tensorflow/core/kernels/cast_op_gpu.cu.cc(32): warning #846-D: this partial specialization would have made the instantiation of class "tensorflow::functor::CastFunctor<tensorflow::functor::GPUDevice, tsl::float8_e4m3fn, Eigen::half>" ambiguous

external/eigen_archive/Eigen/src/Core/MathFunctions.h(429): error: more than one user-defined conversion from "const tsl::uint4" to "tsl::int4" applies:
            function template "ml_dtypes::i4<UnderlyingTy>::operator T() const [with UnderlyingTy=uint8_t]"
bazel-out/k8-opt/bin/external/ml_dtypes/_virtual_includes/int4/ml_dtypes/include/int4.h(52): here
            function template "ml_dtypes::i4<UnderlyingTy>::i4(T) [with UnderlyingTy=int8_t]"
bazel-out/k8-opt/bin/external/ml_dtypes/_virtual_includes/int4/ml_dtypes/include/int4.h(42): here
          detected during:
            instantiation of "NewType Eigen::internal::cast_impl<OldType, NewType, EnableIf>::run(const OldType &) [with OldType=tsl::uint4, NewType=tsl::int4, EnableIf=void]"
(462): here
            instantiation of "NewType Eigen::internal::cast<OldType,NewType>(const OldType &) [with OldType=tsl::uint4, NewType=tsl::int4]"
external/eigen_archive/Eigen/src/Core/functors/UnaryFunctors.h(179): here
            instantiation of "const NewType Eigen::internal::scalar_cast_op<Scalar, NewType>::operator()(const Scalar &) const [with Scalar=tsl::uint4, NewType=tsl::int4]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorConversion.h(238): here
            instantiation of "TargetType Eigen::internal::CoeffConv<SrcType, TargetType, IsSameT>::run(const Eigen::TensorEvaluator<ArgType, Device> &, Eigen::Index) [with SrcType=tsl::uint4, TargetType=tsl::int4, IsSameT=false, ArgType=const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorConversion.h(395): here
            instantiation of "Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::CoeffReturnType Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::coeff(Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::Index) const [with TargetType=tsl::int4, ArgType=const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, Device=Eigen::GpuDevice]"

external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorAssign.h(174): here
            instantiation of "void Eigen::TensorEvaluator<const Eigen::TensorAssignOp<LeftArgType, RightArgType>, Device>::evalScalar(Eigen::TensorEvaluator<const Eigen::TensorAssignOp<LeftArgType, RightArgType>, Device>::Index) const [with LeftArgType=Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, RightArgType=const Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eig
en::MakePointer>>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(607): here
            instantiation of "void Eigen::internal::EigenMetaKernelEval<Evaluator, StorageIndex, Vectorizable>::run(Evaluator &, StorageIndex, StorageIndex, StorageIndex) [with Evaluator=Eigen::TensorEvaluator<const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Eigen::G
puDevice>, StorageIndex=Eigen::DenseIndex, Vectorizable=false]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(644): here
            instantiation of "void Eigen::internal::EigenMetaKernel(Evaluator, StorageIndex) [with Evaluator=Eigen::TensorEvaluator<const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Eigen::GpuDevice>, StorageIndex=Eigen::DenseIndex]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(665): here
            instantiation of "void Eigen::internal::TensorExecutor<Expression, Eigen::GpuDevice, Vectorizable, Tiling>::run(const Expression &, const Eigen::GpuDevice &) [with Expression=const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Vectorizable=false, Tiling=Eige
n::internal::Off]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorDevice.h(39): here
            instantiation of "Eigen::TensorDevice<ExpressionType, DeviceType> &Eigen::TensorDevice<ExpressionType, DeviceType>::operator=(const OtherDerived &) [with ExpressionType=Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, DeviceType=tensorflow::functor::GPUDevice, OtherDerived=Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>]"
tensorflow/core/kernels/cast_op_gpu.cu.cc(32): here
            instantiation of "void tensorflow::functor::CastFunctor<tensorflow::functor::GPUDevice, OUT_TYPE, IN_TYPE>::operator()(const tensorflow::functor::GPUDevice &, tensorflow::TTypes<OUT_TYPE, 1, Eigen::DenseIndex>::Flat, tensorflow::TTypes<IN_TYPE, 1, Eigen::DenseIndex>::ConstFlat, __nv_bool) [with OUT_TYPE=tsl::int4, IN_TYPE=tsl::uint4]"
tensorflow/core/kernels/cast_op_gpu.cu.cc(177): here

external/eigen_archive/Eigen/src/Core/MathFunctions.h(429): error: more than one user-defined conversion from "const tsl::int4" to "tsl::uint4" applies:
            function template "ml_dtypes::i4<UnderlyingTy>::operator T() const [with UnderlyingTy=int8_t]"
bazel-out/k8-opt/bin/external/ml_dtypes/_virtual_includes/int4/ml_dtypes/include/int4.h(52): here
            function template "ml_dtypes::i4<UnderlyingTy>::i4(T) [with UnderlyingTy=uint8_t]"
bazel-out/k8-opt/bin/external/ml_dtypes/_virtual_includes/int4/ml_dtypes/include/int4.h(42): here
          detected during:
            instantiation of "NewType Eigen::internal::cast_impl<OldType, NewType, EnableIf>::run(const OldType &) [with OldType=tsl::int4, NewType=tsl::uint4, EnableIf=void]"
(462): here
            instantiation of "NewType Eigen::internal::cast<OldType,NewType>(const OldType &) [with OldType=tsl::int4, NewType=tsl::uint4]"
external/eigen_archive/Eigen/src/Core/functors/UnaryFunctors.h(179): here
            instantiation of "const NewType Eigen::internal::scalar_cast_op<Scalar, NewType>::operator()(const Scalar &) const [with Scalar=tsl::int4, NewType=tsl::uint4]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorConversion.h(238): here
            instantiation of "TargetType Eigen::internal::CoeffConv<SrcType, TargetType, IsSameT>::run(const Eigen::TensorEvaluator<ArgType, Device> &, Eigen::Index) [with SrcType=tsl::int4, TargetType=tsl::uint4, IsSameT=false, ArgType=const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorConversion.h(395): here
            instantiation of "Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::CoeffReturnType Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::coeff(Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::Index) const [with TargetType=tsl::uint4, ArgType=const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, Device=Eigen::GpuDevice]"

external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorAssign.h(174): here
            instantiation of "void Eigen::TensorEvaluator<const Eigen::TensorAssignOp<LeftArgType, RightArgType>, Device>::evalScalar(Eigen::TensorEvaluator<const Eigen::TensorAssignOp<LeftArgType, RightArgType>, Device>::Index) const [with LeftArgType=Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, RightArgType=const Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Ei
gen::MakePointer>>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(607): here
            instantiation of "void Eigen::internal::EigenMetaKernelEval<Evaluator, StorageIndex, Vectorizable>::run(Evaluator &, StorageIndex, StorageIndex, StorageIndex) [with Evaluator=Eigen::TensorEvaluator<const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Eigen::
GpuDevice>, StorageIndex=Eigen::DenseIndex, Vectorizable=false]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(644): here
            instantiation of "void Eigen::internal::EigenMetaKernel(Evaluator, StorageIndex) [with Evaluator=Eigen::TensorEvaluator<const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Eigen::GpuDevice>, StorageIndex=Eigen::DenseIndex]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(665): here
            instantiation of "void Eigen::internal::TensorExecutor<Expression, Eigen::GpuDevice, Vectorizable, Tiling>::run(const Expression &, const Eigen::GpuDevice &) [with Expression=const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Vectorizable=false, Tiling=Eig
en::internal::Off]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorDevice.h(39): here
            instantiation of "Eigen::TensorDevice<ExpressionType, DeviceType> &Eigen::TensorDevice<ExpressionType, DeviceType>::operator=(const OtherDerived &) [with ExpressionType=Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, DeviceType=tensorflow::functor::GPUDevice, OtherDerived=Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>]"
tensorflow/core/kernels/cast_op_gpu.cu.cc(32): here
            instantiation of "void tensorflow::functor::CastFunctor<tensorflow::functor::GPUDevice, OUT_TYPE, IN_TYPE>::operator()(const tensorflow::functor::GPUDevice &, tensorflow::TTypes<OUT_TYPE, 1, Eigen::DenseIndex>::Flat, tensorflow::TTypes<IN_TYPE, 1, Eigen::DenseIndex>::ConstFlat, __nv_bool) [with OUT_TYPE=tsl::uint4, IN_TYPE=tsl::int4]"
tensorflow/core/kernels/cast_op_gpu.cu.cc(178): here

2 errors detected in the compilation of "tensorflow/core/kernels/cast_op_gpu.cu.cc".

hmaarrfk · 2023-11-23T16:25:12Z

can I ask what your workflow is for this? how do you setup your environment?

xhochy · 2023-11-23T18:18:47Z

can I ask what your workflow is for this? how do you setup your environment?

Run conda-build
let it fail
cd into the work directory
source build_env_setup.sh
git init . && git add . && git commit -m "Initial commit" --no-verify --no-gpg-sign
Iterate with bash $RECIPE_DIR/build.sh

hmaarrfk · 2023-11-25T16:53:07Z

interesting. thanks!

…nda-forge-pinning 2023.11.26.16.08.11

xhochy · 2023-11-28T08:06:21Z

@conda-forge/tensorflow This is ready for review. I will clean up the patches locally and would start building everything on Friday.

h-vetinari

Thanks hugely! 👏

Can you tell us what happened with the protobuf situation? It sounds you're adding some changes related to that, but I don't see the pinned versions changing in the .ci_support files.

Generally LGTM, though the bazel stuff is over my head as usual. It would be lovely if you could write some more context into the commit messages of the patches that are necessary. That's all from my side, except perhapse my recurring nit of generating the patches with --no-signature. ;-)

h-vetinari · 2023-11-28T08:53:43Z

recipe/meta.yaml

-  - url: https://raw.githubusercontent.com/jax-ml/ml_dtypes/v0.2.0/ml_dtypes/include/float8.h
+  # yes, the headers come from a different version than the python package required below.
+  - url: https://raw.githubusercontent.com/jax-ml/ml_dtypes/v0.3.1/ml_dtypes/include/float8.h
    fn: float8.h
-    sha256: 7c3d32809adf01e1568434760bf3c347d0ef21d5fc4c5009815a5dd54635ed25
+    sha256: d2798fad4e64375b566b1df1d7bc440313e4b1024ca08f12cead3eaa4b73ff72
+  - url: https://raw.githubusercontent.com/jax-ml/ml_dtypes/v0.3.1/ml_dtypes/include/int4.h
+    fn: int4.h
+    sha256: b3a9970c3c6b169c41ac2fd4375f668d3fd1b492d48b912d89415fa1522a8f50


It's still quite confusing/surprising what's happening here. I guess the hopes back from 2.14 about this being simpler for 2.15 didn't materialize? Not a blocker, but just for understanding.

They did a partial update of the ml_dtypes code as outlined in the comment. The Python part still uses 0.2.0 while the C++ code depends on 0.3.1 which comes with the new int4 type. In master, they have aligned it again: https://github.com/tensorflow/tensorflow/blob/99926785f7c9eaf53d94343916f14f300965ae72/tensorflow/tools/pip_package/setup.py#L93

If we want to get rid of pulling them in manually, we either need to add code to the libtensorflow_cc.targ.gz generation code or add support for pulling in ml_dtypes as a system dependency. Both are sadly more complicated than adding them as additional sources here.

Just for my understanding, is this the sort of thing that a future TensorFlow release will have resolved?

No idea 😢

I have made a PR to staged-recipes to have the headers packaged separately: conda-forge/staged-recipes#24662

h-vetinari · 2023-11-28T08:55:15Z

recipe/patches/0001-Blacklist-well-known-protos.patch

+From 9e6d16913eedc72aad7ced7f9cdf07374c84fc8f Mon Sep 17 00:00:00 2001
+From: "Uwe L. Korn" <uwe.korn@quantco.com>
+Date: Sun, 19 Nov 2023 20:50:29 +0000
+Subject: [PATCH] Blacklist well-known protos


Can you mention what this does or why it's necessary?

I'm assuming it will not (re)generate certain protos, and that those would be incompatible if we did regenerate?

The problem here was that this would pull in the well-known protos twice. Thus, when importing tensorflow, we got errors about already registered definitions. I found the fix in https://github.com/google/riegeli/blob/c2bcb54934acd28eace78bd4a1bf008347592cc4/third_party/protobuf.patch#L64

I will merge this patch with above one that adds the toolchain during cleanup.

h-vetinari · 2023-11-28T08:55:53Z

recipe/patches/0001-Patch-ml_dtype-to-make-constructor-unambiguous.patch

+	patch_file = [
+            "//third_party/py/ml_dtypes:int4.patch",
+	],


Ah, patch-ception, wonderful 😅

I should add a comment that we need this patch for nvcc to be able to compile code that imports int4.

h-vetinari · 2023-11-28T08:56:29Z

recipe/patches/0001-Remove-some-usage-of-absl-str_format-in-CUDA.patch

+From b1c5c65cd5b4db7e06bcdf5f4886e744b324cfb0 Mon Sep 17 00:00:00 2001
+From: "Uwe L. Korn" <uwe.korn@quantco.com>
+Date: Thu, 23 Nov 2023 09:05:37 +0000
+Subject: [PATCH] Remove some usage of absl::str_format in CUDA


Causes linker errors?

No, nvcc doesn't understand the C++ template due to new C++ features. We can only use absl::str_format in code that isn't parsed/ingested by nvcc.

Do you recall which new C++ features were used?

It was a combination of sizeof...(args) and std::enable_if.

Interesting those look to be a C++11 (or in some cases C++14) features. Would have thought GCC 10 (on CUDA 11.2) and GCC 11 (on CUDA 11.8) would be new enough. Maybe there is some edge case that wasn't handled until a later GCC

xhochy · 2023-11-28T19:31:07Z

Can you tell us what happened with the protobuf situation? It sounds you're adding some changes related to that, but I don't see the pinned versions changing in the .ci_support files.

Nothing has changed. All the protobuf related errors were down to the protobuf_toolchain and not the version itself. Once this is merged, I would start working on the unpinned build again.

except perhapse my recurring nit of generating the patches with --no-signature. ;-)

I always forget that option. If you find a way to set that as default globally in git, I would appreciate that.

xhochy · 2023-11-29T08:26:05Z

I pushed the patches also as a branch to https://github.com/xhochy/tensorflow/tree/2.15.0-conda-forge-patches

recipe/meta.yaml

ngam

Thanks 💎

…nda-forge-pinning 2023.12.08.16.50.53

xhochy · 2023-12-15T11:07:14Z

There were some issues now with the OSX builds but it seems we're fine and I have started the Linux and OSX builds now for all configurations.

xhochy · 2023-12-17T10:37:08Z

Builds are on my uwe.korn-tf-gpu and uwe.korn-tf-experimental channels with the following logs:

xhochy · 2023-12-17T10:38:44Z

@h-vetinari @hmaarrfk Please review/copy ;)

hmaarrfk · 2023-12-17T12:23:56Z

Would the goal be for one of us to do light testing? I'm mostly trying to understand a protocol that we can follow in the future too.

xhochy · 2023-12-17T13:37:16Z

Testing should hopefully be covered by the tests in the feedstock. Otherwise, we should extend that. I think isuruf scanned these logs on whether they used the right OSX SDK. But that was back then when build-locally.py didn't take care of that.

hmaarrfk · 2023-12-17T14:33:01Z

It's just pretty hard to test hardware acceleration without guaranteed access to the right hardware.

I can scan the logs.

hmaarrfk · 2023-12-17T20:54:15Z

thank you hugely

yuvipanda · 2023-12-17T21:09:30Z

(as a standby observer) - THANK YOU SO MUCH FOR WORKING ON THIS!

tensorflow 2.15.0

dc5d86d

xhochy requested review from FarhanTejani, ghego, h-vetinari, hajapy, hmaarrfk, jschueller, ngam, njzjz, waitingkuo and wolfv as code owners November 15, 2023 08:44

xhochy mentioned this pull request Nov 15, 2023

tensorflow 2.15 #351

Closed

MNT: Re-rendered with conda-build 3.27.0, conda-smithy 3.29.0, and co…

ad05d8c

…nda-forge-pinning 2023.11.15.00.22.50

xhochy added 2 commits November 15, 2023 13:05

Add protobuf toolchain

593bf52

More patching

e0414b9

More patching

5e65738

Remove some usage of absl str_format in CUDA

305bf8e

More CUDA fixes

56eec78

xhochy added 3 commits November 26, 2023 07:57

Load cuda_build_defs from tsl

4b348d5

Add python toolchain

97080a4

MNT: Re-rendered with conda-build 3.27.0, conda-smithy 3.30.0, and co…

b89b8be

…nda-forge-pinning 2023.11.26.16.08.11

h-vetinari approved these changes Nov 28, 2023

View reviewed changes

Clean up headers

7342abd

Add cuda120 migration

21664ce

njzjz mentioned this pull request Dec 1, 2023

Cannot find libdevice in TF 2.11 + compilation fails without ptxas #296

Open

1 task

njzjz reviewed Dec 1, 2023

View reviewed changes

recipe/meta.yaml Show resolved Hide resolved

ngam approved these changes Dec 3, 2023

View reviewed changes

xhochy added 2 commits December 8, 2023 18:57

MNT: Re-rendered with conda-build 3.28.1, conda-smithy 3.30.1, and co…

eaa5ba9

…nda-forge-pinning 2023.12.08.16.50.53

Depend on cuda-nvcc-tools

b9d069c

njzjz approved these changes Dec 9, 2023

View reviewed changes

weiji14 mentioned this pull request Dec 12, 2023

Flax needs to be upgraded in the tensorflow/jax image pangeo-data/pangeo-docker-images#489

Closed

xhochy added 2 commits December 14, 2023 14:19

Only use Python toolchain on Linux

32dc5ec

Always use Linux-sed-style

e3e81cf

Get rid of mac space

c27e554

yuvipanda mentioned this pull request Dec 16, 2023

Simplify image vastly, get rid of R 2i2c-org/utoronto-image#54

Merged

hmaarrfk mentioned this pull request Dec 16, 2023

what your workflow is for this? how do you setup your environment? #360

Open

hmaarrfk merged commit 2998b25 into conda-forge:main Dec 17, 2023
1 of 17 checks passed

njzjz mentioned this pull request Dec 17, 2023

TensorFlow: Support CUDA 12 #330

Closed

tensorflow 2.15.0 #353

tensorflow 2.15.0 #353

Conversation

xhochy commented Nov 15, 2023

conda-forge-webservices bot commented Nov 15, 2023

xhochy commented Nov 15, 2023

xhochy commented Nov 17, 2023 • edited

xhochy commented Nov 20, 2023

xhochy commented Nov 21, 2023

xhochy commented Nov 23, 2023

hmaarrfk commented Nov 23, 2023

xhochy commented Nov 23, 2023

hmaarrfk commented Nov 25, 2023

xhochy commented Nov 28, 2023

h-vetinari left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xhochy commented Nov 28, 2023

xhochy commented Nov 29, 2023

ngam left a comment

Choose a reason for hiding this comment

xhochy commented Dec 15, 2023

xhochy commented Dec 17, 2023 • edited

xhochy commented Dec 17, 2023

hmaarrfk commented Dec 17, 2023

xhochy commented Dec 17, 2023

hmaarrfk commented Dec 17, 2023

hmaarrfk commented Dec 17, 2023

yuvipanda commented Dec 17, 2023

xhochy commented Nov 17, 2023 •

edited

xhochy commented Dec 17, 2023 •

edited