Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG FIX] Fix tensor_or_memref.h build error #59536

Merged
merged 1 commit into from Feb 16, 2023

Conversation

i-chaochen
Copy link
Contributor

We found the build error on tensor_or_memref as the following, this PR has fixed it.

opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/tensor_or_memref.h:243:29: error: 'isnan' was not declared in this scope; did you mean 'std::isnan'?
  243 |         bool thisnan = isnan(at(indices));
      |                        ~~~~~^~~~~~~~~~~~~
      |                        std::isnan
In file included from /usr/include/c++/9/complex:44,
                 from bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/interpreter_value.h:19,
                 from tensorflow/compiler/xla/mlir_hlo/tools/mlir_interpreter/framework/tests/interpreter_value_test.cc:16:
/usr/include/c++/9/cmath:632:5: note: 'std::isnan' declared here
  632 |     isnan(_Tp __x)
      |     ^~~~~


opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/interpreter_value.h:108:31:   required from here
bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/tensor_or_memref.h:243:29: error: 'isnan' was not declared in this scope; did you mean 'std::isnan'?
  243 |         bool thisnan = isnan(at(indices));
      |                        ~~~~~^~~~~~~~~~~~~
      |                        std::isnan
In file included from /usr/include/c++/9/complex:44,
                 from bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/interpreter_value.h:19,
                 from tensorflow/compiler/xla/mlir_hlo/tools/mlir_interpreter/framework/tests/interpreter_value_test.cc:16:
/usr/include/c++/9/cmath:632:5: note: 'std::isnan' declared here
  632 |     isnan(_Tp __x)
      |     ^~~~~
In file included from bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/interpreter_value.h:34,
                 from tensorflow/compiler/xla/mlir_hlo/tools/mlir_interpreter/framework/tests/interpreter_value_test.cc:16:
bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/tensor_or_memref.h:244:30: error: 'isnan' was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
  244 |         bool othernan = isnan(other.at(indices));

@google-ml-butler google-ml-butler bot added the size:XS CL Change Size: Extra Small label Feb 3, 2023
@google-ml-butler google-ml-butler bot requested a review from r4nt February 3, 2023 11:41
@google-ml-butler google-ml-butler bot added the awaiting review Pull request awaiting review label Feb 3, 2023
@i-chaochen i-chaochen changed the title fix tensor_or_memref.h build error [BUG FIX] Fix tensor_or_memref.h build error Feb 3, 2023
@gbaned gbaned added the comp:xla XLA label Feb 15, 2023
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Feb 15, 2023
@gbaned gbaned requested a review from tpopp February 15, 2023 16:53
PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Feb 16, 2023
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Feb 16, 2023
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Feb 16, 2023
copybara-service bot pushed a commit to tensorflow/mlir-hlo that referenced this pull request Feb 16, 2023
Imported from GitHub PR tensorflow/tensorflow#59536

We found the build error on tensor_or_memref as the following, this PR has fixed it.

```
opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/tensor_or_memref.h:243:29: error: 'isnan' was not declared in this scope; did you mean 'std::isnan'?
  243 |         bool thisnan = isnan(at(indices));
      |                        ~~~~~^~~~~~~~~~~~~
      |                        std::isnan
In file included from /usr/include/c++/9/complex:44,
                 from bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/interpreter_value.h:19,
                 from tensorflow/compiler/xla/mlir_hlo/tools/mlir_interpreter/framework/tests/interpreter_value_test.cc:16:
/usr/include/c++/9/cmath:632:5: note: 'std::isnan' declared here
  632 |     isnan(_Tp __x)
      |     ^~~~~

opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/interpreter_value.h:108:31:   required from here
bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/tensor_or_memref.h:243:29: error: 'isnan' was not declared in this scope; did you mean 'std::isnan'?
  243 |         bool thisnan = isnan(at(indices));
      |                        ~~~~~^~~~~~~~~~~~~
      |                        std::isnan
In file included from /usr/include/c++/9/complex:44,
                 from bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/interpreter_value.h:19,
                 from tensorflow/compiler/xla/mlir_hlo/tools/mlir_interpreter/framework/tests/interpreter_value_test.cc:16:
/usr/include/c++/9/cmath:632:5: note: 'std::isnan' declared here
  632 |     isnan(_Tp __x)
      |     ^~~~~
In file included from bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/interpreter_value.h:34,
                 from tensorflow/compiler/xla/mlir_hlo/tools/mlir_interpreter/framework/tests/interpreter_value_test.cc:16:
bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/tensor_or_memref.h:244:30: error: 'isnan' was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
  244 |         bool othernan = isnan(other.at(indices));

```
Copybara import of the project:

--
a7322dea05308067d3841ed6bdaab95500905ddf by Chao Chen <cchen104@amd.com>:

fix tensor_or_memref.h build  error

Merging this change closes #59536

PiperOrigin-RevId: 510173693
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Feb 16, 2023
Imported from GitHub PR tensorflow/tensorflow#59536

We found the build error on tensor_or_memref as the following, this PR has fixed it.

```
opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/tensor_or_memref.h:243:29: error: 'isnan' was not declared in this scope; did you mean 'std::isnan'?
  243 |         bool thisnan = isnan(at(indices));
      |                        ~~~~~^~~~~~~~~~~~~
      |                        std::isnan
In file included from /usr/include/c++/9/complex:44,
                 from bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/interpreter_value.h:19,
                 from tensorflow/compiler/xla/mlir_hlo/tools/mlir_interpreter/framework/tests/interpreter_value_test.cc:16:
/usr/include/c++/9/cmath:632:5: note: 'std::isnan' declared here
  632 |     isnan(_Tp __x)
      |     ^~~~~

opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/interpreter_value.h:108:31:   required from here
bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/tensor_or_memref.h:243:29: error: 'isnan' was not declared in this scope; did you mean 'std::isnan'?
  243 |         bool thisnan = isnan(at(indices));
      |                        ~~~~~^~~~~~~~~~~~~
      |                        std::isnan
In file included from /usr/include/c++/9/complex:44,
                 from bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/interpreter_value.h:19,
                 from tensorflow/compiler/xla/mlir_hlo/tools/mlir_interpreter/framework/tests/interpreter_value_test.cc:16:
/usr/include/c++/9/cmath:632:5: note: 'std::isnan' declared here
  632 |     isnan(_Tp __x)
      |     ^~~~~
In file included from bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/interpreter_value.h:34,
                 from tensorflow/compiler/xla/mlir_hlo/tools/mlir_interpreter/framework/tests/interpreter_value_test.cc:16:
bazel-out/k8-opt/bin/tensorflow/compiler/xla/mlir_hlo/_virtual_includes/mlir_interpreter_framework/tools/mlir_interpreter/framework/tensor_or_memref.h:244:30: error: 'isnan' was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
  244 |         bool othernan = isnan(other.at(indices));

```
Copybara import of the project:

--
a7322dea05308067d3841ed6bdaab95500905ddf by Chao Chen <cchen104@amd.com>:

fix tensor_or_memref.h build  error

Merging this change closes #59536

PiperOrigin-RevId: 510173693
@copybara-service copybara-service bot merged commit 49acb93 into tensorflow:master Feb 16, 2023
PR Queue automation moved this from Approved by Reviewer to Merged Feb 16, 2023
copybara-service bot pushed a commit to tensorflow/mlir-hlo that referenced this pull request Feb 18, 2023
tensorflow/tensorflow#59536 broke building on Mac. The error is:

    error: no member named 'isnan' in namespace 'std'; did you mean simply 'isnan'?

PiperOrigin-RevId: 510557348
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Feb 18, 2023
tensorflow/tensorflow#59536 broke building on Mac. The error is:

    error: no member named 'isnan' in namespace 'std'; did you mean simply 'isnan'?

PiperOrigin-RevId: 510557348
copybara-service bot pushed a commit that referenced this pull request Feb 18, 2023
Fix the Mac build by rolling back #59536

#59536 broke building on Mac. The error is:

    error: no member named 'isnan' in namespace 'std'; did you mean simply 'isnan'?

PiperOrigin-RevId: 510557348
@reedwm
Copy link
Member

reedwm commented Feb 21, 2023

Unfortunately, I rolled back this PR in b3cc517 because it broke the Mac build with the error

error: no member named 'isnan' in namespace 'std'; did you mean simply 'isnan'?

According to this StackOverflow question, including math.h will put isnan in the global namespace, and optionally may also put it in the std namespace. Since only the Mac build was broken, presumably this means on Mac it was only put in the global namespace but on Windows/Linux, it was also put in the std namespace.

I think just including math.h without adding the std:: prefix to isnan will fix the issue. Can you create a new PR that just includes math.h?

@i-chaochen
Copy link
Contributor Author

i-chaochen commented Feb 22, 2023

Unfortunately, I rolled back this PR in b3cc517 because it broke the Mac build with the error

error: no member named 'isnan' in namespace 'std'; did you mean simply 'isnan'?

According to this StackOverflow question, including math.h will put isnan in the global namespace, and optionally may also put it in the std namespace. Since only the Mac build was broken, presumably this means on Mac it was only put in the global namespace but on Windows/Linux, it was also put in the std namespace.

I think just including math.h without adding the std:: prefix to isnan will fix the issue. Can you create a new PR that just includes math.h?

Thanks a lot for this reference and explaination!

I have created a new PR based on the suggestion and it does resolve the build error on my side.

#59775

copybara-service bot pushed a commit to tensorflow/mlir-hlo that referenced this pull request Feb 23, 2023
Imported from GitHub PR tensorflow/tensorflow#59775

tensorflow/tensorflow#59536 (comment)

Since pevious PR has been rolled back, this one adopted reviewer's feedback to fix Mac build error.

Thanks @reedwm for the feedback and suggestion!
Copybara import of the project:

--
4f9f856e0159f5dd7c4e8d0e3b5232e55795f700 by Chao Chen <cchen104@amd.com>:

fix tensor_or_memref build error without math.h

Merging this change closes #59775

PiperOrigin-RevId: 511716904
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Feb 23, 2023
Imported from GitHub PR tensorflow/tensorflow#59775

tensorflow/tensorflow#59536 (comment)

Since pevious PR has been rolled back, this one adopted reviewer's feedback to fix Mac build error.

Thanks @reedwm for the feedback and suggestion!
Copybara import of the project:

--
4f9f856e0159f5dd7c4e8d0e3b5232e55795f700 by Chao Chen <cchen104@amd.com>:

fix tensor_or_memref build error without math.h

Merging this change closes #59775

PiperOrigin-RevId: 511716904
MichaelHudgins pushed a commit that referenced this pull request Feb 23, 2023
511837617  by A. Unique TensorFlower<gardener@tensorflow.org>:
    Automated rollback of changelist 509256232.

511836298  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [XLA:CPU] Outline fusion regions in the presence of implicit constant-like operands

--
511832706  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Remove decorator usage for parallel devices as a first step in removing `control_flow_ops.py`'s dependency on `def_function.py` to eliminate circular dependencies.

--
511832154  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [StableHLO to MHLO] Handle bounds of Gather op

    Based on:
    openxla/stablehlo#908

--
511832050  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Specify `save_type='checkpoint'` when calling `trackable_children`.

--
511830259  by A. Unique TensorFlower<gardener@tensorflow.org>:

      Uses the same clustering algorithm in the TF2XLA bridge for TAC passes. They have better results than the current clustering in the number of clusters.

--
511830057  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Add an option to strip location information from ops.

    These end up becoming huge during large fusions (like TpuRewritePass) and in practice don't help much with readability, as locations for TF are typically indicated by the op name.

--
511826780  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [XLA] Add all-gather-start/done multi-operand shape inference tests

--
511815596  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Describe potential problem in case if the LLVM toolchain is used.

--
511810373  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update TFRT dependency to use revision
    http://github.com/tensorflow/runtime/commit/1ed8df8df17f431936090acde5122456c5eed394.

--
511801672  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Plumb CollectiveAllToAllV2 to MLIR.

    Also fix a inconsistency regarding CollectiveGatherV2.

    Follow up of #59598

--
511799604  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Integrate LLVM at llvm/llvm-project@219ba2fb7b0a

    Updates LLVM usage to match
    [219ba2fb7b0a](llvm/llvm-project@219ba2fb7b0a)

--
511797844  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update TFRT dependency to use revision
    http://github.com/tensorflow/runtime/commit/9b50571c5b76f32103ad5099b0993581af3d0592.

--
511793598  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update type-checks of `DatasetV1` and `DatasetV2` to use abstract types.

    Changes usages of the internal `DatasetV1` and `DatasetV2` types
    to use the `tensorflow.types.data` versions instead of the concrete
    implementations.

    This helps reduce the tendency for cyclic dependencies involving the
    `dataset_ops.py` module.

    Usages of the concrete type (e.g. instantiation, member access) are not
    affected by this change.

--
511765379  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [GmlSt] Remove 'distribute' attribute from ParallelOp tiling params.

--
511760944  by A. Unique TensorFlower<gardener@tensorflow.org>:
    Automated rollback of changelist 510948939.

511756866  by A. Unique TensorFlower<gardener@tensorflow.org>:
    Automated rollback of changelist 482514900.

511749935  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [XLA:CPU Next] Disable tiling of linalg.generic.

--
511746972  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [GmlSt] Do not tile linalg.generic if it was tiled already.

    The "tiled labels" are not populated correctly. Theoretically, we shouldn't
    check for the parent.

--
511745314  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [XLA:CPU Next] Fix scalarization of scf.for.

    If the type of the output is not specified, then FromElementsOp creates a 1D tensor

    void FromElementsOp::build(OpBuilder &builder, OperationState &result,
                               ValueRange elements) {
      assert(!elements.empty() && "expected at least one element");
      Type resultType = RankedTensorType::get(
          {static_cast<int64_t>(elements.size())}, elements.front().getType());
      build(builder, result, resultType, elements);
    }

    The test that i had before in scalarization.mlir was 1D by coincidence and therefore worked.

--
511741158  by A. Unique TensorFlower<gardener@tensorflow.org>:

    FIll calculates the output tensor if all the information required is available during Prepare.

    This means that the output will be available for subsequent operator's Prepare and Eval will be free

--
511737032  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [GmlSt] Remove gml_st.materialize.

--
511736776  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [XLA:CPU Next] Disable scf.if vectorization.

--
511734615  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [DelegatePerformance] Removed a strengthened precondition and changed the metric value type from float to double.

    The change removes the strengthened precondition from the inherited method computeModelReport() in the derived class ModelBenchmarkReport. It also changes the metric value type from float to double to avoid potential calculation errors.

--
511730756  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [XLA:CPU Next] Add a pattern to tile linalg.generic to 1 and fuse greedily.

--
511726589  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [GmlSt] Remove useless peeling label.

--
511719057  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Transforming grouped convolution to depth wise when possible.

--
511716904  by A. Unique TensorFlower<gardener@tensorflow.org>:

    PR #59775: Fix tensor_or_memref build error without math.h

    Imported from GitHub PR #59775

    #59536 (comment)

    Since pevious PR has been rolled back, this one adopted reviewer's feedback to fix Mac build error.

    Thanks @reedwm for the feedback and suggestion!
    Copybara import of the project:

    --
    4f9f856 by Chao Chen <cchen104@amd.com>:

    fix tensor_or_memref build error without math.h

    Merging this change closes #59775

--
511715187  by A. Unique TensorFlower<gardener@tensorflow.org>:

    compat: Update forward compatibility horizon to 2023-02-23

--
511715185  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update GraphDef version to 1416.

--
511712677  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [XLA] Further speedup HloModule::Print by using integer key for CanonicalNameMap.

--
511712612  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Go: Update generated wrapper functions for TensorFlow ops.

--
511708539  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update ops-related pbtxt files.

--
511706378  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Add SegmentMaxV2op with num_segments as additional input.

    The only difference with SegmentMax is the additional input  `num_segment`.
    This helps in evaluating the output shape in compile time.

--
511703897  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Go: Update generated wrapper functions for TensorFlow ops.

--
511703121  by A. Unique TensorFlower<gardener@tensorflow.org>:

    PR #59616: ReLU Epilogue Fusion for FP8 GEMMs in XLA

    Imported from GitHub PR #59616

    Enables the epilogue fusion of ReLU activations for FP8 GEMMs.
    Copybara import of the project:

    --
    8813bb2 by Philipp Hack <phack@nvidia.com>:

    Epilogue fusion of ReLU activations for FP8 GEMMs.

    --
    c9ee1d5 by Philipp Hack <phack@nvidia.com>:

    Epilogue fusion of ReLU activations for FP8 GEMMs.

    Merging this change closes #59616

--
511693497  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Add SegmentMinV2op with num_segments as additional input.

    The only difference with SegmentMin is the additional input  `num_segment`.
    This helps in evaluating the output shape in compile time.

--
511692851  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Go: Update generated wrapper functions for TensorFlow ops.

--
511690064  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [jax2tf] Use CUDA and ROCM instead of GPU for XlaCallModuleOp platforms

    JAX is moving to using ROCM and CUDA instead of the generic GPU platform
    type and it is already supporting separate lowerings for ROCM and CUDA.
    To keep up with this functionality, we move the XlaCallModuleOp to supporting
    ROCM and CUDA platforms.

--
511687577  by A. Unique TensorFlower<gardener@tensorflow.org>:

    VLOG(1) upon fallback in phase 2. Whether the new or old bridge ran is usually the first thing to find out when debugging. So when fallback from new bridge to old bridge happens, it should be reported with log level 1.

--
511684159  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Cast BF16 Depthwise conv2D ops to f32 ops.

    This change is to cast BF16 Depthwise Conv2d ops to f32 to make it ready for quantization. But, as the Depthwise conv2D quantization is disabled due to performance improvement issue for now, this change does not guarantee the BF16 Depthwise Conv2D quantization.

--
511677183  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Added a util function to process einsum

--
511676949  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data Retry empty repetitions when repeating data service dataset.

    The ForeverRepeat op assumes if the first repetition produces no data,
    all future repetitions will produce no data. That is not always true.
    For example, when using tf.data service, different repetitions may
    produce different numbers of elements, and empty repetitions should be
    retried.

--
511657402  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [XLA] More speedups to HloModule::Print.

--
511647352  by A. Unique TensorFlower<gardener@tensorflow.org>:
    Automated rollback of changelist 511611390.

511644926  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Refactor TfrtPipelineOptions to a separate file so that it can be reused.

--
511642132  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update TFRT dependency to use revision
    http://github.com/tensorflow/runtime/commit/410c7f3b07f1d1170b13242a2cf9af4ec7edd33f.

--
511640480  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update `DimsAre` matcher to be polymorphic over both `TfLiteTensor` and `TfLiteIntArrays`.

--
511639990  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Fix Relayout handling under XLA SPMD.

    According to the plans by samuelslee@, now that chensunx@ contributed
    the pass to lower Relayout to Identity.

--
511639775  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Integrate LLVM at llvm/llvm-project@a7b6978285c1

    Updates LLVM usage to match
    [a7b6978285c1](llvm/llvm-project@a7b6978285c1)

--
511637257  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Include full node_def info in activity watcher.

--
511637139  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update TFRT dependency to use revision
    http://github.com/tensorflow/runtime/commit/bd588322b3d6660903ee0df9b55d2f589aa5dc8a.

--
511632436  by A. Unique TensorFlower<gardener@tensorflow.org>:
    Automated rollback of changelist 511597567.

511611390  by A. Unique TensorFlower<gardener@tensorflow.org>:

    PR #57956: Performance Enhancements for Sparse Embedding Lookups

    Imported from GitHub PR #57956

    Introduces performance options for sparse embedding lookups that can appreciably speed up the training of recommendation systems. Sparse lookups alternatively accept inputs described by RaggedTensors which are more memory efficient. Performance is further increased by the optional use of a simplified and typically faster embedding lookup.

    In the sparse embedding micro benchmarks in tensorflow/python/eager/benchmarks_test.py, the number of examples per second on a DGX A100 system increases from approx. 1,300 with SparseTensor and without simplified lookup to approx. 11,200 with RaggedTensor inputs and simplified lookup (+760%). The combination of SparseTensor inputs and simplified lookup yields approx. 3,000 examples per second (+130%).
    Copybara import of the project:

    --
    00ee1a9 by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    fe6eb18 by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    47157c3 by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    8826a9b by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    4e77072 by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    3d1096a by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    c115b80 by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    842591c by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    2a307f3 by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    ee90704 by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    9e216e3 by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    c393302 by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    5165f07 by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    f3e88c4 by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    8384791 by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    497f9c3 by Philipp Hack <phack@nvidia.com>:

    Adds performance enhancements for sparse embedding lookups.

    --
    2166e5d by Philipp Hack <phack@nvidia.com>:

    Performance enhancements for sparse embedding lookups.

    --
    13dd2c8 by Philipp Hack <phack@nvidia.com>:

    Performance enhancements for sparse embedding lookups.

    --
    a83f4d4 by Philipp Hack <phack@nvidia.com>:

    Performance enhancements for sparse embedding lookups.

    Merging this change closes #57956

--
511609283  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data Ramp down `autotune_buffer_optimization` experiment.

--
511606842  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Avoid double counting graph building time for nested functions.

--
511606056  by A. Unique TensorFlower<gardener@tensorflow.org>:

    PR #59619: [NVIDIA TF] Throw error only for non-empty function definitions.

    Imported from GitHub PR #59619

    During Function Serialization and Deserialization with Saved Models there can be Node ops with type `func` with default values that have no associated name. In such cases, we shouldn't look for undefined empty strings in Function Library. This is a trivial change and helps with how Saved Models are loaded.
    Copybara import of the project:

    --
    e0fe60f by Pavani Majety <pmajety@nvidia.com>:

    [BugFix] Throw error only for non-empty function definitions.

    Add reason for the required change.

    Add comment to both locations.

    Merging this change closes #59619

--
511605660  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Fix typo in comment

--
511604096  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service Put distributed_save_test and snapshot_ft_test together.

--
511603520  by A. Unique TensorFlower<gardener@tensorflow.org>:

    update tracking bug now that the step id is propagated correctly.

--
511602286  by A. Unique TensorFlower<gardener@tensorflow.org>:

    add a size() member function to HloProtoMap.

--
511599929  by A. Unique TensorFlower<gardener@tensorflow.org>:

    gpu_delegate: Update to support Mali G715

--
511597567  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [tf-lite] Enable parallel transpose.

--
511596662  by A. Unique TensorFlower<gardener@tensorflow.org>:

    add an aggregated stats for per step result.
    the goal is to remove all_reduce_db_per_core.
    we will lose the capabilities of separate compute time and synchronization time for TPU only.
    but I think it is fine, most of collective in TPU now are async ops, so these algorithm is no longer very informative. we will count all async collectives as synchronization time for tpu.
    for gpu, this doesn't apply.

--
511590747  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [PJRT C API] Add a README file to provide communication channel and resources.

--
511585380  by A. Unique TensorFlower<gardener@tensorflow.org>:
    Automated rollback of changelist 511565925.

511578143  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [XLA] Minor fixes in ShardingPropagation.

    - Moves the misplaced comment block of replicate_on_last_tile_dim_.
    - Uses existing variable root_instr to eliminate redundant accesses.
    - Replaces bitwise and with logical and.
    - Uses reverse iterator instead of reversing the list.

--
511569312  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Remove `type_spec` direct dependency on  `framework/ops.py`.

    Changes `type_spec` references to `tf.Tensor` to use an abstract base type
    for `isinstance` checks.

    Uses the conversion function in `tensor_conversion_registry` instead
    of its wrapper in `ops`. While `tensor_conversion_registry` has an indirect
    dependency on `ops.py`, there is future work planned to remove that dependency.

--
511568795  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Check the shape of indices matches the shape of dense_shape in sparse_fill_empty_rows op

--
511567163  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [xla-next][mlir][sparse] add mhlo sparsity rewriting to pipeline

--
511565925  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update visibility to fix OSS build

--

PiperOrigin-RevId: 511837617
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting review Pull request awaiting review comp:xla XLA ready to pull PR ready for merge process size:XS CL Change Size: Extra Small
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

5 participants