Skip to content

Conversation

@dependabot
Copy link

@dependabot dependabot bot commented on behalf of github Jun 8, 2021

Bumps tensorflow from 2.4.1 to 2.5.0.

Release notes

Sourced from tensorflow's releases.

TensorFlow 2.5.0

Release 2.5.0

Major Features and Improvements

  • Support for Python3.9 has been added.
  • tf.data:
    • tf.data service now supports strict round-robin reads, which is useful for synchronous training workloads where example sizes vary. With strict round robin reads, users can guarantee that consumers get similar-sized examples in the same step.
    • tf.data service now supports optional compression. Previously data would always be compressed, but now you can disable compression by passing compression=None to tf.data.experimental.service.distribute(...).
    • tf.data.Dataset.batch() now supports num_parallel_calls and deterministic arguments. num_parallel_calls is used to indicate that multiple input batches should be computed in parallel. With num_parallel_calls set, deterministic is used to indicate that outputs can be obtained in the non-deterministic order.
    • Options returned by tf.data.Dataset.options() are no longer mutable.
    • tf.data input pipelines can now be executed in debug mode, which disables any asynchrony, parallelism, or non-determinism and forces Python execution (as opposed to trace-compiled graph execution) of user-defined functions passed into transformations such as map. The debug mode can be enabled through tf.data.experimental.enable_debug_mode().
  • tf.lite
    • Enabled the new MLIR-based quantization backend by default
      • The new backend is used for 8 bits full integer post-training quantization
      • The new backend removes the redundant rescales and fixes some bugs (shared weight/bias, extremely small scales, etc)
      • Set experimental_new_quantizer in tf.lite.TFLiteConverter to False to disable this change
  • tf.keras
    • tf.keras.metrics.AUC now support logit predictions.
    • Enabled a new supported input type in Model.fit, tf.keras.utils.experimental.DatasetCreator, which takes a callable, dataset_fn. DatasetCreator is intended to work across all tf.distribute strategies, and is the only input type supported for Parameter Server strategy.
  • tf.distribute
    • tf.distribute.experimental.ParameterServerStrategy now supports training with Keras Model.fit when used with DatasetCreator.
    • Creating tf.random.Generator under tf.distribute.Strategy scopes is now allowed (except for tf.distribute.experimental.CentralStorageStrategy and tf.distribute.experimental.ParameterServerStrategy). Different replicas will get different random-number streams.
  • TPU embedding support
    • Added profile_data_directory to EmbeddingConfigSpec in _tpu_estimator_embedding.py. This allows embedding lookup statistics gathered at runtime to be used in embedding layer partitioning decisions.
  • PluggableDevice
  • oneAPI Deep Neural Network Library (oneDNN) CPU performance optimizations from Intel-optimized TensorFlow are now available in the official x86-64 Linux and Windows builds.
    • They are off by default. Enable them by setting the environment variable TF_ENABLE_ONEDNN_OPTS=1.
    • We do not recommend using them in GPU systems, as they have not been sufficiently tested with GPUs yet.
  • TensorFlow pip packages are now built with CUDA11.2 and cuDNN 8.1.0

Breaking Changes

  • The TF_CPP_MIN_VLOG_LEVEL environment variable has been renamed to to TF_CPP_MAX_VLOG_LEVEL which correctly describes its effect.

Bug Fixes and Other Changes

  • tf.keras:
    • Preprocessing layers API consistency changes:
      • StringLookup added output_mode, sparse, and pad_to_max_tokens arguments with same semantics as TextVectorization.
      • IntegerLookup added output_mode, sparse, and pad_to_max_tokens arguments with same semantics as TextVectorization. Renamed max_values, oov_value and mask_value to max_tokens, oov_token and mask_token to align with StringLookup and TextVectorization.
      • TextVectorization default for pad_to_max_tokens switched to False.
      • CategoryEncoding no longer supports adapt, IntegerLookup now supports equivalent functionality. max_tokens argument renamed to num_tokens.
      • Discretization added num_bins argument for learning bins boundaries through calling adapt on a dataset. Renamed bins argument to bin_boundaries for specifying bins without adapt.
    • Improvements to model saving/loading:
      • model.load_weights now accepts paths to saved models.

... (truncated)

Changelog

Sourced from tensorflow's changelog.

Release 2.5.0

Breaking Changes

  • The TF_CPP_MIN_VLOG_LEVEL environment variable has been renamed to to TF_CPP_MAX_VLOG_LEVEL which correctly describes its effect.

Known Caveats

Major Features and Improvements

  • TPU embedding support

    • Added profile_data_directory to EmbeddingConfigSpec in _tpu_estimator_embedding.py. This allows embedding lookup statistics gathered at runtime to be used in embedding layer partitioning decisions.
  • tf.keras.metrics.AUC now support logit predictions.

  • Creating tf.random.Generator under tf.distribute.Strategy scopes is now allowed (except for tf.distribute.experimental.CentralStorageStrategy and tf.distribute.experimental.ParameterServerStrategy). Different replicas will get different random-number streams.

  • tf.data:

    • tf.data service now supports strict round-robin reads, which is useful for synchronous training workloads where example sizes vary. With strict round robin reads, users can guarantee that consumers get similar-sized examples in the same step.
    • tf.data service now supports optional compression. Previously data would always be compressed, but now you can disable compression by passing compression=None to tf.data.experimental.service.distribute(...).
    • tf.data.Dataset.batch() now supports num_parallel_calls and deterministic arguments. num_parallel_calls is used to indicate that multiple input batches should be computed in parallel. With num_parallel_calls set, deterministic is used to indicate that outputs can be obtained in the non-deterministic order.
    • Options returned by tf.data.Dataset.options() are no longer mutable.
    • tf.data input pipelines can now be executed in debug mode, which disables any asynchrony, parallelism, or non-determinism and forces Python execution (as opposed to trace-compiled graph execution) of user-defined functions passed into transformations such as map. The debug mode can be enabled through tf.data.experimental.enable_debug_mode().
  • tf.lite

    • Enabled the new MLIR-based quantization backend by default
      • The new backend is used for 8 bits full integer post-training quantization
      • The new backend removes the redundant rescales and fixes some bugs (shared weight/bias, extremely small scales, etc)

... (truncated)

Commits
  • a4dfb8d Merge pull request #49124 from tensorflow/mm-cherrypick-tf-data-segfault-fix-...
  • 2107b1d Merge pull request #49116 from tensorflow-jenkins/version-numbers-2.5.0-17609
  • 16b8139 Update snapshot_dataset_op.cc
  • 86a0d86 Merge pull request #49126 from geetachavan1/cherrypicks_X9ZNY
  • 9436ae6 Merge pull request #49128 from geetachavan1/cherrypicks_D73J5
  • 6b2bf99 Validate that a and b are proper sparse tensors
  • c03ad1a Ensure validation sticks in banded_triangular_solve_op
  • 12a6ead Merge pull request #49120 from geetachavan1/cherrypicks_KJ5M9
  • b67f5b8 Merge pull request #49118 from geetachavan1/cherrypicks_BIDTR
  • a13c0ad [tf.data][cherrypick] Fix snapshot segfault when using repeat and prefecth
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 2.4.1 to 2.5.0.
- [Release notes](https://github.com/tensorflow/tensorflow/releases)
- [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md)
- [Commits](tensorflow/tensorflow@v2.4.1...v2.5.0)

---
updated-dependencies:
- dependency-name: tensorflow
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot force-pushed the dependabot/pip/python/requirements/rllib/tensorflow-2.5.0 branch from 64009d2 to 00eb3fa Compare June 8, 2021 00:26
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Jun 8, 2021
@dependabot @github
Copy link
Author

dependabot bot commented on behalf of github Aug 14, 2021

Superseded by #18.

@dependabot dependabot bot closed this Aug 14, 2021
@dependabot dependabot bot deleted the dependabot/pip/python/requirements/rllib/tensorflow-2.5.0 branch August 14, 2021 07:08
matthewdeng pushed a commit that referenced this pull request Jul 28, 2022
We encountered SIGSEGV when running Python test `python/ray/tests/test_failure_2.py::test_list_named_actors_timeout`. The stack is:

```
#0  0x00007fffed30f393 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) ()
   from /lib64/libstdc++.so.6
#1  0x00007fffee707649 in ray::RayLog::GetLoggerName() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#2  0x00007fffee70aa90 in ray::SpdLogMessage::Flush() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#3  0x00007fffee70af28 in ray::RayLog::~RayLog() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#4  0x00007fffee2b570d in ray::asio::testing::(anonymous namespace)::DelayManager::Init() [clone .constprop.0] ()
   from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#5  0x00007fffedd0d95a in _GLOBAL__sub_I_asio_chaos.cc () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#6  0x00007ffff7fe282a in call_init.part () from /lib64/ld-linux-x86-64.so.2
#7  0x00007ffff7fe2931 in _dl_init () from /lib64/ld-linux-x86-64.so.2
#8  0x00007ffff7fe674c in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
#9  0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6
#10 0x00007ffff7fe5ffe in _dl_open () from /lib64/ld-linux-x86-64.so.2
#11 0x00007ffff7d5f39c in dlopen_doit () from /lib64/libdl.so.2
#12 0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6
#13 0x00007ffff7b82f13 in _dl_catch_error () from /lib64/libc.so.6
#14 0x00007ffff7d5fb09 in _dlerror_run () from /lib64/libdl.so.2
#15 0x00007ffff7d5f42a in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2
#16 0x00007fffef04d330 in py_dl_open (self=<optimized out>, args=<optimized out>)
    at /tmp/python-build.20220507135524.257789/Python-3.7.11/Modules/_ctypes/callproc.c:1369
```

The root cause is that when loading `_raylet.so`, `static DelayManager _delay_manager` is initialized and `RAY_LOG(ERROR) << "RAY_testing_asio_delay_us is set to " << delay_env;` is executed. However, the static variables declared in `logging.cc` are not initialized yet (in this case, `std::string RayLog::logger_name_ = "ray_log_sink"`).

It's better not to rely on the initialization order of static variables in different compilation units because it's not guaranteed. I propose to change all `RAY_LOG`s to `std::cerr` in `DelayManager::Init()`.

The crash happens in Ant's internal codebase. Not sure why this test case passes in the community version though.

BTW, I've tried different approaches:

1. Using a static local variable in `get_delay_us` and remove the global variable. This doesn't work because `init()` needs to access the variable as well.
2. Defining the global variable as type `std::unique_ptr<DelayManager>` and initialize it in `get_delay_us`. This works but it requires a lock to be thread-safe.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant