Skip to content
  • v2.3.0
  • b36436b
  • Compare
    Choose a tag to compare
    Search for a tag
  • v2.3.0
  • b36436b
  • Compare
    Choose a tag to compare
    Search for a tag

@tensorflow-jenkins tensorflow-jenkins released this Jul 27, 2020 · 54 commits to r2.3 since this release

Release 2.3.0

Major Features and Improvements

  • tf.data adds two new mechanisms to solve input pipeline bottlenecks and save resources:

In addition checkout the detailed guide for analyzing input pipeline performance with TF Profiler.

  • tf.distribute.TPUStrategy is now a stable API and no longer considered experimental for TensorFlow. (earlier tf.distribute.experimental.TPUStrategy).

  • TF Profiler introduces two new tools: a memory profiler to visualize your model’s memory usage over time and a python tracer which allows you to trace python function calls in your model. Usability improvements include better diagnostic messages and profile options to customize the host and device trace verbosity level.

  • Introduces experimental support for Keras Preprocessing Layers API (tf.keras.layers.experimental.preprocessing.*) to handle data preprocessing operations, with support for composite tensor inputs. Please see below for additional details on these layers.

  • TFLite now properly supports dynamic shapes during conversion and inference. We’ve also added opt-in support on Android and iOS for XNNPACK, a highly optimized set of CPU kernels, as well as opt-in support for executing quantized models on the GPU.

  • Libtensorflow packages are available in GCS starting this release. We have also started to release a nightly version of these packages.

  • The experimental Python API tf.debugging.experimental.enable_dump_debug_info() now allows you to instrument a TensorFlow program and dump debugging information to a directory on the file system. The directory can be read and visualized by a new interactive dashboard in TensorBoard 2.3 called Debugger V2, which reveals the details of the TensorFlow program including graph structures, history of op executions at the Python (eager) and intra-graph levels, the runtime dtype, shape, and numerical composistion of tensors, as well as their code locations.

Breaking Changes

  • Increases the minimum bazel version required to build TF to 3.1.0.
  • tf.data
    • Makes the following (breaking) changes to the tf.data.
    • C++ API: - IteratorBase::RestoreInternal, IteratorBase::SaveInternal, and DatasetBase::CheckExternalState become pure-virtual and subclasses are now expected to provide an implementation.
    • The deprecated DatasetBase::IsStateful method is removed in favor of DatasetBase::CheckExternalState.
    • Deprecated overrides of DatasetBase::MakeIterator and MakeIteratorFromInputElement are removed.
    • The signature of tensorflow::data::IteratorBase::SaveInternal and tensorflow::data::IteratorBase::SaveInput has been extended with SerializationContext argument to enable overriding the default policy for the handling external state during iterator checkpointing. This is not a backwards compatible change and all subclasses of IteratorBase need to be updated accordingly.
  • tf.keras
    • Add a new BackupAndRestore callback for handling distributed training failures & restarts. Please take a look at this tutorial for details on how to use the callback.
  • tf.image.extract_glimpse has been updated to correctly process the case
    where centered=False and normalized=False. This is a breaking change as
    the output is different from (incorrect) previous versions. Note this
    breaking change only impacts tf.image.extract_glimpse and
    tf.compat.v2.image.extract_glimpse API endpoints. The behavior of
    tf.compat.v1.image.extract_glimpse does not change. The behavior of
    exsiting C++ kernel ExtractGlimpse does not change either, so saved
    models using tf.raw_ops.ExtractGlimpse will not be impacted.

Known Caveats

  • tf.lite
    • Keras-based LSTM models must be converted with an explicit batch size in the input layer.

Bug Fixes and Other Changes

TF Core:

  • Set tf2_behavior to 1 to enable V2 for early loading cases.
  • Add execute_fn_for_device function to dynamically choose the implementation based on underlying device placement.
  • Eager:
    • Add reduce_logsumexp benchmark with experiment compile.
    • Give EagerTensors a meaningful __array__ implementation.
    • Add another version of defun matmul for performance analysis.
  • tf.function/AutoGraph:
    • AutoGraph now includes into TensorFlow loops any variables that are closed over by local functions. Previously, such variables were sometimes incorrectly ignored.
    • functions returned by the get_concrete_function method of tf.function objects can now be called with arguments consistent with the original arguments or type specs passed to get_concrete_function. This calling convention is now the preferred way to use concrete functions with nested values and composite tensors. Please check the guide for more details on concrete_ function.
    • Update tf.function's experimental_relax_shapes to handle composite tensors appropriately.
    • Optimize tf.function invocation, by removing redundant list converter.
    • tf.function will retrace when called with a different variable instead of simply using the dtype & shape.
    • Improve support for dynamically-sized TensorArray inside tf.function.
  • tf.math:
    • Narrow down argmin/argmax contract to always return the smallest index for ties.
    • tf.math.reduce_variance and tf.math.reduce_std return correct computation for complex types and no longer support integer types.
    • Add Bessel functions of order 0,1 to tf.math.special.
    • tf.divide now always returns a tensor to be consistent with documentation and other APIs.
  • tf.image:
    • Replaced tf.image.non_max_suppression_padded with a new implementation that supports batched inputs, which is considerably faster on TPUs and GPUs. Boxes with area=0 will be ignored. Existing usage with single inputs should still work as before.
  • tf.linalg
    • Add tf.linalg.banded_triangular_solve.
  • tf.random:
    • Add tf.random.stateless_parameterized_truncated_normal.
  • tf.ragged:
    • Add tf.ragged.cross and tf.ragged.cross_hashed operations.
  • tf.RaggedTensor:
    • RaggedTensor.to_tensor() now preserves static shape.
    • Add tf.strings.format() and tf.print() to support RaggedTensors.
  • tf.saved_model:
    • @tf.function from SavedModel no longer ignores args after a RaggedTensor when selecting the concrete function to run.
    • Fix save model issue for ops with a list of functions.
    • Add tf.saved_model.LoadOptions with experimental_io_device as arg with default value None to choose the I/O device for loading models and weights.
    • Update tf.saved_model.SaveOptions with experimental_io_device as arg with default value None to choose the I/O device for saving models and weights.
    • Mutable tables now restore checkpointed values when loaded from SavedModel.
  • GPU
    • TF 2.3 includes PTX kernels only for compute capability 7.0 to reduce the TF pip binary size. Earlier releases included PTX for a variety of older compute capabilities.
  • Others
    • Retain parent namescope for ops added inside tf.while_loop/tf.cond/tf.switch_case.
    • Update tf.vectorized_map to support vectorizing tf.while_loop and TensorList operations.
    • tf.custom_gradient can now be applied to functions that accept nested structures of tensors as inputs (instead of just a list of tensors). Note that Python structures such as tuples and lists now won't be treated as tensors, so if you still want them to be treated that way, you need to wrap them with tf.convert_to_tensor.
    • No lowering on gradient case op when input is DeviceIndex op.
    • Extend the ragged version of tf.gather to support batch_dims and axis args.
    • Update tf.map_fn to support RaggedTensors and SparseTensors.
    • Deprecate tf.group. It is not useful in eager mode.
    • Add CPU and GPU implementation of modified variation of FTRL/FTRLV2 that can triggerred by multiply_linear_by_lr allowing a learning rate of zero.

tf.data:

  • tf.data.experimental.dense_to_ragged_batch works correctly with tuples.
  • tf.data.experimental.dense_to_ragged_batch to output variable ragged rank.
  • tf.data.experimental.cardinality is now a method on tf.data.Dataset.
  • tf.data.Dataset now supports len(Dataset) when the cardinality is finite.

tf.distribute:

  • Expose experimental tf.distribute.DistributedDataset and tf.distribute.DistributedIterator to distribute input data when using tf.distribute to scale training on multiple devices.
  • Allow var.assign on MirroredVariables with aggregation=NONE in replica context. Previously this would raise an error. We now allow this because many users and library writers find using .assign in replica context to be more convenient, instead of having to use Strategy.extended.update which was the previous way of updating variables in this situation.
  • tf.distribute.experimental.MultiWorkerMirroredStrategy adds support for partial batches. Workers running out of data now continue to participate in the training with empty inputs, instead of raising an error. Learn more about partial batches here.
  • Improve the performance of reading metrics eagerly under tf.distribute.experimental.MultiWorkerMirroredStrategy.
  • Fix the issue that strategy.reduce() inside tf.function may raise exceptions when the values to reduce are from loops or if-clauses.
  • Fix the issue that tf.distribute.MirroredStrategy cannot be used together with tf.distribute.experimental.MultiWorkerMirroredStrategy.
  • Add a tf.distribute.cluster_resolver.TPUClusterResolver.connect API to simplify TPU initialization.

tf.keras:

  • Introduces experimental preprocessing layers API (tf.keras.layers.experimental.preprocessing) to handle data preprocessing operations such as categorical feature encoding, text vectorization, data normalization, and data discretization (binning). The newly added layers provide a replacement for the legacy feature column API, and support composite tensor inputs.
  • Added categorical data processing layers:
    • IntegerLookup & StringLookup: build an index of categorical feature values
    • CategoryEncoding: turn integer-encoded categories into one-hot, multi-hot, or tf-idf encoded representations
    • CategoryCrossing: create new categorical features representing co-occurrences of previous categorical feature values
    • Hashing: the hashing trick, for large-vocabulary categorical features
    • Discretization: turn continuous numerical features into categorical features by binning their values
  • Improved image preprocessing layers: CenterCrop, Rescaling
  • Improved image augmentation layers: RandomCrop, RandomFlip, RandomTranslation, RandomRotation, RandomHeight, RandomWidth, RandomZoom, RandomContrast
  • Improved TextVectorization layer, which handles string tokenization, n-gram generation, and token encoding
    • The TextVectorization layer now accounts for the mask_token as part of the vocabulary size when output_mode='int'. This means that, if you have a max_tokens value of 5000, your output will have 5000 unique values (not 5001 as before).
    • Change the return value of TextVectorization.get_vocabulary() from byte to string. Users who previously were calling 'decode' on the output of this method should no longer need to do so.
  • Introduce new Keras dataset generation utilities :
    • image_dataset_from_directory is a utility based on tf.data.Dataset, meant to replace the legacy ImageDataGenerator. It takes you from a structured directory of images to a labeled dataset, in one function call. Note that it doesn't perform image data augmentation (which is meant to be done using preprocessing layers).
    • text_dataset_from_directory takes you from a structured directory of text files to a labeled dataset, in one function call.
    • timeseries_dataset_from_array is a tf.data.Dataset-based replacement of the legacy TimeseriesGenerator. It takes you from an array of timeseries data to a dataset of shifting windows with their targets.
  • Added experimental_steps_per_execution
    arg to model.compile to indicate the number of batches to run per tf.function call. This can speed up Keras Models on TPUs up to 3x.
  • Extends tf.keras.layers.Lambda layers to support multi-argument lambdas, and keyword arguments when calling the layer.
  • Functional models now get constructed if any tensor in a layer call's arguments/keyword arguments comes from a keras input. Previously the functional api would only work if all of the elements in the first argument to the layer came from a keras input.
  • Clean up BatchNormalization layer's trainable property to act like standard python state when it's used inside tf.functions (frozen at tracing time), instead of acting like a pseudo-variable whose updates kind of sometimes get reflected in already-traced tf.function traces.
  • Add the Conv1DTranspose layer.
  • Refine the semantics of SensitivitySpecificityBase derived metrics. See the updated API docstrings for tf.keras.metrics.SensitivityAtSpecificity and tf.keras.metrics.SpecificityAtSensitivty.

tf.lite:

  • Converter
    • Restored inference_input_type and inference_output_type flags in TF 2.x TFLiteConverter (backward compatible with TF 1.x) to support integer (tf.int8, tf.uint8) input and output types in post training full integer quantized models.
    • Added support for converting and resizing models with dynamic (placeholder) dimensions. Previously, there was only limited support for dynamic batch size, and even that did not guarantee that the model could be properly resized at runtime.
    • Enabled experimental support for a new quantization mode with 16-bit activations and 8-bit weights. See lite.OpsSet.EXPERIMENTAL_TFLITE_BUILTINS_ACTIVATIONS_INT16_WEIGHTS_INT8.
  • CPU
    • Fix an issue w/ dynamic weights and Conv2D on x86.
    • Add a runtime Android flag for enabling XNNPACK for optimized CPU performance.
    • Add a runtime iOS flag for enabling XNNPACK for optimized CPU performance.
    • Add a compiler flag to enable building a TFLite library that applies XNNPACK delegate automatically when the model has a fp32 operation.
  • GPU
    • Allow GPU acceleration starting with internal graph nodes
    • Experimental support for quantized models with the Android GPU delegate
    • Add GPU delegate whitelist.
    • Rename GPU whitelist -> compatibility (list).
    • Improve GPU compatibility list entries from crash reports.
  • NNAPI
    • Set default value for StatefulNnApiDelegate::Options::max_number_delegated_partitions to 3.
    • Add capability to disable NNAPI CPU and check NNAPI Errno.
    • Fix crashes when using NNAPI with target accelerator specified with model containing Conv2d or FullyConnected or LSTM nodes with quantized weights.
    • Fix ANEURALNETWORKS_BAD_DATA execution failures with sum/max/min/reduce operations with scalar inputs.
  • Hexagon
    • TFLite Hexagon Delegate out of experimental.
    • Experimental int8 support for most hexagon ops.
    • Experimental per-channel quant support for conv in Hexagon delegate.
    • Support dynamic batch size in C++ API.
  • CoreML
    • Opensource CoreML delegate
  • Misc
    • Enable building Android TFLite targets on Windows
    • Add support for BatchMatMul.
    • Add support for half_pixel_centers with ResizeNearestNeighbor.
    • Add 3D support for BatchToSpaceND.
    • Add 5D support for BroadcastSub, Maximum, Minimum, Transpose and BroadcastDiv.
    • Rename kTfLiteActRelu1 to kTfLiteActReluN1To1.
    • Enable flex delegate on tensorflow.lite.Interpreter Python package.
    • Add Buckettize, SparseCross and BoostedTreesBucketize to the flex whitelist.
    • Add support for selective registration of flex ops.
    • Add missing kernels for flex delegate whitelisted ops.
    • Fix issue when using direct ByteBuffer inputs with graphs that have dynamic shapes.
    • Fix error checking supported operations in a model containing HardSwish.

Packaging Support

  • Added tf.sysconfig.get_build_info(). Returns a dict that describes the build environment of the currently installed TensorFlow package, e.g. the NVIDIA CUDA and NVIDIA CuDNN versions used when TensorFlow was built.

Profiler

  • Fix a subtle use-after-free issue in XStatVisitor::RefValue().

TPU Enhancements

  • Adds 3D mesh support in TPU configurations ops.
  • Added TPU code for FTRL with multiply_linear_by_lr.
  • Silently adds a new file system registry at gstpu.
  • Support restartType in cloud tpu client.
  • Depend on a specific version of google-api-python-client.
  • Fixes apiclient import.

Tracing and Debugging

  • Add a TFE_Py_Execute traceme.

XLA Support

  • Implement stable argmin and argmax

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

902449@58880@bigcat_chen@ASIC, Abdul Baseer Khan, Abhineet Choudhary, Abolfazl Shahbazi, Adam Hillier, ag.ramesh, Agoniii, Ajay P, Alex Hoffman, Alexander Bayandin, Alexander Grund, Alexandre Abadie, Alexey Rogachevskiy, amoitra, Andrew Stevens, Angus-Luo, Anshuman Tripathy, Anush Elangovan, Artem Mavrin, Ashutosh Hathidara, autoih, Ayushman Kumar, ayushmankumar7, Bairen Yi, Bas Aarts, Bastian Eichenberger, Ben Barsdell, bhack, Bharat Raghunathan, Biagio Montaruli, Bigcat-Himax, blueyi, Bryan Cutler, Byambaa, Carlos Hernandez-Vaquero, Chen Lei, Chris Knorowski, Christian Clauss, chuanqiw, CuiYifeng, Daniel Situnayake, Daria Zhuravleva, Dayananda-V, Deven Desai, Devi Sandeep Endluri, Dmitry Zakharov, Dominic Jack, Duncan Riach, Edgar Liberis, Ehsan Toosi, ekuznetsov139, Elena Zhelezina, Eugene Kuznetsov, Eugene Mikhantiev, Evgenii Zheltonozhskii, Fabio Di Domenico, Fausto Morales, Fei Sun, feihugis, Felix E. Klee, flyingcat, Frederic Bastien, Fredrik Knutsson, frreiss, fsx950223, ganler, Gaurav Singh, Georgios Pinitas, Gian Marco Iodice, Giorgio Arena, Giuseppe Rossini, Gregory Keith, Guozhong Zhuang, gurushantj, Hahn Anselm, Harald Husum, Harjyot Bagga, Hristo Vrigazov, Ilya Persky, Ir1d, Itamar Turner-Trauring, jacco, Jake Tae, Janosh Riebesell, Jason Zaman, jayanth, Jeff Daily, Jens Elofsson, Jinzhe Zeng, JLZ, Jonas Skog, Jonathan Dekhtiar, Josh Meyer, Joshua Chia, Judd, justkw, Kaixi Hou, Kam D Kasravi, Kamil Rakoczy, Karol Gugala, Kayou, Kazuaki Ishizaki, Keith Smiley, Khaled Besrour, Kilaru Yasaswi Sri Chandra Gandhi, Kim, Young Soo, Kristian Hartikainen, Kwabena W. Agyeman, Leslie-Fang, Leslie-Fang-Intel, Li, Guizi, Lukas Geiger, Lutz Roeder, M\U00E5Ns Nilsson, Mahmoud Abuzaina, Manish, Marcel Koester, Marcin Sielski, marload, Martin Jul, Matt Conley, mdfaijul, Meng, Peng, Meteorix, Michael Käufl, Michael137, Milan Straka, Mitchell Vitez, Ml-0, Mokke Meguru, Mshr-H, nammbash, Nathan Luehr, naumkin, Neeraj Bhadani, ngc92, Nick Morgan, nihui, Niranjan Hasabnis, Niranjan Yadla, Nishidha Panpaliya, Oceania2018, oclyke, Ouyang Jin, OverLordGoldDragon, Owen Lyke, Patrick Hemmer, Paul Andrey, Peng Sun, periannath, Phil Pearl, Prashant Dandriyal, Prashant Kumar, Rahul Huilgol, Rajan Singh, Rajeshwar Reddy T, rangjiaheng, Rishit Dagli, Rohan Reddy, rpalakkal, rposts, Ruan Kunliang, Rushabh Vasani, Ryohei Ikegami, Semun Lee, Seo-Inyoung, Sergey Mironov, Sharada Shiddibhavi, ShengYang1, Shraiysh Vaishay, Shunya Ueta, shwetaoj, Siyavash Najafzade, Srinivasan Narayanamoorthy, Stephan Uphoff, storypku, sunchenggen, sunway513, Sven-Hendrik Haase, Swapnil Parekh, Tamas Bela Feher, Teng Lu, tigertang, tomas, Tomohiro Ubukata, tongxuan.ltx, Tony Tonev, Tzu-Wei Huang, Téo Bouvard, Uday Bondhugula, Vaibhav Jade, Vijay Tadikamalla, Vikram Dattu, Vincent Abriou, Vishnuvardhan Janapati, Vo Van Nghia, VoVAllen, Will Battel, William D. Irons, wyzhao, Xiaoming (Jason) Cui, Xiaoquan Kong, Xinan Jiang, xutianming, Yair Ehrenwald, Yasir Modak, Yasuhiro Matsumoto, Yixing Fu, Yong Tang, Yuan Tang, zhaozheng09, Zilin Zhu, zilinzhu, 张志豪

Assets 2
Pre-release
Pre-release

@geetachavan1 geetachavan1 released this Jul 18, 2020 · 77 commits to r2.3 since this release

Release 2.3.0

Major Features and Improvements

  • tf.data adds two new mechanisms to solve input pipeline bottlenecks and save resources:

In addition checkout the detailed guide for analyzing input pipeline performance with TF Profiler.

  • tf.distribute.TPUStrategy is now a stable API and no longer considered experimental for TensorFlow. (earlier tf.distribute.experimental.TPUStrategy).

  • TF Profiler introduces two new tools: a memory profiler to visualize your model’s memory usage over time and a python tracer which allows you to trace python function calls in your model. Usability improvements include better diagnostic messages and profile options to customize the host and device trace verbosity level.

  • Introduces experimental support for Keras Preprocessing Layers API (tf.keras.layers.experimental.preprocessing.*) to handle data preprocessing operations, with support for composite tensor inputs. Please see below for additional details on these layers.

  • TFLite now properly supports dynamic shapes during conversion and inference. We’ve also added opt-in support on Android and iOS for XNNPACK, a highly optimized set of CPU kernels, as well as opt-in support for executing quantized models on the GPU.

  • Libtensorflow packages are available in GCS starting this release. We have also started to release a nightly version of these packages.

  • The experimental Python API tf.debugging.experimental.enable_dump_debug_info() now allows you to instrument a TensorFlow program and dump debugging information to a directory on the file system. The directory can be read and visualized by a new interactive dashboard in TensorBoard 2.3 called Debugger V2, which reveals the details of the TensorFlow program including graph structures, history of op executions at the Python (eager) and intra-graph levels, the runtime dtype, shape, and numerical composistion of tensors, as well as their code locations.

Breaking Changes

  • Increases the minimum bazel version required to build TF to 3.1.0.
  • tf.data
    • Makes the following (breaking) changes to the tf.data.
    • C++ API: - IteratorBase::RestoreInternal, IteratorBase::SaveInternal, and DatasetBase::CheckExternalState become pure-virtual and subclasses are now expected to provide an implementation.
    • The deprecated DatasetBase::IsStateful method is removed in favor of DatasetBase::CheckExternalState.
    • Deprecated overrides of DatasetBase::MakeIterator and MakeIteratorFromInputElement are removed.
    • The signature of tensorflow::data::IteratorBase::SaveInternal and tensorflow::data::IteratorBase::SaveInput has been extended with SerializationContext argument to enable overriding the default policy for the handling external state during iterator checkpointing. This is not a backwards compatible change and all subclasses of IteratorBase need to be updated accordingly.
  • tf.keras
    • Add a new BackupAndRestore callback for handling distributed training failures & restarts. Please take a look at this tutorial for details on how to use the callback.
  • tf.image.extract_glimpse has been updated to correctly process the case
    where centered=False and normalized=False. This is a breaking change as
    the output is different from (incorrect) previous versions. Note this
    breaking change only impacts tf.image.extract_glimpse and
    tf.compat.v2.image.extract_glimpse API endpoints. The behavior of
    tf.compat.v1.image.extract_glimpse does not change. The behavior of
    exsiting C++ kernel ExtractGlimpse does not change either, so saved
    models using tf.raw_ops.ExtractGlimpse will not be impacted.

Bug Fixes and Other Changes

TF Core:

  • Set tf2_behavior to 1 to enable V2 for early loading cases.
  • Add execute_fn_for_device function to dynamically choose the implementation based on underlying device placement.
  • Eager:
    • Add reduce_logsumexp benchmark with experiment compile.
    • Give EagerTensors a meaningful __array__ implementation.
    • Add another version of defun matmul for performance analysis.
  • tf.function/AutoGraph:
    • AutoGraph now includes into TensorFlow loops any variables that are closed over by local functions. Previously, such variables were sometimes incorrectly ignored.
    • functions returned by the get_concrete_function method of tf.function objects can now be called with arguments consistent with the original arguments or type specs passed to get_concrete_function. This calling convention is now the preferred way to use concrete functions with nested values and composite tensors. Please check the guide for more details on concrete_ function.
    • Update tf.function's experimental_relax_shapes to handle composite tensors appropriately.
    • Optimize tf.function invocation, by removing redundant list converter.
    • tf.function will retrace when called with a different variable instead of simply using the dtype & shape.
    • Improve support for dynamically-sized TensorArray inside tf.function.
  • tf.math:
    • Narrow down argmin/argmax contract to always return the smallest index for ties.
    • tf.math.reduce_variance and tf.math.reduce_std return correct computation for complex types and no longer support integer types.
    • Add Bessel functions of order 0,1 to tf.math.special.
    • tf.divide now always returns a tensor to be consistent with documentation and other APIs.
  • tf.image:
    • Replaced tf.image.non_max_suppression_padded with a new implementation that supports batched inputs, which is considerably faster on TPUs and GPUs. Boxes with area=0 will be ignored. Existing usage with single inputs should still work as before.
  • tf.linalg
    • Add tf.linalg.banded_triangular_solve.
  • tf.random:
    • Add tf.random.stateless_parameterized_truncated_normal.
  • tf.ragged:
    • Add tf.ragged.cross and tf.ragged.cross_hashed operations.
  • tf.RaggedTensor:
    • RaggedTensor.to_tensor() now preserves static shape.
    • Add tf.strings.format() and tf.print() to support RaggedTensors.
  • tf.saved_model:
    • @tf.function from SavedModel no longer ignores args after a RaggedTensor when selecting the concrete function to run.
    • Fix save model issue for ops with a list of functions.
    • Add tf.saved_model.LoadOptions with experimental_io_device as arg with default value None to choose the I/O device for loading models and weights.
    • Update tf.saved_model.SaveOptions with experimental_io_device as arg with default value None to choose the I/O device for saving models and weights.
  • GPU
    • No longer includes PTX kernels for GPU except for sm_70 to reduce binary size.
  • Others
    • Retain parent namescope for ops added inside tf.while_loop/tf.cond/tf.switch_case.
    • Update tf.vectorized_map to support vectorizing tf.while_loop and TensorList operations.
    • tf.custom_gradient can now be applied to functions that accept nested structures of tensors as inputs (instead of just a list of tensors). Note that Python structures such as tuples and lists now won't be treated as tensors, so if you still want them to be treated that way, you need to wrap them with tf.convert_to_tensor.
    • No lowering on gradient case op when input is DeviceIndex op.
    • Extend the ragged version of tf.gather to support batch_dims and axis args.
    • Update tf.map_fn to support RaggedTensors and SparseTensors.
    • Deprecate tf.group. It is not useful in eager mode.
    • Add CPU and GPU implementation of modified variation of FTRL/FTRLV2 that can triggerred by multiply_linear_by_lr allowing a learning rate of zero.

tf.data:

  • tf.data.experimental.dense_to_ragged_batch works correctly with tuples.
  • tf.data.experimental.dense_to_ragged_batch to output variable ragged rank.
  • tf.data.experimental.cardinality is now a method on tf.data.Dataset.
  • tf.data.Dataset now supports len(Dataset) when the cardinality is finite.

tf.distribute:

  • Expose experimental tf.distribute.DistributedDataset and tf.distribute.DistributedIterator to distribute input data when using tf.distribute to scale training on multiple devices.
  • Allow var.assign on MirroredVariables with aggregation=NONE in replica context. Previously this would raise an error. We now allow this because many users and library writers find using .assign in replica context to be more convenient, instead of having to use Strategy.extended.update which was the previous way of updating variables in this situation.
  • tf.distribute.experimental.MultiWorkerMirroredStrategy adds support for partial batches. Workers running out of data now continue to participate in the training with empty inputs, instead of raising an error. Learn more about partial batches here.
  • Improve the performance of reading metrics eagerly under tf.distribute.experimental.MultiWorkerMirroredStrategy.
  • Fix the issue that strategy.reduce() inside tf.function may raise exceptions when the values to reduce are from loops or if-clauses.
  • Fix the issue that tf.distribute.MirroredStrategy cannot be used together with tf.distribute.experimental.MultiWorkerMirroredStrategy.
  • Add a tf.distribute.cluster_resolver.TPUClusterResolver.connect API to simplify TPU initialization.

tf.keras:

  • Introduces experimental preprocessing layers API (tf.keras.layers.experimental.preprocessing) to handle data preprocessing operations such as categorical feature encoding, text vectorization, data normalization, and data discretization (binning). The newly added layers provide a replacement for the legacy feature column API, and support composite tensor inputs.
  • Added categorical data processing layers:
    • IntegerLookup & StringLookup: build an index of categorical feature values
    • CategoryEncoding: turn integer-encoded categories into one-hot, multi-hot, or tf-idf encoded representations
    • CategoryCrossing: create new categorical features representing co-occurrences of previous categorical feature values
    • Hashing: the hashing trick, for large-vocabulary categorical features
    • Discretization: turn continuous numerical features into categorical features by binning their values
  • Improved image preprocessing layers: CenterCrop, Rescaling
  • Improved image augmentation layers: RandomCrop, RandomFlip, RandomTranslation, RandomRotation, RandomHeight, RandomWidth, RandomZoom, RandomContrast
  • Improved TextVectorization layer, which handles string tokenization, n-gram generation, and token encoding
    • The TextVectorization layer now accounts for the mask_token as part of the vocabulary size when output_mode='int'. This means that, if you have a max_tokens value of 5000, your output will have 5000 unique values (not 5001 as before).
    • Change the return value of TextVectorization.get_vocabulary() from byte to string. Users who previously were calling 'decode' on the output of this method should no longer need to do so.
  • Introduce new Keras dataset generation utilities :
    • image_dataset_from_directory is a utility based on tf.data.Dataset, meant to replace the legacy ImageDataGenerator. It takes you from a structured directory of images to a labeled dataset, in one function call. Note that it doesn't perform image data augmentation (which is meant to be done using preprocessing layers).
    • text_dataset_from_directory takes you from a structured directory of text files to a labeled dataset, in one function call.
    • timeseries_dataset_from_array is a tf.data.Dataset-based replacement of the legacy TimeseriesGenerator. It takes you from an array of timeseries data to a dataset of shifting windows with their targets.
  • Added experimental_steps_per_execution
    arg to model.compile to indicate the number of batches to run per tf.function call. This can speed up Keras Models on TPUs up to 3x.
  • Extends tf.keras.layers.Lambda layers to support multi-argument lambdas, and keyword arguments when calling the layer.
  • Functional models now get constructed if any tensor in a layer call's arguments/keyword arguments comes from a keras input. Previously the functional api would only work if all of the elements in the first argument to the layer came from a keras input.
  • Clean up BatchNormalization layer's trainable property to act like standard python state when it's used inside tf.functions (frozen at tracing time), instead of acting like a pseudo-variable whose updates kind of sometimes get reflected in already-traced tf.function traces.
  • Add the Conv1DTranspose layer.
  • Refine the semantics of SensitivitySpecificityBase derived metrics. See the updated API docstrings for tf.keras.metrics.SensitivityAtSpecificity and tf.keras.metrics.SpecificityAtSensitivty.

tf.lite:

  • Converter
    • Restored inference_input_type and inference_output_type flags in TF 2.x TFLiteConverter (backward compatible with TF 1.x) to support integer (tf.int8, tf.uint8) input and output types in post training full integer quantized models.
    • Added support for converting and resizing models with dynamic (placeholder) dimensions. Previously, there was only limited support for dynamic batch size, and even that did not guarantee that the model could be properly resized at runtime.
    • Enabled experimental support for a new quantization mode with 16-bit activations and 8-bit weights. See lite.OpsSet.EXPERIMENTAL_TFLITE_BUILTINS_ACTIVATIONS_INT16_WEIGHTS_INT8.
  • CPU
    • Fix an issue w/ dynamic weights and Conv2D on x86.
    • Add a runtime Android flag for enabling XNNPACK for optimized CPU performance.
    • Add a runtime iOS flag for enabling XNNPACK for optimized CPU performance.
    • Add a compiler flag to enable building a TFLite library that applies XNNPACK delegate automatically when the model has a fp32 operation.
  • GPU
    • Allow GPU acceleration starting with internal graph nodes
    • Experimental support for quantized models with the Android GPU delegate
    • Add GPU delegate whitelist.
    • Rename GPU whitelist -> compatibility (list).
    • Improve GPU compatibility list entries from crash reports.
  • NNAPI
    • Set default value for StatefulNnApiDelegate::Options::max_number_delegated_partitions to 3.
    • Add capability to disable NNAPI CPU and check NNAPI Errno.
    • Fix crashes when using NNAPI with target accelerator specified with model containing Conv2d or FullyConnected or LSTM nodes with quantized weights.
    • Fix ANEURALNETWORKS_BAD_DATA execution failures with sum/max/min/reduce operations with scalar inputs.
  • Hexagon
    • TFLite Hexagon Delegate out of experimental.
    • Experimental int8 support for most hexagon ops.
    • Experimental per-channel quant support for conv in Hexagon delegate.
    • Support dynamic batch size in C++ API.
  • CoreML
    • Opensource CoreML delegate
  • Misc
    • Enable building Android TFLite targets on Windows
    • Add support for BatchMatMul.
    • Add support for half_pixel_centers with ResizeNearestNeighbor.
    • Add 3D support for BatchToSpaceND.
    • Add 5D support for BroadcastSub, Maximum, Minimum, Transpose and BroadcastDiv.
    • Rename kTfLiteActRelu1 to kTfLiteActReluN1To1.
    • Enable flex delegate on tensorflow.lite.Interpreter Python package.
    • Add Buckettize, SparseCross and BoostedTreesBucketize to the flex whitelist.
    • Add support for selective registration of flex ops.
    • Add missing kernels for flex delegate whitelisted ops.
    • Fix issue when using direct ByteBuffer inputs with graphs that have dynamic shapes.
    • Fix error checking supported operations in a model containing HardSwish.

Packaging Support

  • Added tf.sysconfig.get_build_info(). Returns a dict that describes the currently installed TensorFlow package, e.g. the NVIDIA CUDA and NVIDIA CuDNN versions that the package was built to support.

Profiler

  • Fix a subtle use-after-free issue in XStatVisitor::RefValue().

TPU Enhancements

  • Adds 3D mesh support in TPU configurations ops.
  • Added TPU code for FTRL with multiply_linear_by_lr.
  • Silently adds a new file system registry at gstpu.
  • Support restartType in cloud tpu client.
  • Depend on a specific version of google-api-python-client.
  • Fixes apiclient import.

Tracing and Debugging

  • Add a TFE_Py_Execute traceme.

XLA Support

  • Implement stable argmin and argmax

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

902449@58880@bigcat_chen@ASIC, Abdul Baseer Khan, Abhineet Choudhary, Abolfazl Shahbazi, Adam Hillier, ag.ramesh, Agoniii, Ajay P, Alex Hoffman, Alexander Bayandin, Alexander Grund, Alexandre Abadie, Alexey Rogachevskiy, amoitra, Andrew Stevens, Angus-Luo, Anshuman Tripathy, Anush Elangovan, Artem Mavrin, Ashutosh Hathidara, autoih, Ayushman Kumar, ayushmankumar7, Bairen Yi, Bas Aarts, Bastian Eichenberger, Ben Barsdell, bhack, Bharat Raghunathan, Biagio Montaruli, Bigcat-Himax, blueyi, Bryan Cutler, Byambaa, Carlos Hernandez-Vaquero, Chen Lei, Chris Knorowski, Christian Clauss, chuanqiw, CuiYifeng, Daniel Situnayake, Daria Zhuravleva, Dayananda-V, Deven Desai, Devi Sandeep Endluri, Dmitry Zakharov, Dominic Jack, Duncan Riach, Edgar Liberis, Ehsan Toosi, ekuznetsov139, Elena Zhelezina, Eugene Kuznetsov, Eugene Mikhantiev, Evgenii Zheltonozhskii, Fabio Di Domenico, Fausto Morales, Fei Sun, feihugis, Felix E. Klee, flyingcat, Frederic Bastien, Fredrik Knutsson, frreiss, fsx950223, ganler, Gaurav Singh, Georgios Pinitas, Gian Marco Iodice, Giorgio Arena, Giuseppe Rossini, Gregory Keith, Guozhong Zhuang, gurushantj, Hahn Anselm, Harald Husum, Harjyot Bagga, Hristo Vrigazov, Ilya Persky, Ir1d, Itamar Turner-Trauring, jacco, Jake Tae, Janosh Riebesell, Jason Zaman, jayanth, Jeff Daily, Jens Elofsson, Jinzhe Zeng, JLZ, Jonas Skog, Jonathan Dekhtiar, Josh Meyer, Joshua Chia, Judd, justkw, Kaixi Hou, Kam D Kasravi, Kamil Rakoczy, Karol Gugala, Kayou, Kazuaki Ishizaki, Keith Smiley, Khaled Besrour, Kilaru Yasaswi Sri Chandra Gandhi, Kim, Young Soo, Kristian Hartikainen, Kwabena W. Agyeman, Leslie-Fang, Leslie-Fang-Intel, Li, Guizi, Lukas Geiger, Lutz Roeder, M\U00E5Ns Nilsson, Mahmoud Abuzaina, Manish, Marcel Koester, Marcin Sielski, marload, Martin Jul, Matt Conley, mdfaijul, Meng, Peng, Meteorix, Michael Käufl, Michael137, Milan Straka, Mitchell Vitez, Ml-0, Mokke Meguru, Mshr-H, nammbash, Nathan Luehr, naumkin, Neeraj Bhadani, ngc92, Nick Morgan, nihui, Niranjan Hasabnis, Niranjan Yadla, Nishidha Panpaliya, Oceania2018, oclyke, Ouyang Jin, OverLordGoldDragon, Owen Lyke, Patrick Hemmer, Paul Andrey, Peng Sun, periannath, Phil Pearl, Prashant Dandriyal, Prashant Kumar, Rahul Huilgol, Rajan Singh, Rajeshwar Reddy T, rangjiaheng, Rishit Dagli, Rohan Reddy, rpalakkal, rposts, Ruan Kunliang, Rushabh Vasani, Ryohei Ikegami, Semun Lee, Seo-Inyoung, Sergey Mironov, Sharada Shiddibhavi, ShengYang1, Shraiysh Vaishay, Shunya Ueta, shwetaoj, Siyavash Najafzade, Srinivasan Narayanamoorthy, Stephan Uphoff, storypku, sunchenggen, sunway513, Sven-Hendrik Haase, Swapnil Parekh, Tamas Bela Feher, Teng Lu, tigertang, tomas, Tomohiro Ubukata, tongxuan.ltx, Tony Tonev, Tzu-Wei Huang, Téo Bouvard, Uday Bondhugula, Vaibhav Jade, Vijay Tadikamalla, Vikram Dattu, Vincent Abriou, Vishnuvardhan Janapati, Vo Van Nghia, VoVAllen, Will Battel, William D. Irons, wyzhao, Xiaoming (Jason) Cui, Xiaoquan Kong, Xinan Jiang, xutianming, Yair Ehrenwald, Yasir Modak, Yasuhiro Matsumoto, Yixing Fu, Yong Tang, Yuan Tang, zhaozheng09, Zilin Zhu, zilinzhu, 张志豪

Assets 2
Pre-release
Pre-release

@goldiegadde goldiegadde released this Jul 9, 2020 · 92 commits to r2.3 since this release

Release 2.3.0

Major Features and Improvements

In addition checkout the detailed guide for analyzing input pipeline performance with TF Profiler.

  • tf.distribute.TPUStrategy is now a stable API and no longer considered experimental for TensorFlow. (earlier tf.distribute.experimental.TPUStrategy).

  • TF Profiler introduces two new tools: a memory profiler to visualize your model’s memory usage over time and a python tracer which allows you to trace python function calls in your model. Usability improvements include better diagnostic messages and profile options to customize the host and device trace verbosity level.

  • Introduces experimental support for Keras Preprocessing Layers API (tf.keras.layers.experimental.preprocessing.*) to handle data preprocessing operations, with support for composite tensor inputs. Please see below for additional details on these layers.

  • TFLite now properly supports dynamic shapes during conversion and inference. We’ve also added opt-in support on Android and iOS for XNNPACK, a highly optimized set of CPU kernels, as well as opt-in support for executing quantized models on the GPU.

  • Libtensorflow packages are available in GCS starting this release. We have also started to release a nightly version of these packages.

Breaking Changes

  • Increases the minimum bazel version required to build TF to 3.1.0.
  • tf.data
    • Makes the following (breaking) changes to the tf.data.
    • C++ API: - IteratorBase::RestoreInternal, IteratorBase::SaveInternal, and DatasetBase::CheckExternalState become pure-virtual and subclasses are now expected to provide an implementation.
    • The deprecated DatasetBase::IsStateful method is removed in favor of DatasetBase::CheckExternalState.
    • Deprecated overrides of DatasetBase::MakeIterator and MakeIteratorFromInputElement are removed.
    • The signature of tensorflow::data::IteratorBase::SaveInternal and tensorflow::data::IteratorBase::SaveInput has been extended with SerializationContext argument to enable overriding the default policy for the handling external state during iterator checkpointing. This is not a backwards compatible change and all subclasses of IteratorBase need to be updated accordingly.
  • tf.keras
    • Add a new BackupAndRestore callback for handling distributed training failures & restarts. Please take a look at this tutorial for details on how to use the callback.
  • tf.image.extract_glimpse has been updated to correctly process the case
    where centered=False and normalized=False. This is a breaking change as
    the output is different from (incorrect) previous versions. Note this
    breaking change only impacts tf.image.extract_glimpse and
    tf.compat.v2.image.extract_glimpse API endpoints. The behavior of
    tf.compat.v1.image.extract_glimpse does not change. The behavior of
    exsiting C++ kernel ExtractGlimpse does not change either, so saved
    models using tf.raw_ops.ExtractGlimpse will not be impacted.

Bug Fixes and Other Changes

TF Core:

  • Set tf2_behavior to 1 to enable V2 for early loading cases.
  • Add a function to dynamically choose the implementation based on underlying device placement.
  • Eager:
    • Add reduce_logsumexp benchmark with experiment compile.
    • Give EagerTensors a meaningful __array__ implementation.
    • Add another version of defun matmul for performance analysis.
  • tf.function/AutoGraph:
    • AutoGraph now includes into TensorFlow loops any variables that are closed over by local functions. Previously, such variables were sometimes incorrectly ignored.
    • functions returned by the get_concrete_function method of tf.function objects can now be called with arguments consistent with the original arguments or type specs passed to get_concrete_function. This calling convention is now the preferred way to use concrete functions with nested values and composite tensors. Please check the guide for more details on concrete_ function.
    • Update tf.function's experimental_relax_shapes to handle composite tensors appropriately.
    • Optimize tf.function invocation, by removing redundant list converter.
    • tf.function will retrace when called with a different variable instead of simply using the dtype & shape.
    • Improve support for dynamically-sized TensorArray inside tf.function.
  • tf.math:
    • Narrow down argmin/argmax contract to always return the smallest index for ties.
    • tf.math.reduce_variance and tf.math.reduce_std return correct computation for complex types and no longer support integer types.
    • Add Bessel functions of order 0,1 to tf.math.special.
    • tf.divide now always returns a tensor to be consistent with documentation and other APIs.
  • tf.image:
    • Replaced tf.image.non_max_suppression_padded with a new implementation that supports batched inputs, which is considerably faster on TPUs and GPUs. Boxes with area=0 will be ignored. Existing usage with single inputs should still work as before.
  • tf.linalg
    • Add tf.linalg.banded_triangular_solve.
  • tf.random:
    • Add tf.random.stateless_parameterized_truncated_normal.
  • tf.ragged:
    • Add tf.ragged.cross and tf.ragged.cross_hashed operations.
  • tf.RaggedTensor:
    • RaggedTensor.to_tensor() now preserves static shape.
    • Add tf.strings.format() and tf.print() to support RaggedTensors.
  • tf.saved_model:
    • @tf.function from SavedModel no longer ignores args after a RaggedTensor when selecting the concrete function to run.
    • Fix save model issue for ops with a list of functions.
    • Add tf.saved_model.LoadOptions with experimental_io_device as arg with default value None to choose the I/O device for loading models and weights.
    • Update tf.saved_model.SaveOptions with experimental_io_device as arg with default value None to choose the I/O device for saving models and weights.
  • GPU
    • No longer includes PTX kernels for GPU except for sm_70 to reduce binary size.
  • Others
    • Retain parent namescope for ops added inside tf.while_loop/tf.cond/tf.switch_case.
    • Update tf.vectorized_map to support vectorizing tf.while_loop and TensorList operations.
    • tf.custom_gradient can now be applied to functions that accept nested structures of tensors as inputs (instead of just a list of tensors). Note that Python structures such as tuples and lists now won't be treated as tensors, so if you still want them to be treated that way, you need to wrap them with tf.convert_to_tensor.
    • No lowering on gradient case op when input is DeviceIndex op.
    • Fix in c_api DEFINE_GETATTR.
    • Extend the ragged version of tf.gather to support batch_dims and axis args.
    • Update tf.map_fn to support RaggedTensors and SparseTensors.
    • Deprecate tf.group. It is not useful in eager mode.
    • Add a new variant of FTRL allowing a learning rate of zero.

tf.data:

  • tf.data.experimental.dense_to_ragged_batch works correctly with tuples.
  • tf.data.experimental.dense_to_ragged_batch to output variable ragged rank.
  • tf.data.experimental.cardinality is now a method on tf.data.Dataset.
  • tf.data.Dataset now supports len(Dataset) when the cardinality is finite.

tf.distribute:

  • Expose experimental tf.distribute.DistributedDataset and tf.distribute.DistributedIterator to distribute input data when using tf.distribute to scale training on multiple devices.
  • Allow var.assign on MirroredVariables with aggregation=NONE in replica context. Previously this would raise an error since there was no way to confirm that the values being assigned to the MirroredVariables were in fact identical.
  • tf.distribute.experimental.MultiWorkerMirroredStrategy adds support for partial batches. Workers running out of data now continue to participate in the training with empty inputs, instead of raising an error.
  • Improve the performance of reading metrics eagerly under tf.distribute.experimental.MultiWorkerMirroredStrategy.
  • Fix the issue that strategy.reduce() inside tf.function may raise exceptions when the values to reduce are from loops or if-clauses.
  • Fix the issue that tf.distribute.MirroredStrategy cannot be used together with tf.distribute.experimental.MultiWorkerMirroredStrategy.
  • Add a tf.distribute.cluster_resolver.TPUClusterResolver.connect API to simplify TPU initialization.

tf.keras:

  • Introduces experimental preprocessing layers API (tf.keras.layers.experimental.preprocessing) to handle data preprocessing operations such as categorical feature encoding, text vectorization, data normalization, and data discretization (binning). The newly added layers provide a replacement for the legacy feature column API, and support composite tensor inputs.
  • Added categorical data processing layers:
    • IntegerLookup & StringLookup: build an index of categorical feature values
    • CategoryEncoding: turn integer-encoded categories into one-hot, multi-hot, or tf-idf encoded representations
    • CategoryCrossing: create new categorical features representing co-occurrences of previous categorical feature values
    • Hashing: the hashing trick, for large-vocabulary categorical features
    • Discretization: turn continuous numerical features into categorical features by binning their values
  • Improved image preprocessing layers: CenterCrop, Rescaling
  • Improved image augmentation layers: RandomCrop, RandomFlip, RandomTranslation, RandomRotation, RandomHeight, RandomWidth, RandomZoom, RandomContrast
  • Improved TextVectorization layer, which handles string tokenization, n-gram generation, and token encoding
    • The TextVectorization layer now accounts for the mask_token as part of the vocabulary size when output_mode='int'. This means that, if you have a max_tokens value of 5000, your output will have 5000 unique values (not 5001 as before).
    • Change the return value of TextVectorization.get_vocabulary() from byte to string. Users who previously were calling 'decode' on the output of this method should no longer need to do so.
  • Introduce new Keras dataset generation utilities :
    • image_dataset_from_directory is a utility based on tf.data.Dataset, meant to replace the legacy ImageDataGenerator. It takes you from a structured directory of images to a labeled dataset, in one function call. Note that it doesn't perform image data augmentation (which is meant to be done using preprocessing layers).
    • text_dataset_from_directory takes you from a structured directory of text files to a labeled dataset, in one function call.
    • timeseries_dataset_from_array is a tf.data.Dataset-based replacement of the legacy TimeseriesGenerator. It takes you from an array of timeseries data to a dataset of shifting windows with their targets.
  • Added experimental_steps_per_execution
    arg to model.compile to indicate the number of batches to run per tf.function call. This can speed up Keras Models on TPUs up to 3x.
  • Extends tf.keras.layers.Lambda layers to support multi-argument lambdas, and keyword arguments when calling the layer.
  • Functional models now get constructed if any tensor in a layer call's arguments/keyword arguments comes from a keras input. Previously the functional api would only work if all of the elements in the first argument to the layer came from a keras input.
  • Clean up BatchNormalization layer's trainable property to act like standard python state when it's used inside tf.functions (frozen at tracing time), instead of acting like a pseudo-variable whose updates kind of sometimes get reflected in already-traced tf.function traces.
  • Add the Conv1DTranspose layer.
  • Fix bug in SensitivitySpecificityBase derived metrics.
  • Blacklist Case op from callback

tf.lite:

  • Converter
    • Restored inference_input_type and inference_output_type flags in TF 2.x TFLiteConverter (backward compatible with TF 1.x) to support integer (tf.int8, tf.uint8) input and output types in post training full integer quantized models.
    • Added support for converting and resizing models with dynamic (placeholder) dimensions. Previously, there was only limited support for dynamic batch size, and even that did not guarantee that the model could be properly resized at runtime.
  • CPU
    • Fix an issue w/ dynamic weights and Conv2D on x86.
    • Add a runtime Android flag for enabling XNNPACK for optimized CPU performance.
    • Add a runtime iOS flag for enabling XNNPACK for optimized CPU performance.
    • Add a compiler flag to enable building a TFLite library that applies XNNPACK delegate automatically when the model has a fp32 operation.
  • GPU
    • Allow GPU acceleration starting with internal graph nodes
    • Experimental support for quantized models with the Android GPU delegate
    • Add GPU delegate whitelist.
    • Rename GPU whitelist -> compatibility (list).
    • Improve GPU compatibility list entries from crash reports.
  • NNAPI
    • Set default value for StatefulNnApiDelegate::Options::max_number_delegated_partitions to 3.
    • Add capability to disable NNAPI CPU and check NNAPI Errno.
    • Fix crashes when using NNAPI with target accelerator specified with model containing Conv2d or FullyConnected or LSTM nodes with quantized weights.
    • Fix ANEURALNETWORKS_BAD_DATA execution failures with sum/max/min/reduce operations with scalar inputs.
  • Hexagon
    • TFLite Hexagon Delegate out of experimental.
    • Experimental int8 support for most hexagon ops.
    • Experimental per-channel quant support for conv in Hexagon delegate.
    • Support dynamic batch size in C++ API.
  • CoreML
    • Opensource CoreML delegate
  • Misc
    • Enable building Android TFLite targets on Windows
    • Add support for BatchMatMul.
    • Add support for half_pixel_centers with ResizeNearestNeighbor.
    • Add 3D support for BatchToSpaceND.
    • Add 5D support for BroadcastSub, Maximum, Minimum, Transpose and BroadcastDiv.
    • Rename kTfLiteActRelu1 to kTfLiteActReluN1To1.
    • Enable flex delegate on tensorflow.lite.Interpreter Python package.
    • Add Buckettize, SparseCross and BoostedTreesBucketize to the flex whitelist.
    • Add support for selective registration of flex ops.
    • Add missing kernels for flex delegate whitelisted ops.
    • Fix issue when using direct ByteBuffer inputs with graphs that have dynamic shapes.
    • Fix error checking supported operations in a model containing HardSwish.

Profiler

* Fix a subtle use-after-free issue in `XStatVisitor::RefValue()`.

TPU Enhancements

  • 3D mesh support
  • Added TPU code for FTRL with multiply_linear_by_lr.
  • Silently adds a new file system registry at gstpu.
  • Support restartType in cloud tpu client.
  • Depend on a specific version of google-api-python-client.
  • Fixes apiclient import.

XLA Support

  • Implement stable argmin and argmax

Tracing and Debugging

  • Add a TFE_Py_Execute traceme.

Packaging Support

  • Added tf.sysconfig.get_build_info(). Returns a dict that describes the currently installed TensorFlow package, e.g. the NVIDIA CUDA and NVIDIA CuDNN versions that the package was built to support.

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

902449@58880@bigcat_chen@ASIC, Abdul Baseer Khan, Abhineet Choudhary, Abolfazl Shahbazi, Adam Hillier, ag.ramesh, Agoniii, Ajay P, Alex Hoffman, Alexander Bayandin, Alexander Grund, Alexandre Abadie, Alexey Rogachevskiy, amoitra, Andrew Stevens, Angus-Luo, Anshuman Tripathy, Anush Elangovan, Artem Mavrin, Ashutosh Hathidara, autoih, Ayushman Kumar, ayushmankumar7, Bairen Yi, Bas Aarts, Bastian Eichenberger, Ben Barsdell, bhack, Bharat Raghunathan, Biagio Montaruli, Bigcat-Himax, blueyi, Bryan Cutler, Byambaa, Carlos Hernandez-Vaquero, Chen Lei, Chris Knorowski, Christian Clauss, chuanqiw, CuiYifeng, Daniel Situnayake, Daria Zhuravleva, Dayananda-V, Deven Desai, Devi Sandeep Endluri, Dmitry Zakharov, Dominic Jack, Duncan Riach, Edgar Liberis, Ehsan Toosi, ekuznetsov139, Elena Zhelezina, Eugene Kuznetsov, Eugene Mikhantiev, Evgenii Zheltonozhskii, Fabio Di Domenico, Fausto Morales, Fei Sun, feihugis, Felix E. Klee, flyingcat, Frederic Bastien, Fredrik Knutsson, frreiss, fsx950223, ganler, Gaurav Singh, Georgios Pinitas, Gian Marco Iodice, Giorgio Arena, Giuseppe Rossini, Gregory Keith, Guozhong Zhuang, gurushantj, Hahn Anselm, Harald Husum, Harjyot Bagga, Hristo Vrigazov, Ilya Persky, Ir1d, Itamar Turner-Trauring, jacco, Jake Tae, Janosh Riebesell, Jason Zaman, jayanth, Jeff Daily, Jens Elofsson, Jinzhe Zeng, JLZ, Jonas Skog, Jonathan Dekhtiar, Josh Meyer, Joshua Chia, Judd, justkw, Kaixi Hou, Kam D Kasravi, Kamil Rakoczy, Karol Gugala, Kayou, Kazuaki Ishizaki, Keith Smiley, Khaled Besrour, Kilaru Yasaswi Sri Chandra Gandhi, Kim, Young Soo, Kristian Hartikainen, Kwabena W. Agyeman, Leslie-Fang, Leslie-Fang-Intel, Li, Guizi, Lukas Geiger, Lutz Roeder, M\U00E5Ns Nilsson, Mahmoud Abuzaina, Manish, Marcel Koester, Marcin Sielski, marload, Martin Jul, Matt Conley, mdfaijul, Meng, Peng, Meteorix, Michael Käufl, Michael137, Milan Straka, Mitchell Vitez, Ml-0, Mokke Meguru, Mshr-H, nammbash, Nathan Luehr, naumkin, Neeraj Bhadani, ngc92, Nick Morgan, nihui, Niranjan Hasabnis, Niranjan Yadla, Nishidha Panpaliya, Oceania2018, oclyke, Ouyang Jin, OverLordGoldDragon, Owen Lyke, Patrick Hemmer, Paul Andrey, Peng Sun, periannath, Phil Pearl, Prashant Dandriyal, Prashant Kumar, Rahul Huilgol, Rajan Singh, Rajeshwar Reddy T, rangjiaheng, Rishit Dagli, Rohan Reddy, rpalakkal, rposts, Ruan Kunliang, Rushabh Vasani, Ryohei Ikegami, Semun Lee, Seo-Inyoung, Sergey Mironov, Sharada Shiddibhavi, ShengYang1, Shraiysh Vaishay, Shunya Ueta, shwetaoj, Siyavash Najafzade, Srinivasan Narayanamoorthy, Stephan Uphoff, storypku, sunchenggen, sunway513, Sven-Hendrik Haase, Swapnil Parekh, Tamas Bela Feher, Teng Lu, tigertang, tomas, Tomohiro Ubukata, tongxuan.ltx, Tony Tonev, Tzu-Wei Huang, Téo Bouvard, Uday Bondhugula, Vaibhav Jade, Vijay Tadikamalla, Vikram Dattu, Vincent Abriou, Vishnuvardhan Janapati, Vo Van Nghia, VoVAllen, Will Battel, William D. Irons, wyzhao, Xiaoming (Jason) Cui, Xiaoquan Kong, Xinan Jiang, xutianming, Yair Ehrenwald, Yasir Modak, Yasuhiro Matsumoto, Yixing Fu, Yong Tang, Yuan Tang, zhaozheng09, Zilin Zhu, zilinzhu, 张志豪

Assets 2
Pre-release
Pre-release

@goldiegadde goldiegadde released this Jun 26, 2020 · 107 commits to r2.3 since this release

Release 2.3.0

Major Features and Improvements

  • tf.data adds two new mechanisms to solve input pipeline bottlenecks and save resources:

In addition checkout the detailed guide for analyzing input pipeline performance with TF Profiler.

  • tf.distribute.TPUStrategy is now a stable API and no longer considered experimental for TensorFlow. (earlier tf.distribute.experimental.TPUStrategy).

  • TF Profiler introduces two new tools: a memory profiler to visualize your model’s memory usage over time and a python tracer which allows you to trace python function calls in your model. Usability improvements include better diagnostic messages and profile options to customize the host and device trace verbosity level.

  • Introduces experimental support for Keras Preprocessing Layers API (tf.keras.layers.experimental.preprocessing.*) to handle data preprocessing operations, with support for composite tensor inputs. Please see below for additional details on these layers.

  • TFLite now properly supports dynamic shapes during conversion and inference. We’ve also added opt-in support on Android and iOS for XNNPACK, a highly optimized set of CPU kernels, as well as opt-in support for executing quantized models on the GPU.

  • Libtensorflow packages are available in GCS starting this release. We have also started to release a nightly version of these packages.

Breaking Changes

  • Increases the minimum bazel version required to build TF to 3.1.0.
  • tf.data
    • Makes the following (breaking) changes to the tf.data.
    • C++ API: - IteratorBase::RestoreInternal, IteratorBase::SaveInternal, and DatasetBase::CheckExternalState become pure-virtual and subclasses are now expected to provide an implementation.
    • The deprecated DatasetBase::IsStateful method is removed in favor of DatasetBase::CheckExternalState.
    • Deprecated overrides of DatasetBase::MakeIterator and MakeIteratorFromInputElement are removed.
    • The signature of tensorflow::data::IteratorBase::SaveInternal and tensorflow::data::IteratorBase::SaveInput has been extended with SerializationContext argument to enable overriding the default policy for the handling external state during iterator checkpointing. This is not a backwards compatible change and all subclasses of IteratorBase need to be updated accordingly.
  • tf.keras
    • Add a new BackupAndRestore callback for handling distributed training failures & restarts. Please take a look at this tutorial for details on how to use the callback.
  • tf.image.extract_glimpse has been updated to correctly process the case
    where centered=False and normalized=False. This is a breaking change as
    the output is different from (incorrect) previous versions. Note this
    breaking change only impacts tf.image.extract_glimpse and
    tf.compat.v2.image.extract_glimpse API endpoints. The behavior of
    tf.compat.v1.image.extract_glimpse does not change. The behavior of
    exsiting C++ kernel ExtractGlimpse does not change as well, so saved
    models will not be impacted.

Bug Fixes and Other Changes

TF Core:

  • Set tf2_behavior to 1 to enable V2 for early loading cases.
  • Add a function to dynamically choose the implementation based on underlying device placement.
  • Eager:
    • Add reduce_logsumexp benchmark with experiment compile.
    • Give EagerTensors a meaningful __array__ implementation.
    • Add another version of defun matmul for performance analysis.
  • tf.function/AutoGraph:
    • AutoGraph now includes into TensorFlow loops any variables that are closed over by local functions. Previously, such variables were sometimes incorrectly ignored.
    • functions returned by the get_concrete_function method of tf.function objects can now be called with arguments consistent with the original arguments or type specs passed to get_concrete_function. This calling convention is now the preferred way to use concrete functions with nested values and composite tensors. Please check the guide for more details on concrete_ function.
    • Update tf.function's experimental_relax_shapes to handle composite tensors appropriately.
    • Optimize tf.function invocation, by removing redundant list converter.
    • tf.function will retrace when called with a different variable instead of simply using the dtype & shape.
    • Improve support for dynamically-sized TensorArray inside tf.function.
  • tf.math:
    • Narrow down argmin/argmax contract to always return the smallest index for ties.
    • tf.math.reduce_variance and tf.math.reduce_std return correct computation for complex types and no longer support integer types.
    • Add Bessel functions of order 0,1 to tf.math.special.
    • tf.divide now always returns a tensor to be consistent with documentation and other APIs.
  • tf.image:
    • Replaces tf.image.non_max_suppression_padded with a new implementation that supports batched inputs, which is considerably faster on TPUs and GPUs. Boxes with area=0 will be neglected. Existing usage with single inputs should still work as before.
  • tf.linalg
    • Add tf.linalg.banded_triangular_solve.
  • tf.random:
    • Add tf.random.stateless_parameterized_truncated_normal.
  • tf.ragged:
    • Add tf.ragged.cross and tf.ragged.cross_hashed operations.
  • tf.RaggedTensor:
    • RaggedTensor.to_tensor() now preserves static shape.
    • Add tf.strings.format() and tf.print() to support RaggedTensors.
  • tf.saved_model:
    • @tf.function from SavedModel no longer ignores args after a RaggedTensor when selecting the concrete function to run.
    • Fix save model issue for ops with a list of functions.
    • Add tf.saved_model.LoadOptions with experimental_io_device as arg with default value None to choose the I/O device for loading models and weights.
    • Update tf.saved_model.SaveOptions with experimental_io_device as arg with default value None to choose the I/O device for saving models and weights.
  • GPU
    • No longer includes PTX kernels for GPU except for sm_70 to reduce binary size.
  • Profiler
    • Fix a subtle use-after-free issue in XStatVisitor::RefValue().
  • Others
    • Retain parent namescope for ops added inside tf.while_loop/tf.cond/tf.switch_case.
    • Update tf.vectorized_map to support vectorizing tf.while_loop and TensorList operations.
    • tf.custom_gradient can now be applied to functions that accept nested structures of tensors as inputs (instead of just a list of tensors). Note that Python structures such as tuples and lists now won't be treated as tensors, so if you still want them to be treated that way, you need to wrap them with tf.convert_to_tensor.
    • No lowering on gradient case op when input is DeviceIndex op.
    • Fix in c_api DEFINE_GETATTR.
    • Extend the ragged version of tf.gather to support batch_dims and axis args.
    • Update tf.map_fn to support RaggedTensors and SparseTensors.
    • Deprecate tf.group. It is not useful in eager mode.
    • Add a new variant of FTRL allowing a learning rate of zero.

tf.data:

  • tf.data.experimental.dense_to_ragged_batch works correctly with tuples.
  • tf.data.experimental.dense_to_ragged_batch to output variable ragged rank.
  • tf.data.experimental.cardinality is now a method on tf.data.Dataset.
  • tf.data.Dataset now supports len(Dataset) when the cardinality is finite.

tf.distribute:

  • Expose experimental tf.distribute.DistributedDataset and tf.distribute.DistributedIterator to distribute input data when using tf.distribute to scale training on multiple devices.
    • Added a get_next_as_optional method for tf.distribute.DistributedIterator class to return a tf.experimental.Optional instance that contains the next value for all replicas or none instead of raising an out of range error. Also see new guide on input distribution.
  • Allow var.assign on MirroredVariables with aggregation=NONE in replica context. Previously this would raise an error since there was no way to confirm that the values being assigned to the MirroredVariables were in fact identical.
  • tf.distribute.experimental.MultiWorkerMirroredStrategy adds support for partial batches. Workers running out of data now continue to participate in the training with empty inputs, instead of raising an error.
  • Improve the performance of reading metrics eagerly under tf.distribute.experimental.MultiWorkerMirroredStrategy.
  • Fix the issue that strategy.reduce() inside tf.function may raise exceptions when the values to reduce are from loops or if-clauses.
  • Fix the issue that tf.distribute.MirroredStrategy cannot be used together with tf.distribute.experimental.MultiWorkerMirroredStrategy.
  • Add a tf.distribute.cluster_resolver.TPUClusterResolver.connect API to simplify TPU initialization.

tf.keras:

  • Introduces experimental preprocessing layers API (tf.keras.layers.experimental.preprocessing) to handle data preprocessing operations such as categorical feature encoding, text vectorization, data normalization, and data discretization (binning). The newly added layers provide a replacement for the legacy feature column API, and support composite tensor inputs.
  • Added categorical data processing layers:
    • IntegerLookup & StringLookup: build an index of categorical feature values
    • CategoryEncoding: turn integer-encoded categories into one-hot, multi-hot, or tf-idf encoded representations
    • CategoryCrossing: create new categorical features representing co-occurrences of previous categorical feature values
    • Hashing: the hashing trick, for large-vocabulary categorical features
    • Discretization: turn continuous numerical features into categorical features by binning their values
  • Improved image preprocessing layers: CenterCrop, Rescaling
  • Improved image augmentation layers: RandomCrop, RandomFlip, RandomTranslation, RandomRotation, RandomHeight, RandomWidth, RandomZoom, RandomContrast
  • Improved TextVectorization layer, which handles string tokenization, n-gram generation, and token encoding
    • The TextVectorization layer now accounts for the mask_token as part of the vocabulary size when output_mode='int'. This means that, if you have a max_tokens value of 5000, your output will have 5000 unique values (not 5001 as before).
    • Change the return value of TextVectorization.get_vocabulary() from byte to string. Users who previously were calling 'decode' on the output of this method should no longer need to do so.
  • Introduce new Keras dataset generation utilities :
    • image_dataset_from_directory is a utility based on tf.data.Dataset, meant to replace the legacy ImageDataGenerator. It takes you from a structured directory of images to a labeled dataset, in one function call. Note that it doesn't perform image data augmentation (which is meant to be done using preprocessing layers).
    • text_dataset_from_directory takes you from a structured directory of text files to a labeled dataset, in one function call.
    • timeseries_dataset_from_array is a tf.data.Dataset-based replacement of the legacy TimeseriesGenerator. It takes you from an array of timeseries data to a dataset of shifting windows with their targets.
  • Added experimental_steps_per_execution
    arg to model.compile to indicate the number of batches to run per tf.function call. This can speed up Keras Models on TPUs up to 3x.
  • Functional models now get constructed if any tensor in a layer call's arguments/keyword arguments comes from a keras input. Previously the functional api would only work if all of the elements in the first argument to the layer came from a keras input.
  • Clean up BatchNormalization layer's trainable property to act like standard python state when it's used inside tf.functions (frozen at tracing time), instead of acting like a pseudo-variable whose updates kind of sometimes get reflected in already-traced tf.function traces.
  • Add the Conv1DTranspose layer.
  • Fix bug in SensitivitySpecificityBase derived metrics.
  • Blacklist Case op from callback

tf.lite:

  • Converter
    • Restored inference_input_type and inference_output_type flags in TF 2.x TFLiteConverter (backward compatible with TF 1.x) to support integer (tf.int8, tf.uint8) input and output types in post training full integer quantized models.
    • Added support for converting and resizing models with dynamic (placeholder) dimensions. Previously, there was only limited support for dynamic batch size, and even that did not guarantee that the model could be properly resized at runtime.
  • CPU
    • Fix an issue w/ dynamic weights and Conv2D on x86.
    • Add a runtime Android flag for enabling XNNPACK for optimized CPU performance.
    • Add a runtime iOS flag for enabling XNNPACK for optimized CPU performance.
    • Add a compiler flag to enable building a TFLite library that applies XNNPACK delegate automatically when the model has a fp32 operation.
  • GPU
    • Allow GPU acceleration starting with internal graph nodes
    • Experimental support for quantized models with the Android GPU delegate
    • Add GPU delegate whitelist.
    • Rename GPU whitelist -> compatibility (list).
    • Improve GPU compatibility list entries from crash reports.
  • NNAPI
    • Set default value for StatefulNnApiDelegate::Options::max_number_delegated_partitions to 3.
    • Add capability to disable NNAPI CPU and check NNAPI Errno.
    • Fix crashes when using NNAPI with target accelerator specified with model containing Conv2d or FullyConnected or LSTM nodes with quantized weights.
    • Fix ANEURALNETWORKS_BAD_DATA execution failures with sum/max/min/reduce operations with scalar inputs.
  • Hexagon
    • TFLite Hexagon Delegate out of experimental.
    • Experimental int8 support for most hexagon ops.
    • Experimental per-channel quant support for conv in Hexagon delegate.
    • Support dynamic batch size in C++ API.
  • CoreML
    • Opensource CoreML delegate
  • Misc
    • Enable building Android TFLite targets on Windows
    • Add support for BatchMatMul.
    • Add support for half_pixel_centers with ResizeNearestNeighbor.
    • Add 3D support for BatchToSpaceND.
    • Add 5D support for BroadcastSub, Maximum, Minimum, Transpose and BroadcastDiv.
    • Rename kTfLiteActRelu1 to kTfLiteActReluN1To1.
    • Enable flex delegate on tensorflow.lite.Interpreter Python package.
    • Add Buckettize, SparseCross and BoostedTreesBucketize to the flex whitelist.
    • Add support for selective registration of flex ops.
    • Add missing kernels for flex delegate whitelisted ops.
    • Fix issue when using direct ByteBuffer inputs with graphs that have dynamic shapes.
    • Fix error checking supported operations in a model containing HardSwish.

TPU Enhancements

  • 3D mesh support
  • Added TPU code for FTRL with multiply_linear_by_lr.
  • Silently adds a new file system registry at gstpu.
  • Support restartType in cloud tpu client.
  • Depend on a specific version of google-api-python-client.
  • Fixes apiclient import.

XLA Support

  • Implement stable argmin and argmax

Tracing and Debugging

  • Add a TFE_Py_Execute traceme.

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:
902449@58880@bigcat_chen@ASIC, Abdul Baseer Khan, Abhineet Choudhary, Abolfazl Shahbazi, Adam Hillier, ag.ramesh, Agoniii, Ajay P, Alex Hoffman, Alexander Bayandin, Alexander Grund, Alexandre Abadie, Alexey Rogachevskiy, amoitra, Andrew Stevens, Angus-Luo, Anshuman Tripathy, Anush Elangovan, Artem Mavrin, Ashutosh Hathidara, autoih, Ayushman Kumar, ayushmankumar7, Bairen Yi, Bas Aarts, Bastian Eichenberger, Ben Barsdell, bhack, Bharat Raghunathan, Biagio Montaruli, Bigcat-Himax, blueyi, Bryan Cutler, Byambaa, Carlos Hernandez-Vaquero, Chen Lei, Chris Knorowski, Christian Clauss, chuanqiw, CuiYifeng, Daniel Situnayake, Daria Zhuravleva, Dayananda-V, Deven Desai, Devi Sandeep Endluri, Dmitry Zakharov, Dominic Jack, Duncan Riach, Edgar Liberis, Ehsan Toosi, ekuznetsov139, Elena Zhelezina, Eugene Kuznetsov, Eugene Mikhantiev, Evgenii Zheltonozhskii, Fabio Di Domenico, Fausto Morales, Fei Sun, feihugis, Felix E. Klee, flyingcat, Frederic Bastien, Fredrik Knutsson, frreiss, fsx950223, ganler, Gaurav Singh, Georgios Pinitas, Gian Marco Iodice, Giorgio Arena, Giuseppe Rossini, Gregory Keith, Guozhong Zhuang, gurushantj, Hahn Anselm, Harald Husum, Harjyot Bagga, Hristo Vrigazov, Ilya Persky, Ir1d, Itamar Turner-Trauring, jacco, Jake Tae, Janosh Riebesell, Jason Zaman, jayanth, Jeff Daily, Jens Elofsson, Jinzhe Zeng, JLZ, Jonas Skog, Jonathan Dekhtiar, Josh Meyer, Joshua Chia, Judd, justkw, Kaixi Hou, Kam D Kasravi, Kamil Rakoczy, Karol Gugala, Kayou, Kazuaki Ishizaki, Keith Smiley, Khaled Besrour, Kilaru Yasaswi Sri Chandra Gandhi, Kim, Young Soo, Kristian Hartikainen, Kwabena W. Agyeman, Leslie-Fang, Leslie-Fang-Intel, Li, Guizi, Lukas Geiger, Lutz Roeder, M\U00E5Ns Nilsson, Mahmoud Abuzaina, Manish, Marcel Koester, Marcin Sielski, marload, Martin Jul, Matt Conley, mdfaijul, Meng, Peng, Meteorix, Michael Käufl, Michael137, Milan Straka, Mitchell Vitez, Ml-0, Mokke Meguru, Mshr-H, nammbash, Nathan Luehr, naumkin, Neeraj Bhadani, ngc92, Nick Morgan, nihui, Niranjan Hasabnis, Niranjan Yadla, Nishidha Panpaliya, Oceania2018, oclyke, Ouyang Jin, OverLordGoldDragon, Owen Lyke, Patrick Hemmer, Paul Andrey, Peng Sun, periannath, Phil Pearl, Prashant Dandriyal, Prashant Kumar, Rahul Huilgol, Rajan Singh, Rajeshwar Reddy T, rangjiaheng, Rishit Dagli, Rohan Reddy, rpalakkal, rposts, Ruan Kunliang, Rushabh Vasani, Ryohei Ikegami, Semun Lee, Seo-Inyoung, Sergey Mironov, Sharada Shiddibhavi, ShengYang1, Shraiysh Vaishay, Shunya Ueta, shwetaoj, Siyavash Najafzade, Srinivasan Narayanamoorthy, Stephan Uphoff, storypku, sunchenggen, sunway513, Sven-Hendrik Haase, Swapnil Parekh, Tamas Bela Feher, Teng Lu, tigertang, tomas, Tomohiro Ubukata, tongxuan.ltx, Tony Tonev, Tzu-Wei Huang, Téo Bouvard, Uday Bondhugula, Vaibhav Jade, Vijay Tadikamalla, Vikram Dattu, Vincent Abriou, Vishnuvardhan Janapati, Vo Van Nghia, VoVAllen, Will Battel, William D. Irons, wyzhao, Xiaoming (Jason) Cui, Xiaoquan Kong, Xinan Jiang, xutianming, Yair Ehrenwald, Yasir Modak, Yasuhiro Matsumoto, Yixing Fu, Yong Tang, Yuan Tang, zhaozheng09, Zilin Zhu, zilinzhu, 张志豪

Assets 2
  • v2.1.1
  • 3ffdb91
  • Compare
    Choose a tag to compare
    Search for a tag
  • v2.1.1
  • 3ffdb91
  • Compare
    Choose a tag to compare
    Search for a tag

@mihaimaruseac mihaimaruseac released this May 18, 2020 · 82 commits to r2.1 since this release

Release 2.1.1

Bug Fixes and Other Changes

Assets 2
  • v1.15.3
  • 4386a66
  • Compare
    Choose a tag to compare
    Search for a tag
  • v1.15.3
  • 4386a66
  • Compare
    Choose a tag to compare
    Search for a tag

@mihaimaruseac mihaimaruseac released this May 18, 2020 · 68 commits to r1.15 since this release

Bug Fixes and Other Changes

Assets 2
  • v2.0.2
  • 2c2fdd3
  • Compare
    Choose a tag to compare
    Search for a tag
  • v2.0.2
  • 2c2fdd3
  • Compare
    Choose a tag to compare
    Search for a tag

@mihaimaruseac mihaimaruseac released this May 18, 2020 · 52 commits to r2.0 since this release

Bug Fixes and Other Changes

Assets 2

@tensorflow-jenkins tensorflow-jenkins released this May 6, 2020 · 57 commits to r2.2 since this release

Release 2.2.0

TensorFlow 2.2 discontinues support for Python 2, previously announced as following Python 2's EOL on January 1, 2020.

Coinciding with this change, new releases of TensorFlow's Docker images provide Python 3 exclusively. Because all images now use Python 3, Docker tags containing -py3 will no longer be provided and existing -py3 tags like latest-py3 will not be updated.

Major Features and Improvements

  • Replaced the scalar type for string tensors from std::string to tensorflow::tstring which is now ABI stable.

  • A new Profiler for TF 2 for CPU/GPU/TPU. It offers both device and host performance analysis, including input pipeline and TF Ops. Optimization advisory is provided whenever possible. Please see this tutorial and guide for usage guidelines.

  • Export C++ functions to Python using pybind11 as opposed to SWIG as a part of our deprecation of swig efforts.

  • tf.distribute:

    • Support added for global sync BatchNormalization by using the newly added tf.keras.layers.experimental.SyncBatchNormalization layer. This layer will sync BatchNormalization statistics every step across all replicas taking part in sync training.
    • Performance improvements for GPU multi-worker distributed training using tf.distribute.experimental.MultiWorkerMirroredStrategy
      • Update NVIDIA NCCL to 2.5.7-1 for better performance and performance tuning. Please see nccl developer guide for more information on this.
      • Support gradient allreduce in float16. See this example usage.
      • Experimental support of all reduce gradient packing to allow overlapping gradient aggregation with backward path computation.
      • Deprecated experimental_run_v2 method for distribution strategies and renamed the method run as it is no longer experimental.
      • Add CompositeTensor support for DistributedIterators. This should help prevent unnecessary function retracing and memory leaks.
  • tf.keras:

    • Model.fit major improvements:
      • You can now use custom training logic with Model.fit by overriding Model.train_step.
      • Easily write state-of-the-art training loops without worrying about all of the features Model.fit handles for you (distribution strategies, callbacks, data formats, looping logic, etc)
      • See the default Model.train_step for an example of what this function should look like. Same applies for validation and inference via Model.test_step and Model.predict_step.
      • SavedModel uses its own Model._saved_model_inputs_spec attr now instead of
        relying on Model.inputs and Model.input_names, which are no longer set for subclass Models.
        This attr is set in eager, tf.function, and graph modes. This gets rid of the need for users to
        manually call Model._set_inputs when using Custom Training Loops(CTLs).
      • Dynamic shapes are supported for generators by calling the Model on the first batch we "peek" from the generator.
        This used to happen implicitly in Model._standardize_user_data. Long-term, a solution where the
        DataAdapter doesn't need to call the Model is probably preferable.
    • The SavedModel format now supports all Keras built-in layers (including metrics, preprocessing layers, and stateful RNN layers)
    • Update Keras batch normalization layer to use the running mean and average computation in the fused_batch_norm. You should see significant performance improvements when using fused_batch_norm in Eager mode.
  • tf.lite:

    • Enable TFLite experimental new converter by default.
  • XLA

    • XLA now builds and works on windows. All prebuilt packages come with XLA available.
    • XLA can be enabled for a tf.function with “compile or throw exception” semantics on CPU and GPU.

Breaking Changes

  • tf.keras:
    • In tf.keras.applications the name of the "top" layer has been standardized to "predictions". This is only a problem if your code relies on the exact name of the layer.
    • Huber loss function has been updated to be consistent with other Keras losses. It now computes mean over the last axis of per-sample losses before applying the reduction function.
  • AutoGraph no longer converts functions passed to tf.py_function, tf.py_func and tf.numpy_function.
  • Deprecating XLA_CPU and XLA_GPU devices with this release.
  • Increasing the minimum bazel version to build TF to 2.0.0 to use Bazel's cc_experimental_shared_library.
  • Keras compile/fit behavior for functional and subclassed models have been unified. Model properties such as metrics, metrics_names will now be available only after training/evaluating the model on actual data for functional models. metrics will now include model loss and output losses.loss_functions property has been removed from the model. This was an undocumented property that was accidentally public and has now been removed.

Known Caveats

  • The current TensorFlow release now requires gast version 0.3.3.

Bug Fixes and Other Changes

  • tf.data:
    • Removed autotune_algorithm from experimental optimization options.
  • TF Core:
    • tf.constant always creates CPU tensors irrespective of the current device context.
    • Eager TensorHandles maintain a list of mirrors for any copies to local or remote devices. This avoids any redundant copies due to op execution.
    • For tf.Tensor & tf.Variable, .experimental_ref() is no longer experimental and is available as simply .ref().
    • pfor/vectorized_map: Added support for vectorizing 56 more ops. Vectorizing tf.cond is also supported now.
    • Set as much partial shape as we can infer statically within the gradient impl of the gather op.
    • Gradient of tf.while_loop emits StatelessWhile op if cond and body functions are stateless. This allows multiple gradients while ops to run in parallel under distribution strategy.
    • Speed up GradientTape in eager mode by auto-generating list of op inputs/outputs which are unused and hence not cached for gradient functions.
    • Support back_prop=False in while_v2 but mark it as deprecated.
    • Improve error message when attempting to use None in data-dependent control flow.
    • Add RaggedTensor.numpy().
    • Update RaggedTensor.__getitem__ to preserve uniform dimensions & allow indexing into uniform dimensions.
    • Update tf.expand_dims to always insert the new dimension as a non-ragged dimension.
    • Update tf.embedding_lookup to use partition_strategy and max_norm when ids is ragged.
    • Allow batch_dims==rank(indices) in tf.gather.
    • Add support for bfloat16 in tf.print.
  • tf.distribute:
    • Support embedding_column with variable-length input features for MultiWorkerMirroredStrategy.
  • tf.keras:
    • Added experimental_aggregate_gradients argument to tf.keras.optimizer.Optimizer.apply_gradients. This allows custom gradient aggregation and processing aggregated gradients in custom training loop.
    • Allow pathlib.Path paths for loading models via Keras API.
  • tf.function/AutoGraph:
    • AutoGraph is now available in ReplicaContext.merge_call, Strategy.extended.update and Strategy.extended.update_non_slot.
    • Experimental support for shape invariants has been enabled in tf.function. See the API docs for tf.autograph.experimental.set_loop_options for additonal info.
    • AutoGraph error messages now exclude frames corresponding to APIs internal to AutoGraph.
    • Improve shape inference for tf.function input arguments to unlock more Grappler optimizations in TensorFlow 2.x.
    • Improve automatic control dependency management of resources by allowing resource reads to occur in parallel and synchronizing only on writes.
    • Fix execution order of multiple stateful calls to experimental_run_v2 in tf.function.
    • You can now iterate over RaggedTensors using a for loop inside tf.function.
  • tf.lite:
    • Migrated the tf.lite C inference API out of experimental into lite/c.
    • Add an option to disallow NNAPI CPU / partial acceleration on Android 10
    • TFLite Android AARs now include the C headers and APIs are required to use TFLite from native code.
    • Refactors the delegate and delegate kernel sources to allow usage in the linter.
    • Limit delegated ops to actually supported ones if a device name is specified or NNAPI CPU Fallback is disabled.
    • TFLite now supports tf.math.reciprocal1 op by lowering to tf.div op.
    • TFLite's unpack op now supports boolean tensor inputs.
    • Microcontroller and embedded code moved from experimental to main TensorFlow Lite folder
    • Check for large TFLite tensors.
    • Fix GPU delegate crash with C++17.
    • Add 5D support to TFLite strided_slice.
    • Fix error in delegation of DEPTH_TO_SPACE to NNAPI causing op not to be accelerated.
    • Fix segmentation fault when running a model with LSTM nodes using NNAPI Delegate
    • Fix NNAPI delegate failure when an operand for Maximum/Minimum operation is a scalar.
    • Fix NNAPI delegate failure when Axis input for reduce operation is a scalar.
    • Expose option to limit the number of partitions that will be delegated to NNAPI.
    • If a target accelerator is specified, use its feature level to determine operations to delegate instead of SDK version.
  • tf.random:
    • Various random number generation improvements:
      • Add a fast path for default random_uniform
      • random_seed documentation improvement.
      • RandomBinomial broadcasts and appends the sample shape to the left rather than the right.
    • Added tf.random.stateless_binomial, tf.random.stateless_gamma, tf.random.stateless_poisson
    • tf.random.stateless_uniform now supports unbounded sampling of int types.
  • Math and Linear Algebra:
    • Add tf.linalg.LinearOperatorTridiag.
    • Add LinearOperatorBlockLowerTriangular
    • Add broadcasting support to tf.linalg.triangular_solve#26204, tf.math.invert_permutation.
    • Add tf.math.sobol_sample op.
    • Add tf.math.xlog1py.
    • Add tf.math.special.{dawsn,expi,fresnel_cos,fresnel_sin,spence}.
    • Add a Modified Discrete Cosine Transform (MDCT) and its inverse to tf.signal.
  • TPU Enhancements:
    • Refactor TpuClusterResolver to move shared logic to a separate pip package.
    • Support configuring TPU software version from cloud tpu client.
    • Allowed TPU embedding weight decay factor to be multiplied by learning rate.
  • XLA Support:
    • Add standalone XLA AOT runtime target + relevant .cc sources to pip package.
    • Add check for memory alignment to MemoryAllocation::MemoryAllocation() on 32-bit ARM. This ensures a deterministic early exit instead of a hard to debug bus error later.
    • saved_model_cli aot_compile_cpu allows you to compile saved models to XLA header+object files and include them in your C++ programs.
    • Enable Igamma, Igammac for XLA.
  • Deterministic Op Functionality:
    • XLA reduction emitter is deterministic when the environment variable TF_DETERMINISTIC_OPS is set to "true" or "1". This extends deterministic tf.nn.bias_add back-prop functionality (and therefore also deterministic back-prop of bias-addition in Keras layers) to include when XLA JIT complilation is enabled.
    • Fix problem, when running on a CUDA GPU and when either environment variable TF_DETERMINSTIC_OPS or environment variable TF_CUDNN_DETERMINISTIC is set to "true" or "1", in which some layer configurations led to an exception with the message "No algorithm worked!"
  • Tracing and Debugging:
    • Add source, destination name to _send traceme to allow easier debugging.
    • Add traceme event to fastpathexecute.
  • Other:
    • Fix an issue with AUC.reset_states for multi-label AUC #35852
    • Fix the TF upgrade script to not delete files when there is a parsing error and the output mode is in-place.
    • Move tensorflow/core:framework/*_pyclif rules to tensorflow/core/framework:*_pyclif.

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

372046933, 8bitmp3, aaronhma, Abin Shahab, Aditya Patwardhan, Agoniii, Ahti Kitsik, Alan Yee, Albin Joy, Alex Hoffman, Alexander Grund, Alexandre E. Eichenberger, Amit Kumar Jaiswal, amoitra, Andrew Anderson, Angus-Luo, Anthony Barbier, Anton Kachatkou, Anuj Rawat, archis, Arpan-Dhatt, Arvind Sundararajan, Ashutosh Hathidara, autoih, Bairen Yi, Balint Cristian, Bas Aarts, BashirSbaiti, Basit Ayantunde, Ben Barsdell, Benjamin Gaillard, boron, Brett Koonce, Bryan Cutler, Christian Goll, Christian Sachs, Clayne Robison, comet, Daniel Falbel, Daria Zhuravleva, darsh8200, David Truby, Dayananda-V, deepakm, Denis Khalikov, Devansh Singh, Dheeraj R Reddy, Diederik Van Liere, Diego Caballero, Dominic Jack, dothinking, Douman, Drake Gens, Duncan Riach, Ehsan Toosi, ekuznetsov139, Elena Zhelezina, elzino, Ending2015a, Eric Schweitz, Erik Zettel, Ethan Saadia, Eugene Kuznetsov, Evgeniy Zheltonozhskiy, Ewout Ter Hoeven, exfalso, FAIJUL, Fangjun Kuang, Fei Hu, Frank Laub, Frederic Bastien, Fredrik Knutsson, frreiss, Frédéric Rechtenstein, fsx950223, Gaurav Singh, gbaned, George Grzegorz Pawelczak, George Sterpu, Gian Marco Iodice, Giorgio Arena, Hans Gaiser, Hans Pabst, Haoyu Wu, Harry Slatyer, hsahovic, Hugo, Hugo Sjöberg, IrinaM21, jacco, Jake Tae, Jean-Denis Lesage, Jean-Michel Gorius, Jeff Daily, Jens Elofsson, Jerry Shih, jerryyin, Jin Mingjian, Jinjing Zhou, JKIsaacLee, jojimonv, Jonathan Dekhtiar, Jose Ignacio Gomez, Joseph-Rance, Judd, Julian Gross, Kaixi Hou, Kaustubh Maske Patil, Keunwoo Choi, Kevin Hanselman, Khor Chean Wei, Kilaru Yasaswi Sri Chandra Gandhi, Koan-Sin Tan, Koki Ibukuro, Kristian Holsheimer, kurileo, Lakshay Tokas, Lee Netherton, leike666666, Leslie-Fang-Intel, Li, Guizi, LIUJIAN435, Lukas Geiger, Lyo Nguyen, madisetti, Maher Jendoubi, Mahmoud Abuzaina, Manuel Freiberger, Marcel Koester, Marco Jacopo Ferrarotti, Markus Franke, marload, Mbah-Javis, mbhuiyan, Meng Zhang, Michael Liao, MichaelKonobeev, Michal Tarnowski, Milan Straka, minoring, Mohamed Nour Abouelseoud, MoussaMM, Mrinal Jain, mrTsjolder, Måns Nilsson, Namrata Bhave, Nicholas Gao, Niels Ole Salscheider, nikochiko, Niranjan Hasabnis, Nishidha Panpaliya, nmostafa, Noah Trenaman, nuka137, Officium, Owen L - Sfe, Pallavi G, Paul Andrey, Peng Sun, Peng Wu, Phil Pearl, PhilipMay, pingsutw, Pooya Davoodi, PragmaTwice, pshiko, Qwerty71, R Gomathi, Rahul Huilgol, Richard Xiao, Rick Wierenga, Roberto Rosmaninho, ruchit2801, Rushabh Vasani, Sami, Sana Damani, Sarvesh Dubey, Sasan Jafarnejad, Sergii Khomenko, Shane Smiskol, Shaochen Shi, sharkdtu, Shawn Presser, ShengYang1, Shreyash Patodia, Shyam Sundar Dhanabalan, Siju Samuel, Somyajit Chakraborty Sam, Srihari Humbarwadi, srinivasan.narayanamoorthy, Srishti Yadav, Steph-En-M, Stephan Uphoff, Stephen Mugisha, SumanSudhir, Taehun Kim, Tamas Bela Feher, TengLu, Tetragramm, Thierry Herrmann, Tian Jin, tigertang, Tom Carchrae, Tom Forbes, Trent Lo, Victor Peng, vijayphoenix, Vincent Abriou, Vishal Bhola, Vishnuvardhan Janapati, vladbataev, VoVAllen, Wallyss Lima, Wen-Heng (Jack) Chung, wenxizhu, William D. Irons, William Zhang, Xiaoming (Jason) Cui, Xiaoquan Kong, Xinan Jiang, Yasir Modak, Yasuhiro Matsumoto, Yaxun (Sam) Liu, Yong Tang, Ytyt-Yt, yuan, Yuan Mingshuai, Yuan Tang, Yuki Ueda, Yusup, zhangshijin, zhuwenxi

Assets 2
Pre-release
Pre-release

@tensorflow-jenkins tensorflow-jenkins released this Apr 30, 2020 · 65 commits to r2.2 since this release

Release 2.2.0

TensorFlow 2.2 discontinues support for Python 2, previously announced as following Python 2's EOL on January 1, 2020.

Coinciding with this change, new releases of TensorFlow's Docker images provide Python 3 exclusively. Because all images now use Python 3, Docker tags containing -py3 will no longer be provided and existing -py3 tags like latest-py3 will not be updated.

Major Features and Improvements

  • Replaced the scalar type for string tensors from std::string to tensorflow::tstring which is now ABI stable.

  • A new Profiler for TF 2 for CPU/GPU/TPU. It offers both device and host performance analysis, including input pipeline and TF Ops. Optimization advisory is provided whenever possible. Please see this tutorial and guide for usage guidelines.

  • Export C++ functions to Python using pybind11 as opposed to SWIG as a part of our deprecation of swig efforts.

  • tf.distribute:

    • Support added for global sync BatchNormalization by using the newly added tf.keras.layers.experimental.SyncBatchNormalization layer. This layer will sync BatchNormalization statistics every step across all replicas taking part in sync training.
    • Performance improvements for GPU multi-worker distributed training using tf.distribute.experimental.MultiWorkerMirroredStrategy
      • Update NVIDIA NCCL to 2.5.7-1 for better performance and performance tuning. Please see nccl developer guide for more information on this.
      • Support gradient allreduce in float16. See this example usage.
      • Experimental support of all reduce gradient packing to allow overlapping gradient aggregation with backward path computation.
      • Deprecated experimental_run_v2 method for distribution strategies and renamed the method run as it is no longer experimental.
      • Add CompositeTensor support for DistributedIterators. This should help prevent unnecessary function retracing and memory leaks.
  • tf.keras:

    • Model.fit major improvements:
      • You can now use custom training logic with Model.fit by overriding Model.train_step.
      • Easily write state-of-the-art training loops without worrying about all of the features Model.fit handles for you (distribution strategies, callbacks, data formats, looping logic, etc)
      • See the default Model.train_step for an example of what this function should look like. Same applies for validation and inference via Model.test_step and Model.predict_step.
      • SavedModel uses its own Model._saved_model_inputs_spec attr now instead of
        relying on Model.inputs and Model.input_names, which are no longer set for subclass Models.
        This attr is set in eager, tf.function, and graph modes. This gets rid of the need for users to
        manually call Model._set_inputs when using Custom Training Loops(CTLs).
      • Dynamic shapes are supported for generators by calling the Model on the first batch we "peek" from the generator.
        This used to happen implicitly in Model._standardize_user_data. Long-term, a solution where the
        DataAdapter doesn't need to call the Model is probably preferable.
    • The SavedModel format now supports all Keras built-in layers (including metrics, preprocessing layers, and stateful RNN layers)
    • Update Keras batch normalization layer to use the running mean and average computation in the fused_batch_norm. You should see significant performance improvements when using fused_batch_norm in Eager mode.
  • tf.lite:

    • Enable TFLite experimental new converter by default.
  • XLA

    • XLA now builds and works on windows. All prebuilt packages come with XLA available.
    • XLA can be enabled for a tf.function with “compile or throw exception” semantics on CPU and GPU.

Breaking Changes

  • tf.keras:
    • In tf.keras.applications the name of the "top" layer has been standardized to "predictions". This is only a problem if your code relies on the exact name of the layer.
    • Huber loss function has been updated to be consistent with other Keras losses. It now computes mean over the last axis of per-sample losses before applying the reduction function.
  • AutoGraph no longer converts functions passed to tf.py_function, tf.py_func and tf.numpy_function.
  • Deprecating XLA_CPU and XLA_GPU devices with this release.
  • Increasing the minimum bazel version to build TF to 2.0.0 to use Bazel's cc_experimental_shared_library.
  • Keras compile/fit behavior for functional and subclassed models have been unified. Model properties such as metrics, metrics_names will now be available only after training/evaluating the model on actual data for functional models. metrics will now include model loss and output losses.loss_functions property has been removed from the model. This was an undocumented property that was accidentally public and has now been removed.

Known Caveats

  • The current TensorFlow release now requires gast version 0.3.3.
  • There is a known issue that might surface with CompositeTensor on TPU pods. As a temporary workaround you can set _enable_legacy_iterators to True.

Bug Fixes and Other Changes

  • tf.data:
    • Removed autotune_algorithm from experimental optimization options.
  • TF Core:
    • tf.constant always creates CPU tensors irrespective of the current device context.
    • Eager TensorHandles maintain a list of mirrors for any copies to local or remote devices. This avoids any redundant copies due to op execution.
    • For tf.Tensor & tf.Variable, .experimental_ref() is no longer experimental and is available as simply .ref().
    • pfor/vectorized_map: Added support for vectorizing 56 more ops. Vectorizing tf.cond is also supported now.
    • Set as much partial shape as we can infer statically within the gradient impl of the gather op.
    • Gradient of tf.while_loop emits StatelessWhile op if cond and body functions are stateless. This allows multiple gradients while ops to run in parallel under distribution strategy.
    • Speed up GradientTape in eager mode by auto-generating list of op inputs/outputs which are unused and hence not cached for gradient functions.
    • Support back_prop=False in while_v2 but mark it as deprecated.
    • Improve error message when attempting to use None in data-dependent control flow.
    • Add RaggedTensor.numpy().
    • Update RaggedTensor.__getitem__ to preserve uniform dimensions & allow indexing into uniform dimensions.
    • Update tf.expand_dims to always insert the new dimension as a non-ragged dimension.
    • Update tf.embedding_lookup to use partition_strategy and max_norm when ids is ragged.
    • Allow batch_dims==rank(indices) in tf.gather.
    • Add support for bfloat16 in tf.print.
  • tf.distribute:
    • Support embedding_column with variable-length input features for MultiWorkerMirroredStrategy.
  • tf.keras:
    • Added experimental_aggregate_gradients argument to tf.keras.optimizer.Optimizer.apply_gradients. This allows custom gradient aggregation and processing aggregated gradients in custom training loop.
    • Allow pathlib.Path paths for loading models via Keras API.
  • tf.function/AutoGraph:
    • AutoGraph is now available in ReplicaContext.merge_call, Strategy.extended.update and Strategy.extended.update_non_slot.
    • Experimental support for shape invariants has been enabled in tf.function. See the API docs for tf.autograph.experimental.set_loop_options for additonal info.
    • AutoGraph error messages now exclude frames corresponding to APIs internal to AutoGraph.
    • Improve shape inference for tf.function input arguments to unlock more Grappler optimizations in TensorFlow 2.x.
    • Improve automatic control dependency management of resources by allowing resource reads to occur in parallel and synchronizing only on writes.
    • Fix execution order of multiple stateful calls to experimental_run_v2 in tf.function.
    • You can now iterate over RaggedTensors using a for loop inside tf.function.
  • tf.lite:
    • Migrated the tf.lite C inference API out of experimental into lite/c.
    • Add an option to disallow NNAPI CPU / partial acceleration on Android 10
    • TFLite Android AARs now include the C headers and APIs are required to use TFLite from native code.
    • Refactors the delegate and delegate kernel sources to allow usage in the linter.
    • Limit delegated ops to actually supported ones if a device name is specified or NNAPI CPU Fallback is disabled.
    • TFLite now supports tf.math.reciprocal1 op by lowering to tf.div op.
    • TFLite's unpack op now supports boolean tensor inputs.
    • Microcontroller and embedded code moved from experimental to main TensorFlow Lite folder
    • Check for large TFLite tensors.
    • Fix GPU delegate crash with C++17.
    • Add 5D support to TFLite strided_slice.
    • Fix error in delegation of DEPTH_TO_SPACE to NNAPI causing op not to be accelerated.
    • Fix segmentation fault when running a model with LSTM nodes using NNAPI Delegate
    • Fix NNAPI delegate failure when an operand for Maximum/Minimum operation is a scalar.
    • Fix NNAPI delegate failure when Axis input for reduce operation is a scalar.
    • Expose option to limit the number of partitions that will be delegated to NNAPI.
    • If a target accelerator is specified, use its feature level to determine operations to delegate instead of SDK version.
  • tf.random:
    • Various random number generation improvements:
      • Add a fast path for default random_uniform
      • random_seed documentation improvement.
      • RandomBinomial broadcasts and appends the sample shape to the left rather than the right.
    • Added tf.random.stateless_binomial, tf.random.stateless_gamma, tf.random.stateless_poisson
    • tf.random.stateless_uniform now supports unbounded sampling of int types.
  • Math and Linear Algebra:
    • Add tf.linalg.LinearOperatorTridiag.
    • Add LinearOperatorBlockLowerTriangular
    • Add broadcasting support to tf.linalg.triangular_solve#26204, tf.math.invert_permutation.
    • Add tf.math.sobol_sample op.
    • Add tf.math.xlog1py.
    • Add tf.math.special.{dawsn,expi,fresnel_cos,fresnel_sin,spence}.
    • Add a Modified Discrete Cosine Transform (MDCT) and its inverse to tf.signal.
  • TPU Enhancements:
    • Refactor TpuClusterResolver to move shared logic to a separate pip package.
    • Support configuring TPU software version from cloud tpu client.
    • Allowed TPU embedding weight decay factor to be multiplied by learning rate.
  • XLA Support:
    • Add standalone XLA AOT runtime target + relevant .cc sources to pip package.
    • Add check for memory alignment to MemoryAllocation::MemoryAllocation() on 32-bit ARM. This ensures a deterministic early exit instead of a hard to debug bus error later.
    • saved_model_cli aot_compile_cpu allows you to compile saved models to XLA header+object files and include them in your C++ programs.
    • Enable Igamma, Igammac for XLA.
  • Deterministic Op Functionality:
    • XLA reduction emitter is deterministic when the environment variable TF_DETERMINISTIC_OPS is set to "true" or "1". This extends deterministic tf.nn.bias_add back-prop functionality (and therefore also deterministic back-prop of bias-addition in Keras layers) to include when XLA JIT complilation is enabled.
    • Fix problem, when running on a CUDA GPU and when either environment variable TF_DETERMINSTIC_OPS or environment variable TF_CUDNN_DETERMINISTIC is set to "true" or "1", in which some layer configurations led to an exception with the message "No algorithm worked!"
  • Tracing and Debugging:
    • Add source, destination name to _send traceme to allow easier debugging.
    • Add traceme event to fastpathexecute.
  • Other:
    • Fix an issue with AUC.reset_states for multi-label AUC #35852
    • Fix the TF upgrade script to not delete files when there is a parsing error and the output mode is in-place.
    • Move tensorflow/core:framework/*_pyclif rules to tensorflow/core/framework:*_pyclif.

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

372046933, 8bitmp3, aaronhma, Abin Shahab, Aditya Patwardhan, Agoniii, Ahti Kitsik, Alan Yee, Albin Joy, Alex Hoffman, Alexander Grund, Alexandre E. Eichenberger, Amit Kumar Jaiswal, amoitra, Andrew Anderson, Angus-Luo, Anthony Barbier, Anton Kachatkou, Anuj Rawat, archis, Arpan-Dhatt, Arvind Sundararajan, Ashutosh Hathidara, autoih, Bairen Yi, Balint Cristian, Bas Aarts, BashirSbaiti, Basit Ayantunde, Ben Barsdell, Benjamin Gaillard, boron, Brett Koonce, Bryan Cutler, Christian Goll, Christian Sachs, Clayne Robison, comet, Daniel Falbel, Daria Zhuravleva, darsh8200, David Truby, Dayananda-V, deepakm, Denis Khalikov, Devansh Singh, Dheeraj R Reddy, Diederik Van Liere, Diego Caballero, Dominic Jack, dothinking, Douman, Drake Gens, Duncan Riach, Ehsan Toosi, ekuznetsov139, Elena Zhelezina, elzino, Ending2015a, Eric Schweitz, Erik Zettel, Ethan Saadia, Eugene Kuznetsov, Evgeniy Zheltonozhskiy, Ewout Ter Hoeven, exfalso, FAIJUL, Fangjun Kuang, Fei Hu, Frank Laub, Frederic Bastien, Fredrik Knutsson, frreiss, Frédéric Rechtenstein, fsx950223, Gaurav Singh, gbaned, George Grzegorz Pawelczak, George Sterpu, Gian Marco Iodice, Giorgio Arena, Hans Gaiser, Hans Pabst, Haoyu Wu, Harry Slatyer, hsahovic, Hugo, Hugo Sjöberg, IrinaM21, jacco, Jake Tae, Jean-Denis Lesage, Jean-Michel Gorius, Jeff Daily, Jens Elofsson, Jerry Shih, jerryyin, Jin Mingjian, Jinjing Zhou, JKIsaacLee, jojimonv, Jonathan Dekhtiar, Jose Ignacio Gomez, Joseph-Rance, Judd, Julian Gross, Kaixi Hou, Kaustubh Maske Patil, Keunwoo Choi, Kevin Hanselman, Khor Chean Wei, Kilaru Yasaswi Sri Chandra Gandhi, Koan-Sin Tan, Koki Ibukuro, Kristian Holsheimer, kurileo, Lakshay Tokas, Lee Netherton, leike666666, Leslie-Fang-Intel, Li, Guizi, LIUJIAN435, Lukas Geiger, Lyo Nguyen, madisetti, Maher Jendoubi, Mahmoud Abuzaina, Manuel Freiberger, Marcel Koester, Marco Jacopo Ferrarotti, Markus Franke, marload, Mbah-Javis, mbhuiyan, Meng Zhang, Michael Liao, MichaelKonobeev, Michal Tarnowski, Milan Straka, minoring, Mohamed Nour Abouelseoud, MoussaMM, Mrinal Jain, mrTsjolder, Måns Nilsson, Namrata Bhave, Nicholas Gao, Niels Ole Salscheider, nikochiko, Niranjan Hasabnis, Nishidha Panpaliya, nmostafa, Noah Trenaman, nuka137, Officium, Owen L - Sfe, Pallavi G, Paul Andrey, Peng Sun, Peng Wu, Phil Pearl, PhilipMay, pingsutw, Pooya Davoodi, PragmaTwice, pshiko, Qwerty71, R Gomathi, Rahul Huilgol, Richard Xiao, Rick Wierenga, Roberto Rosmaninho, ruchit2801, Rushabh Vasani, Sami, Sana Damani, Sarvesh Dubey, Sasan Jafarnejad, Sergii Khomenko, Shane Smiskol, Shaochen Shi, sharkdtu, Shawn Presser, ShengYang1, Shreyash Patodia, Shyam Sundar Dhanabalan, Siju Samuel, Somyajit Chakraborty Sam, Srihari Humbarwadi, srinivasan.narayanamoorthy, Srishti Yadav, Steph-En-M, Stephan Uphoff, Stephen Mugisha, SumanSudhir, Taehun Kim, Tamas Bela Feher, TengLu, Tetragramm, Thierry Herrmann, Tian Jin, tigertang, Tom Carchrae, Tom Forbes, Trent Lo, Victor Peng, vijayphoenix, Vincent Abriou, Vishal Bhola, Vishnuvardhan Janapati, vladbataev, VoVAllen, Wallyss Lima, Wen-Heng (Jack) Chung, wenxizhu, William D. Irons, William Zhang, Xiaoming (Jason) Cui, Xiaoquan Kong, Xinan Jiang, Yasir Modak, Yasuhiro Matsumoto, Yaxun (Sam) Liu, Yong Tang, Ytyt-Yt, yuan, Yuan Mingshuai, Yuan Tang, Yuki Ueda, Yusup, zhangshijin, zhuwenxi

Assets 2
Pre-release
Pre-release

@tensorflow-jenkins tensorflow-jenkins released this Apr 14, 2020 · 98 commits to r2.2 since this release

Release 2.2.0

Major Features and Improvements

  • Replaced the scalar type for string tensors from std::string to tensorflow::tstring which is now ABI stable.

  • A new Profiler for TF 2 for CPU/GPU/TPU. It offers both device and host performance analysis, including input pipeline and TF Ops. Optimization advisory is provided whenever possible. Please see this tutorial and guide for usage guidelines.

  • Export C++ functions to Python using pybind11 as opposed to SWIG as a part of our deprecation of swig efforts.

  • tf.distribute:

    • Support added for global sync BatchNormalization by using the newly added tf.keras.layers.experimental.SyncBatchNormalization layer. This layer will sync BatchNormalization statistics every step across all replicas taking part in sync training.
    • Performance improvements for GPU multi-worker distributed training using tf.distribute.experimental.MultiWorkerMirroredStrategy
      • Update NVIDIA NCCL to 2.5.7-1 for better performance and performance tuning. Please see nccl developer guide for more information on this.
      • Support gradient allreduce in float16. See this example usage.
      • Experimental support of all reduce gradient packing to allow overlapping gradient aggregation with backward path computation.
      • Deprecated experimental_run_v2 method for distribution strategies and renamed the method run as it is no longer experimental.
  • tf.keras:

    • Model.fit major improvements:
      • You can now use custom training logic with Model.fit by overriding Model.train_step.
      • Easily write state-of-the-art training loops without worrying about all of the features Model.fit handles for you (distribution strategies, callbacks, data formats, looping logic, etc)
      • See the default Model.train_step for an example of what this function should look like. Same applies for validation and inference via Model.test_step and Model.predict_step.
      • SavedModel uses its own Model._saved_model_inputs_spec attr now instead of
        relying on Model.inputs and Model.input_names, which are no longer set for subclass Models.
        This attr is set in eager, tf.function, and graph modes. This gets rid of the need for users to
        manually call Model._set_inputs when using Custom Training Loops(CTLs).
      • Dynamic shapes are supported for generators by calling the Model on the first batch we "peek" from the generator.
        This used to happen implicitly in Model._standardize_user_data. Long-term, a solution where the
        DataAdapter doesn't need to call the Model is probably preferable.
    • The SavedModel format now supports all Keras built-in layers (including metrics, preprocessing layers, and stateful RNN layers)
    • Update Keras batch normalization layer to use the running mean and average computation in the fused_batch_norm. You should see significant performance improvements when using fused_batch_norm in Eager mode.
  • tf.lite:

    • Enable TFLite experimental new converter by default.
  • XLA

    • XLA now builds and works on windows. All prebuilt packages come with XLA available.
    • XLA can be enabled for a tf.function with “compile or throw exception” semantics on CPU and GPU.

Breaking Changes

  • tf.keras:
    • In tf.keras.applications the name of the "top" layer has been standardized to "predictions". This is only a problem if your code relies on the exact name of the layer.
    • Huber loss function has been updated to be consistent with other Keras losses. It now computes mean over the last axis of per-sample losses before applying the reduction function.
  • AutoGraph no longer converts functions passed to tf.py_function, tf.py_func and tf.numpy_function.
  • Deprecating XLA_CPU and XLA_GPU devices with this release.
  • Increasing the minimum bazel version to build TF to 2.0.0 to use Bazel's cc_experimental_shared_library.
  • Keras compile/fit behavior for functional and subclassed models have been unified. Model properties such as metrics, metrics_names will now be available only after training/evaluating the model on actual data for functional models. metrics will now include model loss and output losses.loss_functions property has been removed from the model. This was an undocumented property that was accidentally public and has now been removed.

Known Caveats

  • Due to certain unforeseen circumstances, we are unable to release MacOS py3.8 binaries, but Windows/Linux binaries for py3.8 are available.
  • The current TensorFlow release now requires gast version 0.3.3.

Bug Fixes and Other Changes

  • tf.data:
    • Removed autotune_algorithm from experimental optimization options.
  • TF Core:
    • tf.constant always creates CPU tensors irrespective of the current device context.
    • Eager TensorHandles maintain a list of mirrors for any copies to local or remote devices. This avoids any redundant copies due to op execution.
    • For tf.Tensor & tf.Variable, .experimental_ref() is no longer experimental and is available as simply .ref().
    • pfor/vectorized_map: Added support for vectorizing 56 more ops. Vectorizing tf.cond is also supported now.
    • Set as much partial shape as we can infer statically within the gradient impl of the gather op.
    • Gradient of tf.while_loop emits StatelessWhile op if cond and body functions are stateless. This allows multiple gradients while ops to run in parallel under distribution strategy.
    • Speed up GradientTape in eager mode by auto-generating list of op inputs/outputs which are unused and hence not cached for gradient functions.
    • Support back_prop=False in while_v2 but mark it as deprecated.
    • Improve error message when attempting to use None in data-dependent control flow.
    • Add RaggedTensor.numpy().
    • Update RaggedTensor.__getitem__ to preserve uniform dimensions & allow indexing into uniform dimensions.
    • Update tf.expand_dims to always insert the new dimension as a non-ragged dimension.
    • Update tf.embedding_lookup to use partition_strategy and max_norm when ids is ragged.
    • Allow batch_dims==rank(indices) in tf.gather.
    • Add support for bfloat16 in tf.print.
  • tf.distribute:
    • Support embedding_column with variable-length input features for MultiWorkerMirroredStrategy.
  • tf.keras:
    • Added experimental_aggregate_gradients argument to tf.keras.optimizer.Optimizer.apply_gradients. This allows custom gradient aggregation and processing aggregated gradients in custom training loop.
    • Allow pathlib.Path paths for loading models via Keras API.
  • tf.function/AutoGraph:
    • AutoGraph is now available in ReplicaContext.merge_call, Strategy.extended.update and Strategy.extended.update_non_slot.
    • Experimental support for shape invariants has been enabled in tf.function. See the API docs for tf.autograph.experimental.set_loop_options for additonal info.
    • AutoGraph error messages now exclude frames corresponding to APIs internal to AutoGraph.
    • Improve shape inference for tf.function input arguments to unlock more Grappler optimizations in TensorFlow 2.x.
    • Improve automatic control dependency management of resources by allowing resource reads to occur in parallel and synchronizing only on writes.
    • Fix execution order of multiple stateful calls to experimental_run_v2 in tf.function.
    • You can now iterate over RaggedTensors using a for loop inside tf.function.
  • tf.lite:
    • Migrated the tf.lite C inference API out of experimental into lite/c.
    • Add an option to disallow NNAPI CPU / partial acceleration on Android 10
    • TFLite Android AARs now include the C headers and APIs are required to use TFLite from native code.
    • Refactors the delegate and delegate kernel sources to allow usage in the linter.
    • Limit delegated ops to actually supported ones if a device name is specified or NNAPI CPU Fallback is disabled.
    • TFLite now supports tf.math.reciprocal1 op by lowering to tf.div op.
    • TFLite's unpack op now supports boolean tensor inputs.
    • Microcontroller and embedded code moved from experimental to main TensorFlow Lite folder
    • Check for large TFLite tensors.
    • Fix GPU delegate crash with C++17.
    • Add 5D support to TFLite strided_slice.
    • Fix error in delegation of DEPTH_TO_SPACE to NNAPI causing op not to be accelerated.
    • Fix segmentation fault when running a model with LSTM nodes using NNAPI Delegate
    • Fix NNAPI delegate failure when an operand for Maximum/Minimum operation is a scalar.
    • Fix NNAPI delegate failure when Axis input for reduce operation is a scalar.
    • Expose option to limit the number of partitions that will be delegated to NNAPI.
    • If a target accelerator is specified, use its feature level to determine operations to delegate instead of SDK version.
  • tf.random:
    • Various random number generation improvements:
      • Add a fast path for default random_uniform
      • random_seed documentation improvement.
      • RandomBinomial broadcasts and appends the sample shape to the left rather than the right.
    • Added tf.random.stateless_binomial, tf.random.stateless_gamma, tf.random.stateless_poisson
    • tf.random.stateless_uniform now supports unbounded sampling of int types.
  • Math and Linear Algebra:
    • Add tf.linalg.LinearOperatorTridiag.
    • Add LinearOperatorBlockLowerTriangular
    • Add broadcasting support to tf.linalg.triangular_solve#26204, tf.math.invert_permutation.
    • Add tf.math.sobol_sample op.
    • Add tf.math.xlog1py.
    • Add tf.math.special.{dawsn,expi,fresnel_cos,fresnel_sin,spence}.
    • Add a Modified Discrete Cosine Transform (MDCT) and its inverse to tf.signal.
  • TPU Enhancements:
    • Refactor TpuClusterResolver to move shared logic to a separate pip package.
    • Support configuring TPU software version from cloud tpu client.
    • Allowed TPU embedding weight decay factor to be multiplied by learning rate.
  • XLA Support:
    • Add standalone XLA AOT runtime target + relevant .cc sources to pip package.
    • Add check for memory alignment to MemoryAllocation::MemoryAllocation() on 32-bit ARM. This ensures a deterministic early exit instead of a hard to debug bus error later.
    • saved_model_cli aot_compile_cpu allows you to compile saved models to XLA header+object files and include them in your C++ programs.
    • Enable Igamma, Igammac for XLA.
    • XLA reduction emitter is deterministic when the environment variable TF_DETERMINISTIC_OPS is set.
  • Tracing and Debugging:
    • Add source, destination name to _send traceme to allow easier debugging.
    • Add traceme event to fastpathexecute.
  • Other:
    • Fix an issue with AUC.reset_states for multi-label AUC #35852
    • Fix the TF upgrade script to not delete files when there is a parsing error and the output mode is in-place.
    • Move tensorflow/core:framework/*_pyclif rules to tensorflow/core/framework:*_pyclif.

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

372046933, 8bitmp3, aaronhma, Abin Shahab, Aditya Patwardhan, Agoniii, Ahti Kitsik, Alan Yee, Albin Joy, Alex Hoffman, Alexander Grund, Alexandre E. Eichenberger, Amit Kumar Jaiswal, amoitra, Andrew Anderson, Angus-Luo, Anthony Barbier, Anton Kachatkou, Anuj Rawat, archis, Arpan-Dhatt, Arvind Sundararajan, Ashutosh Hathidara, autoih, Bairen Yi, Balint Cristian, Bas Aarts, BashirSbaiti, Basit Ayantunde, Ben Barsdell, Benjamin Gaillard, boron, Brett Koonce, Bryan Cutler, Christian Goll, Christian Sachs, Clayne Robison, comet, Daniel Falbel, Daria Zhuravleva, darsh8200, David Truby, Dayananda-V, deepakm, Denis Khalikov, Devansh Singh, Dheeraj R Reddy, Diederik Van Liere, Diego Caballero, Dominic Jack, dothinking, Douman, Drake Gens, Duncan Riach, Ehsan Toosi, ekuznetsov139, Elena Zhelezina, elzino, Ending2015a, Eric Schweitz, Erik Zettel, Ethan Saadia, Eugene Kuznetsov, Evgeniy Zheltonozhskiy, Ewout Ter Hoeven, exfalso, FAIJUL, Fangjun Kuang, Fei Hu, Frank Laub, Frederic Bastien, Fredrik Knutsson, frreiss, Frédéric Rechtenstein, fsx950223, Gaurav Singh, gbaned, George Grzegorz Pawelczak, George Sterpu, Gian Marco Iodice, Giorgio Arena, Hans Gaiser, Hans Pabst, Haoyu Wu, Harry Slatyer, hsahovic, Hugo, Hugo Sjöberg, IrinaM21, jacco, Jake Tae, Jean-Denis Lesage, Jean-Michel Gorius, Jeff Daily, Jens Elofsson, Jerry Shih, jerryyin, Jin Mingjian, Jinjing Zhou, JKIsaacLee, jojimonv, Jonathan Dekhtiar, Jose Ignacio Gomez, Joseph-Rance, Judd, Julian Gross, Kaixi Hou, Kaustubh Maske Patil, Keunwoo Choi, Kevin Hanselman, Khor Chean Wei, Kilaru Yasaswi Sri Chandra Gandhi, Koan-Sin Tan, Koki Ibukuro, Kristian Holsheimer, kurileo, Lakshay Tokas, Lee Netherton, leike666666, Leslie-Fang-Intel, Li, Guizi, LIUJIAN435, Lukas Geiger, Lyo Nguyen, madisetti, Maher Jendoubi, Mahmoud Abuzaina, Manuel Freiberger, Marcel Koester, Marco Jacopo Ferrarotti, Markus Franke, marload, Mbah-Javis, mbhuiyan, Meng Zhang, Michael Liao, MichaelKonobeev, Michal Tarnowski, Milan Straka, minoring, Mohamed Nour Abouelseoud, MoussaMM, Mrinal Jain, mrTsjolder, Måns Nilsson, Namrata Bhave, Nicholas Gao, Niels Ole Salscheider, nikochiko, Niranjan Hasabnis, Nishidha Panpaliya, nmostafa, Noah Trenaman, nuka137, Officium, Owen L - Sfe, Pallavi G, Paul Andrey, Peng Sun, Peng Wu, Phil Pearl, PhilipMay, pingsutw, Pooya Davoodi, PragmaTwice, pshiko, Qwerty71, R Gomathi, Rahul Huilgol, Richard Xiao, Rick Wierenga, Roberto Rosmaninho, ruchit2801, Rushabh Vasani, Sami, Sana Damani, Sarvesh Dubey, Sasan Jafarnejad, Sergii Khomenko, Shane Smiskol, Shaochen Shi, sharkdtu, Shawn Presser, ShengYang1, Shreyash Patodia, Shyam Sundar Dhanabalan, Siju Samuel, Somyajit Chakraborty Sam, Srihari Humbarwadi, srinivasan.narayanamoorthy, Srishti Yadav, Steph-En-M, Stephan Uphoff, Stephen Mugisha, SumanSudhir, Taehun Kim, Tamas Bela Feher, TengLu, Tetragramm, Thierry Herrmann, Tian Jin, tigertang, Tom Carchrae, Tom Forbes, Trent Lo, Victor Peng, vijayphoenix, Vincent Abriou, Vishal Bhola, Vishnuvardhan Janapati, vladbataev, VoVAllen, Wallyss Lima, Wen-Heng (Jack) Chung, wenxizhu, William D. Irons, William Zhang, Xiaoming (Jason) Cui, Xiaoquan Kong, Xinan Jiang, Yasir Modak, Yasuhiro Matsumoto, Yaxun (Sam) Liu, Yong Tang, Ytyt-Yt, yuan, Yuan Mingshuai, Yuan Tang, Yuki Ueda, Yusup, zhangshijin, zhuwenxi

Assets 2
You can’t perform that action at this time.