Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
2574 commits
Select commit Hold shift + click to select a range
d8c60d3
Add a new ShouldFuse method that takes a function to decide whether t…
tensorflower-gardener Sep 12, 2024
0fcf562
Create the concept of pass_regisrty_utils that will offer boilerplate…
vamsimanchala Sep 12, 2024
62de590
Integrate LLVM at llvm/llvm-project@e55d6f5ea265
tensorflower-gardener Sep 12, 2024
f920c2f
Check for dynamism when fusing reshapes.
vamsimanchala Sep 12, 2024
29bc624
Automated Code Change
tensorflower-gardener Sep 12, 2024
1d45467
Automated Code Change
tensorflower-gardener Sep 12, 2024
4b0d407
Automated Code Change
tensorflower-gardener Sep 12, 2024
6a6b58d
NFC: Remove default constructor in HloInstructionAdaptor so that the …
chsigg Sep 12, 2024
fcc9ce2
PR #17044: Explain the different PTX compilation and linking methods.
dimvar Sep 12, 2024
a31cd33
PR #17053: [GPU] cuDNN GEMM: handle more tensors with 1-sized dimensi…
sergachev Sep 12, 2024
d7e1643
Introduce a dependency violation check
beckerhe Sep 12, 2024
d295c30
Remove unused DeviceDescription::runtime_version_string and driver_ve…
beckerhe Sep 12, 2024
a550824
Automated Code Change
tensorflower-gardener Sep 12, 2024
93c74eb
Update GraphDef version to 1983.
tensorflower-gardener Sep 12, 2024
85cc9cd
compat: Update forward compatibility horizon to 2024-09-12
tensorflower-gardener Sep 12, 2024
18257c5
[XLA:GPU] Compute a block-specific base ptr, and residual shapes when…
dimitar-asenov Sep 12, 2024
5025efa
Add comments to GemmFusionAutotuner and rename TilingConfig to Backen…
derdrdirk Sep 12, 2024
eabda5b
[XLA:GPU] Remove `BlockLevelParameters` parameter to `CreateTritonIrA…
bchetioui Sep 12, 2024
26e7d34
Integrate LLVM at llvm/llvm-project@3cd01371e007
tensorflower-gardener Sep 12, 2024
1afc3e8
[XLA:GPU] Move part of `SymbolicTileAnalysis`'s constraint-checking l…
bchetioui Sep 12, 2024
ba46bf5
Remove unused dependencies on cuda_activation.
klucke Sep 12, 2024
0364a96
Stop using the ScopedActivateExecutorContext wrapper class in favor o…
klucke Sep 12, 2024
c56f6f2
Reverts 31001e011efa7c1acca4f0fc82e39b7d85d5ac30
tensorflower-gardener Sep 12, 2024
fb67f6f
Expand the set of strategies generated for kScatter HLO ops.
tensorflower-gardener Sep 12, 2024
dedbdfd
[xla:ffi] Add support for returning xla::ffi::Future from FFI handler
ezhulenev Sep 12, 2024
aac8b1d
Deprecating GPU compatibility experimental feature from both tf/compi…
ecalubaquib Sep 12, 2024
c7a31d1
Integrate LLVM at llvm/llvm-project@36adf8ecedb6
tensorflower-gardener Sep 12, 2024
5ae35fb
[XLA:GPU] Remove temporary workaround.
akuegel Sep 12, 2024
cfa7bed
Move `tsl/profiler/rpc` to `xla/tsl/profiler/rpc`
ddunl Sep 12, 2024
9d64133
Reverts 65811e53f11472c8be120b269a8eac741137649b
tensorflower-gardener Sep 12, 2024
f879b13
Allow python interpreter to access tensor details for all the subgraphs.
rewu93 Sep 12, 2024
2e50040
Move CUDA configuration paragraph up in the document.
tensorflower-gardener Sep 12, 2024
3f7f0ae
Add noop kernel on empty conditional graph as a workaround.
IllogicalMoose Sep 12, 2024
f411224
Move `BatchedGatherScatterNormalizer` to `PreOptimizationPipeline` be…
tomnatan30 Sep 12, 2024
391af23
[xla:ffi] Add support for returning tsl::AsyncValueRef from FFI handler
ezhulenev Sep 12, 2024
18c2514
Expose LocalDeviceManager in LiteSessionWrapper
tensorflower-gardener Sep 12, 2024
4aba241
mlir_hlo_to_hlo: Propagate diagnostics from PrepareForExport.
pizzud Sep 12, 2024
228f636
Reverts ba46bf55c91773be94f5983dad8bd61a970c5c05
klucke Sep 12, 2024
4e790e8
Insert default value attributes when converting dialect ops to node d…
tensorflower-gardener Sep 12, 2024
2deb71a
Fail fast if required folders under CUDA/CUDNN/NCCL local paths are n…
tensorflower-gardener Sep 12, 2024
6411af2
Update default CUDA Toolkit version to 12.5.1
beckerhe Sep 12, 2024
5bbcead
[XLA:GPU] Build a workaround to fix tiling propagation for Triton whe…
bchetioui Sep 12, 2024
00c8f64
No public description
zzzaries Sep 12, 2024
50dd24f
Reorder optimization passes in tflite to tackle BatchMatMul optimizat…
vamsimanchala Sep 12, 2024
a18f7d6
Turn ScopedActivateExecutorContext into an alias to ScopedActivateCon…
klucke Sep 12, 2024
5adafde
Fix a bug where treedef.flatten_up_to(...) was overly permissive for …
hawkinsp Sep 12, 2024
04d81ab
[XLA:GPU] Remove the forced cast to `f32` when generating Triton redu…
dimitar-asenov Sep 12, 2024
e4a35b4
No public description
tensorflower-gardener Sep 12, 2024
cd563c3
Use ScopedActivateContext instead of wrapper ScopedActivateExecutorCo…
klucke Sep 12, 2024
dadeda8
We do not want to prevent _all_ cublas fallback (even those not invol…
tensorflower-gardener Sep 12, 2024
be8998c
Remove unnecessary dependency on gpu_activation.h.
klucke Sep 12, 2024
bd5f62a
[MHLO] Fix typo in mhlo.dynamic_broadcast_in_dim op argument name
abhigunj Sep 12, 2024
5397391
device_manager_ creation must be guarded by mutex so that concurrent …
SiqiaoWu1993 Sep 12, 2024
d58d166
No public description
yangustc07 Sep 13, 2024
3f35d42
PR #16921: [PJRT:GPU] Treat GPU collective memory space as device mem…
zhenying-liu Sep 13, 2024
52be9aa
Prepare code for breaking change in Protobuf C++ API.
evalon32 Sep 13, 2024
54acff0
Reverts 5adafde02706dfe24e3a56744d0baecfdc3df90b
hawkinsp Sep 13, 2024
fb500de
Automated Code Change
tensorflower-gardener Sep 13, 2024
ec50b3a
Fix space-to-batch propagation bug on reduce.
amitsabne1 Sep 13, 2024
80c672d
Automated Code Change
tensorflower-gardener Sep 13, 2024
7f9abb8
[lrt-model] model implementation
LukeBoyer Sep 13, 2024
18efd7e
[lrt-compiler-plugin] plugin header
LukeBoyer Sep 13, 2024
be42da0
compat: Update forward compatibility horizon to 2024-09-13
tensorflower-gardener Sep 13, 2024
24133c6
Update GraphDef version to 1984.
tensorflower-gardener Sep 13, 2024
7e5658f
GPU TopK custom call needs the same layout for operand and output.
akuegel Sep 13, 2024
8e090e5
[XLA:GPU][NFC] Move "tile size fits in registers" check to a separate…
olegshyshkov Sep 13, 2024
8da03ba
#sdy Add support for exporting nested ManualComputations in MHLO export.
bartchr808 Sep 13, 2024
0b37b3d
[XLA:GPU] Make sure that xla_gpu_enable_pgle_accuracy_checker is fals…
golechwierowicz Sep 13, 2024
f10f55b
Add custom kernel fusion to gemm fusion autotuner.
derdrdirk Sep 13, 2024
7ebc4a3
#sdy Move shard map import pass to mhlo_round_trip.
bartchr808 Sep 13, 2024
0a52736
[XLA:GPU] Use padded tile size to estimate FLOPs and choose the numbe…
olegshyshkov Sep 13, 2024
2654559
Workaround a potential compile breakage with Clang HEAD and header mo…
ilya-biryukov Sep 13, 2024
1135e7d
#sdy Add sdy-round-trip-shard-map-export Pass.
bartchr808 Sep 13, 2024
736c424
Update visibility for an internal project.
tensorflower-gardener Sep 13, 2024
89fbe40
Make ptx_compilation_test less strict and more robust
beckerhe Sep 13, 2024
0a5416d
[PJRT] Use `typedef struct X {} X;` idiom for PJRT_Api.
mooskagh Sep 13, 2024
eeec584
internal BUILD rule visibility
tensorflower-gardener Sep 13, 2024
0b03ec9
Delete unused rocm_activation.h.
klucke Sep 13, 2024
0a454fc
Remove unused cuda_dnn_headers target.
klucke Sep 13, 2024
4fd91cf
[xla:cpu] Temporarily disable fast F64 Tanh due to incorrect results.
penpornk Sep 13, 2024
70db77c
Fix some layout test failures on gpu backend
kanglant Sep 13, 2024
850da00
Integrate LLVM at llvm/llvm-project@f0b3287297ae
tensorflower-gardener Sep 13, 2024
ebb6ee4
Remove the use of gpu_activation.h in rocm code.
klucke Sep 13, 2024
8ec1fbc
Add support for Tegra chips in hermetic CUDA rules.
tensorflower-gardener Sep 13, 2024
241337e
#tf-data Fixes `data_service_ops.py` broken doc test in NumPy 2.0 upg…
jimlinntu Sep 13, 2024
47d1906
[xla:ffi] Add CallAsync for invoking potentially asyncrhronous FFI ha…
ezhulenev Sep 13, 2024
6ecdeb4
Add coming changes related to LiteRT repo to release.md
tensorflower-gardener Sep 13, 2024
fc6a89c
Add manual tag to lrt_model_test
LukeBoyer Sep 13, 2024
763e8ed
Clarify that `allocated_bytes` refers to heap only, not total memory …
fergushenderson Sep 13, 2024
742f770
Compute memory usage values that are shown as "MB" as megabytes (1000…
fergushenderson Sep 13, 2024
16b07ae
[lrt-compiler-plugin] dummy example plugin impl
LukeBoyer Sep 13, 2024
c2b90c0
[lite-rt-compiler-plugin] Basic algos needed to plugin application.
LukeBoyer Sep 13, 2024
a0dd5c0
[lrt-model] Basic model re-serialization.
LukeBoyer Sep 13, 2024
36fce8f
Rollback of change breaking downstream tests.
tensorflower-gardener Sep 13, 2024
b0a8383
Add support for kConvolution to FusionWrapper.
klucke Sep 13, 2024
3dc1596
[TOCO Removal] Use new converter/model flag proto defs in converter d…
arfaian Sep 13, 2024
42faf39
Add support for keep dims in current sum folding
LukeBoyer Sep 13, 2024
2dcfc06
Support multiple TF platforms at the same time
tensorflower-gardener Sep 13, 2024
aeb9cda
Add debug metadata support to TFLite flatbuffer schema.
tensorflower-gardener Sep 14, 2024
5c4a4bb
Breaks Mosaic tests.
tensorflower-gardener Sep 14, 2024
ad22c0b
Add a test.
amitsabne1 Sep 14, 2024
c0424ea
HostOffloading: Properly handle XLA tuple shapes in host compute outp…
tensorflower-gardener Sep 14, 2024
1573083
[xla:ffi] Add benchmarks for async FFI calls
ezhulenev Sep 14, 2024
052ca43
[xla:ffi] Optimize returning tsl::AsyncValueRef from FFI handler
ezhulenev Sep 14, 2024
3519b3d
Fix a typo in the code for enumerating sharding strategies for kScatt…
tensorflower-gardener Sep 14, 2024
96f85be
Automated Code Change
tensorflower-gardener Sep 14, 2024
1cc16d1
Bump 4w compat to v1.5.0 release Aug 1, 2024
GleasonK Sep 14, 2024
2fca3f7
Add missing field when initialize the CuptiTracerEvent.
tensorflower-gardener Sep 14, 2024
6468955
CHLO -> StableHLO : use TanOp and StablehloCreateCompatibilityExpande…
abhigunj Sep 14, 2024
c744227
[HLO Componentization] Create pass sub-component
sdasgup3 Sep 14, 2024
732d8a0
Automated Code Change
tensorflower-gardener Sep 14, 2024
7d18d2d
compat: Update forward compatibility horizon to 2024-09-14
tensorflower-gardener Sep 14, 2024
b989b07
Update GraphDef version to 1985.
tensorflower-gardener Sep 14, 2024
8e39d28
Automated Code Change
tensorflower-gardener Sep 14, 2024
f261a4a
Automated Code Change
tensorflower-gardener Sep 14, 2024
abdf78e
Use `xla::GetDefaultStablehloVersion` with ~12w compatibility require…
tomnatan30 Sep 14, 2024
84a4740
Allow propagations on reduce to occur
amitsabne1 Sep 14, 2024
14fbade
[XLA] Propagate the layout of layout constrained custom calls with hi…
blakehechtman Sep 15, 2024
6d72267
Update GraphDef version to 1986.
tensorflower-gardener Sep 15, 2024
04daa7d
compat: Update forward compatibility horizon to 2024-09-15
tensorflower-gardener Sep 15, 2024
779c7e5
Pull out broadcasts from splat const after phase 1.
LukeBoyer Sep 15, 2024
6537419
Add a specialization of IsEqual for bfloat16 based on the specializat…
LarryLansing Sep 15, 2024
d17fe71
Fix typo
syzygial Sep 15, 2024
4569755
Automated Code Change
tensorflower-gardener Sep 16, 2024
3f9a8c4
[Distributed Eager] Fix reference counting of handle objects in the S…
mrry Sep 16, 2024
976b3eb
[TOCO Removal] Remove enable_mlir_converter from Python API.
arfaian Sep 16, 2024
d2df51c
Add TFLite flatbuffer debug metadata serialization logic.
tensorflower-gardener Sep 16, 2024
c704865
[XLA:TPU] Add LoopOptimizerBestFitHeap class that models alternate me…
subhankarshah Sep 16, 2024
c0db4ce
Add pre simulation device assignment to `HloModuleConfig` to use in c…
subhankarshah Sep 16, 2024
aa76e3e
Automated Code Change
tensorflower-gardener Sep 16, 2024
10ebf7e
Reverts f10f55bb8c13f8fd7e9ce104615198964570ef08
derdrdirk Sep 16, 2024
7b92e74
Update GraphDef version to 1987.
tensorflower-gardener Sep 16, 2024
1f07ad6
compat: Update forward compatibility horizon to 2024-09-16
tensorflower-gardener Sep 16, 2024
eb9f104
[XLA:GPU] Run post-scheduling verification under the `xla_gpu_enable_…
golechwierowicz Sep 16, 2024
7bb8b46
[XLA:GPU] Remove debug spew in triton_fusion_emitter_device_legacy_test.
chsigg Sep 16, 2024
f44dde6
PR #17182: Parametrize ConstantsFloatTest OneCellFloat
apivovarov Sep 16, 2024
ad0d77d
PR #17156: [ROCm] Skip ConditionalIfWithMemset on ROCm, introduced in…
hsharsha Sep 16, 2024
5bb9ea8
PR #17130: Add default upcasting behavior to DoWithUpcastToF32
apivovarov Sep 16, 2024
6e7842c
PR #17112: Clean up for legacy comment and misleading error message.
elfiegg Sep 16, 2024
94f723a
PR #17058: Replace "Navi" with corresponding public product names
ScXfjiang Sep 16, 2024
d1447b5
PR #17135: Add TypeParam to FP8E4M3DistanceTest
apivovarov Sep 16, 2024
681a7cd
PR #17177: Parametrize FloatNormalizationF8Test ResolveIfUnsupportedF8
apivovarov Sep 16, 2024
ca58a14
[XLA] Use assertion_result instead of LOG(ERROR) in HloTestBase::Prin…
chsigg Sep 16, 2024
cced9db
[XLA:GPU] Fix `CHECK:` directives in `cuddn_test.cc` to account for d…
dimitar-asenov Sep 16, 2024
4116626
[XLA:GPU] Fix InsertOp's lowering when applying indices
vwbaker Sep 16, 2024
64568fe
MLIR emitters: Fix multi-row reduction triggering and vectorization.
jreiffers Sep 16, 2024
6466e66
Update DPB metric names to better reflect their actual meaning.
fergushenderson Sep 16, 2024
69dae8a
[XLA:GPU] Support complex numbers in materialize & insert op lowering
vwbaker Sep 16, 2024
2f7a4c6
Merge pull request #75567 from tensorflow:tilakrayal-patch-3
tensorflower-gardener Sep 16, 2024
b26c4df
Update `cuda_clang_official` config with CUDA 12.5.1.
tensorflower-gardener Sep 16, 2024
202e9d1
Fix reductions with side outputs that are unrelated to any reduction.
jreiffers Sep 16, 2024
8664a7e
[XLA:GPU] Add some logging with the fusing decisions.
loislo Sep 16, 2024
754f492
[XLA:GPU] Bail out during `SymbolicTileAnalysis` if standalone tile d…
bchetioui Sep 16, 2024
c5589c7
Reverts 64568fe2396adb34c616bc386c6a9da082332a53
pifon2a Sep 16, 2024
d9f9977
[XLA:TPU] Update memory bound loop optimizer tests such that the requ…
subhankarshah Sep 16, 2024
b6f6a73
cleanup: remove cc_stubby_versions from BUILD and bazel files
ericsalo Sep 16, 2024
3e7ebb9
Update np_test.py cuda config to be compatible with numpy 2.0.0.
tensorflower-gardener Sep 16, 2024
0ace693
Add cuda c++ for command buffer kernels
IllogicalMoose Sep 16, 2024
b60283f
#sdy Add sdy round trip shard map import.
bartchr808 Sep 16, 2024
d2903d1
Merge pull request #75822 from syzygial:fix_typo
tensorflower-gardener Sep 16, 2024
763beea
[HLO Componentization] Create pass sub-component
sdasgup3 Sep 16, 2024
468b368
cleanup: remove cc_stubby_versions from BUILD and bazel files
ericsalo Sep 16, 2024
cb2bf69
[xla:python] Move registration of xla_python_gpu_callback into GPU cl…
dfm Sep 16, 2024
95aef2f
Remove the usage of internal bridge passes from prepare_tf.cc
tensorflower-gardener Sep 16, 2024
f56d646
Update to use a static global variable for the devices retrieved from…
changhuilin Sep 16, 2024
bb2160b
[XLA:GPU] Update gpu_command_buffer ConditionalCase to support > 8 br…
IllogicalMoose Sep 16, 2024
23b5e27
Fix the bug when using threshold as default cost for some ops
SiqiaoWu1993 Sep 16, 2024
ab504aa
Add Python 3.13.0rc2 support to rules_python in a form of a patch.
vam-google Sep 16, 2024
92c55b7
Stop using the `ScopedActivateExecutorContext` wrapper class in favor…
dimitar-asenov Sep 16, 2024
994547c
Automated Code Change
ckennelly Sep 16, 2024
1d83938
Adds accessor methods to StrategyGroup (so that clients can't directl…
tensorflower-gardener Sep 16, 2024
4ce2ff4
Allow closures for ErrorSpecGen in exhasutive test
Sep 16, 2024
d3d0624
Integrate Triton up to [50d803cd](https://github.com/openai/triton/co…
chsigg Sep 16, 2024
137e993
[HLO Componentization] Create pass sub-component
sdasgup3 Sep 16, 2024
bdf868d
Cleanup in BFloat16Propagation.
SandSnip3r Sep 16, 2024
76bb934
Use compiler version of model_builder
tensorflower-gardener Sep 16, 2024
c17a980
Fix failing build for LOCAL_CUDA_PATH without NVIDIA driver inside `<…
tensorflower-gardener Sep 16, 2024
1628e27
Add transmission/delay budgets to XProf MXLA trace/graph viewer.
tensorflower-gardener Sep 16, 2024
45550c1
Reverts 976b3eb8226ce31cad36bb28f40bc3666a59bc53
sirakiin Sep 17, 2024
3ecd4ca
Reverts 14fbade4a33070bcdbbfd1f9c118354b7618b848
tensorflower-gardener Sep 17, 2024
6ed0d07
PR #17170: Code dedup in execution_trace_utils LiteralToValue
apivovarov Sep 17, 2024
424e3a5
PR #15403: Handle multiple users in all-gather dynamic-slice simplifi…
patrick-toulme Sep 17, 2024
f078976
Decouples strategies from their associated input shardings.
tensorflower-gardener Sep 17, 2024
22372bd
Update copyright year and author for LiteRT notebooks: "Copyright 202…
ktonthat Sep 17, 2024
75e3018
Automated Code Change
tensorflower-gardener Sep 17, 2024
2949d16
Automated Code Change
tensorflower-gardener Sep 17, 2024
4b67701
Add move reshape after fc pass
chunnienc Sep 17, 2024
fcb903d
Reverts 95aef2fd140e5e049554029c7fdfc8d6815e2805
tensorflower-gardener Sep 17, 2024
2acd7f0
Automated Code Change
tensorflower-gardener Sep 17, 2024
3163ed8
Add enable FC keep num dims pass
chunnienc Sep 17, 2024
a71aa5f
Automated Code Change
tensorflower-gardener Sep 17, 2024
3533f43
Silence misleading `ALREADY_EXIST` logs. This error message is not he…
tensorflower-gardener Sep 17, 2024
165ac23
Mosaic tests are now fixed.
tensorflower-gardener Sep 17, 2024
4cfb36e
Automated Code Change
tensorflower-gardener Sep 17, 2024
b6fd032
Automated Code Change
tensorflower-gardener Sep 17, 2024
f3bffba
Update GraphDef version to 1988.
tensorflower-gardener Sep 17, 2024
18400cc
compat: Update forward compatibility horizon to 2024-09-17
tensorflower-gardener Sep 17, 2024
37d52a7
[XLA:GPU] Fix a bug in Cost Model that doesn't allow concatenate as o…
olegshyshkov Sep 17, 2024
50609ba
Fix ptx_compilation_test failure on H100
beckerhe Sep 17, 2024
d4d4a31
Remove calls to xnnpack from TFLite kernels.
alankelly Sep 17, 2024
d7f222d
Integrate LLVM at llvm/llvm-project@b39a100ff4ec
d0k Sep 17, 2024
0e94b9e
#tf-data Upgrade error for unexpected element lengths.
mpcallanan Sep 17, 2024
09aaa7b
[xla:doc] Update broken links to the deleted "Code review" page.
penpornk Sep 17, 2024
2cd19cb
[XLA:GPU][Emitters] Add xla_gpu.reduce op.
pifon2a Sep 17, 2024
4accf4b
[XLA:GPU] Allow Priority Fusion to fuse small constants into Triton f…
olegshyshkov Sep 17, 2024
de7bb7f
Remove unused cuda_activation target.
klucke Sep 17, 2024
a8125e3
Add Python 3.13.0rc2 to JAX Docker images with CUDA 12.3 and CUDA 12.1.
tensorflower-gardener Sep 17, 2024
57bc708
Add a few changes of int64->int32.
haozha111 Sep 17, 2024
1757863
Adjust cloning behavior to work properly for send + send-done pairs.
pschuh Sep 17, 2024
432d738
Integrate StableHLO at openxla/stablehlo@78c753ad
sdasgup3 Sep 17, 2024
f3f583e
Add CUDNN 9.4.0 to the list of redistributions.
tensorflower-gardener Sep 17, 2024
1287339
Add a command line flag to disable XLA GPU passes based on binary lib…
klucke Sep 17, 2024
50138e2
[xla:multihost_hlo_runner] Add an SpmdMode to enable use_shardy_parti…
bixia1 Sep 17, 2024
152e474
[HLO Componentization] Create pass sub-component
sdasgup3 Sep 17, 2024
6e03f23
Integrate LLVM at llvm/llvm-project@c23d6df60d62
d0k Sep 17, 2024
6102f1c
[XLA:SPMD] Fix scatter index-parallel partitioning issues.
Tongfei-Guo Sep 17, 2024
2751aea
#tf-data Allows `tf.data.experiemental.get_model_proto` to accept `Nu…
jimlinntu Sep 17, 2024
a8a294f
Remove unnecessary dependency from se_gpu_pjrt_compiler.
klucke Sep 17, 2024
1508689
Disable more binary libraries if the disable flag is true.
klucke Sep 17, 2024
1d7a921
[TSL] Bump ml_dtypes. Add float8_e4m3, float8_e3m4
apivovarov Sep 17, 2024
74d383a
Fix comment and add `ToString` function for `WhileMoveInfo`.
SandSnip3r Sep 17, 2024
b3f1a8c
Add constructor to MockSharding that also takes device list to allow …
tensorflower-gardener Sep 17, 2024
9351c81
Make np_random.poisson test compatible w/ windows numpy2x.
tensorflower-gardener Sep 17, 2024
abca9b5
Reverts f975479fb985a23b3cf1d1289dee8e09931d5d57
LukeBoyer Sep 17, 2024
f306822
[xla:cpu] Use ffi::CallAsync in custom call thunk
ezhulenev Sep 17, 2024
cec298c
Cost model now considers the compute latency in addition to its throu…
mehrdadkhani Sep 17, 2024
bc4b65a
[mhlo] fix ScatterOp::fold for batching dims
tomnatan30 Sep 17, 2024
983ce79
Fix FC enable keep_num_dims pass
chunnienc Sep 17, 2024
310d5fc
Delete unused cuda:cuda_activation_header target.
klucke Sep 17, 2024
1d62e63
Update project structure
LukeBoyer Sep 17, 2024
ff501e5
Refactor lambdas (`RunToFixPoint`/`run_to_fix_point` and `GetRelatedI…
SandSnip3r Sep 17, 2024
6aa5487
[Take 2] Generalize global jit cpp cache keys so we can add more keys…
pschuh Sep 17, 2024
ac13ec0
[XLA][HloDCE] Removal of unused outputs of fusions to consider multip…
tensorflower-gardener Sep 17, 2024
ebb9a63
Memory space related copies should not be normalized.
tensorflower-gardener Sep 17, 2024
1ea72da
"Include what you use" fixes.
fergushenderson Sep 17, 2024
35c2bea
Make cupti_buffer_events_test_cpu compile & link with --config=cuda.
klucke Sep 17, 2024
9a6827a
Rollback of change breaking downstream tests.
SiqiaoWu1993 Sep 17, 2024
2efd624
Move `tsl/lib/monitoring` to `xla/tsl/lib/monitoring`
ddunl Sep 18, 2024
4b32024
Integrate LLVM at llvm/llvm-project@815b0046b899
tensorflower-gardener Sep 18, 2024
9ac879b
Internal change to make the targets be visible to a new package.
niuchl Sep 18, 2024
fbe5db4
Automated Code Change
tensorflower-gardener Sep 18, 2024
efa9e38
Automated Code Change
tensorflower-gardener Sep 18, 2024
73f52ec
[XLA:GPU][Emitters] Fix RUN directive in test.
pifon2a Sep 18, 2024
d3d93d1
PR #17022: Add slicing -> reduce-scatter for dynamic-slice-fusion
shraiysh Sep 18, 2024
ed484f0
[HLO Componentization] Create pass sub-component
sdasgup3 Sep 18, 2024
f97d2f4
PR #17133: Dedup LiteralComparisonTests
apivovarov Sep 18, 2024
08e0969
compat: Update forward compatibility horizon to 2024-09-18
tensorflower-gardener Sep 18, 2024
1be648b
Update GraphDef version to 1989.
tensorflower-gardener Sep 18, 2024
c9aa896
Automated Code Change
tensorflower-gardener Sep 18, 2024
0d2fea3
Fix reduce result type when the minor-most dimension is not reduced.
jreiffers Sep 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
119 changes: 51 additions & 68 deletions .bazelrc

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion .github/workflows/osv-scanner-scheduled.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ permissions:
jobs:
scan-scheduled:
if: github.repository == 'tensorflow/tensorflow'
uses: "google/osv-scanner-action/.github/workflows/osv-scanner-reusable.yml@v1.8.1"
uses: "google/osv-scanner-action/.github/workflows/osv-scanner-reusable.yml@v1.8.4"
with:
scan-args: |-
--lockfile=requirements.txt:./requirements_lock_3_9.txt
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/pylint-presubmit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ jobs:
run: |
echo Changed files: ${{ steps.get_file_changes.outputs.files }}
- name: Set up Python 3.9
uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5.1.0
uses: actions/setup-python@f677139bbe7f9c59b41e40162b753c062f5d49a3 # v5.2.0
with:
python-version: "3.9"
- name: Install Python dependencies
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/scorecards-analysis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ jobs:
persist-credentials: false

- name: "Run analysis"
uses: ossf/scorecard-action@dc50aa9510b46c811795eb24b2f1ba02a914e534 # v2.3.3
uses: ossf/scorecard-action@62b2cac7ed8198b15735ed49ab1e5cf35480ba46 # v2.4.0
with:
results_file: results.sarif
results_format: sarif
Expand All @@ -55,7 +55,7 @@ jobs:
# Upload the results as artifacts (optional). Commenting out will disable uploads of run results in SARIF
# format to the repository Actions tab.
- name: "Upload artifact"
uses: actions/upload-artifact@65462800fd760344b1a7b4382951275a0abb4808 # v4.3.3
uses: actions/upload-artifact@50769540e7f4bd5e21e526ee35c689e35e0d6874 # v4.4.0
with:
name: SARIF file
path: results.sarif
Expand All @@ -64,6 +64,6 @@ jobs:
# Upload the results to GitHub's code scanning dashboard (optional).
# Commenting out will disable upload of results to your repo's Code Scanning dashboard
- name: "Upload to code-scanning"
uses: github/codeql-action/upload-sarif@b611370bb5703a7efb587f9d136a52ea24c5c38c # v3.25.11
uses: github/codeql-action/upload-sarif@4dd16135b69a43b6c8efb853346f8437d92d3c93 # v3.26.6
with:
sarif_file: results.sarif
8 changes: 4 additions & 4 deletions .github/workflows/sigbuild-docker-branch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,16 +43,16 @@ jobs:
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@d70bba72b1f3fd22344832f00baa16ece964efeb # v3.3.0
uses: docker/setup-buildx-action@988b5a0280414f521da01fcc63a27aeeb4b104db # v3.6.1
-
name: Login to DockerHub
uses: docker/login-action@0d4c9c5ea7693da7b068278f7b52bda2a190a446 # v3.2.0
uses: docker/login-action@9780b0c442fbb1117ed29e0efdff1e18412f7567 # v3.3.0
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
-
name: Login to GCR
uses: docker/login-action@0d4c9c5ea7693da7b068278f7b52bda2a190a446 # v3.2.0
uses: docker/login-action@9780b0c442fbb1117ed29e0efdff1e18412f7567 # v3.3.0
with:
registry: gcr.io
username: _json_key
Expand All @@ -67,7 +67,7 @@ jobs:
-
name: Build and push
id: docker_build
uses: docker/build-push-action@15560696de535e4014efeff63c48f16952e52dd1 # v6.2.0
uses: docker/build-push-action@5cd11c3a4ced054e52742c5fd54dca954e0edd85 # v6.7.0
with:
push: true
context: ./tensorflow/tools/tf_sig_build_dockerfiles
Expand Down
16 changes: 13 additions & 3 deletions .github/workflows/sigbuild-docker-presubmit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,15 +47,24 @@ jobs:
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@d70bba72b1f3fd22344832f00baa16ece964efeb # v3.3.0
uses: docker/setup-buildx-action@988b5a0280414f521da01fcc63a27aeeb4b104db # v3.6.1
-
name: Login to GCR
if: contains(github.event.pull_request.labels.*.name, 'build and push to gcr.io for staging')
uses: docker/login-action@0d4c9c5ea7693da7b068278f7b52bda2a190a446 # v3.2.0
uses: docker/login-action@9780b0c442fbb1117ed29e0efdff1e18412f7567 # v3.3.0
with:
registry: gcr.io
username: _json_key
password: ${{ secrets.GCP_CREDS }}
-
name: Login to AR
# Once this is verified, change the label's name. For now, we will piggyback on gcr.io actions.
if: contains(github.event.pull_request.labels.*.name, 'build and push to gcr.io for staging')
uses: docker/login-action@9780b0c442fbb1117ed29e0efdff1e18412f7567 # v3.3.0
with:
registry: us-central1-docker.pkg.dev
username: _json_key
password: ${{ secrets.GCP_CREDS }}
-
name: Grab the date to do cache busting (assumes same day OK to keep)
run: |
Expand All @@ -64,7 +73,7 @@ jobs:
-
name: Build containers, and push to GCR only if the 'build and push to gcr.io for staging' label is applied
id: docker_build
uses: docker/build-push-action@15560696de535e4014efeff63c48f16952e52dd1 # v6.2.0
uses: docker/build-push-action@5cd11c3a4ced054e52742c5fd54dca954e0edd85 # v6.7.0
with:
push: ${{ contains(github.event.pull_request.labels.*.name, 'build and push to gcr.io for staging') }}
context: ./tensorflow/tools/tf_sig_build_dockerfiles
Expand All @@ -74,6 +83,7 @@ jobs:
CACHEBUSTER=${{ steps.date.outputs.DATE }}
tags: |
gcr.io/tensorflow-sigs/build:${{ github.event.number }}-${{ matrix.python-version }}
us-central1-docker.pkg.dev/tensorflow-sigs/tensorflow/build:${{ github.event.number }}-${{ matrix.python-version }}
cache-from: |
type=registry,ref=tensorflow/build:latest-${{ matrix.python-version }}
type=registry,ref=gcr.io/tensorflow-sigs/build:${{ github.event.number }}-${{ matrix.python-version }}
Expand Down
18 changes: 14 additions & 4 deletions .github/workflows/sigbuild-docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,20 +46,28 @@ jobs:
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@d70bba72b1f3fd22344832f00baa16ece964efeb # v3.3.0
uses: docker/setup-buildx-action@988b5a0280414f521da01fcc63a27aeeb4b104db # v3.6.1
-
name: Login to DockerHub
uses: docker/login-action@0d4c9c5ea7693da7b068278f7b52bda2a190a446 # v3.2.0
uses: docker/login-action@9780b0c442fbb1117ed29e0efdff1e18412f7567 # v3.3.0
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
-
name: Login to GCR
uses: docker/login-action@0d4c9c5ea7693da7b068278f7b52bda2a190a446 # v3.2.0
uses: docker/login-action@9780b0c442fbb1117ed29e0efdff1e18412f7567 # v3.3.0
with:
registry: gcr.io
username: _json_key
password: ${{ secrets.GCP_CREDS }}
-
name: Login to AR
# Once this is verified, removed gcr.io actions.
uses: docker/login-action@9780b0c442fbb1117ed29e0efdff1e18412f7567 # v3.3.0
with:
registry: us-central1-docker.pkg.dev
username: _json_key
password: ${{ secrets.GCP_CREDS }}
-
name: Grab the upcoming TF version to tag this container
run: |
Expand All @@ -74,7 +82,7 @@ jobs:
-
name: Build and push
id: docker_build
uses: docker/build-push-action@15560696de535e4014efeff63c48f16952e52dd1 # v6.2.0
uses: docker/build-push-action@5cd11c3a4ced054e52742c5fd54dca954e0edd85 # v6.7.0
with:
push: true
context: ./tensorflow/tools/tf_sig_build_dockerfiles
Expand All @@ -87,6 +95,8 @@ jobs:
tensorflow/build:${{ steps.tf-version.outputs.TF_VERSION }}-${{ matrix.python-version }}
gcr.io/tensorflow-sigs/build:latest-${{ matrix.python-version }}
gcr.io/tensorflow-sigs/build:${{ steps.tf-version.outputs.TF_VERSION }}-${{ matrix.python-version }}
us-central1-docker.pkg.dev/tensorflow-sigs/tensorflow/build:latest-${{ matrix.python-version }}
us-central1-docker.pkg.dev/tensorflow-sigs/tensorflow/build:${{ steps.tf-version.outputs.TF_VERSION }}-${{ matrix.python-version }}
cache-from: type=registry,ref=tensorflow/build:latest-${{ matrix.python-version }}
cache-to: type=inline
-
Expand Down
20 changes: 14 additions & 6 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -253,13 +253,21 @@ There are two ways to run TensorFlow unit tests.
export flags="--config=opt -k"
```

If the tests are to be run on the GPU, add CUDA paths to LD_LIBRARY_PATH and
add the `cuda` option flag
If the tests are to be run on the GPU:
* For TensorFlow versions starting from v.2.18.0:
Add the `cuda` option flag.

```bash
export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH"
export flags="--config=opt --config=cuda -k"
```
```bash
export flags="--config=opt --config=cuda -k"
```

* For TensorFlow versions prior v.2.18.0:
Add CUDA paths to LD_LIBRARY_PATH and add the `cuda` option flag.

```bash
export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH"
export flags="--config=opt --config=cuda -k"
```

For example, to run all tests under tensorflow/python, do:

Expand Down
42 changes: 42 additions & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,26 @@
* <DOCUMENT BREAKING CHANGES HERE>
* <THIS SECTION SHOULD CONTAIN API, ABI AND BEHAVIORAL BREAKING CHANGES>

* `tf.lite`
* C API:
* An optional, fourth parameter was added `TfLiteOperatorCreate` as a step
forward towards a cleaner API for `TfLiteOperator`. Function
`TfLiteOperatorCreate` was added recently, in TensorFlow Lite version 2.17.0,
released on 7/11/2024, and we do not expect there will be much code using this
function yet. Any code breakages can be easily resolved by passing nullptr as
the new, 4th parameter.
* SignatureRunner is now supported for models with no signatures.

* TensorRT support is disabled in CUDA builds for code health improvement.

* Hermetic CUDA support is added.

Hermetic CUDA uses a specific downloadable version of CUDA instead of the
user’s locally installed CUDA. Bazel will download CUDA, CUDNN and NCCL
distributions, and then use CUDA libraries and tools as dependencies in
various Bazel targets. This enables more reproducible builds for Google ML
projects and supported CUDA versions.

### Known Caveats

* <CAVEATS REGARDING THE RELEASE (BUT NOT BREAKING CHANGES).>
Expand All @@ -20,6 +40,12 @@
* <INSERT MAJOR FEATURE HERE, USING MARKDOWN SYNTAX>
* <IF RELEASE CONTAINS MULTIPLE FEATURES FROM SAME AREA, GROUP THEM TOGETHER>

* `tf.lite`:
* The LiteRT [repo](https://github.com/google-ai-edge/LiteRT) is
live (see [announcement](https://developers.googleblog.com/en/tensorflow-lite-is-now-litert/)), which means that in the coming months there will be changes to the development experience
for TFLite. The TF Lite Runtime source will be moved later this year,
and sometime after that we will start accepting contributions through that repo.

### Bug Fixes and Other Changes

* <SIMILAR TO ABOVE SECTION, BUT FOR OTHER IMPORTANT CHANGES / BUG FIXES>
Expand All @@ -31,10 +57,26 @@
should run synchronously, as opposed to be parallelizable when
`options.experimental_optimization.map_parallelization=True`. This saves
memory compared to setting `num_parallel_calls=1`.
* Add optional `use_unbounded_threadpool` argument to `map`, to specify that
the `map` should use an unbounded threadpool instead of the default pool
that is based on the number of cores on the machine. This can improve
throughput for map functions which perform IO or otherwise release the
CPU.
* Add [`tf.data.experimental.get_model_proto`](https://www.tensorflow.org/api_docs/python/tf/data/experimental/get_model_proto)
to allow users to peek into the analytical model inside of a dataset
iterator.

* `tf.lite`
* `Dequantize` op supports `TensorType_INT4`.
* This change includes per-channel dequantization.
* Add support for `stablehlo.composite`.
* `EmbeddingLookup` op supports per-channel
quantization and `TensorType_INT4` values.
* `FullyConnected` op supports `TensorType_INT16` activation and
`TensorType_Int4` weight per-channel quantization.

* `tf.tensor_scatter_update`, `tf.tensor_scatter_add` and of other reduce types.
* Support `bad_indices_policy`.

## Keras

Expand Down
47 changes: 47 additions & 0 deletions WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -64,3 +64,50 @@ tf_workspace1()
load("@//tensorflow:workspace0.bzl", "tf_workspace0")

tf_workspace0()

load(
"@local_tsl//third_party/gpus/cuda/hermetic:cuda_json_init_repository.bzl",
"cuda_json_init_repository",
)

cuda_json_init_repository()

load(
"@cuda_redist_json//:distributions.bzl",
"CUDA_REDISTRIBUTIONS",
"CUDNN_REDISTRIBUTIONS",
)
load(
"@local_tsl//third_party/gpus/cuda/hermetic:cuda_redist_init_repositories.bzl",
"cuda_redist_init_repositories",
"cudnn_redist_init_repository",
)

cuda_redist_init_repositories(
cuda_redistributions = CUDA_REDISTRIBUTIONS,
)

cudnn_redist_init_repository(
cudnn_redistributions = CUDNN_REDISTRIBUTIONS,
)

load(
"@local_tsl//third_party/gpus/cuda/hermetic:cuda_configure.bzl",
"cuda_configure",
)

cuda_configure(name = "local_config_cuda")

load(
"@local_tsl//third_party/nccl/hermetic:nccl_redist_init_repository.bzl",
"nccl_redist_init_repository",
)

nccl_redist_init_repository()

load(
"@local_tsl//third_party/nccl/hermetic:nccl_configure.bzl",
"nccl_configure",
)

nccl_configure(name = "local_config_nccl")
Loading