Skip to content

Commit

Permalink
Only call the WriteAnalysisCacheToFS PTransform in Transform if the i…
Browse files Browse the repository at this point in the history
…nput dictionary isn't empty.

PiperOrigin-RevId: 339255496
  • Loading branch information
tfx-copybara committed Oct 27, 2020
1 parent 4b1937e commit e6c3d05
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 6 deletions.
2 changes: 2 additions & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@
tfx/examples.
* Fixed the run_component script.
* Stopped depending on `WTForms`.
* Fixed an issue with Transform cache and beam 2.24-2.25 in an interactive
notebook that caused it to fail.

### For pipeline authors

Expand Down
15 changes: 9 additions & 6 deletions tfx/components/transform/executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -1125,12 +1125,15 @@ def _RunBeamImpl(self, analyze_data_list: List[_Dataset],
full_span_cache_dir,
os.path.join(output_cache_dir, span_cache_dir.key))

(cache_output
| 'WriteCache' >> analyzer_cache.WriteAnalysisCacheToFS(
pipeline=pipeline,
cache_base_dir=output_cache_dir,
sink=self._GetCacheSink(),
dataset_keys=full_analyze_dataset_keys_list))
# TODO(b/157479287, b/171165988): Remove this condition when beam 2.26
# is used.
if cache_output:
(cache_output
| 'WriteCache' >> analyzer_cache.WriteAnalysisCacheToFS(
pipeline=pipeline,
cache_base_dir=output_cache_dir,
sink=self._GetCacheSink(),
dataset_keys=full_analyze_dataset_keys_list))

if compute_statistics or materialization_format is not None:
# Do not compute pre-transform stats if the input format is raw proto,
Expand Down

0 comments on commit e6c3d05

Please sign in to comment.