Upgrade TensorFlow to eaacee173897b77cdb6afd22d5e78154177a10f3 #363

lgeiger · 2020-05-13T18:19:59Z

What do these changes do?

This PR upgrades the TensorFlow dependency in order to include eaacee173897b77cdb6afd22d5e78154177a10f3.

Ruy heavily changed their internal API which requires quite a few code changes. A full diff of the changes in ruy can be seen here. This is a first stab at trying to adapt our kernels to the API changes.

How Has This Been Tested?

CI

Related issue number

This is a required update to enable easy int8 benchmarking in #357

lgeiger · 2020-05-13T18:23:19Z

larq_compute_engine/core/bgemm_impl_ruy.h

-        transposed_lhs, rhs, mul_params, ruy_context, &dst, bgemm_runtime_path,
-        &binary_trmul_params);
-
-    // pre-pack the lhs and rhs matrices
-    ruy::PrepackedMatrix prepacked_lhs;
-    ruy::PrepackedMatrix prepacked_rhs;
-    ruy::SidePair<ruy::PrepackedMatrix*> prepacked(&prepacked_lhs,
-                                                   &prepacked_rhs);
-
-    const ruy::SidePair<int> origin{0, 0};
-    const ruy::SidePair<int> rounded_dims{
-        binary_trmul_params.packed[ruy::Side::kLhs].layout.cols,
-        binary_trmul_params.packed[ruy::Side::kRhs].layout.cols};
-
-    ruy::Tuning tuning = ContextInternal::GetMainThreadTuning(ruy_context);
-    for (ruy::Side side : {ruy::Side::kLhs, ruy::Side::kRhs}) {
-      if (prepacked[side]) {
-        prepacked[side]->data_size = DataSize(binary_trmul_params.packed[side]);
-        prepacked[side]->sums_size = SumsSize(binary_trmul_params.packed[side]);
-        prepacked[side]->data = alloc_fn(prepacked[side]->data_size);
-        prepacked[side]->sums = alloc_fn(prepacked[side]->sums_size);
-        binary_trmul_params.packed[side].data = prepacked[side]->data;
-        binary_trmul_params.packed[side].sums = prepacked[side]->sums;
-        binary_trmul_params.RunPack(side, tuning, origin[side],
-                                    rounded_dims[side]);
-        binary_trmul_params.is_prepacked[side] = true;
-      }
-    }
+        transposed_lhs, internal_rhs, mul_params, ruy_ctx, &internal_dst,
+        bgemm_runtime_path, &binary_trmul_params);

-    ruy::TrMul(&binary_trmul_params, ruy_context);
+    HandlePrepackedCaching(&binary_trmul_params, ruy_ctx);
+    ruy::TrMul(&binary_trmul_params, ruy_ctx);


⚠️ Review with care. I am not familiar with this code so I am not sure what these changes do, but for now at least they seem to compile.

/cc @Tombana @arashb

here I was using the RUY's advanced API but it is completely removed in this commit. I need to dedicate more time to figure out how to replicate that.

Tombana · 2020-06-08T08:58:45Z

The is_mutable_ error turned out to be an issue with constness of the lhs and rhs matrices. In the tflite/ruy code, a non-const Matrix object was created, and then passed to the ruy::Mul function as a const Matrix. However, since we don't have that function call, our matrix was non-const. This is now fixed.

Then a second error got thrown, because RUY removed the kReference path (now there's only kStandardCpp). For legacy reasons, they set kReference = kStandardCpp. We therefore can simply remove the check for not doing the reference path.

Tombana · 2020-06-08T10:26:16Z

Rebased on master.

Tombana · 2020-06-08T12:23:56Z

Just benchmarked this branch on the pixel. Quicknet got 18.0 ms so there is no performance degradation (just to confirm we are not messing up the ruy prepacking or caching or anything). From my side it can be merged.

lgeiger · 2020-06-08T14:24:05Z

Just benchmarked this branch on the pixel. Quicknet got 18.0 ms so there is no performance degradation (just to confirm we are not messing up the ruy prepacking or caching or anything). From my side it can be merged.

Thanks for integrating all the changes and fixing the PR

arashb

LGTM 👍

This fixes build problems after merging #363

* Split weight-writer function into a process and interleave function * Move ProcessWeights function to its own file and make it layer-agnostic * Add extra bconv test with different filter count * Fix minor bug in output-transform MLIR pass * Move weight-processing to an MLIR prepare pass * Add process-weights step to conv unittest * Add new flip-weights function for the new filter layout * Bump Ikva tflite version number * Update Ikva model example to v0.3 with the latest converter * Add utility to re-generate model_data_example.cc if needed * Update lce-utils requirement for a new compute-engine release today

lgeiger added the internal-improvement Internal Improvements and Maintenance label May 13, 2020

lgeiger requested a review from a team May 13, 2020 18:19

lgeiger commented May 13, 2020

View reviewed changes

lgeiger marked this pull request as draft May 13, 2020 18:23

lgeiger force-pushed the tf-nightly-int8-qat-upgrade branch from f96d7c7 to 8b624bb Compare May 14, 2020 17:12

lgeiger changed the base branch from tf-nightly to master May 14, 2020 17:12

lgeiger added the help wanted Extra attention is needed label May 14, 2020

lgeiger force-pushed the tf-nightly-int8-qat-upgrade branch from 8b624bb to fef1bc8 Compare May 19, 2020 14:11

lgeiger mentioned this pull request May 19, 2020

Allow to set fake default ranges to enable latency test of int8 models #357

Merged

AdamHillier added blocked Relies on something else being done first and removed blocked Relies on something else being done first labels Jun 5, 2020

lgeiger and others added 4 commits June 8, 2020 12:17

⬆️ tensorflow@eaacee173897b77cdb6afd22d5e78154177a10f3

d35bd5c

Fix updated RUY code

49223c7

Fix aarch64 bazel build

b07e889

Update submodule

1562a38

lgeiger marked this pull request as ready for review June 8, 2020 10:22

lgeiger removed the help wanted Extra attention is needed label Jun 8, 2020

lgeiger requested a review from arashb June 8, 2020 10:22

Tombana force-pushed the tf-nightly-int8-qat-upgrade branch from 38a0226 to 1562a38 Compare June 8, 2020 10:26

Tombana approved these changes Jun 8, 2020

View reviewed changes

arashb approved these changes Jun 8, 2020

View reviewed changes

Tombana merged commit f374319 into master Jun 8, 2020

Tombana deleted the tf-nightly-int8-qat-upgrade branch June 8, 2020 15:15

lgeiger added a commit that referenced this pull request Jun 8, 2020

Remove broken include

dbcba3f

This fixes build problems after merging #363

lgeiger added a commit that referenced this pull request Jun 9, 2020

Remove broken include

2f6d563

This fixes build problems after merging #363

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade TensorFlow to eaacee173897b77cdb6afd22d5e78154177a10f3 #363

Upgrade TensorFlow to eaacee173897b77cdb6afd22d5e78154177a10f3 #363

lgeiger commented May 13, 2020

lgeiger May 13, 2020

arashb May 19, 2020

Tombana commented Jun 8, 2020

Tombana commented Jun 8, 2020

Tombana commented Jun 8, 2020

lgeiger commented Jun 8, 2020

arashb left a comment

Upgrade TensorFlow to eaacee173897b77cdb6afd22d5e78154177a10f3 #363

Upgrade TensorFlow to eaacee173897b77cdb6afd22d5e78154177a10f3 #363

Conversation

lgeiger commented May 13, 2020

What do these changes do?

How Has This Been Tested?

Related issue number

lgeiger May 13, 2020

Choose a reason for hiding this comment

arashb May 19, 2020

Choose a reason for hiding this comment

Tombana commented Jun 8, 2020

Tombana commented Jun 8, 2020

Tombana commented Jun 8, 2020

lgeiger commented Jun 8, 2020

arashb left a comment

Choose a reason for hiding this comment