[TF] TensorFlow/TensorFlowCore Refactoring #24261

eaplatanios · 2019-04-25T01:33:57Z

This PR is part of the following "PR family":

This set of PRs splits the current TensorFlow module of stdlib in two parts:

TensorFlowCore: Contains the core operations related to the interaction between TensorFlow and the Swift compiler (i.e., compiler runtime, tensor handle, tensor shape, and tensor data types).
TensorFlow: Contains all APIs defined by Swift for TensorFlow. This includes all functions that wrap TensorFlow ops, as well as the contents of the tensorflow/swift-bindings and tensorflow/swift-apis repositories.

This results in the main benefit that all compiler logic remains part of the Swift compiler repository, while all API-related code lies in the tensorflow/swift-apis repository. The separation to TensorFlowCore and TensorFlow further allows all code in the APIs repository to be independently compiled and tested, without needing to be part of the compiler repository (which is the current case).

Alternatives Considered

Just moving parts of the code to tensorflow/swift-apis results in the code in that repository not being independently compilable and testable. It can only be compiled while compiling the Swift stdlib.
Moving everything to tensorflow/swift-apis has the main disadvantage that whenever changes are made to the compiler, changes will also have to be made to the APIs repository. Ideally, we would like to reduce that to the minimum and I believe that the current approach achieves that.

Current Status

The following tests currently fail on my machine:

TensorFlow/integration.swift
TensorFlow/retain_release.swift

I had to keep a few uses of @inline(__always) in tensorflow/swift-apis in order to make the partitioning tests pass.

Questions

Should we remove the partitioning tests and clean up code so that @inline(__always) is not used anywhere? The main issue these tests cause is that we cannot currently switch to using eager mode for the raw op bindings. More generally, we cannot currently move away from using #tfop as many of these tests rely on that.
It looks like many parts of CompilerRuntime.swift are outdated / have become redundant now that GPE is not supported. Should we clean that up at some point?

Comments

This set of PRs also includes the changes introduced in:

@rxwei @dan-zheng @pschuh

…ork currently from outside stdlib.

…entation.

…conformance.

rxwei · 2019-04-26T05:51:08Z

This is where a @differentiable attribute gets deserialized from another module: https://github.com/apple/swift/blob/57f8cf549dce88ad3e5c4b150a19b8b7cf8d60db/lib/Serialization/Deserialization.cpp#L2579.
I haven't found anything wrong here yet.

One hack you can try is to skip the ones whose parameter indices are nullptr here:
https://github.com/apple/swift/blob/57f8cf549dce88ad3e5c4b150a19b8b7cf8d60db/lib/SIL/SILFunctionBuilder.cpp#L81

eaplatanios · 2019-04-26T06:02:42Z

This is where a @differentiable attribute gets deserialized from another module:

swift/lib/Serialization/Deserialization.cpp
Line 2579 in 57f8cf5
auto *indices = AutoDiffParameterIndices::get(parametersBitVector, ctx);
I haven't found anything wrong here yet.

I looked into this earlier today but it didn't seem to be called at all before the error is thrown.

One hack you can try is to skip the ones whose parameter indices are nullptr here:

swift/lib/SIL/SILFunctionBuilder.cpp
Line 81 in 57f8cf5
for (auto *A : Attrs.getAttributes()) {

I actually considered that but wasn't sure what the implications are. Could that break auto-diff?

rxwei · 2019-04-26T06:06:15Z

Some of my earlier debugging told me that addFunctionAttributes is actually called twice, so it should be okay. I still don't understand this code path well enough.

eaplatanios · 2019-04-26T06:23:08Z

Ok I am trying that now. Also, I was wondering why we need the DifferentialOperators.swift file in swift-apis (it's the Gradients.swift file in master). The functions it defines are already defined in AutoDiff.swift.

rxwei · 2019-04-26T06:30:31Z

The functions it defines are already defined in AutoDiff.swift.

Not quite. The standard library AutoDiff.swift file defines general differential operators, which include gradient APIs that are applicable to functions that return a FloatingPoint scalar for mathematical correctness (gradient is only defined on functions that return a scalar). However, Tensor does not conform to FloatingPoint because it's not always a scalar. So we define a new set of differential operators that take functions returning a floating point Tensor, and assume (assert) that the function being differentiated returns a scalar tensor.

Maybe Tensor should conform to FloatingPoint (and assert in certain FloatingPoint requirements that the tensor is rank-0) at some point.

eaplatanios · 2019-04-26T06:47:08Z

Btw, the hacky fix works and now all tests defined using XCTest in swift-apis pass when compiled independently using the S4TF compiler. :)

I'll commit the fixes by tomorrow noon. It all seems to work now locally, including building and using the toolchain.

eaplatanios · 2019-04-26T07:04:04Z

@rxwei All changes are pushed now in this PR and in tensorflow/swift-apis#109 . All tests pass on my machine, both for the Swift compiler and for swift-apis by running swift test using the Swift PM. Also, I have verified that the toolchain builds and that I can use everything normally by doing import TensorFlow.

rxwei · 2019-04-26T07:18:20Z

That's awesome!

…e need for 'Tensor.unbroadcasted'.

eaplatanios · 2019-04-27T16:24:04Z

I think the best way to go about this PR is to first review and merge tensorflow/swift-apis#109 and tensorflow/swift-bindings#26 and then I can update the checkout commits here so that all tests can be run on the CI server.

stdlib/public/TensorFlowCore/Tensor.swift

eaplatanios · 2019-04-28T02:48:21Z

@rxwei After making the changes we discussed I run into issues with the following tests:

TensorFlow/integration.swift
TensorFlow/retain_release.swift

I spent some time trying to debug them but didn't figure out a workaround. I just pushed all changes here and in the paired swift-apis PR. Could you please take a look when you get a chance?

rxwei · 2019-04-28T11:08:59Z

What are the issues you ran into? Could you paste the test log into a gist and share it?

eaplatanios · 2019-04-28T13:00:30Z

@rxwei there’s a few errors about the SIL not being what expected. The log is really big due to printing the whole SIL I think. Do you want me to share all of it or is there an easy way to find the relevant parts?

eaplatanios · 2019-04-28T16:49:28Z

@rxwei Full gist is here: https://gist.github.com/eaplatanios/b9a8ff1a881ad70dbe5b7ae805a9f3d5

rxwei · 2019-04-28T20:38:30Z

Thanks! I see that those GPE tests are very brittle. Our team is starting a discussion about what to do with GPE this week. How about putting this on hold and discussing it in Wednesday's meeting?

eaplatanios · 2019-04-28T21:52:56Z

Sounds good. I believe it would be very useful to decide where GPE fits in, in the current design.

eaplatanios · 2019-05-03T20:07:56Z

@rxwei Given the newer #24452 I believe we can close this PR.

rxwei · 2019-05-03T20:55:04Z

Sounds good.

eaplatanios added 30 commits April 19, 2019 13:37

Moved a couple of tensor initializers to swift-apis.

de8fdbd

Moved the activation functions to swift-apis.

06b96c0

Minor edit.

dcd46be

Moved the log-softmax VJP to swift-apis.

7a957cc

Moved some tensor initializers to swift-apis.

75f7039

Moved some more stuff to swift-apis.

a897918

Moved some more stuff to swift-apis.

da184a8

Moved some more stuff to swift-apis.

cde450e

Removed the now-redundant 'Ops.swift' file.

a878600

Moved the gradient helper methods to swift-apis.

e790f78

Moved the tensor tests to swift-apis.

93041e0

Brought back the tensor APItests.

d988084

Added support for the TensorFlow op.

a238b6e

Bug fix.

d759e24

Bug fix.

944d7f6

Updated the swift-apis dependency.

12ec483

Merged upstream changes.

97d0ea8

Minor edit.

993c972

Added support for 'Dataset.repeated(count:)' since '#tfop' does not w…

aa72c70

…ork currently from outside stdlib.

Added support for prefetched datasets.

4f8c2ca

Moved the dataset ops to swift-apis.

98e5704

Removed the now-redundant 'ArrayOps.swift' file.

df0ec40

Changes to support the new swift-bindings.

ce4dfd3

Updated the 'TensorArrayProtocol' and its automatic derivation implem…

701e31d

…entation.

Bug fixes.

a5fcdc2

Minor edits.

22fed17

Addressed Dan's comments regarding the 'TensorArrayProtocol' derived …

2de2d2b

…conformance.

Minor bug fix.

93e335a

Minor edit.

6a79ac8

Enhancements to 'TensorArrayProtocol'.

5890133

Merge remote-tracking branch 'upstream/tensorflow' into swift-apis

bade22a

Bug fix related to auto-differentiation.

194b22b

eaplatanios added 5 commits April 26, 2019 10:22

Addressed comments by @rxwei.

dca11f5

Added a missing check.

681190f

Made '_vjpAdd' and '_vjpSubtract' a bit more efficient and removed th…

c3c3305

…e need for 'Tensor.unbroadcasted'.

Merge remote-tracking branch 'upstream/tensorflow' into swift-apis

7903e08

Bug fix.

63dca1a

Minor optimization.

0e20045

rxwei reviewed Apr 27, 2019

View reviewed changes

stdlib/public/TensorFlowCore/Tensor.swift Outdated Show resolved Hide resolved

Addressed Richard's feedback.

6dec2de

Merged upstream changes.

35bcbcf

Merged upstream changes.

23c3ad7

rxwei closed this May 3, 2019

[TF] TensorFlow/TensorFlowCore Refactoring #24261

[TF] TensorFlow/TensorFlowCore Refactoring #24261

Uh oh!

Conversation

eaplatanios commented Apr 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Alternatives Considered

Current Status

Questions

Comments

Uh oh!

rxwei commented Apr 26, 2019

Uh oh!

eaplatanios commented Apr 26, 2019

Uh oh!

rxwei commented Apr 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eaplatanios commented Apr 26, 2019

Uh oh!

rxwei commented Apr 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eaplatanios commented Apr 26, 2019

Uh oh!

eaplatanios commented Apr 26, 2019

Uh oh!

rxwei commented Apr 26, 2019

Uh oh!

eaplatanios commented Apr 27, 2019

Uh oh!

Uh oh!

eaplatanios commented Apr 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rxwei commented Apr 28, 2019

Uh oh!

eaplatanios commented Apr 28, 2019

Uh oh!

eaplatanios commented Apr 28, 2019

Uh oh!

rxwei commented Apr 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eaplatanios commented Apr 28, 2019

Uh oh!

eaplatanios commented May 3, 2019

Uh oh!

rxwei commented May 3, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eaplatanios commented Apr 25, 2019 •

edited

Loading

rxwei commented Apr 26, 2019 •

edited

Loading

rxwei commented Apr 26, 2019 •

edited

Loading

eaplatanios commented Apr 28, 2019 •

edited

Loading

rxwei commented Apr 28, 2019 •

edited

Loading