Skip to content

Conversation

eaplatanios
Copy link

@eaplatanios eaplatanios commented Apr 25, 2019

This PR is part of the following "PR family":

This set of PRs splits the current TensorFlow module of stdlib in two parts:

  • TensorFlowCore: Contains the core operations related to the interaction between TensorFlow and the Swift compiler (i.e., compiler runtime, tensor handle, tensor shape, and tensor data types).
  • TensorFlow: Contains all APIs defined by Swift for TensorFlow. This includes all functions that wrap TensorFlow ops, as well as the contents of the tensorflow/swift-bindings and tensorflow/swift-apis repositories.

This results in the main benefit that all compiler logic remains part of the Swift compiler repository, while all API-related code lies in the tensorflow/swift-apis repository. The separation to TensorFlowCore and TensorFlow further allows all code in the APIs repository to be independently compiled and tested, without needing to be part of the compiler repository (which is the current case).

Alternatives Considered

  1. Just moving parts of the code to tensorflow/swift-apis results in the code in that repository not being independently compilable and testable. It can only be compiled while compiling the Swift stdlib.
  2. Moving everything to tensorflow/swift-apis has the main disadvantage that whenever changes are made to the compiler, changes will also have to be made to the APIs repository. Ideally, we would like to reduce that to the minimum and I believe that the current approach achieves that.

Current Status

The following tests currently fail on my machine:

  • TensorFlow/integration.swift
  • TensorFlow/retain_release.swift

I had to keep a few uses of @inline(__always) in tensorflow/swift-apis in order to make the partitioning tests pass.

Questions

  • Should we remove the partitioning tests and clean up code so that @inline(__always) is not used anywhere? The main issue these tests cause is that we cannot currently switch to using eager mode for the raw op bindings. More generally, we cannot currently move away from using #tfop as many of these tests rely on that.
  • It looks like many parts of CompilerRuntime.swift are outdated / have become redundant now that GPE is not supported. Should we clean that up at some point?

Comments

This set of PRs also includes the changes introduced in:

@rxwei @dan-zheng @pschuh

@rxwei
Copy link
Contributor

rxwei commented Apr 26, 2019

This is where a @differentiable attribute gets deserialized from another module: https://github.com/apple/swift/blob/57f8cf549dce88ad3e5c4b150a19b8b7cf8d60db/lib/Serialization/Deserialization.cpp#L2579.
I haven't found anything wrong here yet.

One hack you can try is to skip the ones whose parameter indices are nullptr here:
https://github.com/apple/swift/blob/57f8cf549dce88ad3e5c4b150a19b8b7cf8d60db/lib/SIL/SILFunctionBuilder.cpp#L81

@eaplatanios
Copy link
Author

This is where a @differentiable attribute gets deserialized from another module:

swift/lib/Serialization/Deserialization.cpp
Line 2579 in 57f8cf5
auto *indices = AutoDiffParameterIndices::get(parametersBitVector, ctx);
I haven't found anything wrong here yet.

I looked into this earlier today but it didn't seem to be called at all before the error is thrown.

One hack you can try is to skip the ones whose parameter indices are nullptr here:

swift/lib/SIL/SILFunctionBuilder.cpp
Line 81 in 57f8cf5
for (auto *A : Attrs.getAttributes()) {

I actually considered that but wasn't sure what the implications are. Could that break auto-diff?

@rxwei
Copy link
Contributor

rxwei commented Apr 26, 2019

Some of my earlier debugging told me that addFunctionAttributes is actually called twice, so it should be okay. I still don't understand this code path well enough.

@eaplatanios
Copy link
Author

Ok I am trying that now. Also, I was wondering why we need the DifferentialOperators.swift file in swift-apis (it's the Gradients.swift file in master). The functions it defines are already defined in AutoDiff.swift.

@rxwei
Copy link
Contributor

rxwei commented Apr 26, 2019

The functions it defines are already defined in AutoDiff.swift.

Not quite. The standard library AutoDiff.swift file defines general differential operators, which include gradient APIs that are applicable to functions that return a FloatingPoint scalar for mathematical correctness (gradient is only defined on functions that return a scalar). However, Tensor does not conform to FloatingPoint because it's not always a scalar. So we define a new set of differential operators that take functions returning a floating point Tensor, and assume (assert) that the function being differentiated returns a scalar tensor.

Maybe Tensor should conform to FloatingPoint (and assert in certain FloatingPoint requirements that the tensor is rank-0) at some point.

@eaplatanios
Copy link
Author

Btw, the hacky fix works and now all tests defined using XCTest in swift-apis pass when compiled independently using the S4TF compiler. :)

I'll commit the fixes by tomorrow noon. It all seems to work now locally, including building and using the toolchain.

@eaplatanios
Copy link
Author

@rxwei All changes are pushed now in this PR and in tensorflow/swift-apis#109 . All tests pass on my machine, both for the Swift compiler and for swift-apis by running swift test using the Swift PM. Also, I have verified that the toolchain builds and that I can use everything normally by doing import TensorFlow.

@rxwei
Copy link
Contributor

rxwei commented Apr 26, 2019

That's awesome!

@eaplatanios
Copy link
Author

I think the best way to go about this PR is to first review and merge tensorflow/swift-apis#109 and tensorflow/swift-bindings#26 and then I can update the checkout commits here so that all tests can be run on the CI server.

@eaplatanios
Copy link
Author

eaplatanios commented Apr 28, 2019

@rxwei After making the changes we discussed I run into issues with the following tests:

  • TensorFlow/integration.swift
  • TensorFlow/retain_release.swift

I spent some time trying to debug them but didn't figure out a workaround. I just pushed all changes here and in the paired swift-apis PR. Could you please take a look when you get a chance?

@rxwei
Copy link
Contributor

rxwei commented Apr 28, 2019

What are the issues you ran into? Could you paste the test log into a gist and share it?

@eaplatanios
Copy link
Author

@rxwei there’s a few errors about the SIL not being what expected. The log is really big due to printing the whole SIL I think. Do you want me to share all of it or is there an easy way to find the relevant parts?

@eaplatanios
Copy link
Author

@rxwei
Copy link
Contributor

rxwei commented Apr 28, 2019

Thanks! I see that those GPE tests are very brittle. Our team is starting a discussion about what to do with GPE this week. How about putting this on hold and discussing it in Wednesday's meeting?

@eaplatanios
Copy link
Author

Sounds good. I believe it would be very useful to decide where GPE fits in, in the current design.

@eaplatanios
Copy link
Author

@rxwei Given the newer #24452 I believe we can close this PR.

@rxwei
Copy link
Contributor

rxwei commented May 3, 2019

Sounds good.

@rxwei rxwei closed this May 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants