Skip to content
This repository has been archived by the owner on Mar 30, 2022. It is now read-only.

Commit

Permalink
Land the first draft of the "Why *Swift* for TensorFlow?" doc.
Browse files Browse the repository at this point in the history
  • Loading branch information
lattner committed Apr 26, 2018
1 parent 12858e9 commit 028f245
Show file tree
Hide file tree
Showing 4 changed files with 283 additions and 2 deletions.
6 changes: 5 additions & 1 deletion docs/GraphProgramExtraction.md
Expand Up @@ -386,7 +386,9 @@ These warnings can be annoying when the copies are intentional, so the user can
This model is particularly helpful to production engineers who deploy code at scale, because they can upgrade the warning to an error, to make sure that no implicit copies creep in, even as the code continues to evolve.

Now that we have a robust approach for handling communication between the host program and the TensorFlow program, let’s return to the task of improving the programming model to something that is more user friendly and less primitive.

### Adding structures and tuples

As we discussed before, we can add any abstractions to our model as long as we have a provable way to eliminate them. Once they are eliminated, we can lower the code to graph using the approaches described above. In the case of structs and tuples, Swift is guaranteed to be able to scalarize them away, so long as the compiler can see the type definition. This is possible because we have no aliasing in our programming model.

This is great because it allows users to compose high level abstractions out of tensors and other values. For example, the compiler scalarizes this code:
Expand Down Expand Up @@ -518,12 +520,14 @@ fcl = DenseLayer_Float(inputSize: 28 * 28, outputSize: 10)
and then Swift will apply all the other destructuring transformations until we get to something that can be trivially transformed into a graph.

There is a lot more to go here, but this document is already too long, so we’ll avoid going case by case any further. One last important honorable mention is that Swift’s approach to “[Protocol-Oriented Programming](https://developer.apple.com/videos/play/wwdc2015/408/)” [[youtube](https://www.youtube.com/watch?v=g2LwFZatfTI)] allows many things traditionally expressed with OOP to be expressed in a purely static way through composition of structs using the mix-in behavior granted by default implementations of protocol requirements.

### Limitations of this approach: out of model language features

We’ve covered the static side of Swift extensively, but have completely neglected its dynamic side: classes, [existentials](https://wiki.haskell.org/Existential_type), and dynamic data structures built upon them like dictionaries and arrays. These are actually two different classes to consider. Let’s start with the dynamic types first:

Swift puts aggregate types into two categories: dynamic (classes and existentials) and static (struct, enum, and tuple). Because existential types (i.e., values whose static type is a protocol) could be implemented by a class, here we just describe the issues with classes. Classes in Swift are extremely dynamic: each method is dynamically dispatched via a vtable (in the case of a Swift object) or a message send (with a type deriving from `NSObject` or another Objective-C class on Apple platforms only). Furthermore, properties in classes can be overridden by derived classes, and pointers to instances of classes can be aliased in an unstructured way.

As it turns out, this is the same sort of situation you get in object oriented languages like Java, C#, Scala, and Objective-C: in full generality, class references cannot be analyzed (this is discussed in our [Why *Swift* for TensorFlow](https://docs.google.com/document/d/1G7D_gUhLV8AFXEO48n8-uXcDa3QOaeRnTq07DzKtCRE/edit#) document). A compiler can handle many common situations through heuristic-based analysis (using techniques like [interprocedural alias analysis](http://llvm.org/pubs/2005-05-04-LattnerPHDThesis.html) and [class hierarchy analysis](https://dl.acm.org/citation.cfm?id=679523)) but relying on these techniques as part of the programming model means that small changes to code can break the heuristics they depend on. This is an inherent result of relying on [Heroic optimizations](http://nondot.org/sabre/2012-04-02-CGOKeynote.pdf) as part of the user-visible behavior of the programming model.
As it turns out, this is the same sort of situation you get in object oriented languages like Java, C#, Scala, and Objective-C: in full generality, class references cannot be analyzed (this is discussed in our [Why *Swift* for TensorFlow](WhySwiftForTensorFlow.md) document). A compiler can handle many common situations through heuristic-based analysis (using techniques like [interprocedural alias analysis](http://llvm.org/pubs/2005-05-04-LattnerPHDThesis.html) and [class hierarchy analysis](https://dl.acm.org/citation.cfm?id=679523)) but relying on these techniques as part of the programming model means that small changes to code can break the heuristics they depend on. This is an inherent result of relying on [Heroic optimizations](http://nondot.org/sabre/2012-04-02-CGOKeynote.pdf) as part of the user-visible behavior of the programming model.

Our feeling is that it isn’t acceptable to bake heuristics like these into the user-visible part of the programming model. The problem is that these approaches rely on global properties of a program under analysis, and small local changes can upset global properties. In our case, that means that a small change to an isolated module can cause new implicit data copies to be introduced in a completely unrelated part of the code - which could cause gigabytes worth of data transfer to be unexpectedly introduced. We refer to this as “spooky action at a distance”, and because it could introduce unsettling feelings into our users, we deny it.

Expand Down
2 changes: 1 addition & 1 deletion docs/PythonIntegration.md
@@ -1,6 +1,6 @@
# Python Interoperability

As described in the [design overview document](https://docs.google.com/document/d/11cM7sjTaPOxQhBBozosfV2f9XOz30QYjOQhE_-8StEI/edit#), Python API interoperability is an important requirement for this project. While Swift is designed to integrate with other programming languages (and their runtimes), the nature of dynamic languages does not require the deep integration needed to support static languages. Python in particular is [designed to be embedded](https://docs.python.org/3/extending/index.html) into other applications and has a [simple C interface API](https://oleb.net/blog/2017/12/importing-c-library-into-swift/). For the purposes of our work, we can provide a meta-embedding, which allows Swift programs to use Python APIs as though they are directly embedding Python itself.
As described in the [design overview document](DesignOverview.md), Python API interoperability is an important requirement for this project. While Swift is designed to integrate with other programming languages (and their runtimes), the nature of dynamic languages does not require the deep integration needed to support static languages. Python in particular is [designed to be embedded](https://docs.python.org/3/extending/index.html) into other applications and has a [simple C interface API](https://oleb.net/blog/2017/12/importing-c-library-into-swift/). For the purposes of our work, we can provide a meta-embedding, which allows Swift programs to use Python APIs as though they are directly embedding Python itself.

To accomplish this, the Swift script/program simply links the Python interpreter into its code. Our goal changes from “how do we work with Python APIs” into a question of “how do we make Python APIs feel natural, accessible, and easy to reach for from Swift code?” This isn’t a trivial problem - there are significant design differences between Swift and Python, including their approaches to error handling, the super-dynamic nature of Python, the differences in surface-level syntax between the two languages, and the desire to not “compromise” the things that Swift programmers have come to expect. We also care about convenience and ergonomics and think it is unacceptable to require a wrapper generator like SWIG.

Expand Down

0 comments on commit 028f245

Please sign in to comment.