Skip to content
This repository was archived by the owner on Jul 1, 2023. It is now read-only.

Conversation

@saeta
Copy link
Contributor

@saeta saeta commented Feb 22, 2019

Many deep learning models are composed of sequential layers stacked one on
top of each other. It can be relatively tedious to write out the explicit
applied(to:) function because it's fairly repetitive and the underlying
intent is relatively obscured. (It can be especially bothersome because
it's the 2nd (or 3rd) time you're writing out all the layers. (The first time
is to declare all the instance variables, and the second time (if necessary)
is in the initializer.)

Fortunately, with a single helper functions, we can make everything both type
safe as well as convenient and easily expressible & readable!

This commit adds a family of Sequential functions that take in a context, an
input, and a variable number of layers. It chains through the output of one
layer into the input of the next.

This API approach has a number of advantages:

  1. It avoids introducing new symbolic operators, which can be very confusing
    to new users.

  2. It works with today's AutoDiff implementation. (Yay!)

  3. It is very readable and clean.

  4. It avoids users "getting stuck". Concretely, if someone implemented a model
    using my previously proposed >>> operator, if they wanted to add a
    residual (or skip) connection, they would have to basically re-write their
    whole model using a struct, etc. With this API structure, only "local"
    changes are required. (e.g. If only one skip-connection is required, they
    can split the sequential chain into two pieces.)

Downsides of this approach:

  1. It doesn't DRY-out the types required to define a model. (I have some
    thoughts here, but there isn't enough room in this
    margin^H^H^H^H^H^Hcommit message.)

  2. We should think hard about how things should look when we have loops.

  3. I'm sure there's a better way to code-gen out all the different Sequential
    airities. (I got bored hand-writing them out after 4...) Suggestions
    welcome!

Many deep learning models are composed of sequential layers stacked one on
top of each other. It can be relatively tedious to write out the explicit
`applied(to:)` function because it's fairly repetitive and the underlying
intent is relatively obscured. (It can be especially bothersome because
it's the 2nd (or 3rd) time you're writing out all the layers. (The first time
is to declare all the instance variables, and the second time (if necessary)
is in the initializer.)

Fortunately, with a single helper functions, we can make everything both type
safe as well as convenient and easily expressible & readable!

This commit adds a family of `Sequential` functions that take in a context, an
input, and a variable number of layers. It chains through the output of one
layer into the input of the next.

This API approach has a number of advantages:

 1. It avoids introducing new symbolic operators, which can be very confusing
    to new users.

 2. It works with today's AutoDiff implementation. (Yay!)

 3. It is very readable and clean.

 4. It avoids users "getting stuck". Concretely, if someone implemented a model
    using my previously proposed `>>>` operator, if they wanted to add a
    residual (or skip) connection, they would have to basically re-write their
    whole model using a struct, etc. With this API structure, only "local"
    changes are required. (e.g. If only one skip-connection is required, they
    can split the sequential chain into two pieces.)

Downsides of this approach:

 1. It doesn't DRY-out the types required to define a model. (I have some
    thoughts here, but there isn't enough room in this
    margin^H^H^H^H^H^Hcommit message.)

 2. We should think hard about how things should look when we have loops.

 3. I'm sure there's a better way to code-gen out all the different Sequential
    airities. (I got bored hand-writing them out after 4...) Suggestions
    welcome!
Copy link
Contributor

@rxwei rxwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this proposed direction!

(Feedback below is copied from internal chat, but I'll continue that in the open.)

There are always tradeoffs between sugar operators and methods with clear names. Swift standard library APIs (or Swift libraries in general) tend to discourage functional sugars. It's certainly not necessarily the guideline to follow for every domain-specific library, but we should start with clear named APIs and carefully consider the consequences of each proposed sugar.

I have the following concerns:

  1. Naming. Swift function names should use camel case. Moreover, function names should be clear about the operation -- "sequential(in:from:...)" is not doing a good job at indicating this function is doing a chain of function applications because it feels like returning a sequence.
    • Usually a gerund is better in Swift. However, the argument label from: is still unclear.
      func sequencing<...>(in context: Context, from input: Input, _ layer1: Layer1, ...) -> LayerN.Output
    • We can use through: to make it read better, but it's overly verbose. So, tradeoffs.
      func sequencing<...>(in context: Context, from input: Input, through layer1: Layer1, ...) -> LayerN.Output
  2. Free functions should be introduced with extra care. They are generally discouraged because they will 1) mess up code completion and 2) are not quite idiomatic unless they have lots of common precedents, e.g. sin(_:), matmul(_:_:). In this case, I do not think it should be a top-level function. Instead, I'd very much prefer to define this as a protocol extension method in Differentiable:
    extension Differentiable {
        func sequenced<...>(in context: Context, through layer1: Layer1, _ layer2: Layer2, ...) -> LayerN.Output
    }
    This approach would make the API easy to discover without messing up top-level code completion, and conforms to Swift API design guidelines. And it makes the input not juxtaposed with layers, emphasizing that the input the source of what's being sequenced. Of course, it would look much much better without the context argument, but we don't support implicit arguments yet.
    func applied(to input: Input) -> Output {
        return input.sequenced(in: context, through: layer1, layer2, layer3, ...)
    }

}

@differentiable(wrt: (input, l1, l2))
public func Sequential<L1: Layer, L2: Layer>(in context: Context, from input: L1.Input, _ l1: L1, _ l2: L2) -> L2.Output where L1.Output == L2.Input {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function names should always use camel case.

…ntax and avoiding polluting the global function namespace. Also switch to camelCase.
Copy link
Contributor

@rxwei rxwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it!

}
}

extension Differentiable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use public extension so you can drop publics from individual methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks; fixed!

@differentiable(wrt: (self, l1, l2))
public func sequenced<L1: Layer, L2: Layer>(
in context: Context, through l1: L1, _ l2: L2)
-> L2.Output
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to leave comments about indentation, but I thought "it's gonna be gyb'd anyway" :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't have my editor set up to do the indentation automatically for me. :-( Sorry!

@saeta saeta merged commit 9b3b609 into master Feb 24, 2019
@saeta saeta deleted the sequential-take-two branch February 24, 2019 02:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants