Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoDiff] WIP: Implement the Differentiable Curry #28516

Closed
wants to merge 2,382 commits into from

Conversation

rxwei
Copy link
Member

@rxwei rxwei commented Dec 1, 2019

This PR contains my ongoing implementation of The Differentiable Curry.

Umbrella issue: SR-11882

cc @dan-zheng @marcrasi @dimitriv

TODOs:

  • Type calculation
    • [SIL] Change SILFunctionType::getAutoDiffDerivativeFunctionType to include closure context tangents in the result linear map type.
    • [Sema] Add an option to AnyFunctionType::getAutoDiffDerivativeFunctionType to include closure context tangents in the result linear map type, for calculating the result linear map type of Builtin.applyDerivative*.
  • Standard library and builtins
    • [AST] Change Builtin.differentiableFunction*'s type to include closure context tangents.
    • [stdlib] Update differential operators to use the updated AD builtins.
  • SIL generation
    • [SILGen] Change derivative registration linear map thunks (SILGenFunction::getThunkedAutoDiffLinearMap) to include closure context tangents.
    • [SILGen] Change @differentiable reabstraction thunk generation (createAutoDiffThunk in SILGenPoly.cpp) to include closure context tangents in derivative type calculation.
  • Differentiation transform
    • [Main pass] Handle closure context tangents in subset parameters thunks.
    • [ActivityAnalysis] Propagate variedness and usefulness for function values when visiting apply.
    • [DifferentialEmitter] Differentiate apply with respect to the function, forward-propagating closure context tangents.
    • [PullbackEmitter] Differentiate apply with respect to the function, producing closure context tangents of AnyDerivative type in the pullback.
    • [PullbackEmitter] Differentiate partial_apply, casting closure context tangents from tan[f] to tan[args].

Eugene Burmako and others added 30 commits October 2, 2019 16:08
Merge tag 'swift-DEVELOPMENT-SNAPSHOT-2019-09-30-a' into tensorflow
Now that Swift for TensorFlow 0.5 has been released, remove
`Differentiable` protocol APIs that were deprecated in 0.5.
- Set default true for `-enable-experimental-differentiable-programming` flag
  on tensorflow branch. The default will be false on master.
- Rename unit test to AutoDiffIndexSubset.cpp, since `AutoDiffIndexSubset` is
  the tested data structure.
Previously, TransposingAttr::TransposingAttr would set the DeclAttribute to be
DAK_Differentiating. This is incorrect, as it should be DAK_Transposing. This
commit fixes that.

Note: I noticed that this bug hasn't been observed because this code path is
currently unused. However, it is planned to be used in the future, so I will
not remove the code. This unfortunately makes it impossible to write a test
that would have caught this regression.
Replace the last usages of `AutoDiffParameterIndices` (in `DifferentiableAttr`
and `DifferentiatingAttr`) with `AutoDiffIndexSubset`.

Delete `AutoDiffParameterIndices`.
Delete `AutoDiffParameterIndicesBuilder`; directly use `AutoDiffIndexSubset::get`
instead, constructing temporary `llvm::SmallBitVector` when necessary.

Rewrite parameter indices utilities to use `AutoDiffIndexSubset`:
- `autodiff::getLoweredParameterIndices`
- `autodiff::getSubsetParameterTypes `

Resolves TF-538.

TF-874 tracks improving the implementation of `AutoDiffIndexSubset` helpers:
see issue for details.
…ple#27556)

* Add tests for 'AutoDiffIndexSubset' initializers.
* Unify file headers with apple#27555.
apple#27554)

Fix PullbackEmitter crash for functions with multiple basic blocks, where some
basic blocks have no dominated active values.

This occurs because PullbackEmitter currently assumes that all bbs in
multiple-bb functions dominate some active values (at least the entry bb
will contain an active value). This assumption is not always true.

One fix is to add PullbackEmitter support for "bbs with no dominated active
values". This sounds plausible but requires nontrivial work.

Instead, we can infer:
- If no original bbs dominate active values,
- Then the original result must be non-varied (it is useful, so non-active
  implies non-varied),
- Which can be detected easily, and tells us statically that the derivative of
  the original result must be zero,
- So we can skip full pullback generation and simply return zero for wrt
  parameters. This is a simpler fix and is robust for arbitrarily complex
  functions with non-varied result.

A similar shortcut may be necessary for forward-mode differentiation when
it supports control flow.

Resolves TF-876.
Note that the shortcut is semantically sound: see TF-878 for details.
Remove unused known protocols (`TensorProtocol`, `TensorFlowDataTypeCompatible`)
and helper functions.
* Inverse trigonometric functions.
* Exponents and logarithms.
* Hyperbolic functions.
* Error functions ( erf, erfc ).
* Free generic functions: sqrt, fma.

Partially resolves [TF-812](https://bugs.swift.org/browse/TF-812).
)

Previously, `LinearMapInfo::addLinearMapToStruct` did not remap `apply` callee
type in derivative context.

Now, remapping is done. Remapping is significant when the derivative has a
more constrained generic signature.

Resolves TF-817.
Rename the `Differentiable` protocol to `_Differentiable`.

This matches upstreaming PR apple#27511,
which adds `protocol _Differentiable` in a new `Differentiable.swift` file.

`tensorflow` branch defines `typealias Differentiable = _Differentiable` for
usability. There should be no user-facing changes, except that some error
messages now show '_Differentiable' instead of 'Differentiable'.
…apple#27579)

The differentiation order field in `differentiable_function` and `differentiable_function_extract` instructions is unsupported and will not be used by the current design. Quite a lot of dead code exists to try to handle `order`, but it is mostly incomplete and untested. This PR removes the differentiation order from the code base to simplify what we upstream to the 'master' branch.

Changes include:
* Remove `differentiationOrder` from `DifferentiableFunctionInst` and `DifferentiableFunctionExtractInst`.
* Make `DifferentiableFunctionInst::DifferentiableFunctionInst` take an optional pair of JVP and VJP instead of a variable-size array.
* Rename "associated functions" to "derivative functions" in `DifferentiableFunctionInst` to align better with [the design](https://forums.swift.org/t/differentiable-programming-mega-proposal/28547). Filed task [TF-882](https://bugs.swift.org/browse/TF-882) to track the renaming of all other occurrences of "associated functions".

Resolves [TF-880](https://bugs.swift.org/browse/TF-880).
…ple#27603)

`assocFn` -> `derivativeFn`
`AssocFn` -> `DerivativeFn`
`assocFunc` -> `derivativeFunc`
`AssocFunc` -> `DerivativeFunc`
`associatedFunction` -> `derivativeFunction`
`AssociatedFunction` -> `DerivativeFunction`
`autodiff associated function` -> `derivative function`
`autodiff-associated function` -> `derivative function`
`AD associated function` -> `derivative function`
`associated differentiation function` -> `derivative function`

This is a follow-up to apple#27597.

Resolves [TF-882](https://bugs.swift.org/browse/TF-882).
…apple#27604)

Fix `partial_apply` substitution map for subset parameters linear map thunk.

The correct substitution map is computed by `buildThunkType` in
the helper `ADContext::getOrCreateSubsetParametersThunkForLinearMap` and is
now returned by the helper.

Resolves TF-886.
Per @DougGregor's review comment (apple#27555 (comment)), `AutoDiffIndexSubset` data structure could have a more general name. We rename it to `IndexSubset` and move it to its own file.
…apple#27613)

Type-check `@differentiable` attributes during `TypeChecker::validateDecl` for
all relevant declaration kinds (initializers, subscripts, variables), not just
function declarations.

Resolves TF-888.
TF-789 tracks proper request-based type-checking for `@differentiable` attribute.

Exposes TF-892: `ElementaryFunctions` linker error on Linux.
…e. (apple#27627)

- Move some 'IndexSubset' method implemenetations from AutoDiff.cpp to a new file called IndexSubset.cpp.
- Fix some indentation errors caused by the earlier renaming of `AutoDiffIndexSubset`.
Define `DifferentiableActivityInfo::getLookupConformanceFunction` helper.
Use `LookUpConformanceInModule` when derivative generic signature is undefined.
Tf-999 tracks enabling the test after SR-11336 is fixed.
…pple#27638)

`ValueOwnershipKindClassifier` and `OperandOwnershipKindClassifier` should have the same classification for `autodiff_function_extract`. `ValueOwnershipKindClassifier`'s classification was fixed by apple#27199, which gave the correct ownership verification results. Now we fix it in `OperandOwnershipKindClassifier`.

`DifferentiableFunctionExtractOriginalExpr`'s SILGen is now corrected to borrowing the argument and emitting a copy for the extracted original function.
dan-zheng and others added 7 commits November 26, 2019 23:04
…e#28490)

Check that original declaration has a non-error interface type to avoid crash.

Resolves TF-1017.
…lement. (apple#28492)

Fix differentiation of tuples with single differentiable element, i.e.
non-tuple-typed tangent space:
- `JVPEmitter::emitTangentForTupleInst`
- `JVPEmitter::emitTangentForDestructureTupleInst`
- `PullbackEmitter::visitTupleInst`

These visitors now check when tuple values have non-tuple-typed tangent spaces.

Resolves TF-964. Add tests (using non-trivial values to test ownership).
Add forward-mode crasher tests: TF-984, TF-1011.
Remove duplicate tests from test/AutoDiff/forward_mode_runtime.swift.
Previously, `@noDerivative` warnings for `Differentiable` derived conformances
were unsilenceable for conditional `Differentiable` conformances: adding
`@noDerivative` in the nominal type declaration was rejected by `@noDerivative`
attribute type-checking.

Now, adding an explicit `@noDerivative` attribute silences these warnings.
`@noDerivative` can always be added in the nominal type declaration because
cross-file derived conformances are rejected.

Add `@noDerivative` tests for `Differentiable`-conforming structs/classes.

Resolves TF-1018.
# Main change

The differentiation transform now looks up differentiability witnesses and
creates `differentiable_witness_function` instructions (derivative function
references) instead of looking up `[differentiable]` attributes and creating
`function_ref` instructions.

Differentiability witnesses are now used end-to-end, from SILGen to IRGen.
- Runtime performance impact: none with `differentiability_witness_function`
  devirtualization, enabled with `-O`. (TF-994)
- Compile-time performance impact: slight regressions, to be investigated.
  (TF-1013)

# Related changes

Some fixes were required to make the main change work properly.

- `LinkEntity::SecondaryPointer` is now set to `nullptr` for the
  `DifferentiabilityWitness` case. Otherwise, the default wild pointer value
  messes up equality comparisons.
- Handle `differentiable_witness_function` rewriting in LoadableByAddress.
  - This requires support for `differentiable_witness_function` with explicit
    lowered type, similar to `differentiable_function_extract`.
- Do not canonicalize differentiability witness generic signatures. Otherwise,
  `τ` generic parameter types are printed in diagnostics, breaking some tests.
  - As a matter of principle, canonicalization is also unnecessary; only users
    that require canonical signatures should canonicalize.
- Temporarily disable test/AutoDiff/differentiable_sil_attr_roundtrip.swift.
  - Round-trip SIL tests for external `@differentiable` functions are broken.
  - TF-988 tracks fixing this issue.
- Move derivative lookup logic to SILOptimizer/Utils/DerivativeLookup.{h,cpp}.
  - Organizing code in separate files is incremental compilation and hiding
    private helper functions.
  - TF-993 tracks further organization of the Differentiation.cpp mega-file.

# Next steps

TF-866 is the master issue for retroactive derivative registration.

- Remove `SILDifferentiableAttr`. (TF-898)
  - `SILDifferentiableAttr` is still used in the differentiation transform for:
    - Initial triggering of differentiation.
    - Middle-man for differentiability witness anonicalization.
  - `SILDifferentiableAttr` can be removed after these uses are rewritten.
- Lift same-file-only derivative registration limitation. (TF-1021)
- Simplify `@derivative` attribute type-checking and SILGen. (TF-835)
  - SILGen `@derivative` attributes directly to differentiability witnesses.
…ple#28505)

Add `differentiability_witness_function` instruction assertions:
- Must never be constructed with null witness.
- Can only have explicit type in lowered SIL.

Add `differentiability_witness_function` instruction verification: check that
type of `differentiability_witness_function` instruction matches the type of
the witnessed SIL function.
rxwei added a commit to rxwei/swift that referenced this pull request Dec 2, 2019
Make `AnyDerivative` `@frozen` so that differentiable curry can be efficient (apple#28516).

Resolves SR-11879.
@rxwei
Copy link
Member Author

rxwei commented Dec 2, 2019

AD is now using the right linear map type (with AnyDerivative) everywhere including the differentiation transform. Besides FileCheck errors which are obviously expected, I'm getting quite a few runtime segfaults which I think could be caused by incorrect thunking. I'm signing off for today and will continue when I get a chance (next weekend or so).

@dan-zheng
Copy link
Collaborator

I accidentally deleted tensorflow branch, which closed this PR. That was not intentional, sorry!
It would be nice to protect tensorflow branch against deletion to prevent this from happening again.

@dan-zheng dan-zheng reopened this Feb 22, 2020
@dan-zheng dan-zheng changed the base branch from tensorflow to master September 19, 2020 05:34
@shahmishal
Copy link
Member

Please update the base branch to main by Oct 5th otherwise the pull request will be closed automatically.

  • How to change the base branch: (Link)
  • More detail about the branch update: (Link)

@rxwei rxwei closed this Oct 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet