New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoDiff] WIP: Implement the Differentiable Curry #28516
Conversation
Merge tag 'swift-DEVELOPMENT-SNAPSHOT-2019-09-30-a' into tensorflow
Now that Swift for TensorFlow 0.5 has been released, remove `Differentiable` protocol APIs that were deprecated in 0.5.
- Set default true for `-enable-experimental-differentiable-programming` flag on tensorflow branch. The default will be false on master. - Rename unit test to AutoDiffIndexSubset.cpp, since `AutoDiffIndexSubset` is the tested data structure.
Previously, TransposingAttr::TransposingAttr would set the DeclAttribute to be DAK_Differentiating. This is incorrect, as it should be DAK_Transposing. This commit fixes that. Note: I noticed that this bug hasn't been observed because this code path is currently unused. However, it is planned to be used in the future, so I will not remove the code. This unfortunately makes it impossible to write a test that would have caught this regression.
Replace the last usages of `AutoDiffParameterIndices` (in `DifferentiableAttr` and `DifferentiatingAttr`) with `AutoDiffIndexSubset`. Delete `AutoDiffParameterIndices`. Delete `AutoDiffParameterIndicesBuilder`; directly use `AutoDiffIndexSubset::get` instead, constructing temporary `llvm::SmallBitVector` when necessary. Rewrite parameter indices utilities to use `AutoDiffIndexSubset`: - `autodiff::getLoweredParameterIndices` - `autodiff::getSubsetParameterTypes ` Resolves TF-538. TF-874 tracks improving the implementation of `AutoDiffIndexSubset` helpers: see issue for details.
…ple#27556) * Add tests for 'AutoDiffIndexSubset' initializers. * Unify file headers with apple#27555.
apple#27554) Fix PullbackEmitter crash for functions with multiple basic blocks, where some basic blocks have no dominated active values. This occurs because PullbackEmitter currently assumes that all bbs in multiple-bb functions dominate some active values (at least the entry bb will contain an active value). This assumption is not always true. One fix is to add PullbackEmitter support for "bbs with no dominated active values". This sounds plausible but requires nontrivial work. Instead, we can infer: - If no original bbs dominate active values, - Then the original result must be non-varied (it is useful, so non-active implies non-varied), - Which can be detected easily, and tells us statically that the derivative of the original result must be zero, - So we can skip full pullback generation and simply return zero for wrt parameters. This is a simpler fix and is robust for arbitrarily complex functions with non-varied result. A similar shortcut may be necessary for forward-mode differentiation when it supports control flow. Resolves TF-876. Note that the shortcut is semantically sound: see TF-878 for details.
Remove unused known protocols (`TensorProtocol`, `TensorFlowDataTypeCompatible`) and helper functions.
* Inverse trigonometric functions. * Exponents and logarithms. * Hyperbolic functions. * Error functions ( erf, erfc ). * Free generic functions: sqrt, fma. Partially resolves [TF-812](https://bugs.swift.org/browse/TF-812).
Rename the `Differentiable` protocol to `_Differentiable`. This matches upstreaming PR apple#27511, which adds `protocol _Differentiable` in a new `Differentiable.swift` file. `tensorflow` branch defines `typealias Differentiable = _Differentiable` for usability. There should be no user-facing changes, except that some error messages now show '_Differentiable' instead of 'Differentiable'.
…apple#27579) The differentiation order field in `differentiable_function` and `differentiable_function_extract` instructions is unsupported and will not be used by the current design. Quite a lot of dead code exists to try to handle `order`, but it is mostly incomplete and untested. This PR removes the differentiation order from the code base to simplify what we upstream to the 'master' branch. Changes include: * Remove `differentiationOrder` from `DifferentiableFunctionInst` and `DifferentiableFunctionExtractInst`. * Make `DifferentiableFunctionInst::DifferentiableFunctionInst` take an optional pair of JVP and VJP instead of a variable-size array. * Rename "associated functions" to "derivative functions" in `DifferentiableFunctionInst` to align better with [the design](https://forums.swift.org/t/differentiable-programming-mega-proposal/28547). Filed task [TF-882](https://bugs.swift.org/browse/TF-882) to track the renaming of all other occurrences of "associated functions". Resolves [TF-880](https://bugs.swift.org/browse/TF-880).
…ple#27603) `assocFn` -> `derivativeFn` `AssocFn` -> `DerivativeFn` `assocFunc` -> `derivativeFunc` `AssocFunc` -> `DerivativeFunc` `associatedFunction` -> `derivativeFunction` `AssociatedFunction` -> `DerivativeFunction` `autodiff associated function` -> `derivative function` `autodiff-associated function` -> `derivative function` `AD associated function` -> `derivative function` `associated differentiation function` -> `derivative function` This is a follow-up to apple#27597. Resolves [TF-882](https://bugs.swift.org/browse/TF-882).
…apple#27604) Fix `partial_apply` substitution map for subset parameters linear map thunk. The correct substitution map is computed by `buildThunkType` in the helper `ADContext::getOrCreateSubsetParametersThunkForLinearMap` and is now returned by the helper. Resolves TF-886.
Per @DougGregor's review comment (apple#27555 (comment)), `AutoDiffIndexSubset` data structure could have a more general name. We rename it to `IndexSubset` and move it to its own file.
…apple#27613) Type-check `@differentiable` attributes during `TypeChecker::validateDecl` for all relevant declaration kinds (initializers, subscripts, variables), not just function declarations. Resolves TF-888. TF-789 tracks proper request-based type-checking for `@differentiable` attribute. Exposes TF-892: `ElementaryFunctions` linker error on Linux.
…e. (apple#27627) - Move some 'IndexSubset' method implemenetations from AutoDiff.cpp to a new file called IndexSubset.cpp. - Fix some indentation errors caused by the earlier renaming of `AutoDiffIndexSubset`.
Define `DifferentiableActivityInfo::getLookupConformanceFunction` helper. Use `LookUpConformanceInModule` when derivative generic signature is undefined.
Tf-999 tracks enabling the test after SR-11336 is fixed.
…pple#27638) `ValueOwnershipKindClassifier` and `OperandOwnershipKindClassifier` should have the same classification for `autodiff_function_extract`. `ValueOwnershipKindClassifier`'s classification was fixed by apple#27199, which gave the correct ownership verification results. Now we fix it in `OperandOwnershipKindClassifier`. `DifferentiableFunctionExtractOriginalExpr`'s SILGen is now corrected to borrowing the argument and emitting a copy for the extracted original function.
…e#28490) Check that original declaration has a non-error interface type to avoid crash. Resolves TF-1017.
…lement. (apple#28492) Fix differentiation of tuples with single differentiable element, i.e. non-tuple-typed tangent space: - `JVPEmitter::emitTangentForTupleInst` - `JVPEmitter::emitTangentForDestructureTupleInst` - `PullbackEmitter::visitTupleInst` These visitors now check when tuple values have non-tuple-typed tangent spaces. Resolves TF-964. Add tests (using non-trivial values to test ownership). Add forward-mode crasher tests: TF-984, TF-1011. Remove duplicate tests from test/AutoDiff/forward_mode_runtime.swift.
Previously, `@noDerivative` warnings for `Differentiable` derived conformances were unsilenceable for conditional `Differentiable` conformances: adding `@noDerivative` in the nominal type declaration was rejected by `@noDerivative` attribute type-checking. Now, adding an explicit `@noDerivative` attribute silences these warnings. `@noDerivative` can always be added in the nominal type declaration because cross-file derived conformances are rejected. Add `@noDerivative` tests for `Differentiable`-conforming structs/classes. Resolves TF-1018.
…pple#24068) * No bangs! * Index from the dictionary directly.
# Main change The differentiation transform now looks up differentiability witnesses and creates `differentiable_witness_function` instructions (derivative function references) instead of looking up `[differentiable]` attributes and creating `function_ref` instructions. Differentiability witnesses are now used end-to-end, from SILGen to IRGen. - Runtime performance impact: none with `differentiability_witness_function` devirtualization, enabled with `-O`. (TF-994) - Compile-time performance impact: slight regressions, to be investigated. (TF-1013) # Related changes Some fixes were required to make the main change work properly. - `LinkEntity::SecondaryPointer` is now set to `nullptr` for the `DifferentiabilityWitness` case. Otherwise, the default wild pointer value messes up equality comparisons. - Handle `differentiable_witness_function` rewriting in LoadableByAddress. - This requires support for `differentiable_witness_function` with explicit lowered type, similar to `differentiable_function_extract`. - Do not canonicalize differentiability witness generic signatures. Otherwise, `τ` generic parameter types are printed in diagnostics, breaking some tests. - As a matter of principle, canonicalization is also unnecessary; only users that require canonical signatures should canonicalize. - Temporarily disable test/AutoDiff/differentiable_sil_attr_roundtrip.swift. - Round-trip SIL tests for external `@differentiable` functions are broken. - TF-988 tracks fixing this issue. - Move derivative lookup logic to SILOptimizer/Utils/DerivativeLookup.{h,cpp}. - Organizing code in separate files is incremental compilation and hiding private helper functions. - TF-993 tracks further organization of the Differentiation.cpp mega-file. # Next steps TF-866 is the master issue for retroactive derivative registration. - Remove `SILDifferentiableAttr`. (TF-898) - `SILDifferentiableAttr` is still used in the differentiation transform for: - Initial triggering of differentiation. - Middle-man for differentiability witness anonicalization. - `SILDifferentiableAttr` can be removed after these uses are rewritten. - Lift same-file-only derivative registration limitation. (TF-1021) - Simplify `@derivative` attribute type-checking and SILGen. (TF-835) - SILGen `@derivative` attributes directly to differentiability witnesses.
…ple#28505) Add `differentiability_witness_function` instruction assertions: - Must never be constructed with null witness. - Can only have explicit type in lowered SIL. Add `differentiability_witness_function` instruction verification: check that type of `differentiability_witness_function` instruction matches the type of the witnessed SIL function.
1304c7b
to
6a8d48d
Compare
Make `AnyDerivative` `@frozen` so that differentiable curry can be efficient (apple#28516). Resolves SR-11879.
Make `AnyDerivative` `@frozen` so that differentiable curry can be efficient (apple#28516). Resolves SR-11879.
…erentiable' reabstraction thunks.
e3452cc
to
0ffd1a8
Compare
AD is now using the right linear map type (with |
I accidentally deleted |
This PR contains my ongoing implementation of The Differentiable Curry.
Umbrella issue: SR-11882
cc @dan-zheng @marcrasi @dimitriv
TODOs:
SILFunctionType::getAutoDiffDerivativeFunctionType
to include closure context tangents in the result linear map type.AnyFunctionType::getAutoDiffDerivativeFunctionType
to include closure context tangents in the result linear map type, for calculating the result linear map type ofBuiltin.applyDerivative*
.Builtin.differentiableFunction*
's type to include closure context tangents.SILGenFunction::getThunkedAutoDiffLinearMap
) to include closure context tangents.@differentiable
reabstraction thunk generation (createAutoDiffThunk
in SILGenPoly.cpp) to include closure context tangents in derivative type calculation.apply
.apply
with respect to the function, forward-propagating closure context tangents.apply
with respect to the function, producing closure context tangents ofAnyDerivative
type in the pullback.partial_apply
, casting closure context tangents fromtan[f]
totan[args]
.