Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change "engine" to run on autodiff types only #370

Merged
merged 2 commits into from
Sep 1, 2020
Merged

Conversation

kai-qu
Copy link
Contributor

@kai-qu kai-qu commented Aug 26, 2020

Description

Previously, the "engine" had the option of using number or Tensor types in the evaluator via the autodiff flag, because we needed the former type for displaying shapes (evaluating the translation) and the latter type for optimization (evaluating the energy).

While concise, the design started to cause some challenges:

  • Computations appear in both kinds of evaluation, thus had to be written with both numeric types
  • Lots of code duplication in the evaluator, since it needs to interpret expressions on both numeric types and deal with mismatches
  • Hard to debug if a single number of the wrong type gets into the wrong evaluation phase (e.g. numbers in the optimization)

Therefore, this PR changes all evaluation + optimization-related functions and types to be on autodiff types only (in this case, Tensors).

The tradeoff is that the display step may take slightly longer and use more space, since it requires doing all work on autodiff types (which are objects that store a good amount of information), then converting them to numbers. However, that tradeoff is acceptable because the engine does much more optimization-type evaluation than display-type evaluation.

Implementation strategy and design decisions

The new pipeline is as follows:

  • decode state from backend JSON, which gives us numbers
    • note that the deserialization is the least type-safe part of our code [0]
  • in the initial state, convert all Value<number> into Value<Tensor> to store in the Translation, which now only holds Tensors
    • the varyingValues are also converted to be all Tensors, as are the pending values (label dimensions) after they are calculated
    • expressions (OptEval) that can be evaluated at optimization-time are not converted; if a number is encountered in evaluation, it's converted on the spot in evalExpr
  • all evaluation happens on Tensors
    • in the optimization, it just uses the Tensors stored in the translation
    • for displaying shapes via evalTranslation, it also does all computations on Tensors, and does a final conversion pass from Value<Tensor> to Value<number>

In general, numeric values are converted to tf Scalars via the scalar function (so they just tensors, not variables in the optimization), unless they are varying values, in which case they are converted to tf Variables.

Most changes are in Evaluator and types.d.ts, with some changes in PropagateUpdate and Canvas, and new files EngineUtils and Computations.

Also, the diff is a little misleading because I changed my editor's indentation, but I kept it because I will probably use this indentation in the future.

[0] If something is deserialized with the wrong type (e.g. the deserializer doesn't know that something is supposed to be a Tensor, not a number), our types are gone at runtime, so there are no guarantees and the wrong type just gets everywhere.

Examples with steps to reproduce them

  • Test inline computation with Tensors: runpenrose hyperbolic-domain/hyperbolic-example.sub hyperbolic-domain/PoincareDisk.sty hyperbolic-domain/hyperbolic.dsl

The following three are successfully-repro'ed examples from the wiki page for web-runtime working examples.

  • Test labels, pending values, varying values, and optimization: runpenrose set-theory-domain/tree.sub set-theory-domain/venn-opt-test.sty set-theory-domain/setTheory.dsl

  • runpenrose set-theory-domain/tree.sub set-theory-domain/tree.sty set-theory-domain/setTheory.dsl

  • runpenrose set-theory-domain/continuousmap.sub set-theory-domain/continuousmap.sty set-theory-domain/setTheory.dsl

Checklist

  • I have commented my code, particularly in hard-to-understand areas
  • I ran Haddock and there were no errors when generating the HTML site
  • My changes generate no new warnings
  • New and existing tests pass locally using stack test
  • My code follows the style guidelines of this project (e.g.: no HLint warnings)

Open questions

Questions that require more discussion or to be addressed in future development:

  • In the future, the system should use a custom autodiff type that is easily swappable (i.e. not Tensors). Using Tensors is a stopgap until web-perf and @strout18's macro code get merged.
  • I wasn't sure if tf Variables needed to have their gradients cleared after each step, so for safety, I did that in Optimizer at the end of each stepEP. In general, when we use custom AD vars, we may need to clear them.
  • I wasn't super careful about the distinction between tf Scalars and Variable. Hopefully this generates the right computational graph when we use custom AD vars.
  • Maybe just convert Shapes from autofloats to numbers, instead of converting the whole translation? (@wodeni's suggestion)

@kai-qu kai-qu requested a review from wodeni August 26, 2020 23:12
@kai-qu
Copy link
Contributor Author

kai-qu commented Aug 26, 2020

Hey @wodeni, requesting a review as you wrote the original code. It works on the examples so I'm pretty confident, but can you give it a once-over by Monday so I can merge into web-runtime? Let me know! (See also my note in the README about editor indentation re: this large diff.)

@wodeni
Copy link
Member

wodeni commented Aug 26, 2020

See also my note in the README about editor indentation re: this large diff.

I'll go over the changes soon. Quick q about indentation: did you just change from 2 to 4 spaces? If so, I'd love to hear the rationale. Where's this README you were referring to?

@kai-qu
Copy link
Contributor Author

kai-qu commented Aug 26, 2020

Yeah, 2 to 4 spaces, because it's what makes my editor auto-indent work correctly on both my machines... (not very principled!). I'm just referring to the PR README here.

Copy link
Member

@wodeni wodeni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, 2 to 4 spaces, because it's what makes my editor auto-indent work correctly on both my machines... (not very principled!). I'm just referring to the PR README here.

I'd encourage you to try getting 2-space indentations to work in your editor just to avoid large diffs in the future. The rest of us seem to use linters/beautifiers that have this setting by default. If you have a strong preference, we can discuss that and put a rule in the linter config.

react-renderer/src/Circle.tsx Outdated Show resolved Hide resolved
react-renderer/src/Constraints.ts Outdated Show resolved Hide resolved
@kai-qu
Copy link
Contributor Author

kai-qu commented Aug 28, 2020

@wodeni Sure, I'll try to get the 2-space indent to work. In the meantime, can you let me know if you were planning to review this PR further? Or did you want me to fix the indent first?

Copy link
Member

@wodeni wodeni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall okay with the changes. I'm still a little unsure about the exact amount of "duplicated code" that this PR is supposed to remove, as the original autodiff flag affects only a few functions. If the extra conversion doesn't cost us much, I definitely would go with this cleaner version, though.

// https://stackoverflow.com/questions/31084619/map-a-javascript-es6-map
// Basically Haskell's mapByValue (?)
export function mapMap(map: Map<any, any>, fn: any) {
return new Map(Array.from(map, ([key, value]) => [key, fn(value, key, map)]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My experience with Map hasn't been great if I don't really add or remove entries and want to change existing entries a lot... You can probably use mapValues and regular js objects if that's easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. mapMap isn't currently used anywhere, so I'll just leave it in as a util.


/**
* Find the value of a property in a list of fully evaluated shapes.
* @param shapes a list of shapes
* @param path a path to a property value in one of the shapes
*/
// the `any` is to accomodate `collectLabels` storing updated property values in a new property that's not in the type system
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably fix that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes...

import { Tensor, scalar } from "@tensorflow/tfjs";
import { mapValues } from 'lodash';

// TODO: Is there a way to write these mapping/conversion functions with less boilerplate?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High-level question: if everything is autodiff format, why do we have to do this deep conversion of the translation? For rendering purposes, can't we output shapes with autofloats and do the conversion on shapes instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, true... I liked that the Shape type enforced that it only contains Value<number>, so there is no chance an autofloat makes it to the frontend. But it's probably faster to render if we only convert shapes. Maybe I'll make this change in the future?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the timing is up to you. If it's too involved, let's make sure we come back to it. Seems like this is even more code duplication than before :(

Copy link
Contributor Author

@kai-qu kai-qu Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is pretty generic, so can be used for mapping over any translation stuff, not specific to autofloat/floats. However I am down to use the shape conversion approach instead, later.

(The code duplication problem earlier involved computations + operations in evaluations needing to be defined on both autofloats and floats, e.g. adding two numbers had to use both tf.add and the js + depending on the inputs, and it was difficult for me to debug why autofloats/floats ended up where they weren't supposed to be.)

const props = mapValues(propExprs, (prop: TagExpr<Tensor>): Value<number> => {
if (prop.tag === "OptEval") {
// For display, evaluate expressions with autodiff types (incl. varying vars as AD types), then convert to numbers
// (The tradeoff for using autodiff types is that evaluating the display step will be a little slower, but then we won't have to write two versions of all computations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need to see profiling results to see the overhead of the Tensor => number step, otherwise okay with me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a quick shot of three steps for the tree example (using tf.js on this branch). You can see that optimization takes most of the time and the conversion time seems negligible. Maybe this will change once we have compiled gradients.

image

const { energy } = minimize(f, fgrad, xs, steps);
// insert the resulting variables back into the translation for rendering
// NOTE: this is a synchronous operation on all varying values; may block
// Note: after we finish one evaluation, we "destroy" the AD variables (by making them into scalars) and remake them
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the consequence of not "destroying" the AD variables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you reuse an AD variable later with our dynamic evaluator approach, then my guess is that Tensorflow records those additional operations in its computational graph, which might affect the gradient. But it's a moot point after we convert to using our custom AD + compiled gradients, because the AD vars become irrelevant after the gradient is compiled.

@@ -4,7 +4,7 @@
"outDir": "build/dist",
"module": "esnext",
"target": "es5",
"lib": ["es5", "es6", "es7", "dom"],
"lib": ["es5", "es6", "es7", "esnext", "dom"],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I added esnext here @wodeni @maxkrieger. Wanted to use Object.fromEntries in EngineUtils. Lmk if unnecessary

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how the polyfill coverage is but shouldn't be a problem

…s and types to be on autodiff types only (in this case, Tensors). remove autodiff=true flag from evaluator. convert to and from the `number` type for display. factor out computations into a separate file, and convert their types to be autodiff as well
@kai-qu
Copy link
Contributor Author

kai-qu commented Sep 1, 2020

FYI @wodeni I just made changes WRT the above suggestions and reverted the indentation to 2 spaces, and squashed/force-pushed to this branch. Diff should make more sense now. (Also added react-renderer/tsfmt.json for my editor.) Thanks for the review!

@kai-qu kai-qu merged commit 4d7c2a5 into web-runtime Sep 1, 2020
@kai-qu kai-qu deleted the evaluator-fix branch September 1, 2020 18:09
@wodeni
Copy link
Member

wodeni commented Sep 1, 2020

FYI @wodeni I just made changes WRT the above suggestions and reverted the indentation to 2 spaces, and squashed/force-pushed to this branch. Diff should make more sense now. (Also added react-renderer/tsfmt.json for my editor.) Thanks for the review!

Thanks! BTW GitHub does give you an option to ignore whitespace changes, so the diff was no issue for me :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants