Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V0.9.0 #195

Merged
merged 145 commits into from Aug 19, 2018

Conversation

@chewxy
Copy link
Member

chewxy commented Mar 11, 2018

Ongoing notes:

  • CUDA: Better CUDA support (IN PROGRESS)
    • ColMajor used by default if engine is CUDA. (ColMajor is supported, but defaults to using RowMajor for all the major cuBLAS versions. Careful reasoning of the parameters obviates the need for ColMajor by default, which causes more headaches. It is still supported)
    • Transposition will be automatically done when performing transports back to CPU.
    • cudnn operations supported (IN PROGRESS) (note: these are the ones I use more often hence gets bigger attention):
      • Conv2d
      • Dropout
      • Maxpool2d
      • BatchNorm
      • Rectify
    • Other CUDA related optimizations
      • full cuBLAS support
  • New Ops:
    • BatchNorm
    • InvSqrt
    • CUDA enabled ops in ops/nn (preview for how things will start to look in v0.10.0)
  • New Features:
    • Limited shape inference. Working towards a calculus for shapes (first raised in #96 and #97).
  • Optimizations:
    • Optimizations of basic ops to use engine functions if available, otherwise, fall back to using Apply, which adds a penalty from repeatedly calling functions.
    • Faster VMs (1 of 2 VMs): greedy goroutines grabs gigs from a priority queue. This causes faster execution of code in general. (this is moved to a future version of 0.9.xx):
benchmark                           old ns/op      new ns/op      delta
BenchmarkTapeMachineExecution-8     3129074510     2695304022     -13.86%

benchmark                           old allocs     new allocs     delta
BenchmarkTapeMachineExecution-8     25745          25122          -2.42%

benchmark                           old bytes      new bytes      delta
BenchmarkTapeMachineExecution-8     4804578705     4803784111     -0.02%
  • Code generation: some exported API is now auto generated
  • New Solver : @ynqa added the Momentum solver.
  • Breaking API: Solver now take a slice of ValueGrad instead of Nodes. ValueGrad is an interface, of which a *Node fulfils. An additional utility function NodesToValueGrads has been added to aid with refactoring. This was done for two reasons:
    • The support for BatchNorm operation, which is a verily impure and highly stateful function. The BatchNorm Op has internal states that need to have their gradients updated as well. But the internal state of BatchNorm isn't really part of the expression graph, and really it shouldn't be. Turns out there was a better API for BatchNorm.
    • In the next version, v0.10.0. We aim to do better package organization for managability. With this API breaking change, the solver now is less dependent on the other parts of Gorgonia and can be easily separated.
  • Breaking Semantics: A gorgonia.VM now implements io.Closer. It should be treated as a resource as well as a computation device - the VM must be Close()d in order for the resources acquired by the VM to actually be released. Turns out, automatic resource management is too difficult. Who'd thunk that?
@coveralls

This comment has been minimized.

Copy link

coveralls commented Mar 11, 2018

Coverage Status

Coverage increased (+0.1%) to 62.031% when pulling 8453c8b on v0.9.0-working into 32c6fd8 on master.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Mar 11, 2018

Coverage Status

Coverage decreased (-0.004%) to 61.363% when pulling 64ce9c6 on v0.9.0-working into d7b00e0 on master.

chewxy added 6 commits Mar 11, 2018
Generated new Gopkg.lock files
…nes in operations.go which were partially generated using a python script
…math.go). Cleaned up some spelling errors
@docmerlin

This comment has been minimized.

Copy link
Collaborator

docmerlin commented Mar 18, 2018

Awesome. Excited about the col-major, stuff for more speed.

@chewxy chewxy force-pushed the v0.9.0-working branch from 16267e9 to 14b9bdc Mar 19, 2018
chewxy added 9 commits Mar 20, 2018
…TapeMachine can be done.

As a result the example is much less clearer than expected :(
Added checks for shape inference of linear algebra
Added scalar checks for WithShape()
chewxy added 28 commits Aug 3, 2018
…of batchNorm by requiring gamma(scale) and beta(bias) though
@chewxy chewxy merged commit 978b1c3 into master Aug 19, 2018
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.