Implement Convolution, Dropout and Pooling layers #180

hweom · 2022-12-10T20:26:29Z

What does this PR accomplish?

Implements Convolution, Dropout and Pooling layers in the new architecture.

🦚 Feature
🧭 Architecture

Changes proposed by this PR:

Additional by-product changes:

Passes the backend to layer/net init functions as it's sometimes needed to create backend-depended helper structures.
Adjusts the layer testing helpers to better handle different input/output dimensionality.
Adjusts the layer testing helpers to additionally take backend ref to support layers that are currently only implemented for CUDA.
Automatically extend the CUDA descriptor size to a minimum of 3 instead of returning an error.

📜 Checklist

Test coverage is excellent
All unit tests pass
The juice-examples run just fine
Documentation is thorough, extensive and explicit

juice/src/net/common/convolution.rs

* Move Convolution workspace into context * Formatting fixes * Fixed unit tests * Partial implementation of the Convolution layer * Implement the remaining parts for Convolution layer * Implement dropout and pooling layers * Fix CUDA tensor descriptor size error and adjust layer testing infra * Extended debug output for layers with custom Debug impl Co-authored-by: Mikhail Balakhno <{ID}+{username}@users.noreply.github.com>

* Fix coaster UI tests (rustc error messages changed in 1.62 (#172) * Fix Linear layer bias gradient computation; add size checks to CUDA functions (#170) * Assert the correct tensor sizes in copy() and gemm(); fix related Linear logic * Check output matrix dims in GEMM; fix corresponding Linear layer logic * Update coaster-blas/src/frameworks/cuda/helper.rs * Fix merge mistake in commit 6952a49 (#173) * doc: clarify remote test (#175) * bump rust-bindgen to 0.60.1, bump cargo lock file (#174) * build(deps): bump capnp from 0.14.9 to 0.14.11 (#179) Bumps [capnp](https://github.com/capnproto/capnproto-rust) from 0.14.9 to 0.14.11. - [Release notes](https://github.com/capnproto/capnproto-rust/releases) - [Commits](capnproto/capnproto-rust@capnp-v0.14.9...capnp-v0.14.11) --- updated-dependencies: - dependency-name: capnp dependency-type: direct:production ... * build(deps): bump tokio from 1.21.0 to 1.23.1 (#183) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.21.0 to 1.23.1. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](tokio-rs/tokio@tokio-1.21.0...tokio-1.23.1) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production ... * build(deps): bump bumpalo from 3.11.0 to 3.12.0 (#187) Bumps [bumpalo](https://github.com/fitzgen/bumpalo) from 3.11.0 to 3.12.0. - [Release notes](https://github.com/fitzgen/bumpalo/releases) - [Changelog](https://github.com/fitzgen/bumpalo/blob/main/CHANGELOG.md) - [Commits](fitzgen/bumpalo@3.11.0...3.12.0) --- updated-dependencies: - dependency-name: bumpalo dependency-type: indirect ... * build(deps): bump tokio from 1.23.1 to 1.24.2 (#191) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.23.1 to 1.24.2. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/commits) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production ... * Now also saves bias layers (#193) * build(deps): bump openssl from 0.10.41 to 0.10.48 Bumps [openssl](https://github.com/sfackler/rust-openssl) from 0.10.41 to 0.10.48. - [Release notes](https://github.com/sfackler/rust-openssl/releases) - [Commits](sfackler/rust-openssl@openssl-v0.10.41...openssl-v0.10.48) updated-dependencies: - dependency-name: openssl dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> * Do not pass batch_size to cudnnGetRNNParamsSize(). * Add a feature for deterministic (pseudo)randomizing. * New network architecture pieces: Layer, Descriptor, Context, Network (#165) * New network architecture pieces: Layer, Descriptor, Context, Network * Update juice/src/net/descriptor.rs * Implement Sequential layer for the new architecture (#168) * Implement Sequential layer * Fix coaster UI tests (rustc error messages changed in 1.62 (#172) * Fix Linear layer bias gradient computation; add size checks to CUDA functions (#170) * Assert the correct tensor sizes in copy() and gemm(); fix related Linear logic * Check output matrix dims in GEMM; fix corresponding Linear layer logic * Update coaster-blas/src/frameworks/cuda/helper.rs * More ergonomic net creation and fallible Sequential constructor * Fix merge mistake in commit 6952a49 * Add a few more layers to the new architecture (#176) * Add trainer subsystem with SGD and Adam optimizers (#177) * Coaster convolution API cleanup (#178) * Move Convolution workspace into context * Implement Convolution, Dropout and Pooling layers (#180) * Move Convolution workspace into context * Formatting fixes * Fixed unit tests * Partial implementation of the Convolution layer * Implement the remaining parts for Convolution layer * Implement dropout and pooling layers * Fix CUDA tensor descriptor size error and adjust layer testing infra * Extended debug output for layers with custom Debug impl * Add softmax layers and convert MNIST example (#184) * Move Convolution workspace into context * Formatting fixes * Fixed unit tests * Partial implementation of the Convolution layer * Implement the remaining parts for Convolution layer * Implement dropout and pooling layers * Fix CUDA tensor descriptor size error and adjust layer testing infra * Extended debug output for layers with custom Debug impl * Changed mnist example to the new architecture * Plumbed the momentum arg in the mnist example * Implemented softmax and logsoftmax layers * Remove unnecessary NLL parameter and fix mnist example * Fix native backend softmax and logsoftmax grad computation * Changed slicing syntax in native backend softmax functions * Convert juice benchtests to Criterion (#192) * Convert Juice benchmarks to Criterion * Add newline at the end of Cargo.toml * Made Layer operations return a Result (#186) * Made Layer operations return a Result * Change LayerError to contain Boxes * Update benchmarks for new layer API * Simplify new_rnn_config() Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Mikhail Balakhno <{ID}+{username}@users.noreply.github.com> Co-authored-by: Bernhard Schuster <bernhard@ahoi.io> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: opfromthestart <opfromthestart@gmail.com>

Mikhail Balakhno added 9 commits November 5, 2022 15:05

Move Convolution workspace into context

75ea138

Formatting fixes

e165757

Fixed unit tests

56a2d63

Partial implementation of the Convolution layer

59e2ef5

Implement the remaining parts for Convolution layer

b0c66e5

Merge remote-tracking branch 'upstream/arch-refactor' into arch-refactor

0afa7df

Merge branch 'new_convolution' into arch-refactor

d0ebf04

Implement dropout and pooling layers

5d58e33

Fix CUDA tensor descriptor size error and adjust layer testing infra

f421a8f

drahnr reviewed Dec 22, 2022

View reviewed changes

juice/src/net/common/convolution.rs Outdated Show resolved Hide resolved

drahnr approved these changes Dec 22, 2022

View reviewed changes

Extended debug output for layers with custom Debug impl

45f404b

drahnr approved these changes Dec 24, 2022

View reviewed changes

drahnr merged commit c388ebb into fff-rs:arch-refactor Dec 24, 2022

hweom deleted the arch-refactor branch January 3, 2023 01:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Convolution, Dropout and Pooling layers #180

Implement Convolution, Dropout and Pooling layers #180

hweom commented Dec 10, 2022

Implement Convolution, Dropout and Pooling layers #180

Implement Convolution, Dropout and Pooling layers #180

Conversation

hweom commented Dec 10, 2022

What does this PR accomplish?

Changes proposed by this PR:

📜 Checklist